May 6, 2024

Provider Data Sources Reference

Last week (May 3, 2024) CarePrecise published a dynamic new healthcare data resource, entitled the Provider Data Sources Reference Guide (or just "PDS" for short). By "dynamic" we mean that it will be continually updated, and it will grow with new entries relating to free and fee-based provider data sources.

Navigating public and private data sources is challenging and time consuming, and it's something that the CarePrecise resources team has been doing for decades. Opening our expertise to the public is built into the DNA of our company. We take pride in being the most open and transparent healthcare provider data vendor, and the Provider Data Sources Reference Guide is just the logical next step.

Provider Data Sources (PDS) Reference Guide

The PDS is free and publicly available, with listings of data sources, both public and proprietary. It's a great place to find clues about the data sources needed across the healthcare industry. Included are sources that we use to build our authoritative provider data packages, as well as others that can be integrated with data packages from CarePrecise and other vendors using the NPI, CCN and PAC ID unique identifiers, to augment and enhance value for our customers, the industry at large.

Our resources team accepts submissions for entries in the PDS, which are reviewed for quality, pertinence, and value of content. Direct links to the sources are included on the page, where possible.

Using the on-screen tools, listings can be sorted and filtered by Category (such as Physicians, Hospital/Medical Facility, Mental Health, etc.), and by Free, Fee-based, or Limited use sources.

The PDS is a companion to the CarePrecise U.S. Healthcare Administration and Information (USHAI) resources guide, which contains links to medical associations, healthcare IT, cost reduction, patient guidance information, and much more. Both public and proprietary sources are included, and the USHAI guide also accepts submissions from the industry. Note that submissions must come from the source of the information; submissions that come from a public relations agency or other third party are not considered for publication. Both the USHAI and the PDS consider limited commercial content of high quality, with inclusion at the discretion of CarePrecise.

Go here to submit a provider data source for the PDS reference guide. Go here for commercial submissions to the companion USHAI guide.

April 15, 2024

3D Views of Healthcare Locations

Google Maps has just released a fascinating new capability. Their new 3D Area Explorer offers the ability to create immersive, interactive views of any point of interest. Like a 2D map, locations are pinned, and the view can be rotated on various axes to explore the locations. This would be useful in applications like "find a provider" apps that would be able now to show the user around an unfamiliar building or facility compound, making it easier to find their destination and building entrance.

Combined with data from CarePrecise, such as HealthGeo, which contains latitude and longitude for U.S. providers, these clinicians and facilities, or a cluster of them, such as medical offices around a hospital, can be viewed as an interactive 3D map.

Using the Google Maps Platform API along with other tools from Google and CarePrecise datasets, such as CarePrecise Platinum extended healthcare provider data, it's possible to visualize information, such as:
  • All of one doctor's practice locations and the hospitals they're affiliated with, and to zoom around and identify travel routes
  • The locations of all medical facilities, or specific types of facilities, in a city or neighborhood
  • All of the practice locations of physicians with particular specialties, or who perform particular procedures
  • Locations of physicians who have opted out of Medicare, versus those who accept Medicare
We expect these tools to find uses in identifying areas that are underserved or overserved, offering improved revenue opportunities for providers. Overlaid with POI (Point of Interest) data from other vendors, heat maps can be created to indicate volumes of patients per location.  From retail to investment to insurance, innumerable scenarios scenarios make use of geospatial data. With 3D visualization, these complex data can be better understood and communicated with team members, stakeholders, and consumers.

March 14, 2024

The Power of Physician Databases

In the ever-evolving landscape of healthcare in the United States, access to accurate and comprehensive provider data is crucial for improving patient outcomes, optimizing resource allocation, advancing medical research, and communication between providers and innovators. Clinician data stands at the forefront of this revolution, offering a treasure trove of information that empowers stakeholders across the healthcare ecosystem. From pharmaceutical companies seeking to collaborate with key opinion leaders to healthcare organizations aiming to enhance their referral networks, the value of physician and other prescribing clinician data cannot be overstated.

The Backbone of Healthcare Insights

Physician data serves as the backbone of healthcare insights, providing centralized repositories of information on medical professionals, including their specialties, affiliations, contact details including email addresses, and clinical interests and treatment patterns. These databases are meticulously curated, drawing from authoritative sources such as federal provider data maintained by the Centers for Medicare and Medicaid Services (CMS), medical licensing boards, professional associations, and healthcare institutions. Good physician databases can offer a comprehensive view of the healthcare landscape, enabling stakeholders to make informed decisions and to reach out for strategic partnerships.

Driving Medical Innovation

Innovation in healthcare relies heavily on collaboration and knowledge sharing among medical professionals. Authoritative physician data fosters these connections by facilitating networking opportunities and identifying experts in specific fields. Pharmaceutical companies, for instance, leverage physician databases to identify potential investigators for clinical trials, gather insights on prescribing patterns, and engage with thought leaders to advance their research agendas. By streamlining the process of connecting with relevant healthcare providers, these databases accelerate the pace of medical innovation and drug development.

Enhancing Patient Care

Effective patient care hinges on seamless coordination among healthcare providers and access to timely, relevant information. Physician databases enable healthcare organizations to build robust referral networks, ensuring that patients receive the specialized care they need. Primary care physicians can quickly identify specialists based on their expertise and proximity, leading to shorter wait times and improved patient satisfaction. Access to comprehensive physician profiles allows clinicians to make well-informed referrals, resulting in better treatment outcomes and continuity of care. 

Accurate fax numbers for pharmacies facilitate delivery of prescriptions where prescribers do not subscribe to an ePresciption system, and contact information for the prescribers is crucial when a pharmacist needs clarification, or has potentially life-saving information on a drug interaction that may have escaped a prescriber's notice.

Communicating Vital Information to Doctors

The process of updating physicians on advances in their areas of practice relies heavily on good physician contact information. Up-to-date, accurate physician databases, with practice addresses, phone and fax numbers, and email addresses, from a reliable vendor are the basis for communication. Companies use authoritative these data resources to update their own in-house databases, keeping the lines f communication open and effective.

Informing Healthcare Policy

In an era of evidence-based medicine, data-driven insights are indispensable for shaping healthcare policy and regulation. Physician databases provide policymakers with valuable information on physician demographics, practice patterns, and geographic distribution, enabling them to identify areas of need and allocate resources effectively. By analyzing trends in physician workforce dynamics, policymakers can develop strategies to address shortages in underserved areas, promote diversity in healthcare, and support initiatives aimed at improving access to care for underserved populations.

Ensuring Data Accuracy and Privacy

While physician databases offer immense benefits, ensuring data accuracy and privacy is paramount. To maintain the integrity of these databases, data providers employ rigorous validation processes and adhere to strict privacy regulations such as HIPAA (Health Insurance Portability and Accountability Act) in the United States. Additionally, data anonymization techniques are often employed to protect sensitive information and preserve patient confidentiality. By prioritizing data quality and security, stakeholders can harness the full potential of physician databases while safeguarding patient privacy.

Looking Ahead: The Future of Physician Databases

As technology continues to evolve, the future of physician databases holds tremendous promise. Advancements in artificial intelligence and machine learning are poised to revolutionize data analytics, enabling stakeholders to extract deeper insights and predictive analytics from vast datasets. Integration with electronic health records (EHRs) and interoperability standards will further enhance the value of physician databases by providing real-time access to patient information and care coordination tools. With data coming in from so many sources in government and the healthcare industry, intelligent tools for merging information into a “single source of truth,” such as the CarePrecise Collection™ healthcare provider dataset, are key. CarePrecise developed its QoRelate™ record collection and linkage intelligence to build a range of data modules that can be used in any relational database environment across the industry.


October 9, 2023

Media Release: ScribeFax Announced

The following media bulletin was released 10/9/2023.

Database Reveals Hidden Clinician Fax Numbers

SUMMARY

ScribeFax™, comprehensive database of fax numbers for physicians, physician assistants, dentists and other prescribing clinicians, is now available from CarePrecise LLC, a vendor of authoritative healthcare provider data. ScribeFax is created using advanced data mining, compiling monthly updates from millions of clinician, clinic, and other medical facility records to reveal hard-to-find fax numbers.

FOR IMMEDIATE RELEASE

CarePrecise LLC, a supplier of healthcare provider data since 2008, has announced the release version of ScribeFax™, the company’s comprehensive database of U.S. physicians, dentists, physician assistants, advanced practice nurses, and other prescribing clinicians’ fax numbers. The enhanced clinician fax database has graduated from beta to full production version, and is now available for download from the company’s web site.

ScribeFax is a unique resource, offering not only fax numbers reported by physicians and others, but also correct fax numbers to replace tens of thousands of unreported, hidden, incorrect, or obfuscated numbers. The database also includes the complete public NPI registry data, including robust contact information, specialties and licenses, in a format that can be easily used in ordinary database software, such as Access, part of the Microsoft Office 365 suite, FileMaker Pro, or any of the database management software like MS SQL Server, Oracle, MySQL, or other database systems.

Healthcare providers self-report their fax numbers to the NPI Registry, where they become part of the National Plan and Provider Enumeration System (NPPES) federal database. Fax numbers are not a required data point, and many providers choose not to report them, fearing abuse. This causes problems for businesses needing to communicate vital information to clinicians, including health plans, electronic prescription transmission services, and suppliers. Contacting by phone to request a fax number wastes many thousands of hours, making the ScribeFax directory an essential tool for fax communications.

In preparing each monthly update to ScribeFax, CarePrecise uses a testing methodology to flag the “good” and “bad” fax numbers, and adding new correct numbers found using advanced data mining technology. “Bad” numbers are included so that users can clean their in-house data. Where multiple fax numbers are found for a single medical practitioner, a prioritization scheme determines the best number to ensure correct delivery.

ScribeFax, like the company’s related product, the ScriptFax™ enhanced pharmacy fax database, is designed to save millions of hours of manual fax number collection across the healthcare industry, and has found wide adoption even in its beta version. Both products are now in full release, and are available for immediate download at CarePrecise.com. Subscriptions to monthly or quarterly updated releases are available.

Michael Christopher, Partner and Chief Data Analyst
CarePrecise LLC
Tulsa, Oklahoma, USA
877-782-2294 x9

September 27, 2023

Effects on Data of a Government Shutdown

UPDATE: An agreement reached for continuing funding has ensured that normal collection and dissemination of federal healthcare data will continue through at least the upcoming October update of CarePrecise data packages.

Discussion of a government shutdown that could affect the Centers for Medicare and Medicaid Services (CMS) – a primary source for U.S. healthcare data – has raised concerns that the regular updates of CarePrecise datasets may be impacted. Our pledge is to provide the most recent data as released from our sources, which are largely within The U.S. Department of Health and Human Services (HHS), the U.S. Department of Commerce, and the U.S. Postal Service, among other government agencies. Should a government shutdown occur, it was announced a few days ago that a large portion – perhaps 42% – of personnel in the HHS will be furloughed. In this event, Medicare, Medicaid, and Obamacare are not expected to be impacted from the perspectives of most patients, providers, and payers, beyond brief delays in reimbursements. However, it is not known at this time how the furloughs will affect processes related to delivery of healthcare data updates. Should any delays occur, CarePrecise will alert all of our data subscribers with what we know.

It is important to point out that there have been 21 government shutdowns since the days of the Ford Administration, and four of these have taken place since CarePrecise began leveraging federal data for the benefit of our customers. The longest, and perhaps most dramatic of these, was the seventeen-day shutdown that occurred in October 2013. No impact on CarePrecise data delivery was felt during any of these previous shutdowns. We note also that much of the responsibility for data production within the federal government falls to contractors. In many, if not most, cases, contractor agreements will not immediately be affected, and work may be expected to continue.

The criticality of current, accurate, standardized data in healthcare has sustained strong bipartisan support since the leadership of HHS Secretary Louis W. Sullivan in the administration of President George H.W. Bush.

It is also noteworthy that, since its founding in the 2008/2009 federal fiscal year, CarePrecise has never missed a data distribution. Whatever happens, CarePrecise will distribute the data as soon as digitally possible.


July 30, 2023

HIPAA Prevents State LEOs from Grazing for PHI -- Doesn't It?

h/t to Samantha Holvey's concise and timely weekly Whealth Care newsletter for addressing a question that is probably on every HIPAA-savvy reader's mind of late: "Can State Attorneys General just randomly scan out of state health records to see whether one of their residents may have committed a health care 'crime'?" This might apply to potentially pregnant patients seeking reproductive diagnosis and treatment, or parents of transgender minors seeking gender-affirming care not available at home.

Having been engaged with HIPAA since its earliest days, I was prepared to repeat my customary, reassuring, "HIPAA is better privacy protection than we had before" speech, but I quickly realized that this time, I was not so sure. See, when we were implementing the three pillars of HIPAA (Privacy | Security | Transactions and Code Sets), back in the aughts, people were most concerned about organizations within the industry misusing the data, or letting it leak out for commercial exploitation.

Very few were worried about a malevolent government. The pre-HIPAA government guardrails that had been erected were still in place, and HIPAA itself was relatively neutral on the matter. Or at least, we implementers were relatively complacent. We thought that, occasional abuse aside, law enforcement organizations would go through existing legal channels to obtain patient records in pursuit of fraud, theft, controlled substance misappropriation, or malpractice.

Now, state after state is passing laws that not only criminalize healthcare procedures that have been common practice for decades, they extend that criminality to procedures performed in states whose own laws preserves their legality. Private citizens can earn bounties by revealing someone has crossed a state line to pursue such treatment. Or even helped fund such an excursion.

And while CMS has published regulatory guidance that explains what sorts of inquiries are already unacceptable under HIPAA, they have also released a Notice of Proposed Rulemaking (NPRM) to tighten the federal regulations against potential state governmental fishing expeditions. The comment period on the NPRM has closed. Can federal regulations be far behind? HIPAA history says not to be too confidents. Some NPRMs were allowed to languish for years. Other draft regulations were never formalized into a Final Rule.

July 28, 2023

Transitioning from AI Gee-Whiz to B2B Results

We at CarePrecise are as fascinated as anyone about the miraculous capabilities -- and astounding failures -- of the new Large Language Model Artificial Intelligence tools now battling it out in cyberspace. But we've been around too long not to reserve some skepticism about the hype cycle. The other day I was chatting with an LLM about a new medical device. It initially pointed me to the manufacturer's site and some related promo material, but when I told it I'd rather read content from actual users of the equipment it suggested some sites I generally prefer not to use. When I asked instead for Facebook Groups, it gave me a list of suggestions with very specific Group names.

None of which turned out to exist.

So, when pressed for different information than it had been providing, my chatty AI tool employed a very human tactic: MSU.

This suggests to us that perhaps the best way to effectively use AI will be to point it to data you know is good -- specifically, your own data about your customers and prospects.

This approach is already taking root in pharmaceutical marketing. Directing AI tools toward rich, highly accurate reference data will, we think, become a key component in making the new technology produce credible, and actionable, results.

June 7, 2023

The Facts on Medicare Spending

Analysis from the Kaiser Family Foundation

An excellent interactive explainer from KFF (Henry J. Kaiser Family Foundation) offers a clear and succinct view into the somewhat mystifying universe of Medicare spending. You can scroll to start viewing the interactive. 

Medicare Part A in-out over time

A Depleting Trust Fund

A particularly interesting chart shows how the solvency of the Part A trust fund presents challenges as those workers paying into the system become overwhelmed by the amounts that will need to be paid out (at right). The orange bars represent pay out, while the blue bars represent income to the fund.

I very plain language, when Medicare spends more money on Part A benefits, like hospital stays, than it brings in through payroll taxes, the assets in the Part A trust fund will gradually become depleted, and Medicare would not have enough money to pay for all Part A benefits from that point onward. If nothing is done to prevent it, the trust fund is expected to unravel slowly through the next decade.

What it Bodes

This isn't some doomsday prediction for partisan advantage; this report is based in cold, hard fact. Work should begin now to shore up the Part A trust fund, or the U.S. may face a dire future of rationed and/or restricted care [editor's opinion], where necessary medical care may be withheld and Americans' health will suffer. Other models have shown that when care isn't given when needed, more serious conditions arise and force higher costs, medical bankruptcies, and early death.

May 22, 2023

Algorithmic Bias in Healthcare AI

"Artificial intelligence (AI) and machine learning (ML) are used in healthcare to combat unsustainable spending and produce better outcomes with limited resources," says Ben Tuck in a recent article on the healthcare data blog ClosedLoop.ai. The article stresses the importance of keeping algorithmic bias in check, and goes on to offer four steps to address it.

When machine learning occurs, particularly in neural network-based systems where it is essentially impossible to fully grasp what's happening within the "mind" of the AI, the system may rely on data that reflects cultural biases, such as racism, sexism, homophobia, ageism, and all of the other stereotyping structures that have become written across our languages, interests, parenting, habits - whether we can precisely identify them (or openly admit them) or not.

Tuck's post identifies two general causes, or types, of algorithmic bias: subgroup invalidity and label choice bias.

Subgroup Invalidity Bias

Subgroup invalidity arises where the AI isn't up to the task of modeling the behavior of certain subgroups, due to training on homogeneous populations. Tuck offers the example of a study of pulse oximeter algorithms that demonstrated bias as a result of training on non-diverse data. The study found that "Black patients had nearly three times the frequency of occult hypoxemia that was not detected by pulse oximetry as white patients." The possibility for adverse health outcomes is obvious.

Label Choice Bias

Label choice bias is harder to detect. This is the situation when the AI's process returns a proxy variable —a stand-in for the real thing when the target metric is unavailable. The use of cost data to predict the need for future healthcare resources is an example; because Black people experience discrimination that results in their receiving less of the care received by the White population. Cost metrics, as derived from mostly white consumers' episodes, is used as though it applies to everyone. An argument can be made that minorities receiving less acute care when needed may actually bias the model in exactly the opposite direction, and the existence of the argument is a strong reason to improve the way the model is built by including race very thoughtfully in the source investigations and in the model's computations.

Fixing It

To limit bias and make the models useful, is possible, Tuck says. "Organizations are taking major steps to ensure AI/ML is unbiased, fair, and explainable," pointing to a playbook developed by the Booth School of Business at the University of Chicago - a guide for healthcare organizations and policy makers on catching, quantifying, and reducing bias. Read Ben Tuck's article for steps that can be taken, and review the Algorithmic Bias Playbook for more on how to define, measure, and mitigate bias in AI/ML algorithms.

-------------------

CarePrecise is a supplier of authoritative healthcare provider data and insights used across the healthcare community.

May 8, 2023

Record-Linkage in Healthcare Research... and Marketing

Record-linkage is a term referring to technologies that make it possible to merge data on people and organizations from multiple, disparate sources. Early development of the technology was largely related to marketing, for instance, as a means of connecting magazine subscribers' contact information to sales records belonging to retail stores. It's still used that way (more than ever), but some very important applications have emerged since those early days in the 1950s and 1960s, when computers filled whole rooms and developing highly complex software that would use years of run time was pointless. 

CarePrecise uses record linkage to create business intelligence datasets from a broad range of information available through the U.S. Department of Health and Human Services, Department of Commerce, USPS, and other resources. For example, by merging Medicare claims data with NPI registry data and other federal data sources, we can build a 360 degree view of the U.S. healthcare system - from the health systems to the hospitals to the medical practice groups and clinics, to individual clinicians. Today, record linkage is also making significant inroads in improving patient care.


What is record linkage technology and how does it work?

Record linkage is becoming a vital tool for getting the most out of many types of data. Record linkage technology works by creating a unique identifier for each patient that is used to combine information from multiple sources. There are two general types of record linkage: Exact (deterministic) matching and statistical (probabilistic) matching. 

Disambiguation. Exact matching is, of course, ideal. Linking records based on email addresses and tax identification numbers are excellent examples. "Disambiguation" occurs when otherwise disconnected data can be "hard matched" to create an unambiguous match, for which one unique identifier - a number or other code - can be assigned.

Arriving an unambiguous match may not be as easy as comparing Social Security Numbers. That's when we turn to statistical matching. This is  trickier, and almost always less reliable. Probabilistic record linkage uses "fuzzy" matching algorithms to compare data points and make links between different records that may not have the same exact details. For example, if two records had similar birth dates or home addresses, the algorithm would recognize these as potential matches and create a statistical link between them.

Relying on one or a few non-deterministic data points to match records is, naturally, a bad idea. People tend to change home addresses several times over their lifetimes, so using a street address, or phone number or email address, for that matter, would likely miss a number of records. Also, even if these markers have remained constant, another problem, frequently referred to as "fat fingering," occurs when a name, address, phone, etc. is wrongly entered in a database. 

Deliberate ambiguation. Early techniques for reducing this kind of ambiguity between datasets included creating a data field in which all of the vowels are removed from a name or street address. This "works" because numbers and consonants are statistically far less likely to be typed incorrectly. Not a good system, but better than nothing. A "false positive," when records are matched that shouldn't be, and "false negatives," when records that should be matched aren't, abound using only this ham-handed method, but it can still be a part of the record linkage process. Where patient data is involved, and where scientists are relying on clean data to glean truth, much more must be done.


Tighter matching for critical healthcare data

Data that can be linked include sensitive medical records, hospital records, laboratory tests, insurance claims data and administrative databases. When used for research involving patient records, record linkage often involves matching information from multiple sources to create a single unified patient record identifier, sometimes called a Master Patient Identifier (MPI), that can be used to track and analyze health outcomes over time. By combining different datasets, researchers can gain insights into the effectiveness of treatments and interventions, as well as uncover patterns in disease progression or risk factors that would not be visible if looking at one dataset alone.

This allows researchers to gain insights into patient care outcomes by combining information from multiple sources and looking at patients over time. As data science developed, and much larger datasets became available, scholarly efforts to improve record matching began to emerge. Systems that compare text strings and score the difference have been among these methods. An algorithm known as Soundex compares text strings phonetically; the words "Mary" and "Merry" would have a low text-only score, but Soundex can add weight to the match because the words sound alike.

Other fuzzy-logic methods exist, and can even be bought as part of record linkage software. "Standardization" essentially means making all of the same kinds of data appear the same way across different datasets. One such technique is address standardization, based either on proprietary technologies such as the CoLoCode technique developed by CarePrecise, or other, less precise, methods such as the USPS "Pub 28" standard. Getting mail delivered properly is important, to be sure, but the post office to its advantage the benefit of mail carriers' knowledge of their routes and the human ability to disambiguate on the fly. When comparing thousands or millions of rows of data, as is not unusual in medical research applications, "eyeballing" is not an option.

Rather than get too deep in the weeds here, a fine elucidation on record linkage in medicine can be found on the National Library of Medicine website.


Benefits of record linkage technology in medicine

Data merged from many sources can provide a more comprehensive view of the patient, allowing researchers to make more accurate and reliable conclusions about healthcare outcomes. By combining multiple datasets, researchers can gain deeper insight into medical conditions and how treatments affect patients over time. It also makes it easier to compare health outcomes across different populations, as well as detect potential errors or risks in patient care. 

Additionally, record linkage technology can be used to reduce medical costs and improve efficiency in the healthcare system. By linking administrative databases with clinical data, researchers can better understand why certain treatments cost more than others and identify areas where cost savings can be made. This could lead to improved healthcare decisions, including changes in treatment protocols or resource allocations. 

Record linkage has also been used to analyze the prevalence of medical conditions in various populations, create predictive models for patient care, and identify potential drug interactions. All of these studies have helped to improve our understanding of healthcare outcomes and inform decisions about how best to provide care for different patient groups. 

Researchers at the University of California‐San Francisco used record linkage to combine patient records from different providers and examine how electronic medical records could be used to improve care coordination. 


Challenges in using record linkage technology 

Despite the many potential benefits of record linkage technology, there are still challenges that must be overcome. Lack of standardization between datasets can make it difficult for algorithms to identify matches, and data quality issues can lead to incorrect links or missing information. 

Additionally, privacy concerns arise when combining multiple datasets, as linking patient records can reveal identifying information about individuals. In order to ensure that patient data is kept secure and confidential, there must be safeguards in place to prevent unauthorized access or misuse of the information. This includes developing secure protocols for data sharing, as well as strong regulations for protecting patient privacy.

It is important to consider the ethics of combining multiple datasets in order to identify a single patient. This could lead to potential issues such as discrimination or stigmatization, and researchers must make sure that they are adhering to ethical codes when collecting and analyzing data. 

These issues must be addressed in order to ensure that record linkage technology is used responsibly and efficiently. Solutions such as secure data sharing protocols, improved standards for data quality, and rigorous processes for privacy can help researchers harness the power of record linkage technology while protecting patient privacy.


Examples of recent uses of advanced record linkage technology in medical research

April 18, 2023

How to Use Physician Compare to Extract Free Physician Information

Physician Compare Website
The Physician Compare website is a common and free way to acquire very basic physician data. Not only can you look up information on specific providers using the Physician Compare search tool, you can also download the physician and other clinician data as a set of CSV files. The files contain clinicians' NPI number, name, credentials, practice address, phone number, and specialties, along with some other useful data.

The Physician Compare data on the facility affiliations of the doctors and clinicians, is very sparse, and doesn't even list the name of the facility, only its CCN identifier (CMS Certification Number) and PAC ID (PECOS Associate Control ID). For hospital and other facilities' names, address, and other data, you'll have to search and download numerous other files on the CMS website. CarePrecise acquires these from more than a dozen separate files. Alternatively you can purchase the CarePrecise Advanced dataset that includes all of the clinicians' data plus the facilities' data.

Free Physician Data

Within the free Physician Compare data is the Doctors and Clinicians National Downloadable File, which contains the following fields. The file is too large to be used in Excel, with its 1,048,576-row limit. You will need software that can accept more than that number of records, and a way to integrate it with the facility data in the next section, such as a SQL database, Microsoft Access, FileMaker Pro, or similar relational database software environment. (CarePrecise offers it all in an easy-to-use Microsoft Office format.)

  • NPI (national Provider Identifier number)
  • Individual's PAC ID
  • Individual's Medicare Enrollment ID
  • Last Name, First Name, Middle Name, Suffix
  • Gender
  • Credential(s)
  • Medical school (for some)
  • Graduation year (a useful means of inferring approximate age)
  • Primary specialty
  • Secondary specialties
  • Whether the clinician offers telehealth services
  • Name of the group the clinician works with
  • Number of clinicians in the group
  • Practice address fields
  • Phone number
  • Whether the clinician accepts Medicare's approve amount as full payment
  • Whether the affiliated group accepts Medicare's approved amount as full payment
  • Refence Address ID, indicating the specific suite within the same practice address building

Free Hospital and Other Facility Affiliation data

The Doctors and Clinicians Facility Affiliations file, which indicates the CCN numbers of hospitals and other medical facilities the doctors are affiliated with, contains these fields:

  • Clinician's NPI number
  • Clinician's Individual PAC ID
  • Clinician's name fields
  • Facility type (hospitals, long-term care, rehab, dialysis, etc.)
  • CCN number of the facility
  • CCN number of the parent/primary hospital where the clinician provides service

The file doesn't include the name or address of the facility. This file is too large to be used in Excel, which has a limitation of 1,048,576 rows.

Other files available in the Physician Compare download include:

  • Doctors and Clinicians Quality Payment Program PY 2021 Clinician Public Reporting: Overall MIPS Performance
  • Doctors and Clinicians Quality Payment Program PY 2021 Group Public Reporting: MIPS Measures
  • Doctors and Clinicians Quality Payment Program PY 2021 Group Public Reporting: Patient Experience
  • Doctors and Clinicians Quality Payment Program PY 2021 Virtual Group Public Reporting
  • Doctors and Clinicians 2020 Clinician Utilization Data

None of these files include licensed data, such as board certification or residency information.

Conclusion

There is a lot of useful free information the Physician Compare downloadable files, but pulling it together with the more robust data in the NPI registry  – the National Plan and Provider Enumeration System (NPPES) file – is more than a little bit difficult, requiring special methods for dealing with the 7.5 million-record file, and some relational database chops, as well. The hospital affiliations include the CCN number, but not the name, address, phone, etc. of the facilities, requiring additional search and extraction steps. For users who have mastered using these free files but need these additional data, CarePrecise offers data packages that can easily be linked to the Physician Compare date, or they can skip downloading and processing the free files themselves and go to CarePrecise for the combined ready-to-use dataset.

For deeper data on the wide range of U.S. healthcare facilities, CarePrecise also offers the Authoritative Hospital Database, with data on more than 50,000 facilities.

CarePrecise offers its customers free guidance in finding free, downloadable healthcare provider data to fill a wide variety of needs, and works with many research programs that require highly specialized healthcare provider information.

April 6, 2023

About CP Preferred Email™ for Healthcare Providers

We are often asked what we mean by "CP Preferred Email," our trademarked, proprietary email curation system. The "CP" part is obvious. The "Preferred" part, in a nutshell, is the way CarePrecise acquires, verifies, and maintains high quality email addresses for physicians, nurses, dentists, chiropractors, and other U.S. healthcare practitioners. Since we offer a 95% deliverability money-back guarantee, it is critically important to our business that the medical email addresses we sell are solid.

"Preferred" Physician Email Addresses?

When doctors and allied health professionals sign up for conferences and CME opportunities, subscribe to medical journals, and join medical organizations of various kinds, they're asked for their email addresses, and whether they want the organization to share with other organizations within the healthcare industry. You might be surprised to learn that a majority do "opt-in" to sharing. This doesn't mean that they have opted in for your particular use, but that they are OK with having their email used for a variety of medically related purposes. This is one way they can get timely information on new conferences, continuing education opportunities, new medical products and services, etc. It's where we source the vast majority of medical professionals' email information. 

Other sources include places where they give an email address for goods and services and have permitted sharing, and the small number of places where they are required to publicly share an email contact. We do NOT "screen scrape" to get email addresses surreptitiously from websites. That's just not cool, and could lead to unhappy addressees (and unhappy email customers, of course).

Email addresses sometimes "go dark." How do we keep up?

How does CP Preferred Email follow a physician when they move from one practice to another, or add an entrepreneurial venture email address, or one for a teaching position? And how do we know which address is the best one to use to get the email into the environment most likely to result in it being opened, read, and responded to positively?

The answer is partly a "secret recipe," of course. We wouldn't want every other medical email vendor to compete with us. But it's also partly just common sense, and we'll gladly explain.

First of all, we only use good sources. When we get new email addresses we immediately verify every one for deliverability. We also use a proprietary system to determine whether the email is addressing the correct practitioner (we track them by NPI number so we can be certain). There are many components in this step that take into account IP address locations, and domain and local-part components of the address making sense (our own proprietary AI), among other factors. All this happened before a single email is sent.

And then there's the vitally important technique of Spending Gobs Of Money On Fresh Email Addresses So Often It Makes Our Finance Director Cringe. Keeping up with moving clinician emails requires it.

Anatomy of an email address

There are two primary components to an email address. The "domain" is the part following the "@" sign, and this may contain both the primary domain and a subdomain:

maryjones@[bighospital.com] or maryjones@[cardio.bighospital.com]

The domain and sub may be used anywhere in the world that has Internet access. The local-part generally identifies the specific mailbox associated with, for instance, a physician assistant's email address, and is usually the person's name:

[maryjones]@bighospital.com or [Dr.Bob]@GeriatricAssociates.co

Note that email addresses are not case-sensitive. People can use as many uppercase letters as they like, but the servers treat them all as lowercase. 

Many of the special characters may be used in the local-part of an email address, notably !#$%&'*+-/=?^_`{|}~  (and the ever-popular "." as long as it's not the first or last character). So, depending on the specifics of a given email server's configuration, you'll may some nutty or genuinely creative email addresses similar to ///pediatric.care///@hugehealthsystem.org, or even Hi,Kids!_:-}@jonespediatric.com or You-Like-Ice-Cream?_It'll-Put-On-###_And-Costs-$$$!@weightclinic.com. No kidding.

But knowing whether an email address is a "well-formed email string" means little in determining quality. 

Crowdsourcing for the highest quality medical email addresses

Through our gathering process we collect more than one email address for many individuals. Working with some of our higher-volume email customers, we ingest email bounce reports containing data on up to hundreds of thousands of email addresses per report. From this we get information on which email was opened, which addressees opted out of future mailings from that organization or program, and many more technical clues. Real world data not only lets us know which email is getting through to the mailbox, but also which ones were deleted without being opened, and which ones were opened, responded to, and which ones produced a real ROI based on information shared with us by users. This process lets us tag one of the several email addresses for Dr. Jones is being responded to in a positive way. It may become one of our "Preferred" email addresses, if, that is, it meets other quality criteria and maintains its quality over a period of time.

Rigorous medical email verification

Nowhere else is it more important to respect an email recipient's time and attention than among medical clinicians. We check out every prospective email purchaser and grade them for worthwhile usage. That is, we simply do not sell to spammers. But beyond that, we re-verify all of our email addresses every few weeks to eliminate the ones that have gone dark, and those that we suspect as being problematical for a number of reasons. We don't throw out addresses only because they don't get a top deliverability score from a bulk verification service. If we did that, we would be throwing out many of our best emails – ones that are performing strongly in our Preferred email system. Ultimately, our customers don't just want email addresses that can get an automated score of some arbitrary level, they want email that actually performs.

Why are email addresses so much more expensive than other contact information?

Good email addresses are hard to come by. Our sources guard them jealously as a valuable business asset. You can't just go to Wikipedia and search for "all U.S. physician email addresses." Wouldn't that be nice? Or head on over to the American Dental Association and ask them for a free download of their membership email list. But nope. Won't work. There is no open government resource for email addresses (with a vanishingly few exceptions) as there is for phone numbers and practice addresses. If you're willing to buy a large number of emails, chances are you can score a few hundred thousand nurse emails from a company that deals in low quality and high volume (in other words, the email addresses used by spammers), and then you can run them by a verification checker and throw away the stinkers (most of them). But will they produce returns? Our customers don't seem to think so. Yes, our pricing isn't as low as the screen-scraped (and sometimes stolen) addresses you might find, but ours are up-to-date, verified, and guaranteed.

Our customers make it happen

If not for the customers who send millions of messages using our email addresses, we wouldn't have "Preferred" email. Many use our email addresses to communicate with existing and prospective provider network members. For these uses and for medical marketing, the healthcare industry is a special case among email usage. To acquire email addresses from CarePrecise, our users must agree to abide diligently by the terms of the U.S. CAN SPAM Act of 2003. Beyond that, CarePrecise offers medical email best practices that not only protect our users' reputations and their domains from blacklisting, but also help keep email communication pertinent to the recipient, and respectful of their attention and time. Together, CP Preferred Email and our users form an important piece of U.S. healthcare industry communication.