State by state, what's your infectious disease rate?

While the CDC is considered the nation’s leading authority on infectious diseases, it is also vitally important for each state in the country to understand the rate of infectious diseases in their area. We saw the need for this firsthand during the COVID-19 pandemic when states had to make timely decisions about quarantine, wearing masks in public, sending children to school, and other difficult choices that depended on the spread of the virus in their particular state. 

The number of COVID cases varied widely by state with California having a high of over 12 million incidences and Vermont only experiencing 152,000.1 While each state collects this information differently, all state health departments have to rely on some form of self-reporting from local health departments, physicians, hospitals, labs, and other care providers. But in cases of an emergency, is this timely and reliable enough? Even if it’s not a crisis, if your state is experiencing a high rate of certain infectious diseases — from the seasonal flu to sexually transmitted diseases to food-borne pathogens — is the information reported enough to gain a true understanding of the situation and make the necessary policy decisions? 

Real-world data to the rescue

What if states had direct access to the data collected by physicians, hospitals, labs and insurance companies? This real-world data (RWD), in the form of electronic medical records (EMR), hospital chargemasters, lab results and medical claims, can provide the insight that state agencies need to understand disease spread, predict potential outbreaks, allocate resources and make any necessary policy decisions to protect public health.

There are, however, several challenges when utilizing RWD:

  • Patient privacy - RWD contains personally identifiable information (PII) about a patient and, therefore, state agencies need to ensure that appropriate measures are taken to protect the patient’s privacy and make the data HIPAA compliant.

  • Data interoperability - When using multiple RWD sources or integrating RWD with state agency data, it needs to be interoperable and in a common format for proper analysis.

  • Matching accuracy - Patient records need to be accurately matched across data sources to provide a longitudinal view of that person’s healthcare journey. This can pose a challenge when the records are de-identified to be HIPAA compliant; however, if two different individuals are matched, known as a false positive, this could lead to inaccurate or contradictory conclusions. Conversely, false negatives, where the same person is counted as separate individuals, can lead to a fragmented view of the patient journey with short medical histories and potential blind spots to key indicators, and it could inflate infected population numbers.

  • Appropriate data sources - State agencies need to be able to source data that is representative of their area and includes the insights they need to make decisions. This could include lab results, hospital stays, previous medical histories or other comorbid conditions.

A synchronized solution

Privacy-Preserving Record Linkage (PPRL) is a method recognized by government agencies for integrating data sources while maintaining privacy. While a few PPRL solutions have emerged in the market, there are certain gold standards that enable accurate de-identification and patient matching capabilities, providing the ability to synchronize disparate data across multiple sources into a completely interoperable and HIPAA-compliant view of a patient’s healthcare journey. 

As an example, the HealthVerity PPRL solution, Identity Manager, has been granted Federal Risk Authorization Management Program (FedRAMP) authorization for its high-security standards, which aligns to state-related security requirements. 

Beyond simply replacing the PII, HealthVerity matches patient identities to a continuously updated referential database of over 200 billion healthcare and consumer transactions and leverages machine learning techniques to ensure the highest accuracy rate. We also utilize probabilistic matching, which can handle the inherent typos, errors and missing fields found in RWD, as opposed to deterministic matching employed by legacy technologies, which requires an exact match. This results in 10x greater accuracy than other de-identification solutions. Additionally, in lieu of PII, each individual is assigned a unique but persistent universal identifier, known as a HealthVerity ID (HVID), creating a single source of truth that enables patient records to be reliably synchronized across time and data sources. An independent third-party reviewer ensures all data matched using this PPRL technology is HIPAA compliant via Expert Determination. This industry-leading process makes the data fully interoperable and research ready from day one.

Using this synchronized solution has enabled HealthVerity to create the nation’s largest fully interoperable and HIPAA-compliant healthcare and consumer data ecosystem, allowing state agencies to easily discover data representative of their area in HealthVerity Marketplace, licensing only what they need to better understand the infectious disease rate in their state. HealthVerity Marketplace offers a self-service cloud solution where state agencies can build custom cohorts and see patient counts and overlaps in real time. Agencies can also seamlessly weave in their proprietary data to enrich datasets, enabling them to instantly spot gaps in care and avoid licensing duplicative data. This provides the most comprehensive view of the patient journey to protect public health.

Click here to discover the synchronized real-world data you need to understand the infectious disease rate in your state.


1Statista. Total number of COVID-19 cases in the United States as of March 10, 2023 by state. https://www.statista.com/statistics/1102807/coronavirus-covid19-cases-number-us-americans-by-state/.

Back to Blog