Featured Content
Data Ecosystem
Technology Products

What is PPRL? A smarter way to link data securely and accurately

emery-niemiec-header

In this edition of our Ask an Expert series, we turn to Emery Niemiec, Principal of Government Sales at HealthVerity, for insight into one of the most powerful technologies enabling secure, privacy-compliant data integration today: Privacy-preserving record linkage, or PPRL. With experience helping organizations across life sciences, government, and beyond connect disparate datasets responsibly, Emery breaks down what PPRL is, why it matters, and how HealthVerity is setting a new standard for trusted data exchange.

In today’s digital landscape, data is everywhere, but it’s rarely connected. Whether you work in public health, life sciences, government programs, insurance, finance, marketing, or other, data is often scattered across systems, institutions, and formats. This makes it incredibly difficult to form a complete view of an individual or population, especially when privacy must be preserved.

So, what is privacy-preserving record linkage (PPRL)?

PPRL, or Privacy-preserving record linkage, is a powerful data integration technique that allows organizations to connect de-identified data across disparate sources without compromising privacy.

PPRL is a privacy-enhancing technology (PET) that enables two or more organizations to link records about the same individual across different datasets, all without sharing personally identifiable information (PII) or protected health information (PHI). Instead of sending raw identifiers like names or Social Security numbers, PPRL solves the challenge of linking sensitive data accurately and securely for insights that could not be derived from a single data set.

At HealthVerity, Identity Manager, provides this robust PPRL solution so you can De-ID patients with ease and link them securely across data sets.

 

what is privacy preserving record linkage PPRL

Figure 1: An example of how Identity Manager at HealthVerity enables privacy preserving record linkage across disparate data sets to remain fully privacy compliant.

 

Why PPRL is critical in a fragmented data world

Across industries, organizations struggle with data silos, separate systems and vendors that each hold a slice of information but lack a way to safely connect the dots. This fragmentation limits insight, slows research, and increases risk.

For example:

  • A pharmaceutical company conducts safety studies and might have data on prescription volume but cannot access clinical information at scale for understanding outcomes.

  • A marketing campaign can see the number of views for any given advertisement but does not have a way to assess effectiveness

  • A government agency may monitor disease outbreaks or public health trends but lack a unified view across hospitals, labs, and geographic regions.

Without PPRL, these linkages are either impossible or risky. With PPRL, they’re secure, scalable, and privacy-compliant.

"Organizations that adopt PPRL can build deeper insights faster, all while protecting the people behind the data. At HealthVerity PPRL is embedded in everything we do, making secure, trusted data sharing with unprecedented accuracy, and large-scale interoperability possible."

- Emery Niemiec, Principal of Government Sales

How PPRL works: a step-by-step process

So how does privacy-preserving record linkage work?

Here’s a simplified breakdown:

  1. Local de-identification: Each data owner installs secure software that replaces PII (e.g., names, birth dates) with de-identified hashes behind their firewall and encrypts those hashes for greater security before transmission.

  2. Encrypted matching: These encrypted records are matched using advanced techniques (such as probabilistic matching and Bloom filter technology), which account for typos, name variations, and data entry errors when matching individuals across disparate data silos.

  3. Assignment of a unique identifier: If records are found to refer to the same person, they’re linked using a persistent, universal identifier (like the HealthVerity HVID), enabling fully interoperable and longitudinal data across various systems.

Importantly, no raw PII is ever transferred or exposed.

What makes HealthVerity PPRL different (and better)

Legacy matching approaches rely on deterministic matching, exact matches between fields like name and date of birth. But real-world data is messy. Misspellings, missing info, or changes in address can throw off those matches.

That’s why PPRL uses probabilistic methods and advanced machine learning to detect and correct for variation, delivering much lower false positive and false negative rates.

For example, at HealthVerity,  PPRL technology boasts a:

  • 0.2% false positive rate: Compared to industry standards of up to 2% false positives

  • 3–5% false negative rate: Compared to industry standards of up to 42% false negatives.

That’s a tenfold improvement and a major leap in both data quality and trust. (Figure 2)

what is PPRL at healthverity

Figure 2: HealthVerity PPRL compared to industry standards for false positive, false negative, and matching accuracy rates.

Who benefits from PPRL?

While PPRL has gained traction in public health and government use cases, its value extends far beyond.

Healthcare and life sciences

  • Link EMR, claims, lab, and social data to form a longitudinal patient view.

  • Improve clinical research, outcomes tracking, and precision medicine.

Retail and consumer insights

  • Merge online/offline customer interactions securely.

  • Enable personalization without compromising user privacy.

Finance and insurance

  • Detects fraud by linking customer behavior across institutions.

  • Maintain compliance while improving risk profiling.

Research and academia

  • Collaborate across institutions with linked datasets.

  • Preserve subject privacy while enhancing analytical power.

The Value of PPRL: efficiency, privacy, and trust

PPRL isn’t just a privacy tool, it unlocks access to richer, more complete datasets while:

  • Preserving privacy and security at every step

  • Accelerating data readiness by eliminating manual matching

  • Improving interoperability across diverse data environments

  • Reducing regulatory burden by minimizing PII exposure

Organizations that adopt PPRL can build deeper insights faster, all while protecting the people behind the data. At HealthVerity PPRL is embedded in our IPGE framework and a part of everything we do. If you want to learn more about identity, provenance, governance, and exchange explore the HealthVerity IPGE blog series

Key takeaways

As data becomes more essential to innovation and decision-making, the ability to link that data responsibly is non-negotiable. If you’ve been asking “What is PPRL?”—the answer is simple:

PPRL is the key to unlocking the full value of your data, without unlocking your users’ privacy.

Whether you’re in healthcare, retail, finance, or research, now is the time to explore how Privacy-preserving record linkage can transform your data strategy—and empower you to connect, analyze, and act with confidence.