When two federal agencies, the National Center for Science and Engineering Statistics (NCSES) and the National Center for Health Statistics (NCHS), needed a demonstration to determine the feasibility of securely linking sensitive government data without exposing any personally identifiable information (PII), they turned to HealthVerity for privacy-preserving record linkage (PPRL) and Mathematica for data analysis and validation.
 Through the America’s DataHub Consortium and under the National Science Foundation (NSF), this important PPRL project demonstrated that cross-agency data linkage could be done securely, accurately, and efficiently using HealthVerity commercial linkage tools such as like Identity Manager, a powerful commercial PPRL solution.
Through the America’s DataHub Consortium and under the National Science Foundation (NSF), this important PPRL project demonstrated that cross-agency data linkage could be done securely, accurately, and efficiently using HealthVerity commercial linkage tools such as like Identity Manager, a powerful commercial PPRL solution.
The result was an impactful report that demonstrates if the U.S. government trusts HealthVerity to protect its data, you can too.
A foundation of privacy, precision, and stability
The NSF PPRL project is a validation of the HealthVerity FedRAMP-Moderate Identity Manager Solution, the same technology that powers secure data linkage for the nation’s largest healthcare and consumer data ecosystem. By applying the same privacy-preserving processes trusted by federal agencies, pharmaceutical clients can now benefit from the same verified, government-tested precision when linking real-world data for clinical research, HEOR, and commercial strategy.
Implication for commercial pharma clients
The NSF PPRL project proved that HealthVerity privacy-preserving linkage isn’t just government-grade, it's also industry-ready. The same verified technology that secured federal datasets now enables pharmaceutical partners to unify real-world data across claims, labs, and EMR sources with speed, compliance and confidence.
Main findings of the summary report
HealthVerity and Mathematica delivered significant advancements in data privacy and utility through their collaboration on the NSF/NCSES–NCHS project:
1. Successful linkage of two federal datasets
- HealthVerity securely linked data from the NCSES and the NCHS, specifically, the Survey of Earned Doctorates (SED) and the National Health Interview Survey (NHIS)— two agencies that had never shared data before.
- This demonstrated, for the first time, that two federal statistical agencies could share and link data without exchanging any personally identifiable information (PII), in full compliance with CIPSEA, HIPAA and FedRAMP privacy and security standards.
2. Deployment of a FedRAMP-authorized secure PPRL environment
- HealthVerity provisioned a dedicated, FedRAMP Moderate-certified cloud environment to handle the linkage, meeting stringent federal security requirements.
- The company obtained an Agency-to-Operate (ATO) authorization from NSF, confirming HealthVerity compliance with federal IT security protocols.
- All project data were isolated and destroyed within 30 days of delivery to the NCSES Secure Data Access Facility, with a formal Certificate of Destruction issued.
3. High-accuracy privacy-preserving record linkage
- Using HealthVerity AI/ML-enhanced PPRL engine, we were able to de-identify, tokenize, and match ~93% of the individuals in these datasets to a corresponding HVID in our master patient list across both datasets (94.9% for NHIS and 91.8% for SED), demonstrating strong accuracy and scalability at federal scale.
- This approach ensured that linkages were privacy-compliant and non-reversible, with no identifiable information ever exchanged or retained.
4. Establishment of a reusable interagency data sharing model
- HealthVerity helped designed and documented the Data Sharing Agreement (DSA) and Software License Agreement (SLA) frameworks used between NCSES and NCHS, now being considered as a template for future interagency linkages under the National Secure Data Service (NSDS) initiative.
- These agreements captured stakeholder roles, data handling standards, and encryption procedures that can guide future federal projects.
5. Proof of HealthVerity technical and governance leadership
- HealthVerity not only provided the PPRL technology but also led project management, compliance, and coordination across multiple federal teams and its partner, Mathematica.
- The project’s final recommendations to NSF emphasized that commercial PPRL solutions like HealthVerity Identity Manager provide higher accuracy, support, and scalability than open-source or homegrown options, effectively validating the HealthVerity model as a best practice for federal data linkage.
"We were honored that NCSES chose us to help them demonstrate the viability of agencies working in tandem to link data in a privacy-compliant manner. It was a great partnership that proved that when it comes to two agencies linking data, security, scale, accuracy and speed can coexist. This project is helping to define what trusted linkage looks like for government and industry alike, and HealthVerity was proud to have been a part of it."
- Jason Mayer, VP Government at HealthVerity
Learn more about Identity Manager, our powerful PPRL solution. If the U.S. government trusts HealthVerity to link its most sensitive data, your next study or launch strategy should too.
References
HealthVerity; Mathematica. Utilizing Privacy Preserving Record Linkage (PPRL) to Link Data from Two Federal Statistical Agencies: Final Project Summary Report. Prepared for the National Center for Science and Engineering Statistics (NCSES) within the U.S. National Science Foundation; August 15, 2025. Available at: https://www.americasdatahub.org/wp-content/uploads/2025/09/NCSES-NCHS-PPRL-Final-Project-Summary-Report.pdf
 
          
         
            
           
     
   
   
         
        