Why 'good enough' data is risky

Real World Data (RWD) is increasingly being incorporated into healthcare and life science research to inform a myriad of decisions important to public health. However, the high stakes of these decisions are often undermined by a common challenge: accepting, "good enough" data. While this data may appear broad and produce answers that feel directionally correct, there are risks to not properly vetting the RWD resources. When organizations procure data without transparent provenance and source stability, or high-fidelity identity resolution, they operate with an opaque foundation that may compromise study design, commercial investment, and adherence to regulatory expectations.

When teams cannot evaluate where data came from, how patient identities were matched across sources, what populations are overrepresented or underrepresented, and where important gaps remain, they take on risk that is easy to miss. The output looks precise, but the assumptions under it have not been fully tested.¹

A recent review on designing real-world studies argues that meaningful real-world evidence requires more than accessible data: it depends on carefully curated data, transparent provenance and quality assessment, representative populations and rigorous record linkage.²

The illusion of data adequacy

Many organizations still evaluate datasets on availability and not quality, and can miss key questions that matter:

What is the data source?
How were fragmented patient records resolved across sources?
What information is directly observed versus inferred?
Can analysts trace a result back to the underlying data and logic?
Is the sample demographically representative?

Standardized data structures alone do not create trust, and transparent, systematic data quality assessment is necessary before researchers can have confidence in the evidence generated from observational data.¹ More recent work showed that data reliability can be evaluated through accuracy, completeness and traceability, and that stronger performance across those dimensions changes the quality of evidence teams can produce.³

When provenance, handling and quality assessment are not transparent, researchers have less basis for deciding whether the data is fit for purpose. The strongest real-world evidence comes from carefully curated RWD, robust study design and stronger approaches to record linkage when multiple data sources are brought together.²

Gaps in longitudinal continuity can change persistence or discontinuation estimates. Weak patient matching can split one person into multiple records or merge distinct people into one profile. Limited source transparency can make it difficult to explain why results shift from one study to the next.³

Two datasets can appear similar if they cover the same condition, time frame or patient count. In practice, differences in source stability, payer mix, longitudinal continuity, matching quality, clinical depth and documentation shape what each dataset can reliably support.

Where HealthVerity makes the difference

By prioritizing high-fidelity identity resolution, transparent data provenance, and privacy-first linkage, HealthVerity helps customers move beyond opaque datasets toward insights grounded in real populations and real-world behavior.

That rigor allows teams to:

Evaluate data with a clear understanding of tradeoffs
Reduce analytical risk caused by hidden assumptions
Make defensible decisions in regulated environments
Scale analytics with confidence, not guesswork

Instead of optimizing for the most accessible or “good enough” answers, organizations can optimize for trust, transparency, and impact. See the hundreds of publications and clients who trust their research to HealthVerity real-world data: https://healthverity.com/resources/publications/

Explore real-world data in HealthVerity Marketplace

References

Blacketer C, Defalco FJ, Ryan PB, Rijnbeek PR. (2021). Increasing trust in real-world evidence through evaluation of observational data quality. Journal of the American Medical Informatics Association, 28(10), 2251-2257. https://academic.oup.com/jamia/article/28/10/2251/6328963
Dreyer, N. A., & Mack, C. D. (2023). Tactical Considerations for Designing Real-World Studies: Fit-for-Purpose Designs That Bridge Research and Practice. Pragmatic and observational research, 14, 101–110. https://doi.rg/10.2147/POR.S396024
Riskin DJ, Monda KL, Gagne JJ, et al. (2025). Implementing accuracy, completeness, and traceability for data reliability. JAMA Network Open, 8(3), e250128. https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2831188

HealthVerity Marketplace^TM

taXonomy Pathways

HealthVerity eXOs

Identity Manager (HVID)

Life Sciences

Channel Partners

Government

Data Ecosystem Partners

Insurance

HealthVerity Blog

Publications

News & Events

The IPGE Approach

Resource Library

Zombie Data Explained

Why 'good enough' data is risky

The illusion of data adequacy

Where HealthVerity makes the difference

References

Category

Popular Post

Subscribe

HealthVerity MarketplaceTM

taXonomy Pathways

HealthVerity eXOs

Identity Manager (HVID)

Life Sciences

Channel Partners

Government

Data Ecosystem Partners

Insurance

HealthVerity Blog

Publications

News & Events

The IPGE Approach

Resource Library

Zombie Data Explained

Why 'good enough' data is risky

The illusion of data adequacy

Where HealthVerity makes the difference

References

Category

Popular Post

Subscribe

Similar Blogs

The GLP-1 trend: What the real-world data reveals

The rise of conversational AI in healthcare using EHR data

Inside taXonomy, a next-gen closed claims solution for real-world evidence generation

HealthVerity Marketplace^TM