What Is Data Integrity and Why Is It Important in Healthcare?

View all blog posts under Articles | View all blog posts under HIM

A health information management professional is reviewing an electronic medical record with a physician.Healthcare organizations produce large volumes of data, on everything from patient admissions to insurance reimbursements. This data can be invaluable. It can show executives and decision makers how to best allocate their organization’s resources to serve the needs of their communities and patients. It may also be used by clinicians to arrive at important conclusions about population health and related trends. Crucially, this information is only useful as long as it remains accurate and uncompromised; data of dubious quality simply has no utility to healthcare decision makers.

In thinking about the quality and consistency of data, healthcare information technology (IT) professionals and analysts are fundamentally concerned with data integrity. Formalized programs and protocols to ensure data integrity help healthcare organizations use their aggregated information confidently and strategically.

An excellent way to learn more about the importance of data integrity, and the means to ensure it, is to pursue an advanced degree in health information management or a related field.

What Is Data Integrity?

Data integrity refers to the accuracy and consistency of data over the course of its life —in other words, for as long as that piece of data is maintained and used within an organization.

The concept of data integrity may also refer to either a state or a process. For example, if a set of data is both accurate and consistent, then it is deemed to be in a state of data integrity. At the same time, health information management professionals may use the term data integrity differently. It may refer to their own internal systems and processes that are used to check different data sets and evaluate them for accuracy and consistency.

According to Digital Guardian, the notion that data should remain accurate and consistent means that a particular set of data should remain intact and unaltered, no matter how many times it is replicated or transmitted throughout the organization. Each replication of a data set should perfectly mirror the original. For a data set to have integrity, it must not deviate from the original data set nor have any pieces missing.

The ways in which data integrity can be corrupted within a healthcare organization are numerous. Original data may include disparities between a physician’s original case notes and the notes left in the patient’s electronic file. Or, a spreadsheet that is distributed throughout the organization may show different departmental staffing arrangements, have rows or columns missing, or may be damaged through the simple process of data transfer.

Data Integrity vs. Data Security

Data integrity is closely related to data security. Data security is primarily concerned with protecting data from unauthorized access, typically by hackers, and the subsequent data corruption that might ensue. In this way, data security is a prerequisite for ensuring data integrity.

Conversely, data integrity is the primary goal of data security. However, data security is just one of several disciplines that are used in pursuit of this goal. While it’s important for healthcare organizations to guard against data loss or cyberattack, it’s equally important to recognize that there are other ways in which data integrity can be compromised, absent any malicious third-party attacker.

How Data Integrity Is Compromised

As Digital Guardian makes clear, there are several potential causes of breached data integrity.

  • Unintentional human error, such as typos or mistakes when manually transcribing data
  • Technical errors that occur when a set of data is transmitted from one device to another
  • Compromised hardware, such as a hard drive that crashes or becomes physically corrupted
  • Different forms of cyberattack, including viruses, malware, or the disruptive work of hackers

Risks such as these demonstrate the need for healthcare organizations to be proactive and vigilant in safeguarding their important data, ensuring its ongoing integrity.

Data Integrity Examples in Healthcare

In any given healthcare organization, there are countless examples of important records and data sets that are crucial for delivering optimal patient outcomes, as well as enabling organizational efficiency. For any of these records or data sets to be useful, however, their integrity must be assured.

Electronic Health Records

One of the most foundational data integrity examples is electronic health records, or EHRs. An EHR represents a digitized version of a patient’s paper chart. These real-time, patient-centered files ensure that all relevant information about a patient’s treatments, diagnoses, and medical history can be easily accessible to any authorized users, including any doctors or nurses caring for the patient, and specialists to whom the patient has been referred.

In fact, data integrity is one of the primary reasons why electronic health records have become so commonplace. By providing a single electronic file, rather than multiple transcriptions of a doctor’s handwritten notes, healthcare organizations can make certain all authorized users are on the same page with regard to the patient’s status and condition. One of the foremost goals of EHRs is to guarantee accurate, precise diagnostics and care for each patient, even if their treatment takes them to different departments or providers.

Even so, EHRs provide opportunities for data integrity to be lost, whether due to human errors in the initial data compilation or due to data corruption when the EHR is stored in the cloud. Vigilance for data integrity is an essential part of maintaining EHRs.

When well-maintained, EHRs can bring great flexibility and efficiency to patient care. For example, EHRs make it easy for a patient’s data to be transferred from a primary care doctor to a specialist, ensuring that the patient receives the personalized care required at each stage of treatment. Also, EHRs provide legible and streamlined information that can expedite the medical coding and billing processes, minimizing errors. When data integrity is prioritized, organizations that use EHRs can realize significant cost savings.

Pharmaceutical Data

Another important data integrity example has to do with pharmaceutical interventions. Health professionals who research, test, and develop new drugs and vaccines collect huge volumes of data to validate that medications are safe and effective, and that all known side effects are documented. This data is compiled from a range of sources, including clinical observations as well as healthcare wearables such as ECG monitors other mobile applications used by patients.

Not only does this information help pharmaceutical developers ensure their medications are safe, but it affords providers some confidence as to when and how they use these medications. Given the importance of this data, maintaining its integrity is crucial. A number of potential threats to data integrity exist. Again, the potential for user errors, transmission failures, or cyberattacks all require vigilance.

Billing Information

Yet another example of data integrity pertains to billing information. Healthcare organizations use patient files to appropriately bill insurance companies, which helps them get reimbursed promptly and fully. Additionally, billing information may be maintained in a hospital or medical organization, where it can provide data analysts with indicators as to where the organization makes or loses money. Sustaining the integrity of financial data helps healthcare executives make fully-informed decisions, resulting in a smoother operation and ideally in higher standards of patient care.

Ensuring Data Integrity Through Best Practices

Ensuring data integrity requires systems and processes that guide the collection, storage, and access of data. The best data integrity practices call for optimized IT systems and smart day-to-day habits from all members of the organization.

Clear Data Attribution

One of the most important steps in ensuring data integrity is providing clear attribution for all patient-centered data. This means every patient file should clearly and accurately convey who observed or recorded the data, which patient the file pertains to, and when the data was first obtained and recorded. This level of attribution ensures that, should questions about its validity ever arise, the data’s original sourcing can be checked.

Wise Data Sourcing

Another important step in ensuring data integrity is vigilantly recording its source. Sometimes, data originates inside the healthcare organization. For example, a patient’s EHR contains data points that were recorded by the doctor or nurse who examined them. But when that data is received by a third-party — whether a pharmaceutical company, an insurer, or a referring physician — it’s crucial to have a system in place to verify that the data is properly formatted, complete, features proper attribution, and contains no inconsistencies. Health organizations can safeguard their own data integrity by refusing to allow any subpar, outside data to enter their systems.

Real-Time Data Collection

The timing of data collection and compilation can play an important role in ensuring its integrity. Specifically, it’s usually best to have contemporaneous data entry whenever possible. For example, when a patient receives any form of testing or diagnostic evaluation, the results should be added to their file immediately. The longer the delay in recording, the more likely errors are to be introduced.

Smart Data Governance

One of the most common forms of data corruption is data duplication. When multiple, identical files are entered into a healthcare IT system or database, it can lead to a lack of clarity. The best way to avoid confusion is through smart data governance, specifically through a formal data pipeline. Health IT professionals can put protocols in place that clearly define who owns the data, who may access it, who may alter it, and how frequently it is audited and reviewed. These are all important safeguards against unintentional data duplication.

Data Entry Training

Unintended human error is one of the greatest threats to data integrity. While it cannot be completely eradicated, it can be minimized through employee training. All employees who are tasked with manual data entry should receive training in the best practices and protocols for accurate data entry, review, and quality control.

Up-to-Date IT Systems

One final step in ensuring data integrity is keeping IT systems, including databases and physical hardware, up to date. This typically falls under the responsibilities of the healthcare IT department. Not only does maintaining up-to-date systems allow for the periodic review of data quality standards, it also guarantees a robust defense against hackers, malware, and other forms of cyberattack. Indeed, sophisticated, well-maintained IT systems provide a crucial means of ensuring data security, and thus data integrity.

The Healthcare Integrity and Protection Data Bank

Prioritizing healthcare integrity also entails knowledge of key legal resources and provisions, including the Healthcare Integrity and Protection Data Bank (HIPDB).

History of the HIPDB

The HIPDB was originally established through the Health Insurance Portability and Accountability Act of 1996, or HIPAA. The goal of the HIPDB is to minimize occurrences of healthcare fraud and abuse. While the HIPDB is no longer an active entity, the information that it collected and stored is now maintained by the National Practitioner Data Bank (NPDB).

Role of the NPDB

The NPDB website describes it as an online “repository of reports containing information on medical malpractice payments and certain adverse actions related to healthcare practitioners, providers, and suppliers.” The NPDB exists to protect the public and to establish high standards for healthcare. One way in which it accomplishes this is by collecting information about performance issues among physicians, preventing them from moving their practice from state to state without those performance issues being discovered and disclosed.

Additionally, physicians may conduct a “self-query” using the NPDB, which allows them to see any medical malpractice payments or negative licensure actions connected to their practice.

The NPDB as a Model of Data Integrity

Ultimately, the NPDB exists as an important example of how healthcare data can be collected, compiled, and meticulously preserved with the ultimate goal of creating higher standards of patient safety and care. The records contained in the NPDB, like the HIPDB before it, exemplify data integrity, making them valuable resources for providers and the general public.

The Importance of Data Integrity Testing in Healthcare

The reality of data collection is that, over time, any file or data set may become compromised, whether due to simple human errors, technological breakdowns, or undetected cyberattacks. The only way in which healthcare organizations can have confidence in the quality, accuracy, and consistency of their data is to subject it to routine data integrity testing.

Through this process, health IT professionals can validate that a data set has remained consistent over its lifecycle. Alternatively, they can identify any potential corruption or inconsistency within a data set. Not only does data integrity testing provide confidence in the data that’s been tested, but it may also uncover areas in which an organization’s data collection, storage, or transmission procedures can be improved.

Important Steps in Data Integrity Testing

Data security company Varonis highlights several key components in the data integrity testing process.

Validate Input

It’s always important to make certain that the original sourcing of the data is known. If the data originated inside the organization, this requires confirming the time, place, and person responsible for its input. If it came from an outside source, it’s necessary to verify that the source is credible. Sometimes, malicious or misleading data may be implanted by a cyber assailant. Validating data input helps root out these instances.

Validate Data

Data points themselves should also be checked to determine if they have remained consistent and unaltered throughout their lifecycle and across different uses or transmissions. A key component of this process involves maintaining a data “paper trail,” which helps analysts verify if and when any key specifications or attributes have been modified.

Remove Duplicates

Another vital step in data integrity testing is checking for files that have been duplicated. This includes instances of sensitive or private information being replicated in an unauthorized or unsecured place. There are a number of software tools to help healthcare IT professionals scan large databases or IT systems for “clone” or duplicate files.

Testing Data to Ensure Patient Safety and Privacy

Ultimately, verifying data integrity not only helps healthcare organizations improve their own operations, but it also allows them to ensure that patients are cared for properly by preventing clinical decisions being made based on faulty or incomplete data. Additionally, data integrity testing provides an important check against sensitive information, whether medical or financial, being stored inappropriately, risking its exposure or theft.

Maintaining Data Integrity in Healthcare

Day in and day out, healthcare organizations produce huge sums and types of data. Indeed, every patient encounter provides a new data point that, when stored and analyzed properly, can be instrumental in improving the quality and efficiency of care. However, all of this relies on data integrity, something health information management professionals are equipped to safeguard.

A sound choice for developing the skills to maintain data integrity is to enroll in an advanced degree program, such as the University of Illinois at Chicago’s online Master of Health Information Management.

In this program, students can prepare for careers in which they add real value to healthcare organizations, specifically by mastering the skills associated with using data to improve patient outcomes. Find out more about the university’s master’s in health information management and the role it can play in advancing your career.


Recommended Readings

Human Anatomy and Physiology Study Guide

How to Become a Health Information Manager

Networking Tips for a Career in Healthcare: Putting Your Best Foot Forward



American Health Information Management Association, “Ensuring Data Integrity During Health Information Exchange”

BuiltIn, “From Fighting Cancer to Preventing Disease, Big Data in Healthcare Might Just Save Your Life”

Digital Guardian, What is Data Integrity? Definition, Best Practices & More

Health IT Analytics, HIM Pros, Data Integrity Play Vital Role in Hospital Ratings

HealthIT.gov, “What Are the Advantages of Electronic Health Records?”

HealthIT.gov, “What is an Electronic Health Record (EHR)?”

MedicalDirector, Why Data Integrity is Critical for Healthcare

National Practitioner Data Bank, About Us

National Practitioner Data Bank, HIPDB Archive

PromptCloud, “7 Steps to Improve Data Integrity”

Towards Data Science, 7 Steps to Ensure and Sustain Data Quality

Varonis, “What is Data Integrity and How Can You Maintain it?”

ZS, “The Future of Data in Pharma: Data Privacy, Data Ownership and Going Beyond the Pill”