Skip to content

Why do we have missing data in EHR

Amatur Rahman edited this page Aug 25, 2017 · 4 revisions

Strategies for Handling Missing Data in Electronic Health Record Derived Data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4371484/

a lack of collection (e.g., patient was never asked about asthma) a lack of documentation (e.g., patient was asked about asthma but the response was never recorded in the medical record). Lack of documentation is particularly common when it comes to a patient not having a symptom/comorbidity. Instead of recording a negative value for each potential symptom/comorbidity, all data fields are left blank (missing) and only the positive values are recorded. Thus it can be impossible to differentiate between the lack of a comorbidity, the lack of documentation of a comorbidity and the lack of data collection regarding the comorbidity. In order to conduct research using EHR data, it is typically necessary to assume that missing data elements indicate a negative value.

or example, healthier patients are less likely to utilize the healthcare system and may be more likely to have missing data such as systolic blood pressure readings. Therefore, systolic blood pressure may have a direct, positive relationship with the number of office visits in a univariate analysis, but adequate adjustment for current health status would make this relationship between office visits and systolic blood pressure disappear.

This argument is almost always plausible when working with EHR data (e.g., compliant patients, patients with good insurance, and patients with more severe disease tend to have less missing data).

Impute techniques
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5144587/

Clone this wiki locally