Section4.7 Implement

Section 4 Implement—Data Management - 1

Data Management

Ensuring high quality data in an electronic health record (EHR) and being a good steward of data in a health information exchange (HIE) environment are critical to making sure your local public health (LPH) department gets optimal value from these and other forms of health information technology (HIT).

Time needed: 4 – 6 hours
Suggested other tools: NA

How to Use

Review the information in this tool to develop an understanding of how to manage and maintain the best quality data, both for an EHR and for HIE.

Nature of Data in Information Technology

In information systems, there are two forms of data: structured and unstructured.

  • Structured data are processable by a computer. They are captured as the values of variables that describe patient characteristics. For example, every client has a name. The name variable may have myriad values based on different combinations of the letters of the alphabet. Likewise, every client has a birth date, from which age may be calculated. The fact that structured data can be alphabetized, arranged in a list, compared with other data to generate new data (e.g., age) and alerts (e.g., this value is not within normal limits), graphed, and trended is the value of automation.
  • Unstructured, or narrative, data (such as: in a handwritten or typed sentence; dictated – whether through traditional dictation/transcription, digital dictation, or speech recognition; or captured as any other form of image) cannot be easily processed.

Both types of data are very important. Structured data support clinical decision making and analytical processes, while unstructured data lends context and richness to information. Narrative information tells the story about the patient and is an important element in the thought processes of clinicians. As a result, it is important to balance the amount of structured and unstructured data—and ideally to find way to structure data that are in unstructured formats.

New technologies are addressing the ability to structure data in narrative descriptions. These technologies are summarized below. As you evaluate EHR products and ways to conduct HIE, keep these new approaches in mind. Obtain information from your vendors about their future plans in this area. The future may occur sooner than you think! Emerging approaches to structuring unstructured data are:

  • Content management refers to indexing of narrative data so it is more readily retrievable. For example, if a nurse needs to be reminded of a patient’s allergies, the allergies recorded in one location (e.g., on a scanned form) can be copied to other locations, such as the notes screen for viewing. However, the nurse would still have to compare the allergies against the drug information to determine if there is a contraindication. Viewable data are not codified in a manner that can be processed further by the computer.
  • HL7 Clinical Document Architecture (CDA) can also be used to create processable data. In this case, an indexing process in templates is used. While the Continuity of Care Document (CCD) from HL7 is starting to be used, other templates available in the Consolidated – Clinical Document Architecture (C-CDA) are not widely used yet. (See Section 2.11 Exchange of Clinical Summaries: CCR, CCD, C-CDA).
  • XML is used to retrieve data that have been entered into a computer. The Continuity of Care Record (CCR) is able to be rendered in XML, but receiving systems must have a Web-Services Architecture in order to “browse” the data, and most EHRs today do not have this.
  • Natural language processing (NLP) is a technology that parses narrative information in electronic form (e.g., word processed document) to produce structured data. This technology is improving, but is not yet widely deployed in EHR systems. One exception is the use of NLP in discrete reportable transcription (DRT) that provides the ability for voice dictation that follows a specific template in the EHR to capture the spoken data points as structured data. Again, not many EHRs support this technology as yet.

Data Management

Data management refers to the process of ensuring that the structured data collected yield complete and accurate data. The data quality must support the “EHR five rights” of (1) having the right clinical data, achieved through the (2) right presentation of templates to capture data and displays that provide information, in support of (3) right decision making, provided within the context of the (4) right work processes, in order to produce (5) right results.

Data quality principles have been introduced to health care by the American Health Information Management Association in its Data Quality Management Model, depicted below. For a comprehensive discussion of how to ensure the quality of the data you are collecting, see:

AHIMA. Pocket Glossary of Health Information Management and Technology, Third Edition. Chicago: AHIMA Press, 2012.

Two key ways to manage data quality are:

  • Ensure that there is a well-maintained data dictionary in the EHR. A data dictionary is a descriptive list of the names, definitions, and attributes of all data elements to be collected in a database. Once a data dictionary is established, one should be very careful about making changes. While there will always be new terms that need adding because of new disease states, drugs, etc., simply adding “new” terms because of personal preference works against data quality.

For instance, if an EHR user finds that a person has a “bed sore” and wants to enter that in the EHR but does not find that expression in a list of choices, two things often happen. First, “bed sore” will often be written in a comment field and lost to computer processing. A subsequent user who does not read the comment field may not be alerted to look for and assess the pressure ulcer. Second, the user may lobby for the preferred term to be added to the data dictionary. (Note: some vendors keep the data dictionary to themselves and will only make changes upon request, and often with a fee; other vendors will give your organization access to the data dictionary for you to make these changes yourself.) In this example, if “bed sore” is added, any list of patients to be monitored for pressure ulcers will not include those with “bed sores” unless someone creates the list using both terms.

  • Ensure that the vendor uses standard definitions and terminologies and that users adhere to these standards. Every vendor maintains a data dictionary for its EHR, but not every vendor will use standard names, definitions, and attributes. Also, the vendor may only apply standards to some of the data collected via the EHR, such as for data sets used in state or federal reporting programs. Applying standard data definitions and terminologies not only helps users ensure data consistency but supports the ability to share data with other organizations in HIE. (See Section 1.3 Interoperability for EHR and HIE.)

For example, every terminology allows expression of a concept at a high level or a more specific level (e.g., “Diabetes” vs. “Diabetes Mellitus Type 2 with Neuropathy”). In entering structured data, there are often ways to select one or the other. For example, “Diabetes” may be chosen by a type-ahead feature. Typing “Dia” may initially yield just “Diabetes” and that may be selected over typing further to be more specific. If there is a drop down-menu of choices, “Diabetes” may be the first choice in the list, and therefore it tends to be chosen over a more specific statement as a matter of convenience.

Data Stewardship

Many corporations have applied the principles of data management to improve the quality of data used to make business decisions. Often, this is called data stewardship. (See Data Quality Management: The Most Critical Initiative You Can Implement, available at:

Health care has lagged other industries in both effectively using the massive amount of data the industry accumulates and in adopting data stewardship principles.

There are many issues associated with the use and potential misuse of health data. To make the best of such data in the most appropriate way, many are calling for the health care industry to adopt principles of data stewardship. (As an example, see “A Stewardship Framework for the Use of Community Health Data,” National Committee on Vital and Health Statistics (NCVHS):

Data stewardship is particularly important to LPH departments. As the attempt is made to coordinated care among many different plays, LPH departments need to use a common language. This is even more important when using HIE. Even if the HIE is not accumulating a data warehouse, the importance of ensuring that the data exchanged are high quality data and secure is vital.

The following are definitions of data stewardship, data steward, and data governance:

  • Data stewardship, per NCVHS (see above), is “the responsibility of ensuring the appropriate collection, management, use, disclosure, or safeguarding of information.”
  • Data steward is a person (or entity) that has a trust relationship with the data.
  • Data governance is the process by which responsibilities of stewardship are conceptualized and carried out.

As information is exchanged among participants in an HIE organization (HIO), each participant is, in essence, a data steward. The HIO may provide governance, establishing the policies for the structure of the data, standardization of data sets, access privileges, and permissible use of data. However, every user must have a trust relationship with that data, or it will not be useful. Inappropriately acquired or misused data can potentially harm the patient or client, as well as the organization.

Recommendation: Instill in every EHR and HIO user a sense of data stewardship. Use the data stewardship framework from NCVHS (summarized and adapted for LPH below) to discuss what data stewardship means to your department:

  1. Openness, transparency, and choice. Make sure your clients know what information is being collected within your EHR and what information is being shared with others via an HIO.
  2. Purpose specification. Describe for your clients why you are collecting and sharing the data about them.
  3. Community engagement and participation. As an HIO participant, you are a member of a community. Engage with this community to ensure everyone upholds the principles of data stewardship.
  4. Data integrity and security. Data integrity refers to the reliability of the data---that it meets specified standards of fidelity and has not been altered. Security has a threefold purpose: to ensure confidentiality; to protect the data from alteration or destruction; and to make sure data is accessible when needed. This implies that systems must have sufficient redundancy so that you do not experience unusual downtime, and that there are contingency plans in the event of an uncontrollable downtime.
  5. Accountability requires identification of the person or entity responsible for stewardship at each point in the flow of data from initial collection and use through dissemination of any aggregation of the data, and its storage and ultimate destruction. From a practical perspective, accountability resides with everyone charged with adhering to data quality standards and security measures, such as protecting passwords and encrypting data in devices and media.
  6. Protecting de-identified data. HIPAA permits protected health information (PHI) to be de-identified, rendering it no longer protected under the Privacy and Security Rules. Still, many believe that rendering PHI de-identifiable and using it for another purpose is ethically wrong. An LPH department may not be in the position to de-identify its data, but understanding how your data you exchange through an HIO may be subsequently used is a data stewardship responsibility.
  7. Attending to the risks of “enhanced” data sets. Many HIOs are considering merging data from different sources to enrich what is available for community planning, research, and other purposes. Such “mash-ups,” can be beneficial but also problematic—especially if the quality of the data is poor, the enhancement is not well performed, or the use is ill-conceived.
  8. Stigma and discrimination. HIOs should be alert to data uses that may result in discrimination for their clients, their participants, or their community as a whole. NCVHS cites redlining in the housing loan market as an example of how a report of poor health status for a particular community that was determined by an “enhanced” data set can harm the members of the community.

Copyright © 2014 Stratis Health.Updated 01-01-14

Section 4 Implement—Data Management - 1