De-Identifying Data

The HIPAA Privacy Rule provides two routes by which a Covered Entity may properly de-identify a data set:

1.SafeHarbor Method

All eighteen identifiers (listed below) concerning the individual’s and the individual’s employer, relatives and household members must be removed by the Covered Entity before releasing the data to the investigator. There must be no possibility that the information remaining in the de-identified data set could be used alone or in combination with information from other sources to identify an individual within the data set.

Identification codes may be retained in a de-identified data set only if they meet all of the following conditions:

  • Codes must not be derived from any other identifiers
  • Codes must not be translatable such that an individual can be identified
  • Codes must not be used by the Covered Entity for any other purpose
  • The Covered Entity must not disclose its key (method of re-identifying the data)
  • Codes may not be derived from the Social Security Number or medical records number

The eighteen identifiers that must be removed to de-identify data are:

  1. Names
  2. Geographic subdivisions, addresses, and Zip codes:
    All geographical subdivisions smaller than a state, including street address, city, county, precinct, zip code, and their equivalent geo codes. However, the initial three digits of a zip code may remain on the information if, according to current publicly-available data from the Bureau of the Census, the geographic unit formed by combining all zip codes with the same three initial digits contains more than 20, 000 people, and the initial three digits for all such geographic unit containing 20,000 or fewer people is changed to 000.
  3. All elements of dates (except year) for dates directly related to an individual, including:
    Birth date, dates of admission and discharge from a medical facility, and date of death; for persons age 90 and older, all elements of dates (including year) that would indicate such age must be removed, except that such ages and elements may be aggregated into a single category of “age 90 or older.”
  4. Telephone numbers
  5. Fax numbers
  6. Electronic mail addresses
  7. Social security numbers
  8. Medical record numbers and prescription numbers
  9. Health plan beneficiary numbers
  10. Account numbers
  11. Certificate/license numbers
  12. Vehicle identifiers, serial numbers, and license plate numbers
  13. Device identifiers and serial numbers
  14. Universal Resource Locators (URL) for Web Sites
  15. Internet Protocol (IP) address numbers
  16. Biometric identifiers, including fingerprints and voice prints
  17. Full face or comparable photographic images
  18. Any other unique number, characteristic, or code that could be used to identify the individual

2.Statistical Method

A Covered Entity may obtain certification from a statistician familiar with accepted de-identification methods that there is a “very small” risk that recipients of the data would be able to identify individuals using the information alone or in combination with information from other sources.

If you have questions about de-identified data in research, please contact:
Vinita Witanachchi, J.D., USF DRC Research Privacy Officer, HIPAA Program Coordinator
Telephone: (813) 974-5478 or e-mail:

v1.0: 10-3-03