Human Research Protection Program Guidance Document

Ensuring Privacy When Mining Data for Research

The purpose of this document is to provide guidance for researchers and informatics departments on common uses for aggregate datasets and appropriate ways to accessand sharedata from large data sources for research purposes while ensuring appropriate privacy protections.

What are Common UsesforAggregate Data with PHI?

  • Preparatory to research activity to determine feasibility – one may review records without specific authorization, unless the individual is not a member of the work force of the Health System (HS); however one cannot contact individuals without Institutional Review Board(IRB) approval.
  • For research purposes, which should include:
  • IRB approval for the study, unless it involves de-identified data (see below) and
  • A waiver/partial waiver of HIPAA authorization granted by the IRB OR requires research consent from participants to collect, use and disclose their data
  • Quality improvement purposes – does not require IRB approval, but may require departmental review and approval.

What are Common Sources of Large Data Sets within the Health System?

  • Electronic Health Records (EHRs) or similar internal systems that house clinical data from HS patients
  • Data repositories–data collected and managedby a HS department or research group; usually the research data repositories are reviewed andapproved by the IRB.
  • OtherHS data gathered and/or extracted by an external group.

When is IRB Approval Needed?

Because the IRB acts as the privacy board for research, there may be certain requirements to access use or disclose information for research depending on the type of aggregate data needed:

  • Anonymized/de-identified data (does not contain any of the 18 HIPAA identifiers/PHI) – no IRB approval required unless researcher has access to a link allowing re-identification of data.
  • Identifiable Data (contains at least one of the 18 HIPAA identifiers/PHI) –IRB approval required.

What Should HS Researchers Know when Requesting Data with PHI?

  • Ensure that you have appropriate IRB approval/waiver of HIPAA authorization for your research, and if you don’t qualify for a waiver, that you’ve obtained research consent from participants.
  • Have your IRB documentation ready when requesting aggregate data from informatics.
  • If data will be extracted by a person who is not part of your research team, it is best to include this information in your IRB protocol.
  • Ensure that you receive, use, store and dispose of data in a secure manner according to HS policies.

What Are Some Best Practices for HSInformatics Departments that Extract and Release Data with PHI for Research Purposes?

  • Include a process to request IRB documentation from thedata requestor.
  • Request protocol data parameters, which are often outlined in the IRB submission, protocol or data collection sheet.
  • Release only minimum necessary data required for the research study and approved by the IRB.
  • Maintain IRB and relevant documentation with recipient’s file for data release.
  • If unsure whether the data requested is for research purposes, confirm with the requestor, obtain necessary documentationand/or contact the IRB at .
  • Transmit data to the researcher in a secure manner according to HS policies.

Additional Resources:

  • Refer to the HRPP ePHI Guidance document for more information on data security at Tools and Guidance Section
  • For questions regarding whether your activity is considered research contact the IRB at or call (516) 321-2100.
  • HRPP policies are available online at under “Policies and Procedures”
  • Visit to view information on REDCap, which is an online electronic research data management tool.
  • Ensure that data are secured according to HS standards. IS policies are available through HealthPort and the IS web page.

v.3.18.14NSLIJHS