PCORnet Data Strategy – Draft

Summary:

  • Promote and maintain a PCORnet common data model (CDM) that includes designations for both “core” (required) data domains and “optional” data domains or elements, with a governance process that allows the review and approval of CDM extensions (see below) that have been submitted by PCORnet stakeholders (e.g., CRG’s and PCORnet studies) for consideration of incorporation into the core CDM.
  • Develop the governance and processes necessary to allow network participants to pilot their own Common Data Model extensions, including the use of external sources,while ensuring alignment when multiple efforts target the same element or domain. This will:
  • Enable organizations to plan participation in the network at varying levels of engagement, data standardization, and data specialization as resources permit,
  • Allow the resulting extensions to be utilized by other network partners.
  • Support the ability of a distributed lead role for the purposes of research that maintains PCORnet’s commitment to the use of high-quality, well-characterized data
  • The PCORnet Executive Committee shall charge members of the PCORnet Research and Data Committees, and Distributed Research Operations Center to work together to identify high-value examples of fundable research that would benefit from this approach and define the processes by which those studies would be supported.

Rationale and Approach:

The initial phases ofPCORnet have been focused on comparative effectiveness research (CER) utilizing those data domains commonly available in electronic health record (EHR) and claims data sources (e.g., demographics, diagnoses, procedures, medications, laboratory results, etc.). This focus has allowed PCORnet to leverage previous federal investments in distributed research network (DRN) infrastructure and to demonstrate value to stakeholders by quickly engaging in query activity that can ask questions of data collected on millions of patients. As illustrated by the level of interest in the Collaborative Research Groups (CRGs) and the range of responses to the Common Data Model (CDM) Survey commissioned by the PCORnet Data Committee, there are dozens of additional data elements, data sources or data domains that are of interest and already curated by at least one network or network participant.

Given resource constraints, expansions or modifications to the CDM should be in service of the overall goal of sustainability, which is achieved when PCORnet can answer the research questions ofthose stakeholders who are in a position to provide financial support for network operations.

To address this need, we recommend transitioning to an iterative, collaborative approach that seeks to leverage external data resourcesas a way for multiple institutions to work together to prototype or pilot topic-specific extensions to the CDM. Every expansion of the CDM increases the overall maintenance costs of the network, to both partners [to create and maintain extract, transform and load (ETL) procedures] and the Coordinating Center (to define data checks and data characterization routines). As a result, in some cases it may be more cost effective to incorporate external sources that have already standardized data for a given domain or population than it is to try and include that information in the core CDM tables, though there will still be a need to characterize the external data to assess their overall fitness for use.

As the breadth of the CDM expands, the number of domains that are truly “common” for large numbers of patients will decrease. This means that future components may only be applicable to those networks or CRGs with an interest in a given topic area, potentially limiting their use in “network-wide” queries. Future expansions of the PCORnet CDM will need to balance efforts that expand the comprehensiveness of existing domains (e.g., including additional laboratory results or vital signs) with the inclusions of those domains that may only apply to a subset of the network population (e.g., tumor staging for cancer patients). It is also important to notethat regardless of how many resources are allocated to theinstantiation andsupport of a CDM, there will always be studies that will rely on data that are stored outside of it– information about the nature of these outside data will help potential clients to PCORnet assess fitness of possible performance sites for research.

Thus, we recommend movingto amore flexiblecommon data model frameworkthat defines both a core CDM as well as a process by which data can be incorporated back into the CDM, allowing subsets of the network to pilot their own extensions that will model this flexible CDM framework. Potential extensions may come from the CRGs, PCORnet studies, or from network participants who have access to other external data sources that can be used for research. This, in turn, can inform the definition of sponsored data elementsor domains which can developed into optional PCORnet CDM extensions that are created for a specific project. Using standard governing and prioritization processes, which would ensure that multiple efforts around the same domain/element are aligned and harmonized, participating organizations can develop, test, and implement these elements with community input and specific funding from the sponsor.This would eventually include projects where an individual site or network functions as a Coordinating Center. These may be optional data elements for sites not participating in the specific sponsored research, but data quality checks and other guidance will be publicly available. Should it become apparent that a data element or domain is critical, it could be promoted to arequired data elementor domain and be incorporated back into the core CDM. The result will bean iterative evolution in PCORnet’s data strategy to allow for more distributed development with coordination to ensure alignment across the disparate efforts. Acore CDM can be linked to externaldata sourcesthat function as optional extensions that can answer specific research questions.

AlternativesConsidered:

There are discussions as to the scope of a new major release of the CDM (v4.0) and how this expansion can support the activities of the CRGs in time for Data Characterization Cycle 4, currently slated for execution in late 2017/early 2018. Given the time needed for the CDM feedback cycles and approval process, to define and program new data characterization routines/data checks, and to implement any CDM changes, any new content for CDM v4.0 would need to be specified around the end of Q1 of CY2017. Options considered by the Data Committee for this expansion included the following:

  • A focus on those elements that are common across CRGs, available within CDRNs and can be incorporated into the existing CDM with relative ease (e.g., additional laboratory results)
  • Incorporation of a small number of additional domains that are of high-value to many CRGs (e.g., Pulmonary Function Test results) and relatively available among CDRNs.
  • Piloting the process described above to allow for the distributed development of optional CDM components

Data Committee Recommendation:

To allow the CRGs and other stakeholders to more rapidly develop and define their data elements and domains of interest, the PCORnet Data Committee recommends that CDM v4.0 be focused on the process to pilot the distributed development of optional CDM components. To support this transition, it is critical to first establish the processes and governance to ensure that this approach will align with the expectations of stakeholders such as the CRGs and the support provided to the Coordinating Center and network partners. To that end, under the oversight of the PCORnet Executive Committee (EC), a series of high-value, targeted research questions or topics should be identified that could be answered using this approach, and members of the PCORnet Research and Data Committees should work in collaboration with the PCORnet DRN Operations Center to identify the issues that must be addressed and funding necessary to define a repeatable process that ensures successful implementation and continued alignment. These findings would then be incorporated into the existing processes and support that govern the expansion of and extensions to the PCORnet CDM. They would also inform the creation of policies that clearly articulate the commitments expected of partners under this new strategy.