RWR CONTEXT

Study sponsors should ensure they have documented policies and procedures in place that enable them to address these FDA recommendations, so that they can systematically assess and use (appropriate quality) registry data as a source of real world data (RWD) to support their drug development strategies, new drug applications (NDAs)/marketing authorisation applications (MAAs), label extensions and post-marketing commitments (PMCs)/post-marketing requirements (PMRs).

In November 2021, the FDA’s published its draft Guidance on “Real-World Data: Assessing Registries to Support Regulatory Decision-Making for Drug and Biological Products” [Link] [1].

This FDA guidance aligns with the EMA’s Guideline on Registry-Based Studies, which was published October 2021.  We’ll discuss the EMA Guideline in detail in the March 2022 RWR Regulatory Updates Report [2].

According to the FDA, whether registry data are fit-for-use in regulatory decision-making depends on the attributes that support the collection of relevant and reliable data as well as additional scientific considerations related to study design and study conduct (as per Section I of the Draft FDA Guidance) [1].

In this article we explore the scientific aspects (e.g., strengths and limitations) and quality aspects of registries (e.g., policies and procedures) that registry owners and study sponsors should consider when addressing these FDA recommendations.

What is a Registry?

Definitions are important because they provide the parameters around which the guidelines and legislation are built.  Definitions help us understand what is applicable/relevant and therefore what we need to comply with.

In the US, the term ‘registry’ is often used to describe the data collection system (registry) and the clinical study that uses the data from the data collection system (registry-based study e.g., non-interventional study).

So, what is a registry in the context of this latest FDA draft guidance?

Registry: A registry is defined as an organized system that collects clinical and other data in a standardized format for a population defined by a particular disease, condition, or exposure. Establishing registries involves enrolling a predefined population and collecting pre-specified health-related data for each patient in that population (patient-level data) (as per Section II of the Draft FDA Guidance) [1].

This context and definition are important because they help us understand why the FDA specifically draws out the uses of registry data in a regulatory context e.g., to inform the design and support the conduct of either interventional studies (clinical trials) or non-interventional (observational) studies.  

Meaning? Non-interventional (observational) studies are not registries…they are clinical studies that use registry data…they are registry-based studies.  So, when we talk about using registry data to support regulatory decisions, think of this in the context of registry data being used as a source of real world data (RWD) for non-interventional studies (and/or clinical trials) which generate the real world evidence (RWE) that is submitted to the FDA as part of (for example) a new drug application (NDA).

Uses of Registry Data

Registries have the potential to support medical product development, and registry data can ultimately be used, when appropriate, to inform the design and support the conduct of either interventional studies (clinical trials) or non-interventional (observational) studies (as per Section II of the Draft FDA Guidance) [1]. 

Examples of such uses include, but are not limited to:

  • Characterizing the natural history of a disease
  • Providing information that can help determine sample size, selection criteria, and study endpoints when planning an interventional study
  • Selecting suitable study participants—based on factors such as demographic characteristics, disease duration or severity, and past history or response to prior therapy—to include in an interventional study (e.g., randomized trial) that will assign a drug to assess that drug’s safety or effectiveness
  • Identifying biomarkers or clinical characteristics that are associated with important clinical outcomes of relevance to the planning of interventional and non-interventional studies
  • Supporting, in appropriate clinical circumstances, inferences about safety and effectiveness in the context of:
      • A non-interventional study evaluating a drug received during routine medical practice and captured by the registry
      • An externally controlled trial including registry data as an external control arm

The data collected in a given registry and the procedures for data collection are relevant when considering how registry data can be used. For example, registries used for quality assurance purposes related to the delivery of care for a particular health care institution or health care system tend to collect limited data related to the provision of care. Registries designed to address specific research questions tend to systematically collect longitudinal data in a defined population, on factors characterizing patients’ clinical status, treatments received, and subsequent clinical events (as per Section II of the Draft FDA Guidance) [1].

Using Registry Data to Support Regulatory Decisions

[Garbage in = Garbage out]

Image Source: https://xkcd.com/2295/ 

Before using any RWD (including registry data) for regulatory decision-making, sponsors should consider whether the data are fit-for-use by assessing the data’s relevance and reliability. The term relevance includes the availability of key data elements (patient characteristics, exposures, outcomes) and a sufficient number of representative patients for the study, and the term reliability includes data accuracy, completeness, provenance, and traceability (as per Section III.A of the Draft FDA Guidance) [1].

Data Accuracy = Correctness of collection, transmission, and processing of data (as per the Glossary of the Draft FDA Guidance) [1].

Completeness = The presence of the necessary data to address the study question, design, and analysis (as per the Glossary of the Draft FDA Guidance) [1].

Provenance = An audit trail that “accounts for the origin of a piece of data (in a database, document or repository) together with an explanation of how and why it got to the present place” (as per the Glossary of the Draft FDA Guidance) [1].

Traceability = Permits an understanding of the relationships between the analysis results (tables, listings, and figures in the study report), analysis datasets, tabulation datasets, and source data (as per the Glossary of the Draft FDA Guidance) [1].

Registry data can have varying degrees of suitability within a regulatory context, depending on several factors, including how the data are intended to be used for regulatory purposes; the patient population enrolled; the data collected; and how registry datasets are created, maintained, curated, and analyzed. Registry data collected initially for one purpose (e.g., to obtain comprehensive clinical information on patients with a particular disease) may or may not be fit for-use for another purpose (e.g., to examine a drug-outcome association in a subset of these patients) (as per Section III.A of the Draft FDA Guidance) [1].

According to the FDA, sponsors should consider both the strength and limitations of using registries as a source of data to generate evidence for regulatory decision-making (as per Section III.A of the Draft FDA Guidance) [1].

Registry strengths:

    • Registries may have advantages over other RWD sources, given that registries collect structured and predetermined data elements and can offer longitudinal, curated data about a defined population of patients and their corresponding disease course, complications, and medical care. 
    • Registries can systematically collect patient-reported data that medical claims datasets or EHR datasets may lack.

Registry limitations:

  • Existing registries may focus on one disease, with limited information on comorbid conditions, even after linkage to other data sources.
  • Enrolled patients may not be representative of the target population of interest due to challenges related to patient recruitment and retention.
      • For example, patients with more severe disease may be more likely to be enrolled in a registry compared to patients with milder disease; or enrolled patients might have different self-care practices, socioeconomic backgrounds, or levels of supportive care versus the entire population of interest. These issues can potentially introduce bias into analyses that make use of registry data.
  • Additional potential limitations of registries involve issues with data heterogeneity (e.g., different clinical characteristics across various populations) and variation in approaches used to address data quality.

Relevance of Registry Data

When considering whether to use an existing registry for regulatory purposes, a sponsor’s overall assessment of the relevance of registry data should consider whether the registry is adequate for evaluating the scientific objectives (as per Section III.B of the Draft FDA Guidance) [1].

For example, the EMA recommends conducting a feasibility analysis prior to writing the study protocol, to guide its development and facilitate the discussion with national competent authorities (e.g., FDA, EMA), health technology assessors (HTAs) and other parties. The feasibility analysis should be performed in collaboration with registry holders and include the following information, as applicable (as per Section 3.3 of the EMA – Guideline on Registry-Based Studies, October 2021) [2]:

  • General Description – General description of the registry or network of registries; the Checklist for evaluating the suitability of registries for registry-based studies can be used to prepare this description; the epidemiology of the disease, this is more precise, medicines use and standards of care applied in the country or registry setting should be described if relevant for the specific study.
  • Availability of Core Data Elements – Analysis of the availability in the registry of the core data elements needed for the planned study period (as availability of data elements may vary over time), including relevant confounding and effect-modifying variables, whether they are mapped to any standard terminologies (e.g., MedDRA, OMOP common data model), the frequency of their recording and the capacity to collect any additional data elements or introduce additional data collection methods if necessary .
  • Quality and Completeness of the Data Elements – Analysis of the quality, completeness and timeliness of the available data elements needed for the study, including information on missing data and possible data imputations, risk of duplicate data for the same patient, results of any verification or validation performed (e.g., through an audit), analysis of the differences between several registries available in the network and their possible impact on data integration, description of the methods applied for data linkage as applicable, and possible interoperability measures that can be adopted.
  • Adverse Event Reporting Processes – Description of processes in place for the identification of adverse events and prompt reporting of suspected adverse reactions occurring in the course of treatments, and capacity to introduce additional processes for their collection and reporting if needed.
  • Study Size and Patient Recruitment – Study size estimation and analysis of the time needed to complete patient recruitment for the clinical study by providing available data on the number of centres involved in the registry(ies), numbers of registered patients and active patients, number of new patients enrolled per month/year, number of patients exposed to the medicinal product(s) of interest, duration of follow-up, missing data and losses to follow-up, need and possibility to obtain informed consent.
  • Bias – Evaluation of any potential information bias, selection bias due to the inclusion/exclusion criteria of centres (e.g., primary, secondary or tertiary care) and patients, potential time-related bias between and within registry(ies), and potential bias due to loss to follow-up.
  • Confounding – Evaluation of any potential confounding that may arise, especially if some data elements cannot be collected or measured.
  • Analytical Issues – Analytical issues that may arise based on the data characteristics and the study design.
  • Data Privacy – Any data privacy issues, possible limitations in relation to informed consent and governance related issues such as data access, data sharing and funding source.
  • Suitability of the Registry – Overall evaluation of the suitability of the registry for the specific study, taking into account any missing information on the above-mentioned aspects.

Reliability of Registry Data

When considering using an existing registry or establishing a new registry, sponsors should ensure there are processes and procedures to govern (as per Section III.C of the Draft FDA Guidance) [1]:

  • Registry operation
  • Education and training of registry staff
  • Resource planning
  • General practices that help ensure the quality of the registry data. 

Such governance attributes help ensure that the registry can achieve its objectives and should include, but not be limited to:

  • An established data dictionary and rules for the validation of queries and edit checks of registry data (as applicable), to be made available for those who intend to use the registry data to perform analyses
      • To support the collection of reliable data within a registry, a registry’s data dictionary should include:
        1. Data elements and how the data elements are defined 
        2. Ranges and allowable values for the data elements 
        3. Reference to the source data for the data elements
  • Defined processes and procedures for the registry, such as:
      • Data collection, curation, management, and storage, including processes in place to help ensure that data within a registry can be confirmed by source data (as applicable) for that registry
      • Plans for how patients, researchers, and clinicians will access and interact with the registry data and the registry’s data collection systems
      • Terms and conditions for use of the registry data by parties other than the registry creator (e.g., terms and conditions a sponsor should satisfy to permit combining the registry data with data from another source)
  • Conformance with 21 CFR part 11, as applicable, including maintenance of access controls and audit trails to demonstrate the provenance of the registry data and to support traceability of the data

Factors that FDA considers when assessing the reliability of registry data include (as per Section III.C of the Draft FDA Guidance):

  • How the data were collected (data accrual)
  • Whether the registry personnel and processes in place during data collection and analysis provide adequate assurance that errors are minimized and that data integrity is sufficient. 
  • Whether the registry has privacy and security controls in place to ensure that the confidentiality and security of data are preserved.

Quality Considerations when Using RWD from Registries to Support Regulatory Decisions

Based on the draft guidance provided by the FDA in their November 2021 publication, what quality aspects of registries (e.g., policies and procedures) should registry owners and study sponsors should consider when addressing these FDA recommendations?

Quality Consideration #1: Policies and Procedures to Support FDA Review of Submissions that Include Registry Data (as per Section III.E of the Draft FDA Guidance) [1].

Sponsors interested in using a specific registry as a data source to support a regulatory decision should meet with the relevant FDA review division before conducting a study that will include registry data (as per Section III.E of the Draft FDA Guidance) [1]. 

Sponsors should:

  • Confer with FDA regarding:
      • The ability to accurately define and evaluate the target population based on the planned inclusion and exclusion criteria
      • Which data elements will come from the registry (versus other data sources) and their adequacy, as well as the frequency and timing of data collection
      • The planned approach for linking the registry to another registry or other data system, when linking is anticipated
      • The planned methods to ascertain and validate outcomes, including diagnostic requirements and the level of validation or adjudication of outcomes FDA agrees is needed
      • The planned methods to validate the diagnosis of the disease being studied.
  • Submit protocols and statistical analysis plans for FDA review and comment before conducting an interventional or a non-interventional study when including data from registries.
  • Predefine all essential elements of a registry study’s design, analysis, and conduct in the protocol and describe how that element will be ascertained from the selected RWD source or sources.
  • Ensure that patient-level data are provided to FDA in accordance with applicable legal and regulatory requirements.
  • Ensure that source records necessary to verify the RWD are made available for inspection as applicable.

Quality Consideration #2: Conduct a feasibility analysis of the registry to guide protocol development and facilitate discussions with regulators (as per Section III.B of the Draft FDA Guidance) [1].

  • Conduct a feasibility analysis prior to writing the study protocol, to guide its development and facilitate the discussion with national competent authorities (e.g., FDA, EMA), health technology assessors (HTAs) and other parties. The feasibility analysis should be performed in collaboration with registry holders (as per Section 3.3 of the EMA – Guideline on Registry-Based Studies, October 2021) [2].

Quality Consideration #3: Policies and procedures should be in place to support the reliability of the registry data, including (as per Section III.C of the Draft FDA Guidance) [1]:

  • Pre-specifying data validation rules for queries and edit checks of registry data
  • Validating the electronic systems used to collect registry data
  • Enabling FDA and persons interested in using the registry’s data to assess the quality of the data, including to help address issues such as errors in coding or interpretation of the source document or documents, as well as data entry, transfer, or transformation errors. 
  • Plans for how patients, researchers, and clinicians will access and interact with the registry data and the registry’s data collection systems
  • Terms and conditions for use of the registry data by parties other than the registry creator (e.g., terms and conditions a sponsor should satisfy to permit combining the registry data with data from another source)

Quality Consideration #4: Policies and Procedures for Linking a Registry to Another Registry or Another Data System (as per Section III.D of the Draft FDA Guidance) [1]. 

If a registry is to be populated with data from another data system, sponsors should:

  • Consider the potential impact of the additional data on overall integrity of the registry data. 
  • Use strategies to correct for redundant data, to resolve any inconsistencies in the data, and to address other potential problems, such as the ability to protect patient privacy while transferring data securely. 
  • Have a plan for addressing the adequacy of patient-level linkages (i.e., that the same patient is being matched). 
  • Consider any jurisdictional requirements (e.g., country-specific laws) when seeking to link patient-level data to another registry or data system.
  • Consider whether the data sources to be linked are interoperable and support appropriate informatics strategies to ensure data integration.
  • Ensure that:
    1. Sufficient testing is conducted to demonstrate interoperability of the linked data systems, 
    2. The automated electronic transmission of data elements to the registry functions in a consistent and repeatable fashion, and 
    3. Data are accurately, consistently, and completely transmitted.
  • Use predefined rules to check for logical consistency and value ranges to confirm that data within a registry were retrieved accurately from a linked data source and that the operational definitions for the linked variables are aligned.

Quality Consideration #5: Documentation of the Process Used to Validate the Transfer of Data (as per Section III.D of the Draft FDA Guidance) [1].

Documentation of the process sponsors used to validate the transfer of data should be available for FDA to review during sponsor inspections. Sponsors should also ensure that software updates to the registry database or additional data sources do not affect the integrity, interoperability, and security of data transmitted to the registry. For example, issues such as the correct temporal alignment of linked data and registry data should be considered (as per Section III.D of the Draft FDA Guidance) [1].

The appropriateness of using additional data sources also depends on how the sponsor intends to use the linked data and the ability to obtain similar data for all patients. For example, for each potential data source, the sponsor should consider whether:

  • The linkage is appropriate for the proposed research question (e.g., the additional data source provides relevant clinical detail and/or long-term follow-up information)
  • The data can be accurately matched to patients in the registry and whether linking records between the two (or more) databases can be performed accurately
  • The variables of interest in the registry and additional data sources have consistent definitions and reliable ascertainment approaches
  • The data have been captured with sufficient accuracy, consistency, and completeness to meet registry objectives

After a sponsor decides to use an additional data source or sources to supplement the registry, the sponsor should: 

  • Develop the approach and algorithms needed to link such data to a registry.
  • Determine how data integrity will be evaluated, including how assessments of any inaccuracies introduced by the linkage (e.g., overcounts of a particular data measure) will be made. 
  • Use appropriate methods for data entry, coding, cleaning, and transformation for each linked data source.

Quality Consideration #6: Policies and Procedures to Support Data Management Strategies, including (as per Section III.C of the Draft FDA Guidance) [1]:

  • Standard Operation Procedures (SOPs) for Data Aggregation and Data Curation: Trained staff should follow standard operating procedures to aggregate data for a registry and carry out data curation
  • Implement and maintain version control by documenting the date, time, and originator of data entered in the registry; performing preventative and/or corrective actions to address changes to the data (including flagging erroneous data without deleting the erroneous data, while inserting the corrected data for subsequent use); and describing reasons for any changes to data without obscuring previous entries.
  • Ensure data transferred from another data format or system are not altered in the migration process
  • Seek to integrate data in the registry that were previously collected using data formats or technology (e.g., operating systems, hardware, software) that are now outdated
  • Account for changes in clinical information over time (e.g., criteria for disease diagnosis, cancer staging)
  • Explain the auditing rules and methods used and the mitigation strategies used to reduce errors 
  • Describe the types of errors that were identified based on audit findings and how the data were corrected

Quality Consideration #7: Periodic Assessment of Data Consistency, Accuracy and Completeness (as per Section III.C of the Draft FDA Guidance) [1].

  • Adequate controls should be in place to ensure confidence in the reliability, quality, and integrity of the electronic source data [4]
  • Indicators of data consistency, accuracy, and completeness should be assessed periodically, with the frequency dependent on the purposes of the registry data (e.g., for the sole purpose of facilitating recruitment in a randomized controlled trial versus using the registry data in an interventional or non-interventional study analysis). 
  • Routine descriptive statistical analyses should be performed to detect the extent of any missing data, inconsistent data, outliers, and losses to follow-up

Conclusion

Whether registry data are fit-for-use in regulatory decision-making (e.g., as a data source for non-interventional studies) depends on the attributes that support the collection of relevant and reliable data as well as additional scientific considerations related to study design and study conduct (as per Section I of the Draft FDA Guidance) [1].

What does this mean for sponsors who are looking to utilise existing disease registries and their associated real world data (RWD) to support their drug development and life cycle management activities?

Study sponsors should ensure they have documented policies and procedures in place that enable them to address these FDA recommendations, so that they can systematically assess and use (appropriate quality) registry data as a source of real world data (RWD) to support their drug development strategies, new drug applications (NDAs)/marketing authorisation applications (MAAs), label extensions and post-marketing commitments (PMCs)/post-marketing requirements (PMRs).

Examples of the policies, procedures and documentation recommended in the draft FDA guidance [1], include:

    1. Policies and procedures to support FDA review of submissions that Include registry data (Study Sponsor).
    2. Conducting a feasibility analysis of the registry to guide protocol development and facilitate discussions with regulators (Sponsor).
    3. Policies and procedures to support the reliability of the registry data (Registry Owner).
    4. Policies and procedures for linking a registry to another registry or another data system (Registry Owner).
    5. Documentation of the process(es) used to validate the transfer of data (Registry Owner and Study Sponsor).
    6. Policies and procedures to support data management strategies (Registry Owner and Study Sponsor).
    7. Periodic assessment of data consistency, accuracy, and completeness (Registry Owner and Study Sponsor).

References

1. Draft FDA Guidance – Real-World Data: Assessing Registries to Support Regulatory Decision-Making for Drug and Biological Products Guidance for Industry (November 2021)

Link: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/real-world-data-assessing-registries-support-regulatory-decision-making-drug-and-biological-products  

2. EMA – Guideline on Registry-Based Studies (October 2021)

Link: https://www.ema.europa.eu/en/guideline-registry-based-studies-0  

3. Draft FDA Guidance – Use of Electronic Records and Electronic Signatures in Clinical Investigations Under 21 CFR Part 11 — Questions and Answers (June 2017)

Link: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/use-electronic-records-and-electronic-signatures-clinical-investigations-under-21-cfr-part-11 

4. FDA Guidance – Electronic Source Data in Clinical Investigations (September 2013)

Link: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/electronic-source-data-clinical-investigations 

Useful Links

21 CFR 11 – Electronic Records; Electronic Signatures

Link: https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-11  

EUnetHTA – REQueST Tool and its vision paper (September 2019)

Link: https://www.eunethta.eu/request-tool-and-its-vision-paper/