RWR Insights | USA – Lessons Learned from FDA Reviews of External Control Arms
FEBRUARY 2023 – On the 1 February 2023 the FDA published draft guidance on ‘Considerations for the Design and Conduct of Externally Controlled Trials for Drug and Biological Products’. We have provided a summary of this new draft FDA guidance earlier in the report…Read More.
Here we explore some recent examples of external control arms reviewed by the FDA, highlight the limitations identified, and provide recommendations (Lessons Learned) for optimising the acceptability of external controls for regulatory decision making.
Based on the 3 case studies below, the FDA identified the following common limitations to the acceptability of external controls for regulatory decision making:
-
-
- Post-hoc analysis (lack of prior FDA review of the protocol and SAP)
- Selection Criteria Issues (selection bias, misclassification, and confounding)
- Index Date Issues (immortal time bias)
- Comparability Issues (lack of comparability, missing data, confounding bias)
- Limited Sample Size (Lack of statistical power due to insufficient real world cohort size)
- Real world data not fit for purpose
-
First, let’s set the foundation and have a quick look at the purpose of a control group in a clinical trial and the trends and uses of external control arms.
What is the Purpose of a Control Group?
-
-
- Control groups have one major purpose: to allow discrimination of patient outcomes (for example, changes in symptoms, signs, or other morbidity) caused by the test treatment from outcomes caused by other factors, such as the natural progression of the disease, observer or patient expectations, or other treatment (Section 1.2 of ICH E10) [2].
- The control group experience tells us what would have happened to patients if they had not received the test treatment or if they had received a different treatment known to be effective (Section 1.2 of ICH E10) [2].
- In most situations, a concurrent control group is needed because it is not possible to predict outcome with adequate accuracy or certainty (Section 1.2 of ICH E10) [2].
- A concurrent control group is one chosen from the same population as the test group and treated in a defined way as part of the same trial that studies the test treatment, and over the same period of time (Section 1.2 of ICH E10) [2].
- The test and control groups should be similar with regard to all baseline and on-treatment variables that could influence outcome, except for the study treatment (Section 1.2 of ICH E10) [2].
- Failure to achieve this similarity can introduce a bias into the study (Section 1.2 of ICH E10) [2].
- Bias here means the systematic tendency of any aspects of the design, conduct, analysis, and interpretation of the results of clinical trials to make the estimate of a treatment effect deviate from its true value (Section 1.2 of ICH E10) [2].
- Randomization and blinding are the two techniques usually used to minimize the chance of such bias and to ensure that the test treatment and control groups are similar at the start of the study and are treated similarly in the course of the study. Whether a trial design includes these features is a critical determinant of its quality and persuasiveness (Section 1.2 of ICH E10) [2].
-
External Control Arms – Trends and Uses
-
-
- Accelerated approval programs for high morbidity and high unmet need diseases have driven the use of single-arm studies, studies that do not include placebo or active comparator arms (ie, no concurrent control), for drug development (Jaksa et al., 2022) [3].
- External control arms (ECAs) generated from real world data (RWD) are emerging to contextualize single-arm trial data by exploring what would happen if single-arm trial patients did not receive the study drug (Jaksa et al., 2022) [3].
- Single-arm studies speed up patient access to innovative treatments because they often require fewer patients than randomized controlled trials (RCTs) and use intermediate or surrogate endpoints (e.g., objective response rate [ORR]) (Jaksa et al., 2022) [3].
- Oncology drug development has been increasingly relying on single-arm studies: from 1992 to 2017, 67% of the Food and Drug Administration’s (FDA) accelerated approvals were based on single-arm trials (Jaksa et al., 2022) [3].
- Similar trends have been observed in health technology assessment (HTA) submissions. From 2000 to 2016, 22 submissions to the National Institute for Health and Care Excellence (NICE) in the UK were based on non-randomized data, and oncology drugs accounted for more than half of these submissions (Jaksa et al., 2022) [3].
-
Case Study #1 – Rozlytrek (Entrectinib) for the Treatment of ROS1-Positive, Advanced Non-Small Cell Lung Cancer (NSCLC)
FDA (NDA 212725) = Priority Review. Breakthrough Therapy. Orphan Drug Designation. Accelerated Approval (15 August 2019) [4] [5].
This study report reviewed by the FDA contained a comparative analysis of time to treatment discontinuation (TTD), progression free survival (PFS), and overall survival (OS). It compared patients with ROS1-positive NSCLC receiving entrectinib in three single arm clinical trials (ALKA, STARTRK-1, STARTRK-2) and patients with ROS1-positive NSCLC receiving crizotinib in the real world captured by the Flatiron Health Analytic Database [6].
Is the Crizotinib RWE arm sufficient to establish the natural history of disease for ROS1-positive NSCLC?
FDA Review [7]:
-
-
- The crizotinib arm is unlikely to be generalizable to the entire population of patients with ROS1-positive NSCLC
- The generalizability of the RWE control arm was limited by the low rate of ROS1 testing in clinical practice and resultant sensitivity (estimated as 15%-30%) and the high proportion of community-treated patients in the selected data source.
- Examination of baseline characteristics demonstrates that the crizotinib arm is not sufficiently comparable to the entrectinib clinical trial population.
-
FDA Review Conclusions [8]:
-
-
- While the crizotinib population identified may be representative to patients who currently receive treatment for ROS1-positive NSCLC in the community setting, it is not generalizable to the entire ROS1-positive NSCLC population and it is not generalizable to patients enrolled in entrectinib clinical trials.
-
Does the study methodology provided allow for a comparison of treatment outcomes between the entrectinib arm and crizotinib arm in this study?
FDA Review [7]:
-
-
- The study identified substantial differences in time to treatment discontinuation (TTD), progression-free survival (PFS), and overall survival (OS) between study arms, all favoring the entrectinib arm
- Differentially implemented study eligibility criteria, resultant differences in baseline criteria, and limitations in statistical modeling due to low sample size make it difficult to determine what proportion of the observed differences in rates of clinical outcomes are due to imbalances in study populations at baseline (i.e. selection bias) versus differential treatment effects of the study drugs = This limits comparison of study arms.
- Additionally, despite a well-done attempt at defining treatment outcomes, there were limitations. TTD is complicated by treatment beyond disease progression, PFS is limited by missingness in radiographic imaging within electronic medical record data, and OS may be more subject to bias from baseline imbalances
-
FDA Review Conclusions [8]:
-
-
- This study report is not adequate to allow a robust comparison of treatment outcomes between crizotinib and entrectinib study arms.
-
List of key external control arm limitations identified by the FDA [9]:
-
-
- Post-hoc analysis (lack of prior FDA review of the protocol and statistical analysis plan)
- Selection Bias
-
- This is the greatest threat to study validity for the comparison of study arms. Substantial differences in baseline covariates were observed. While this is a generally well-done study report, it is unlikely these differences can be overcome with the provided analyses [10]
-
- Missing Data Among Covariates and Missing Covariates
-
- The Applicant did not try to replicate all the study eligibility criteria in this RWE protocol, likely because data to implement them are missing for many inclusion and exclusion criteria in the crizotinib RWE arm [10].
- It would have been useful for the Applicant to evaluate all eligibility criteria to the extent possible, especially baseline laboratory data [10].
- It is noteworthy that ECOG was missing in 55.1% of patients in the crizotinib arm [10]
-
- Statistical Modelling…limited by sample size
- Measure of Study Outcomes
-
- This study report provides a generally acceptable definition of study outcomes given limitations of available data. It does have limitations. Time to Treatment Discontinuation (TTD) is complicated by treatment beyond disease progression, Progression Free Survival (PFS) is limited by lack of radiographic imaging in EMR data, and Overall Survival (OS) may be more subject to bias from baseline imbalances [10].
- This study report is not adequate to allow a robust comparison of treatment outcomes between crizotinib and entrectinib study arms [8].
-
-
Impact on Rozlytrek (Entrectinib) Approval = Label limitations. The resulting FDA label of Rozlytrek excluded Time to Treatment Discontinuation (TTD), Progression Free Survival (PFS), and Overall Survival (OS) outcomes, and only referenced improvements in Overall Response Rate (ORR) [11].
Case Study #2 – Xpovio (selinexor) for the Treatment of Adult Patients with Relapsed or Refractory Multiple Myeloma (RRMM)
FDA (NDA 212306)(Xpovio) = Regular Review. Orphan Designation. Accelerated Approval (3 July 2019) [12] [13]
In support of the NDA 212306 for selinexor, the Applicant submitted the results of a phase 2b, open label, single arm clinical trial (STORM) and an external control arm from analyses using retrospectively collected electronic health record (EHR) data (Study KS-50039 ) – below [12]:
Methodological Issues Identified by the FDA [14]:
-
-
- Selection Criteria Issues (selection bias, misclassification, and confounding)
-
- Substantial differences in the inclusion and exclusion criteria for the STORM and the external control (Flatiron Health Analytical Database (FHAD) cohorts) are likely to result in selection bias, misclassification, and confounding.
- For example, the Applicant cited real-world overall survival (OS) of patients with penta-exposed, triple-class refractory MM as 3.5–3.7 months; however, patients with less than 4 months life expectancy were excluded from STORM. An exclusion criterion for minimal life expectancy was not implemented for the FHAD population. Differences in selection criteria between the study arms systematically ensure that the STORM cohort will have longer expected OS compared to FHAD cohort
-
- Index Date Issues (immortal time bias)
-
- Systematic differences in how the index date was defined may have resulted in biased results (immortal time bias).
- The definition of the index date has a direct effect on the length of the observed survival time intervals.
- The index date to start assessment of overall survival, for both the STORM trial and FHAD, was the date upon which a patient failed his or her last treatment. Using this index date, some FHAD patients could have exhausted all treatment options and could not be indexed at their next treatment (FDA note that 27/64 FHAD patients had no subsequent treatment and should have been excluded from the study).
- However, in STORM, all patients must survive until randomization (initiation of selinexor) by design. Thus, person-time between failure of the prior therapy and randomization is “immortal” by design in STORM. It is unknown how many months of immortal time this represents on average.
-
- Comparability Issues (lack of comparability, missing data, confounding bias)
-
- oIn addition to difference in inclusion and exclusion criteria, additional factors result in a lack of comparability between the FHAD and STORM cohorts.
- RWD analysis compares patients in STORM, who are sufficiently healthy to enroll in a clinical trial, versus patients in FHAD who may or may not receive additional therapy.
- Patients who have failed their current treatment but do not receive another treatment are likely have a lower expectation for overall survival.
- oPatients who would likely have been more similar to the STORM cohort were explicitly excluded from the FHAD cohort.
- FHAD had different prior treatment histories, and differential distributions of ECOG scores and missing data.
- Imbalances between treatment groups were not adequately accounted for in the design or analysis phases, which likely resulted in confounding bias, primarily favoring survival for the STORM cohort.
-
- Selection Criteria Issues (selection bias, misclassification, and confounding)
-
FDA Review Conclusions [15]:
-
-
- To enhance transparency and facilitate evaluation of validity, FDA requires submission of study protocols and statistical analysis plans (SAP) prior to study initiation. Pre-specification of study protocols and SAPs can preclude unplanned multiple testing and analyses, which may inflate Type I error probability and lead to spurious or un-reproducible findings. In support of NDA 212306 for selinexor, the Applicant submitted analyses using retrospectively collected electronic health record (EHR) data. However, neither the protocol or SAP for the selinexor RWD analysis was submitted to FDA prior to the conduct of the study. FDA was made aware of Study KS-50039 upon receiving the final study report on October 6, 2018 [15].
- Given the methodological limitations discussed above, we conclude that the evidence generated from the RWD analysis is not adequate to provide context or comparison for the overall survival observed in the STORM patients. This conclusion is based on the lack of comparability between the STORM and FHAD treatment groups. Furthermore, FDA’s analysis finds that post-hoc strategies to create greater comparability across cohorts were inadequate and resulted in very limited sample size and unstable estimates [15].
- Due to major methodological issues (including immortal time bias, selection bias, misclassification, confounding, and missing data), the FDA does not consider these results adequate to support regulatory decision making [15].
-
Impact on Xpovio (selinexor) Approval = The Oncology Drug Advisory Committee (ODAC) members voted in favor of delaying approval until results of the randomized phase 3 BOSTON trial are available [16].
Case Study #3 – Abecma (idecabtagene vicleucel/ide-cel) for the Treatment of Multiple Myeloma
FDA (BLA 125736)(Abecma) = Priority Review. Orphan Designation. Breakthrough Therapy Designation. Regular Approval (26 March 2021) [17] [18].
Idecabtagene vicleucel was approved for fourth-line relapse or refractory multiple myeloma by the FDA in 2021. The clinical evidence included a phase 2b, single-arm, multi-centre clinical trial (MM-001) and an external control arm (ECA) generated from real world data across multiple sources including clinical sites, registries, and a research database (Real world evidence study NDS-MM-003) . The goal of the external control arm was to provide an estimate of overall remission rate (ORR) in patients receiving at least 3 previous myeloma regimens [3] [19].
Study MM-001 – Enrolled 140 subjects and 127 were infused with conformal ide-cel [19].
Study NDS-MM-003 (Retrospective Observation Study using Real-World Data) – A global non-interventional retrospective study (NDS-MM-003) to compare the outcome of MM-001 study with a real-world cohort of relapsed and refractory myeloma patients treated with standard therapies. Patient level data from clinical sites, registries and research database was collated into a single data model using data cut off of 30 October 2019. Subjects in the eligible RRMM cohort received approximately 90 different treatment regimens predominantly as a combination of 3 or more drug regimens [20].
FDA Review Comments to the External Control Arm (Study NDS-MM-003) [20]:
-
-
- The Agency communicated concerns about the real-world evidence (RWE) study (NDS-MM-003) which was being conducted to provide an indirect comparison of effectiveness of bb2121.
- Issues with the RWE study include selection of a population which may not be comparable to subjects enrolled in Study MM-001 due to missing baseline patient characteristics, missing or absent data on efficacy assessments which may bias the outcomes and heterogeneity of real world data from different databases that will be collated for analysis.
-
- Selection criteria issues
- Missing data
- Real world data not fit for purpose
-
- Given the methodological limitations discussed above, we conclude that the evidence generated from the RW analysis is not adequate to provide context or comparison for the outcome of MM-001 study.
-
- Significant amount of missing data
- Differences in follow up and response assessment of subjects from these different sources may impact the interpretability of the study results.
- Significant heterogeneity in the RWE population limits its utility as a control arm.
- The follow up schedule for response assessment in RWE and clinical trial myeloma patients may be different. This can result in potential bias in the estimate of duration of response.
- Different response assessments for clinical trial vs RWE patients
- The efficacy results from the RWE study population are uninterpretable as compared to efficacy evaluable population determined by the Agency (N=100) and based on FDA adjudicated efficacy results.
-
- While it reiterates the challenges of an appropriate choice of a treatment in the control arm and supports the approach of considerations for a single arm study design in support of a primary study intended for marketing purposes, an alternative approach may be to consider a randomized controlled trial with investigator’s choice of treatment from pre-specified therapeutic options as the control arm.
-
FDA Review Conclusions for Abecma [20]:
-
-
- Given the methodological limitations discussed above, we conclude that the evidence generated from the external control arm (Study NDS-MM-003) is not adequate to provide context or comparison for the outcome of MM-001 study.
-
Conclusions and Recommendations
The FDA identified the following common limitations in the external control arms of the 3 case studies above:
-
-
- Post-hoc analysis (lack of prior FDA review of the protocol and SAP)
- Selection Criteria Issues (selection bias, misclassification, and confounding)
- Index Date Issues (immortal time bias)
- Comparability Issues (lack of comparability, missing data, confounding bias)
- Limited Sample Size (Lack of statistical power due to insufficient real world cohort size)
- Real world data not fit for purpose
-
Recommendations for optimising the acceptability of external controls for regulatory decision making, include:
-
-
- Apply the guidance provided in the new draft FDA guidance on the design and conduct of externally controlled trials [1]
- Design the external control to emulate the preferred randomised controlled trial – use a “target trial approach”
- Use appropriate Real world data sources to optimise comparability, relevance and generalizability of the control group population
- Seek early FDA review of the protocol and statistical analysis plan (a priori rather than post hoc)
- Assess and limit bias using the appropriate methodological/ statistical designs
- Use the START-RWE templates to demonstrate the what, how and why of the data sources, curation and analysis [21] [22]
- List the limitations of the real world data/ real world evidence in the protocol
-
References
1. FDA Draft Guidance – Considerations for the Design and Conduct of Externally Controlled Trials for Drug and Biological Products (February 2023)
2. ICH E10 – Choice of Control Group in Clinical Trials (July 2000)
Link: https://database.ich.org/sites/default/files/E10_Guideline.pdf
3. Ashley Jaksa, Anthony Louder, Christina Maksymiuk, Gerard T. Vondeling, Laura Martin, Nicolle Gatto, Eric Richards, Antoine Yver, Mats Rosenlund. A Comparison of Seven Oncology External Control Arm Case Studies: Critiques From Regulatory and Health Technology Assessment Agencies. Value in Health, 2022. ISSN 1098-3015, doi.org/10.1016/j.jval.2022.05.016. (25 June 2022)
4. FDA approves entrectinib for NTRK solid tumors and ROS-1 NSCLC (August 2019)
5. Drugs@FDA: FDA-Approved Drugs – Rozlytrek (NDA 212725)
Link: https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm?event=overview.process&ApplNo=212725
6. Section 3.1 – Study Overview (page 7) – CDER – Review of Study Report No WO40977: Comparative analysis of ROS1-positive locally advanced or metastatic non-small cell lung cancer between patients treated in entrectinib trials and crizotinib treated patients from real world data (11 July 2019)
Link: https://www.accessdata.fda.gov/drugsatfda_docs/nda/2019/212725Orig1s000,%20212726Orig1s000OtherR.pdf
7. CDER – Review of Study Report No WO40977: Comparative analysis of ROS1-positive locally advanced or metastatic non-small cell lung cancer between patients treated in entrectinib trials and crizotinib treated patients from real world data (11 July 2019)
Link: https://www.accessdata.fda.gov/drugsatfda_docs/nda/2019/212725Orig1s000,%20212726Orig1s000OtherR.pdf
8. Section 6 – Recommendations (page 28) – CDER – Review of Study Report No WO40977: Comparative analysis of ROS1-positive locally advanced or metastatic non-small cell lung cancer between patients treated in entrectinib trials and crizotinib treated patients from real world data (11 July 2019)
Link: https://www.accessdata.fda.gov/drugsatfda_docs/nda/2019/212725Orig1s000,%20212726Orig1s000OtherR.pdf
9. Section 8 – Appendix (page 30) – CDER – Review of Study Report No WO40977: Comparative analysis of ROS1-positive locally advanced or metastatic non-small cell lung cancer between patients treated in entrectinib trials and crizotinib treated patients from real world data (11 July 2019)
Link: https://www.accessdata.fda.gov/drugsatfda_docs/nda/2019/212725Orig1s000,%20212726Orig1s000OtherR.pdf
10. Section 4 – Discussion (page 27) – CDER – Review of Study Report No WO40977: Comparative analysis of ROS1-positive locally advanced or metastatic non-small cell lung cancer between patients treated in entrectinib trials and crizotinib treated patients from real world data (11 July 2019)
Link: https://www.accessdata.fda.gov/drugsatfda_docs/nda/2019/212725Orig1s000,%20212726Orig1s000OtherR.pdf
11. Learnings from three FDA decisions on ECA submissions in oncology, Aetion (November 2019)
Link: https://aetion.com/evidence-hub/learnings-from-three-fda-decisions-on-eca-submissions-in-oncology/
12. NDA 212306: Selinexor – Oncologic Drugs Advisory Committee Meeting – Introductory Comments – February 26, 2019
Link: https://www.fda.gov/media/121670/download
13. Drugs@FDA: FDA-Approved Drugs – Xpovio (NDA 212306)
Link: https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm?event=overview.process&ApplNo=212306
14. Section 7.2.6 (KS-50039 (Retrospective observational study using real-world data)) (page 84) – FDA – NDA/BLA Multi-disciplinary Review and Evaluation, NDA 212306, XPOVIO® (selinexor) (July 2019)
Link: https://www.accessdata.fda.gov/drugsatfda_docs/nda/2019/212306Orig1s000MultidisciplineR.pdf
15. 1.Section 7.2.6 (KS-50039 (Retrospective observational study using real-world data)) (page 84) – FDA – NDA/BLA Multi-disciplinary Review and Evaluation, NDA 212306, XPOVIO® (selinexor) (July 2019)
Link: https://www.accessdata.fda.gov/drugsatfda_docs/nda/2019/212306Orig1s000MultidisciplineR.pdf
16. Section 1.2 (Conclusions on the Substantial Evidence of Effectiveness ) (page 16) – FDA – NDA/BLA Multi-disciplinary Review and Evaluation, NDA 212306, XPOVIO® (selinexor) (July 2019)
Link: https://www.accessdata.fda.gov/drugsatfda_docs/nda/2019/212306Orig1s000MultidisciplineR.pdf
17. FDA – ABECMA (idecabtagene vicleucel) (21 April 2021)
Link: https://www.fda.gov/vaccines-blood-biologics/abecma-idecabtagene-vicleucel
18. FDA – Summary Basis for Regulatory Action – Abecma (26 March 2021)
Link: https://www.fda.gov/media/147627/download
19. FDA – BLA 125736 (Abecma) – Statistical Review (27 July 2020)
Link: https://www.fda.gov/media/147781/download
20. Section 9.2 (Aspect(s) of the Clinical Evaluation Not Previously Covered – Study NDS-MM-003 (Retrospective Observation Study using Real-World Data))(page 113) – FDA BLA Clinical Review Memorandum – STN 125736/0 (27 July 2020)
Link: https://www.fda.gov/media/147740/download
21. Wang S V, Pinheiro S, Hua W, Arlett P, Uyama Y, Berlin J A et al. STaRT-RWE: structured template for planning and reporting on the implementation of real world evidence studies BMJ 2021; 372 :m4856 doi:10.1136/bmj.m4856
Link: https://www.bmj.com/content/372/bmj.m4856
22. Harvard Dataverse – Structured Template and Reporting Tool for Real World Evidence (STaRT-RWE)
Link: https://dataverse.harvard.edu/dataverse/STaRT-RWE;jsessionid=952c11f48e3021a6ddcd8c9c9822