Secondary use of health data is the processing of health data for purposes other than the initial purposes for which the data were collected. This approach is becoming increasingly popular in real-world research (research that uses real world data to generate real world evidence) because of the large amounts of data that are available through various sources, (e.g., electronic health records, administrative databases, and social media), and the availability of AI-powered analytical tools [1].
In many cases, secondary data analysis can provide valuable insights and answer research questions that would otherwise be difficult or impossible to answer with primary data collection. For example, researchers can use existing data to study disease trends, evaluate the effectiveness of health interventions, and identify risk factors for various health outcomes.
In the world of clinical research we often refer to secondary data as ‘real world data (RWD)’ to distinguish it from data generated through clinical trials. As per FDA guidance, real-world data are data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources. Examples of RWD include data derived from electronic health records, medical claims data, data from product or disease registries, and data gathered from other sources (such as digital health technologies) that can inform on health status [2].
Traditionally, existing healthcare data are collected from medical records and processed to provide insights in to the safety and effectiveness of drugs etc. In Europe, we call these types of studies retrospective non-interventional studies. These are protocol-defined studies that require local regulatory approvals and can only collect data that was collected before the start of the study. This self-limits the usefulness of the research, especially given that the healthcare data will continue to be generated. For these reason, the emphasis is moving from ‘retrospective’ to ‘secondary use of existing data’ which can be both retrospective and prospective. The research is still non-interventional (or observational) because there are no healthcare interventions that impact the clinical management of the patient.
Under the General Data Protection Regulation (GDPR – Regulation EU/2016/676), personal data (especially health and genetic data – Article 9) must be collected and processed lawfully, fairly, and transparently, and individuals have the right to be informed about how their data is being used. This means that researchers should obtain explicit and informed consent from individuals to use their personal data for research purposes and the data should be pseudonymized or anonymized to protect individuals’ privacy [3].
The requirement for explicit informed consent from each individual can become problematic when the intention is to process (analyse) very large healthcare datasets, such as electronic health records in the context of scientific research. This is where GDPR becomes a facilitator, rather than the hindrance it was thought it would be when it was first implemented.
One of the key ways that the GDPR supports the secondary use of health data for research is through the concept of “legitimate interests”. Article 6(1)(f) of the GDPR allows for the processing of personal data if it is necessary for the legitimate interests of the data controller or a third party, provided that those interests do not override the fundamental rights and freedoms of the data subject. Scientific research can be considered a legitimate interest, provided that appropriate safeguards are in place to protect individuals’ rights and freedoms. In addition, the GDPR includes provisions that specifically address the use of health data for scientific research. For example, Article 9(2)(j) allows for the processing of special categories of personal data, such as health data, for scientific research purposes, provided that appropriate safeguards are in place. Whereas, Article 89(1) provides for further processing of existing data for scientific research when appropriate safeguards such as pseudonymisation, no longer permits the identification of data subjects [3].
GDPR indicates that personal data should be gathered for an identifiable purpose or purposes and not further processed for incompatible purposes. Therefore, processing for purposes that are compatible with the purpose of the original gathering and processing of the data are permitted. In addition, the GDPR goes further to indicate that further processing for research purposes are compatible with the original purpose. In the case of the GDPR, this is very positive for RWD processing. However, it is not without difficulties (Section 4.4.4 of the draft CIOMS Real-World Data and Real-World Evidence in Regulatory Decision Making)[4].
Currently, in the context of scientific research, GDPR (especially Article 89(1) is interpreted and implemented differently at the national level. As per recent European Commission reports, more harmonisation of the implementation of GDPR is required at the national Member State level [5] [6].
See visual example below.
This is particularly relevant to the proposed European Health Data Space and the creation of a federated network of health data hubs that will facilitate access to secondary health data, especially for research purposes (HealthData@EU) [7].
As per the recent draft CIOMS report, there is a strong argument that the processing of RWD only works where data subjects have trust and confidence in the institutions and individuals who process data that relate to them, and therefore a strong personal data protection regime is essential to the acceptance and operation of RWD processing. As noted above, further work is needed on issues regarding compatible processing of RWD (secondary use of existing data) in the absence of consent or where data were gathered to form a patient record (Chapter 5 of the draft CIOMS Real-World Data and Real-World Evidence in Regulatory Decision Making)[4].
References
1. World Health Organisation (WHO) – Meeting on Secondary Use of Health Data (13 December 2022)
2. FDA – Real-World Evidence
Link: https://www.fda.gov/science-research/science-and-research-special-topics/real-world-evidence
3. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation)
Link: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A02016R0679-20160504&qid=1687865819117
4. Real-World Data and Real-World Evidence in Regulatory Decision Making. CIOMS Working Group report. Geneva, Switzerland: Council for International Organizations of Medical Sciences (CIOMS), 2023
Link: https://cioms.ch/wp-content/uploads/2020/03/CIOMS-WG-XIII_6June2023_Draft-report-for-comment-1.pdf
5. European Commission, Consumers, Health, Agriculture and Food Executive Agency, Hansen, J., Wilson, P., Verhoeven, E., et al., Assessment of the EU Member States’ rules on health data in the light of GDPR, Publications Office, 2021
Link: https://data.europa.eu/doi/10.2818/546193
6. Study on the appropriate safeguards under Article 89(1) GDPR for the processing of personal data for scientific research – Final Report – EDPS/2019/02-08 (August 2021)
Link: https://edpb.europa.eu/system/files/2022-01/legalstudy_on_the_appropriate_safeguards_89.1.pdf
7. European Commission – Proposal for a regulation – The European Health Data Space (May 2022)
Link: https://health.ec.europa.eu/publications/proposal-regulation-european-health-data-space_en
5. Astellas – U.S. Food and Drug Administration Expands Indication for PROGRAF® for Prevention of Organ Rejection in Adult and Pediatric Lung Transplant Recipients (20 July 2021)
Link: https://newsroom.astellas.us/2021-07-20-U-S-Food-and-Drug-Administration-Expands-Indication-for-PROGRAF-R-for-Prevention-of-Organ-Rejection-in-Adult-and-Pediatric-Lung-Transplant-Recipients?_ga=2.73980498.1553566477.1627827053-1302835671.1627827053