9 Ways Unified Clinical Data Cloud Platforms Solve Data Challenges In Rare Disease
By Venu Mallarapu
Despite the challenges of developing therapies for rare disease, there is hope for the less than 200,000 people globally living with one (or more) of the identified 7,000 conditions: Industry is steadily increasing its research efforts. According to a study by Tufts Center for the Study of Drug Development, 34% of the drugs in R&D (as of 2020) are targeting rare diseases. This is a significant increase from 9% in 2010. And, 31 of the 53 novel new drugs approved in 2020 by FDA’s CDER were for rare diseases or conditions.
One key to the increasing focus on rare diseases was the passage of the Orphan Drug Act (ODA) in 1983. As the chart below indicates, there were seven times more Orphan Drug designations in the most recent decade (2013–2022).
Total orphan drug designations (n = 6340) and initial orphan drug approvals (n = 882) by decade, 1983–2022
Used under the Creative Commons Attribution 4.0 International License. Source: A comprehensive study of the rare diseases and conditions targeted by Orphan Drug designations and approvals over the 40 years of the Orphan Drug Act - PMC (nih.gov).
Challenges Of Rare Diseases Clinical Research
As noted in the section above, given the small patient population affected by a rare disease, it is challenging to find and recruit patients for clinical trials. This often leads to smaller studies with less conclusive results.
Many rare diseases are also not well understood and thus lead to challenges in developing effective treatments and conducting research on their safety and efficacy, making the effort time-consuming and costly. The incentives offered through ODA, like Orphan Drug designation and a 25% tax credit on R&D costs, as well as others, certainly help.
Another challenge with rare diseases is the variability of symptoms from patient to patient even though they may have the same rare disease. It makes it hard to measure the effectiveness of the treatments.
Given there are so few patients with rare diseases, researchers need to protect their safety and well-being. Therefore, ethical considerations become paramount and decisions like putting patients on placebo or control arm may not be possible.
Finally, there is the financial aspect. Some of the more prevalent diseases across large populations provide the incentive for pharmaceutical companies to spend money on clinical research. If a successful treatment is found, these companies are likely to benefit financially. This may not be the case with rare diseases, given smaller populations are affected by them and thus fewer people are available to purchase treatments.
Using RWD Can Alleviate Challenges In Rare Disease Research
However, there are ways to address some of the above challenges, including using real-world data (RWD), sometimes supported by machine learning (ML). RWD can help by:
- Filling data gaps: supplements small clinical trials and identifies potential participants.
- Improving disease understanding: tracks disease progression, identifies risk factors, and informs personalized medicine.
- Trial optimization: helps refine endpoints, calculate sample sizes, and design better trials.
- Monitoring safety and efficacy: tracks long-term outcomes and reveals rare drug interactions or adverse events.
- Supporting regulatory decisions: can help approve new therapies and update treatment guidelines.
However, it's important to acknowledge the limitations of RWD in rare disease such as:
- Data quality issues exist, such as missing data due to incomplete medical records, inconsistent recording due to variations in data collection, measurement errors in diagnostic coding, etc.
- Potential biases include selection bias in which patients included in RQD may not be representative of the entire rare disease population; confounding biases such as social economic status and comorbidities; and information bias, in terms of missing or inaccurate data leading to biased conclusions.
- There are challenges in causal inference due to the absence of controlled comparison used in randomized trials; temporal confounding, as time-dependent factors may influence the disease progression and treatment decisions; and reverse causality, in which case the disease itself may influence treatment choices, creating a reverse causal pathway complicating the analysis.
These need to be carefully addressed. Nevertheless, with responsible data collection, analysis, and interpretation, RWD can be a powerful tool to overcome data challenges and accelerate progress in rare disease research and clinical care.
Where Data Challenges In Rare Disease Persist
Data challenges in rare diseases are significant and complex and pose major hurdles to R&D of treatments. Listed below are some of the major ones:
- Dearth Of Data
- Small patient population leads to insufficient data collected for meaningful analysis. Small data sets lack statistical power and are challenging to use in drawing definitive conclusions.
- Patient data is often scattered across different medical organizations, patient registries, and studies. This leads to a lack of centralization and standardization, making it difficult to integrate and analyze patient data to make informed decisions.
- Poor Quality
- Variations in medical formats, diagnostic standards, and collection methods all lead to inconsistency and heterogeneity of data. (This is also a problem with non-rare disease clinical research.) This leads to difficulties in comparing data across different sources like local labs, sites, data capture systems, etc.
- Missing data across medical records from various organizations, for the same patient, will complicate the aggregation and analysis.
- Access And Sharing
- Privacy concerns are a perennial challenge and are even more so with rare diseases. Sharing data across institutions and countries will be arduous and thus limits collaboration and research progress.
- Lack of interoperability due to different data formats and lack of standardized data models create barriers to aggregating and analyzing data from different sources.
- Technology
- Lack of robust data storage, management, and analysis tools will limit the effective use of already limited data in rare diseases.
- Small and heterogeneous data sets in rare diseases lead to challenges in applying advanced data analysis techniques including machine learning and other novel approaches.
Apart from these, there is always an ethical angle, especially in rare diseases, to ensure the protection of a vulnerable patient population, including obtaining consent and eventual sharing of benefits of research with them.
Unified Clinical Data Cloud Platforms Can Help Overcome The Rare Disease Data Challenge
A unified clinical data cloud platform is a data and analytics platform that contains a data hub to aggregate clinical data from multiple data sources and in multiple formats, supports data transformation and standardization, supports computing and analysis, and has applications to use the data for management and decision-making. These could range from a commercially acquired platform to an in-house built application, the use of which is increasing as the industry recognizes the benefits of such a technological investment. Use of such a platform for rare diseases holds immense potential to overcome many of the above-mentioned data challenges through:
1. Data Aggregation
It offers the ability to pool data from across institutions, registries, and research studies to create a larger, more comprehensive data set, increasing statistical power and enabling more robust analysis. The ability to ingest and use RWD can complement and enhance the clinical research data. E.g.: Aggregation of data sets available with the sponsor and data sets obtained through patient registries and public data databases.
2. Standardization And Harmonization
It offers a meta data and standards management capability that enables implementing standardized data formats and collection protocols to ensure consistency and comparability across different sources, facilitating effective integration and analysis. E.g.: Ability to hold study standards defined by the sponsors/CROs as well as industry standards like SDTM, CDASH, OMOP, etc.
3. Data Validation And Curation
It serves as a data management workbench for personnel to apply dedicated efforts to validate and clean data, which can help improve data accuracy and completeness, thus minimizing bias and enhancing reliability. E.g.: Review of CRF data collected from the patients by the sites to ensure data quality.
4. Missing Data Imputation
It offers statistical compute capability within the platform to enable employing statistical techniques to address missing data gaps that can further strengthen the data set and allow for more comprehensive analysis. E.g.: Availability of a statistical compute environment in the UCDR can help with data engineering using advanced statistical analysis and even machine learning models to impute the data.
5. Secure Data Platforms
It allows users to securely store and share data through controlled-access platforms, which addresses privacy concerns while enabling collaboration and knowledge exchange. E.g.: Contributing any patient registry data or other deidentified data for research by third parties in a secure manner so it is accessed only by the intended parties and for the intended purpose.
6. Common Data Models And Interoperability Tools
It uses common data models and interoperability tools to break down technical barriers and allow seamless data exchange and analysis across different platforms and systems. E.g.: Use of data models based on industry standards like SDTM and supporting data interchange standards like HL7 FHIR.
7. Transparent Data Governance
It has robust data governance structures and transparent policies to ensure adherence to ethical principles and protect patient privacy while promoting responsible data sharing. E.g.: Ensuring that the provenance of data collected is well maintained, tracked, and available for audit and inspection purposes, if needed.
8. Advanced Data Analytics
Unified clinical data cloud platforms, with modern architectures like data lake houses —a new architecture that combines the best elements of data warehouse and data lake — enable researchers to apply advanced analytics techniques, such as ML, to identify patterns and insights hidden within vast data sets (enhanced with availability of RWD). E.g.: Use of statistical models to deidentify any outliers in the data collected or use of machine learning models to drive standardization of data or anomaly detection.
9. Collaborative Research Environment
By streamlining data access and sharing, cloud platforms foster collaboration among researchers, institutions, and industries, accelerating progress in rare disease research. E.g.: Unified clinical cloud platforms can act as a central repository that can provide access to internal users in data management, biostats, and stats programming and also make data available to external academic researchers and others, as needed.
A unified clinical data cloud platform for rare diseases proves to be a powerful tool to tackle data challenges at their core and pave the way for faster advancements in diagnosis, treatment, and ultimately, improved quality of life for individuals living with rare diseases.
About The Author:
Venu Mallarapu is a business and technology leader with 25+ years of experience in the industry. He has helped organizations with business and IT advice, strategic consulting, relationship, and delivery management. As eClinical Solutions’ VP of global strategy and operations, Venu drives overall strategy and operational excellence to increase adoption, market expansion and use of the elluminate® platform and services.
Venu is an SME in all R&D functions. He has delivered strategy and transformation advice to top 50 global life sciences companies. He is a regular speaker at events on transformation, innovation, and next-generation technologies. He is a recognized industry thought leader with published articles, blog posts, webinars, and seminars on topics that drive industry forward.