A Tool To Tackle The Risk Of Uninformative Trials
By Thomas Wood, Fast Data Science

A major challenge in clinical research is the high rate of trials that fail to deliver meaningful results, which we can call "uninformative” clinical trials. Uninformative trials hinder progress in clinical practice, policy decisions, and further research. They are a waste of money and a violation of medical ethics, since they expose subjects to risk without benefiting the body of medical knowledge. Unfortunately, many clinical trials are designed in a way that does not give them the best chance of delivering informative results.
What Are Uninformative Trials?
An uninformative trial is not the same as a failed trial. The drug under investigation may be an effective treatment, but an uninformative trial would fail to return this information. Likewise, a drug may be ineffective, but a trial could still be informative if it correctly delivers the negative result about the failure of that drug to treat a particular condition.
“Uninformativeness” isn’t yet officially defined, but it is possible to create an objective definition based on what constitutes an informative trial. According to Deborah Zarin, Steven Goodman, and Jonathan Kimmelman in 2019,1 an informative trial must fulfill the following five conditions:
- The study hypothesis addresses an important and unresolved question.
- The study is designed to provide meaningful evidence related to the question.
- The study is feasible (e.g., from the point of view of enrollment).
- The study is conducted in a scientifically valid manner.
- The methods and results are reported accurately, completely, and promptly.
Authors of meta-analyses sometimes exclude past trials if they are judged to have an excessive risk of bias. A trial that cannot be included in systematic reviews can therefore be considered uninformative — that is, it does not contribute to the body of knowledge in a field.
The Risk Of Uninformativeness
A question related to uninformativeness is the risk of a trial ending uninformatively. The risk of uninformativeness is distinct from the other risks associated with the trial, such as the risk to participant safety or financial risk.
At the trial planning stage, identifying the risk of uninformativeness (whether a trial is likely to be uninformative) is no easy task and could involve sifting through large numbers of documents, including clinical trial protocols.
Clinical trial investigators, sponsors, investors, and CROs often spend large amounts of time reading through clinical trial protocols and other documents, which can run to several hundred pages in length and contain all essential parameters of the trial such as the sample size, enrollment criteria, duration, interventions, endpoints, and other useful information. A person trying to anticipate the risk of uninformativeness of a trial, or the cost of running that trial, will need to spend time with the trial protocol.
Reviewing The Risk Of Uninformativeness Using AI And NLP
Several pharmaceutical companies have been investigating technological solutions to facilitate the task of reviewing a protocol. In 2019, the German pharma company Boehringer Ingelheim contracted me to build a solution for protocol review that used natural language processing (NLP) to produce a trial complexity estimate from the trial protocol in PDF form. I wrote a brief blog post about the project on Fast Data Science’s website, and one year later, in 2021, the Bill and Melinda Gates Foundation contacted me with the idea for a trial protocol review tool that would produce risk estimates and about how AI and NLP could help with this.
Specifically, the Bill and Melinda Gates Foundation wanted to build a tool that would identify risk factors in HIV and tuberculosis trials running in low- and middle-income countries (LMICs). Initially, my team and I built a web-based, open-source tool in Python, which allows a nontechnical user to drag and drop a trial protocol in PDF format, that rated the risk level as a traffic light (red, amber/yellow, or green). It became the Clinical Trial Risk Tool.
We put the source code on GitHub for public use and also to edit the tool and send contributions back to the project, which resulted in an article on the development and workings of the tool in Gates Open Research,2 so users could understand and modify or extend the tool if they wish. The plan was that investigators would be able to self-assess their protocols with the Gates tool before applying for grants, to ensure that their protocols were sufficiently robust.
The first version of the Clinical Trial Risk Tool proved successful, but it became clear that a redesign would take the project further to cover more disease areas, such as vaccine trials, and experimenting with cost prediction. We spent most of 2024 redesigning the tool to cover more disease areas, predict cost, and retrieve more structured information from the text, with a new and improved user interface that allows users to save and retrieve documents and past analyses.
Version 2 of the Clinical Trial Risk Tool covers more disease areas, including enteric and diarrheal diseases, HIV, malaria, neglected tropical diseases, pneumonia, tuberculosis, and vaccine trials, as well as some areas that are outside the scope of the Gates Foundation, such as multiple sclerosis and oncology trials. We hope that with the broader scope, the tool will attract users across the pharmaceutical sector, including trial managers, CROs, sponsors, regulators, patient advocacy groups, and even private equity investors.
Implementing Feedback From Risk Tool Users
Everyone is interested in the information held in the protocol, but different groups often want different things or have different checklists regarding complexity, cost, and risk models. For example, many researchers are interested in patient safety and health risks rather than the risk of uninformativeness. Major risks are well-known to experts within each domain. For some disease areas or intervention types, it is difficult to enroll participants, whereas for other trial types, enrollment is not a problem.
Patient advocacy groups have asked for the tool to check that patient groups have been involved in drafting the protocol and that more quality of life measures are included in the endpoints. These groups say that pharma companies may be too focused on measurable outcomes such as overall survival rate, but they want trial design to include a greater emphasis on appropriate quality of life measures such as the EORTC Quality of Life questionnaire. The patient advocates reported that interventions that produce a significant and measurable improvement in quality of life could still be of value to patients, even if they do not have any significant effect on survival rate, and they would like to see more data gathered to allow these conclusions to be drawn.
Financial professionals who are not in the clinical space, such as private equity investors, have asked that the tool assess trial protocols for risk to decide whether to invest in biotech companies. Additionally, trial sites may also be interested in this information, as they need to provide quotes for sponsors and CROs and compete with other sites on cost for the opportunity of running a trial.
Some users are concerned about uploading company-internal documents to a cloud-based software solution, so we are planning to allow users to download the tool to their own servers or run it on their own laptops, so confidential documents do not need to leave their corporate infrastructure.
Planning For Future Iterations Of The Risk Tool
We would like to make the tool output a template clinical trial budget, as many users have indicated that this would be helpful. If the budget could be derived from a database of real-life cost figures, such as those made public by the Sunshine Act, this would allow companies to get an accurate view of the breakdown of costs of a trial before it is run.
Another interesting extension to the tool would be to identify endpoints and inclusion/exclusion criteria and retrieve past trials from clinical trial registries such as ClinicalTrials.gov, which have similar endpoints or inclusion/exclusion criteria, allowing users to quickly determine if their planned trial is using non-standard endpoints or assess how realistic their recruitment criteria are.
Overall, the tool can be made more useful for a wider variety of stakeholders if we can design it for separate user profiles such as patient advocates, financial planners, and medical professionals.
For example, if the tool could make a simple printout of recommended actions needed to improve a protocol, which could be tailored for a medical professional, finance professional, or advocacy group, this would allow very personalized feedback on the protocol development. The tool will not serve as a substitute for a human reviewer, but it allows users to flag gaps in a trial design early and identify high-risk indicators.
References:
- Zarin, Deborah A., Steven N. Goodman, and Jonathan Kimmelman. "Harms from uninformative clinical trials." Jama 322.9 (2019): 813-814, https://pubmed.ncbi.nlm.nih.gov/31343666/
- Wood, Thomas A., and Douglas McNair. "Clinical Trial Risk Tool: software application using natural language processing to identify the risk of trial uninformativeness." Gates Open Research 7.56 (2023): 56. https://gatesopenresearch.org/articles/7-56/v1
About The Author:
Thomas Wood grew up in London and studied physics as his first degree, and then earned a master's in natural language processing in 2008 at Cambridge University. After finishing his studies, Wood moved to Germany then Spain, working for a number of small and large companies in data science and NLP. Wood came back to the UK in 2014 and, in 2018, founded Fast Data Science. Wood now works on a number of consulting projects in pharmaceuticals as well as manages the Clinical Trial Risk Tool, funded by the Gates Foundation, and a tool called Harmony, which is to assist psychologists in combining datasets.