i-Cubed Used End-To-End AI In A Proof-Of-Concept Trial. Here's What They Learned
A conversation with i-Cubed Associate Director Christoph Hornik, MD, MPH, [AP1.1]and Clinical Leader Executive Editor Abby Proch

Project Loom is not dipping its toes into the shallow waters of AI but jumping headlong into its deep end.
Rather than optimizing isolated clinical trial tasks, this initiative explores whether AI agents working under human oversight can execute end-to-end trial workflows more efficiently. In a proof-of-concept study spanning document generation, IRB processes, patient engagement, data integration, and reporting, Project Loom showed that AI could help compress timelines while maintaining quality and traceability. At the same time, it revealed important limitations and the need for robust governance.
In this Q&A, i-Cubed’s Christoph Hornik discusses what the project uncovered about where AI is used in clinical research, including its most promising cases and where human expertise remains indispensable.
Clinical Leader: What inspired the launch of Project Loom, and what specific challenges in clinical trial operations were you hoping to address?
Christoph Hornik, MD, MPH: Clinical trials remain the foundation of evidence generation, yet they are often slowed by fragmented workflows, extensive documentation requirements, repeated handoffs between systems and organizations, and complex review processes. We launched Project Loom to explore whether agentic AI could help address these operational inefficiencies by coordinating trial activities across the entire life cycle rather than optimizing individual tasks in isolation.
The goal was not simply to make one process faster. We wanted to determine whether a network of AI agents, working together with human oversight, could execute multiple trial functions end-to-end while maintaining quality, transparency, and accountability.
Our ultimate goal is to find a way for all of us to collectively do more trials — faster, cheaper, easier, and with higher quality — because trials are the key to bringing evidence to patients faster.
Which trial activities did you include in the proof-of-concept study, and where did AI appear to deliver the greatest operational impact?
The proof of concept evaluated five phases of a clinical trial workflow: study document generation, IRB package preparation and review, participant identification, participant screening and engagement through a chatbot, and data integration and reporting. The platforms also created EDC system case report forms and generated a clinical study report.
We observed the greatest impact on tasks that are highly structured, repetitive, and document intensive. The AI platform generated initial protocol-related documents, ICFs, CRFs, and downstream trial artifacts in hours rather than weeks. We also observed high accuracy for automated data extraction and transfer into the EDC system.
In the study, AI systems performed tasks ranging from regulatory submissions to generating clinical study reports. Which of those results surprised you the most?
The most surprising finding was not a single task. It was the fact that the entire workflow could be completed end-to-end. Historically, different groups perform these activities using different systems over many months. Seeing an integrated AI workflow move from a trial synopsis to regulatory documents, participant interactions, data collection, and final reporting in one to two weeks was remarkable.
Also surprising was how well the platforms performed on some highly structured activities. For example, automated transfer of participant data into the EDC system achieved near-perfect accuracy. At the same time, the study highlighted areas where AI still struggles, particularly tasks requiring nuanced clinical reasoning or interpretation. That combination of strengths and weaknesses provided valuable insight into where these technologies are ready for deployment and where additional development is needed.
Were there any unexpected challenges or limitations revealed during the proof-of-concept phase?
Absolutely. One of the most important findings was that success varied substantially across different trial functions. All technology partners’ platforms performed well on document generation and data handling, but none successfully identified eligible patients from the synthetic EHR data set without intervention. This highlighted that seemingly straightforward tasks can become challenging when they require interpretation of complex clinical information.
We also learned that agent training is critically important. Performance was not determined solely by the underlying language model; it depended heavily on workflow design, evaluation mechanisms, and domain-specific training. Finally, we found that comprehensive audit logs and workflow traceability were essential for identifying failures and understanding how outputs were generated. These lessons reinforced the importance of governance and human oversight.
Skeptics often argue that AI cannot operate reliably within the regulatory and scientific rigor required for clinical research. What evidence from Project Loom addresses those concerns?
Skepticism is healthy, particularly in a field where patient safety and scientific integrity are paramount.
Project Loom demonstrated that AI could perform many clinical trial activities within a structured governance framework. The platforms generated regulatory and ethical documents that largely aligned with established guidance frameworks, maintained detailed audit trails, and achieved very high accuracy for structured data transfer tasks. We used independent human review and AI-based evaluation systems throughout the process to assess quality and completeness.
Importantly, we designed Project Loom not to show that AI can replace human judgment but to determine whether AI could serve as a highly capable execution layer operating under human supervision. The study provides evidence that AI can contribute meaningfully to regulated research environments when paired with appropriate oversight, validation, and accountability mechanisms.
Project Loom emphasizes integrating AI in a way that supports human expertise. Where do clinicians, trialists, and operations leaders remain indispensable in the clinical trial process?
Human expertise remains essential in areas involving judgment, ethics, risk assessment, scientific interpretation, and regulatory strategy.
Clinicians determine whether study designs make medical sense and protect participants. Trialists evaluate feasibility, operational risk, and scientific validity. Regulatory experts ensure compliance and appropriate positioning of evidence. These are not simply information-processing tasks; they require contextual understanding, experience, and accountability.
Our experience suggests that AI is most effective when it serves as a force multiplier. Humans provide direction, oversight, and decision-making, while AI performs large portions of the execution work. The combination appears substantially more powerful than either operating alone.
If AI can meaningfully automate trial operations, what does that mean for the future role of clinical operations teams?
We don’t believe it means fewer experts are needed. Rather, it means experts can spend less time on repetitive execution and more time on higher-value activities.
Many operational teams today devote significant effort to document preparation, reconciliation, data transfer, and workflow coordination. If AI can reliably perform portions of these activities, clinical operations professionals can focus more on study design, quality oversight, participant experience, risk management, and strategic decision-making.
In that sense, the future may be less about replacing people and more about changing how they work. The role shifts from manual execution toward orchestration, governance, and optimization of increasingly sophisticated human-AI workflows.
Looking five to 10 years ahead, what could clinical trials look like if initiatives like Project Loom are successful?
If successful, clinical trials may become significantly more connected, adaptive, and efficient.
Much of the routine operational work could be coordinated by AI systems that continuously manage documents, monitor data quality, support participant interactions, and prepare regulatory artifacts. AI systems can also compress trial timelines substantially, allowing studies to launch faster and generate evidence more rapidly.
At the same time, we expect human oversight to remain central. The most likely future is not fully autonomous clinical research. Instead, it is a hybrid model where AI manages execution while humans provide governance, scientific leadership, ethical oversight, and clinical judgment. That model has the potential to increase both efficiency and quality.
Finally, what are the immediate next steps for Project Loom as you move beyond this proof-of-concept study?
The proof of concept answered an important question: end-to-end agentic AI workflows for clinical trials are feasible today. The next phase focuses on refinement, validation, and deployment.
Our priorities include improving agent training, strengthening evaluation frameworks, expanding independent assessments of quality and risk, and identifying the trial activities where AI delivers the greatest return on investment. We are also interested in systematically comparing human-only, AI-assisted, and hybrid workflows to better understand where each approach is most appropriate.
The ultimate goal is safer, higher-quality, and more accessible evidence generation. We believe AI can help us achieve that.
About The Expert:
In addition to his role at i-Cubed, Christoph Hornik, MD, MPH, serves as vice chair for research in Duke’s Department of Pediatrics, director of the Duke Clinical Research Institute Pharmacometrics and Fit-for-Purpose Research Group, and a practicing pediatric cardiac intensivist. He also consults for pharmaceutical companies and startups.
Christoph has led multicenter randomized controlled trials, real-world evidence studies, regulatory-compliant data analysis and reporting, prospective cohort studies and registries, and digital applications supporting direct-to-participant research. Over the past decade, he has focused on team-based clinical research innovation and expanding participant access to research. He believes integrating clinical research into clinical care can improve evidence generation and health outcomes.
Christoph earned his medical degree from Albert-Ludwigs University in Freiburg, Germany, and completed pediatrics residency and fellowships in pediatric cardiology and critical care medicine at Duke University. He also earned an MPH and Ph.D. from the University of North Carolina at Chapel Hill.