Guest Column | May 12, 2025

How Recursion Is Industrializing Clinical Trials With AI

A conversation with Sid Jain, SVP, clinical development and data science, Recursion

AI, artificial intelligence-GettyImages-949895818

If you’ve ever met an executive, listened to a presentation, or read an article from Recursion, you already know that company distinguishes itself as a TechBio (as opposed to biotech). Known by many for its longstanding AI-driven drug discovery platform, Recursion is now extending its computationally intensive paradigm to the domain of clinical research. Its comfort with and expertise in AI-driven drug discovery positions Recursion for a logical transition of AI-led processes in the clinical space.

In this interview, SVP of Clinical Development and Data Science Sid Jain discusses Recursion's three-point "ClinTech" strategy, focusing on trial design, trial execution, and evidence generation — all using large data/AI/ML.

Clinical Leader: Recursion has built an AI-driven drug discovery platform and is now bringing that same AI-forward mindset into clinical research. What does that look like?

Sid Jain: Recursion built its foundation on an AI-driven drug discovery platform — and we’re now bringing that same AI-first mindset into clinical development. We call this effort ClinTech — and it’s about applying technology across the whole development journey, not just in discovery. We’re focused around three key pillars: how we design our trials, how we execute those trials, and how we generate evidence by applying AI/ML to large datasets.

On the design side, it starts with writing smarter protocols — making sure they’re patient-centric. We ask: Who’s most likely to benefit from this drug? Can we model which patients are likely to be responders and non-responders using multi-omics data? How do we best identify and stratify the patients most likely to respond using causal modeling? How do we minimize the burden of the trial on the patients? Patients have many visits, and they can be hours long. There's a tremendous amount of patient and site burden We look at the inclusion/exclusion criteria and open the funnel to as many patients as possible safely by simulating trials in silico and without compromising study end points. Using real-world data and other data sets, we also do in-house simulations for scheduling assessments., and we use AI to minimize that burden.

Second comes execution. We ask: Are we finding the sites that have the patients we are looking for, and do those sites have the operational capabilities? How likely are they to enroll patients successfully for a particular phase of the trial? We use multimodal real-world data — whether it's multi-omics data, EHR claims, or labs — and then combine that with operations data sources. Hot spotting is the practice of using data to pinpoint areas ("hot spots") that are likely to yield a higher number of qualified candidates more efficiently, as opposed to broad, widespread recruitment campaigns for trials. We identify and do the patient-matching hot spotting to enroll that patient population as fast as we can at high-quality sites.

The third is evidence generation. Many of our studies are in oncology and rare diseases. It is very important to contextualize the results that we see in clinical trials with the natural history of that disease. In some cases, we can't run a control arm or placebo arm, so we do an external control arm. We're investing in generating evidence and making that evidence holistic for the regulators.

Under one of those three pillars, can you share how something was done previously at Recursion and how you are doing it now with AI assistance?

We’ve had programs where traditional approaches to inclusion and exclusion criteria were creating barriers to enrollment — especially in rare diseases with small patient populations. One of the things we’ve done is take a more data-driven approach to challenge those assumptions, and we’ll able to share the impact of those changes soon.

In my past work at a different company, we applied this thinking to lab-based cutoffs. As an example, we found that the hemoglobin criteria for exclusion was leading to disproportionate exclusion of Black patients, who historically tend to have higher rates of anemia. By changing the the hemoglobin criteria for exclusion from 8 to 7.5, we were able to increase our eligible patient population by 8% to 10% overall and expand the patient funnel to recruit a more diverse patient population as well.

It sounds like your AI-assisted work is giving you the insights and the confidence to break away from the traditional way of doing things. Is that right?

Absolutely. It's Recursion’s mission to decode biology to radically improve lives. To do that, we're trying to industrialize every aspect of drug development, not just discovery, but trial design, execution, and evidence generation. A good example is in site execution. Across the industry, nearly 20% of trial sites enroll no patients at all. We're tackling that by integrating real-world data and predictive modeling into site selection and start-up. Instead of relying on historical relationships or gut feel, we’re building workflows that surface high-potential sites based on actual patient-level signals and operational performance and flag operational or recruitment risks early.

When we're developing these capabilities to do things differently, we're also thinking about scale. We're also thinking about standardization across workflows throughout the development life cycle, so that we can automate them and use different data modalities to improve the probability of success for our programs.

When it comes to integrating AI, what is the role of human oversight?

Because we are tech- and AI-first from the ground up, validation has always been at the core of how we work. We have this tightly integrated wet lab and dry lab approach where the AI predictions are rigorously tested and validated through physical experiments; this provides biological proof and explains why the AI made certain decisions. Human-in-the-loop is baked into our DNA.

When I talk to my teams — or even the industry — I always say we don't think of AI as replacing humans. It's making the job for those humans easier and more satisfying in a way. Think of AI as a Ph.D.-level research assistant. You're not going to take the work of a research assistant and then just publish it. You're going to review it, challenge it, and validate it. For us, AI is both an enabler and an assistant, but it's not overtaking that human ingenuity.

Understanding there is a vast amount of data generated by AI, how do you ensure it is high quality and will ultimately satisfy regulators?

The way we ensure data quality — especially when using AI at scale — comes down to building the right systems from the ground up. We’ve invested heavily in infrastructure that tracks data lineage, ensures reproducibility and auditability, and maintains full traceability from raw data to final analysis. This isn’t just about good practice — it’s about earning trust with regulators. our team ensures that the evidence that's generated, whether it's from the clinical trials or from the real-world data, meets the highest standards of regulatory needs.

Along those lines, how do you ensure it’s been trained properly with relevant data?

It starts with being intentional — we’re not chasing the biggest models or the largest data sets just for the sake of it. We care about relevant data, high-quality and representative data, and fit-for-purpose models. That means understanding where the data came from, how it was generated, and whether it reflects the biological or clinical reality we’re trying to model.

For the models we build in-house, provenance and auditability are built in from day one — we know exactly what’s in the data set. And when we evaluate external models, we have the technical depth to challenge them — to ask not just what does this model do, but what was it trained on? And is this the right tool for this problem?

Since Recursion considers itself a TechBio, does the AI expertise already exist within your ranks? And if so, how has that benefited the company as pharma increasingly embeds AI assistance into its operations?

When I joined Recursion about seven months ago, I didn't have to convince anyone that we needed to use AI or have large data sets to drive our clinical development strategy. That was already baked in. That was the expectation. It's really important to stop thinking of AI as an add-on or a shiny new tool. It needs to be embedded into the fabric of how you operate. And that goes from therapeutic hypothesis to study design and into decision-making. Integrating AI well requires designing workflows, data strategies, and decision-making frameworks so that AI isn't something you use occasionally, but rather something that accelerates every part of your process. The culture of open-mindedness and the bilingual nature of talent in tech and R&D makes a huge difference in terms of adaptability and being able to pivot as rapid technological changes occur.

What's a good starting point for those figuring out how to integrate and implement AI?

A couple of things come to mind.

First, prioritize building internal capabilities over simply outsourcing expertise. While not every organization has the advantage of a team equally fluent in technology and drug development—that’s something unique about Recursion—it’s essential to cultivate both. You need internal depth to frame the right questions and make informed decisions about when and how to partner externally. Ultimately, the goal isn’t just to apply AI—it’s to ensure you’re solving the right problems, the ones that truly move the needle for patients. That requires not just tools, but vision, focus, and the courage to rethink what’s possible.The other is data readiness. The AI is only as good as the data it learns from. Many biotechs

underestimate how fragmented or incomplete their internal data really is, even though the external partners often bring not just technology but a harmonized and well-created data set. External partners often offer more than just technology—they bring well-curated, harmonized datasets that can jumpstart AI applications. In the short term, especially when timelines are tight (6 to 12 months), partnering can accelerate progress. But long-term success often requires a hybrid approach: building internal capabilities while selectively partnering to fill critical gaps.

Because in the end, this isn’t just about deploying AI—it’s about ensuring that AI is being applied to the right problems. The goal is not automation for its own sake, but transformation that leads to better-designed trials, faster execution, and ultimately, better outcomes for patients.

About The Expert:

Sid Jain is a healthcare innovator passionate about reimagining clinical development through the power of data, technology, and biology. As SVP of clinical development & data science at Recursion, Sid leads teams that blend AI, real-world data, computational biology, and clinical expertise to modernize how therapies are developed. He is focused on transforming clinical development into a more predictive, efficient, and patient-centric process, seamlessly bridging the gap between drug discovery and development in Recursion’s TechBio platform. Prior to Recursion, Sid served as vice president of global development data science & digital health at Johnson & Johnson, where he led initiatives like real-world registries and AI-powered trial platforms that accelerated development timelines. Over his 20-year career, he has built and scaled data science and digital health teams across biopharma and health tech companies, including ConcertAI and NantHealth. Sid is also a committed advocate for advancing diversity, equity, and inclusion in clinical research.

How Recursion Is Industrializing Clinical Trials With AI

Like what you are reading?

Sign up for our free newsletter

Newsletter Signup