Simulants: Medidata's Generative AI Algorithm For Generating Synthetic Clinical Data

Clinical trials remain constrained by limited data access, high costs, and persistent inefficiencies in design and recruitment. Patient-level data, though rich in clinical insights, is often inaccessible. Generative AI offers a breakthrough through the creation of synthetic data—artificially generated datasets that replicate the statistical properties and complexity of real trial data without exposing sensitive patient or sponsor information. Synthetic trial data expands research opportunities while also augmenting small or incomplete datasets, correcting for underrepresentation, and improving predictive modeling.
Medidata’s Simulants generative AI algorithm exemplifies this innovation, leveraging one of the industry’s largest standardized trial data repositories to produce high-fidelity, regulatory-grade synthetic datasets. Simulants enable trial sponsors to model alternative designs, forecast dropout risks, refine eligibility criteria, and identify subpopulations most likely to benefit from new therapies. This synthetic data approach not only reduces trial burden and operational uncertainty but also accelerates breakthroughs in complex areas such as CAR-T and rare disease therapies. By coupling domain expertise with generative AI, synthetic data is emerging as a critical enabler of safer, faster, and more inclusive clinical development.
Get unlimited access to:
Enter your credentials below to log in. Not yet a member of Clinical Leader? Subscribe today.