How AI And Secondary Data Use Are Changing The Way We Do CTAs, Part Two
By Katherine Leibowitz and Catherine London, Leibowitz Law

With AI advancing at lightning speed, cybersecurity threats multiplying, and electronic health records (EHRs) now the norm, it’s time to revisit your clinical trial agreements (CTAs). Emerging technologies are transforming the clinical research landscape, and your contracts need to keep up. A CTA refresh is a strategic move to safeguard sensitive data, manage cybersecurity risks, and ensure regulatory compliance.
In part one of this series, we explored foundational topics, including cybersecurity and EHR standards. Now, in part two, we delve into two emerging drivers of change: secondary use of study data and AI.
Secondary Research And Use
Background
A large industry has developed around secondary research, which is the reuse of information or biological specimens collected during clinical research for an unrelated, or “secondary,” research activity. For example, sites may contribute source records or leftover specimens to a database or repository or make their EHR systems containing study-related data available internally or to third parties for research or other purposes. Over time, and particularly with AI, the concept of secondary research has expanded beyond traditional research activities, evolving into broader application, or “secondary use,” that includes non-research purposes. CTAs often address secondary research and use of information, either head-on or indirectly through the non-exclusive, royalty-free (NERF) license or EHR provision. This section focuses on data, not specimens1.
NERF To Study Data
In the CTA, the sponsor often grants the site a NERF to use the study data for internal, non-commercial academic research, patient (or participant) care, and education purposes. This license may be subject to the CTA’s confidentiality and publication obligations and may prohibit sublicensing, among other things. The NERF enables the site to use the data generated during the study for purposes unrelated to the study, such as to advance their academic mission. However, advances in technology and the increased sophistication of electronic databases are prompting sponsors to scrutinize this provision more carefully.
Contracting Party Perspectives
- Sites:
- Necessary for Academic Mission. Many sites require a NERF to advance their academic mission, including research, education, publication, and participant or patient care. Sites want to make sure the NERF terms do not unduly restrict them from carrying out their mission.
- Realistic Compliance. Sites need NERF terms that align with their actual policies, procedures, and technological capabilities.
- Sponsors:
- Maintaining Competitive Edge. Sponsors need the NERF to be narrowly tailored to shield their study data from competitors and protect their intellectual property.
- Ensuring “Internal” Truly Means Internal. Sponsors want clarity that study data will remain truly “internal” and “non-commercial.” Without precise wording and key restrictions, these two terms could be interpreted more loosely than intended. For example, can study data be used for research by the site (which is internal) that is funded by a competitor, a nonprofit, or the government (which are external)?
- Access and Controls. The question of whether the site’s exercise of the NERF is internal or non-commercial may ultimately depend on who can exercise the NERF (is sublicensing permitted?), who has access to NERF-generated data (secondary data or NERF results), and database controls to prevent unintended access.
- AI Considerations. Sponsors are starting to consider whether AI will have access to the study data and, if so, who benefits from the AI access: The site? The AI vendor? Other third parties? Does any data ingested by the AI refine the AI? The answers may turn “internal” into “external.”
- Secondary Data Risks. The parties often do not thoroughly assess whether secondary data generated under the NERF remains internal, non-commercial, or confidential. In the past, sponsor concerns were largely theoretical, but with AI and evolving data-sharing practices, the risks are now very real. For example, if secondary data is pooled with data from other sources, can sponsor-specific data be isolated (perhaps through AI)? How high is the risk of competitor access?
- High Stakes for Clinical Stage Companies. For sponsors whose entire pipeline hinges on a single investigational product, competitive exposure will “make or break” the company. These sponsors want to prevent the exercise of the NERF from crossing the line from internal use to competitor access or commercial use.
EHR Access
Health systems and universities are increasingly leveraging their EHR systems for research, AI development, and operational improvements. As with NERFs, this access raises significant concerns regarding secondary research or data use.
- Contracting Party Perspectives
- Sites:
- EHR Ownership. As mentioned in part one of this series, the site owns or controls the EHR system and considers any restrictions on EHR use, including on study source documents, to be inappropriate.
- Patient Record Requests. Patients have the right to request their records, and sites must honor these requests — regardless of sponsor concerns about secondary use.
- AI and Healthcare Improvements. Many sites are actively collaborating with AI vendors to enhance healthcare delivery, and they do not want sponsors restricting their ability to license access to their EHR systems to AI vendors.
- Sponsors:
- Competitive Risk. EHR records may contain sponsor names, protocols, study products and study adverse event information. If competitors gain access to this information through secondary research or data use, the risks are substantial.
- Risk Levels Vary. Sponsors with a single investigational product face the greatest exposure.
- De-Identification Isn’t a Solution. When questioned about these risks, some sites assure sponsors that they de-identify the study source records under HIPAA prior to permitting third party access to their EHR. De-identifying records under HIPAA does not remove non-protected health information (PHI) data like sponsor names and protocol titles, leaving sponsors vulnerable.
- Sites:
Key Takeaways
- Understand each Party’s Perspective. Sponsors and sites each have legitimate concerns about secondary access to and use of study data, whether through the NERF or the EHR system. Taking the time to learn and understand the other party’s perspective will help the parties structure the CTA accordingly.
- Use Cases and Downstream Data Access. Contracting parties should discuss possible use and access cases for study-related data held at the site. Who has access to study data? The NERF results? EHR records? For what purpose? Can they share the data with third parties? Addressing these questions early avoids conflicts later.
- Account for AI’s Role. The parties should define AI’s role in accessing and handling the NERF data, NERF results, and EHR data relating to the study. This is addressed in more detail below.
- Technology Benefits vs. Competitive Risk. New technologies bring promise but also risk. Sponsors and sites need to assess whether the CTA language enables exposure of sensitive data to unintended parties, including how attenuated or diluted that risk is (or isn’t).
- Precise NERF. The NERF should be precisely tailored to ensure both parties understand — and are comfortable with — its scope. The parties should discuss what database controls and safeguards the site can realistically implement to ensure compliance with the NERF, keeping AI in mind.
- Practical EHR Terms. Sites should make sure they can implement any EHR terms (such as access restrictions or data removal) they agree to.
- Audit Database Security. Sites should evaluate their data access policies. Key questions include:
- Who can access the study data under the NERF?
- Where are the NERF results stored?
- How is access monitored and restricted?
- What is the institution’s policy on making data available (including EHR records) for secondary research or use?
- Does AI have access? If so, under what guardrails?
Artificial Intelligence
Background
We could write an entire article about AI2 in clinical trials. While AI has been around for decades, the rise of ChatGPT and large language models has rapidly expanded AI’s role in clinical research. AI now supports drug discovery, patient-trial matching, participant adherence, trial design optimization, synthetic control arms, and adverse event prediction and monitoring.
But this article focuses on a critical issue: What happens when AI accesses study data without all parties knowing? As AI integrates deeper into clinical research, concerns around unauthorized access, data sharing, and confidentiality are becoming impossible to ignore.
- Tech Giant Access to Healthcare Institution Data. For over a decade, tech giants have been partnering with healthcare systems to access EHR data and develop machine learning tools for predicting medical events3. These activities, combined with the recent surge in AI applications, raise serious concerns for clinical trial stakeholders – particularly around data sharing, confidentiality, and competitive risks.
- Tradeoffs. More data means better AI models, leading to improved healthcare operations and analytics. But AI access to study data comes with significant risks, particularly for sponsors. The parties to the CTA risk having critical confidential data made accessible to third parties – though typically as part of a vast data set that may make individual study data difficult to extract or trace back to its source. While sponsors face the greatest exposure, both sides need to be aware of the implications.
- Confidentiality Concerns.
- Lawsuits Over AI Ingesting Proprietary Content. A wave of lawsuits highlights the risks of AI ingesting proprietary content without consent. On February 13, 2025, major publishers, including The Atlantic, Forbes, the LA Times, and Condé Nast, sued startup Cohere Inc. for copyright and trademark infringement. The lawsuit alleges Cohere used over 4,000 copyrighted works to train its AI, displayed large portions (or entire articles) to end users, and generated false articles misattributed to the publishers4.
- A Growing Legal Trend. Similar lawsuits are piling up: The New York Times sued OpenAI and Microsoft in December 2023; NewsCorp sued Perplexity in October 2023; and Thomson Reuters won a case against AI company Ross Intelligence in February 2024. These lawsuits underscore AI’s legal and ethical challenges in using proprietary data4.
- The Risk for Clinical Trials. A key issue in these lawsuits is whether AI tools can refine themselves using ingested data. While most AI companies claim they do not train models with user data, fine-tuning remains common. For sponsors in clinical trials, confidential study data could be processed, repurposed, and potentially exposed through AI models.
Contracting Party Perspectives
- Both parties:
- Unauthorized Use and Disclosure of Confidential Information and PHI. Using AI on study data or documents may unintentionally make confidential data accessible – perhaps in diluted fashion – by third parties in violation of the CTA confidentiality obligations or PHI restrictions.
- Vendor-Based Risk. If parties are unaware of the AI tools their vendors use or if vendors lack proper AI controls, their vendors’ actions may place the parties in violation of the same CTA sections.
- State Law. Understanding how AI is being used is essential to complying with emerging state AI and privacy laws that may apply.
- Sites:
- Operational Freedom. Sites want to avoid sponsor-imposed restrictions on AI tools used in daily operations (e.g., EHR, document management systems, clinical trial management system (CTMS), NERF databases).
- Confidentiality and PHI. AI use by sponsors or vendors could inadvertently breach CTA confidentiality or PHI obligations relating to site confidential information.
- Competitive Risk Dilution. Study data absorbed into AI models may become part of vast data sets, reducing traceability to sponsors or study products.
- Implementation Challenges. Completely banning AI tools is often impractical and difficult for sites to enforce.
- Sponsors:
- Loss of Confidentiality.
- If AI ingests study data from site databases and the AI is accessible by third parties, competitors may be able to extract study-related data.
- If sites use AI for document review, and the CTA itself is sponsor confidential information, they may violate the CTA’s confidentiality obligations.
- Competitive Risks. AI access to study data may affect the sponsor’s:
- Market reach and share
- Intellectual property protection
- Patentability
- FDA review process
- Data Gateways. AI operates within databases, posing risks through the NERF, EHR systems (see the discussion above on secondary research and use), and site document management tools like CTMS.
- Loss of Confidentiality.
Key Issues To Consider
AI in clinical trials is in its early stages, and most CTAs do not explicitly address AI… yet. Prudent stakeholders should understand AI’s role in studies to prevent unintended consequences. While AI provisions in CTAs are not standard, it’s critical for stakeholders to assess how AI may interact with study-related data. Questions and considerations include:
- AI Access to Sponsor Confidential Information
- Does AI at the site interact with study data, the protocol, NERF data, other sponsor confidential information, or EHR records? Consider AI in the site’s and its vendors’ systems as well as tools used by study personnel.
- Does the site’s document management system have built-in AI tools (e.g., Microsoft Copilot)? Is the EHR system Office of the National Coordinator for Health Information Technology (ONC)-certified AI?
- Does study data refine AI for broader use, benefiting other customers? If so, is the data sufficiently diluted to prevent it from being linked back to the sponsor?
- AI Access to NERF Data
- Where does the site store study data subject to NERF rights?
- Are databases containing secondary NERF data accessible by AI?
- Could AI-ingested data be traceable back to the sponsor or study product?
- AI Access to Site Confidential Information
- Sites should determine whether sponsor tools, CROs, database vendors, or cloud service providers use AI.
- Remote source data verification poses a lower risk due to site-controlled monitoring systems.
- AI Access to Patient Data (PHI)
- Both parties must assess whether AI use triggers state privacy law compliance obligations (and stay tuned for evolving AI regulations).
- Sites should ensure AI vendors sign business associate agreements (BAAs) where required and that they address AI-specific risks.
- Sponsors should evaluate whether their or their vendors’ AI tools risk unauthorized PHI disclosure in violation of the CTA.
- Sponsors should consider whether EHR records containing study adverse event information require redaction before AI access.
- Downstream Obligations. Unregulated AI use may violate CTA requirements to bind employees, agents, and contractors to the confidentiality and PHI protections.
- Current Industry Practices. Approaches to AI in CTAs vary:
- Silence: No mention of AI.
- Strict Prohibitions: Ban on AI interacting with the CTA or study data.
- Tailored NERF and/or EHR Language: AI-driven concerns addressed in these provisions.
- Confidentiality Provisions: AI prohibition tucked into confidentiality sections.
Key Takeaways
- Learn
- Study sites and sponsors should discuss how each party – and its vendors – uses AI in relation to study data and other confidential information.
- Conduct AI audits to assess:
- What AI tools are in use?
- For what purpose?
- Are they home grown or third party?
- What safeguards exist?
- How is data processed and stored?
- What privacy, confidentiality, and cybersecurity measures are in place?
- Is a blanket AI ban practical or enforceable?
- Has a data mapping exercise been conducted to track where data is received, generated, or stored?
- Be Transparent and Consider Guardrails:
- Adding an AI provision to the CTA will prompt discussions among the stakeholders and drive internal policy development.
- Define clear AI usage restrictions in the CTA based on the outcome of these discussions.
- AI is quickly being adopted, sometimes by employees or vendors not subject to formal policies or oversight. Sites and sponsors should investigate their employee and vendor usage of AI.
- AI governance is evolving; stakeholders must update contracts and practices accordingly.
- Vendor Due Diligence
- Ensure vendor contracts for any study-related services, including clinical trial management, database hosting, and data processing, include:
- Security certifications (ideally SOC 2 or ISO 27001)
- Comprehensive documentation (including transparency on training data)
- Audit rights
- Indemnification provisions
- Incident response policies
- Policies for AI hallucinations
- Legal compliance requirements
- Ongoing monitoring and governance
- Align AI Terms with Other CTA Provisions:
- Align AI-related clauses with the CTA’s confidentiality, publication, security, NERF, EHR, indemnification, and limitation of liability provisions.
- Coordinate AI-related terms with known AI applications in the study, including those referenced in the protocol, incorporated into DHTs, or embedded in the study product.
- Ensure vendor contracts for any study-related services, including clinical trial management, database hosting, and data processing, include:
Conclusion
Technology is revolutionizing clinical trials, but with innovation comes complexity. Clear governance, contract adaptations, and proactive risk management are essential. AI is just one piece of the puzzle. Protecting research investments means strengthening cybersecurity, keeping pace with evolving EHR system standards, setting clear policies for secondary data usage through NERF and EHR systems, and addressing AI.
Looking ahead, organizations must take a forward-thinking approach to technology-related risk management while strengthening data governance frameworks to stay compliant and competitive.
References/Footnotes:
- A discussion of biobanking and other secondary use of biological specimens is beyond the scope of this post.
- There are many important and often confusing terms used when referring to AI, including model, algorithm, tool, program, etc. For purposes of this post, we are going to refer to AI generally to mean any or all of these terms.
- IBM’s Watson (2015), Google’s DeepMind (2016), and Google’s partnerships with universities and health systems, including the University of Chicago (which spawned a class action lawsuit in 2019 that was eventually dismissed), UCSF, and Ascension Providence (Project Nightengale, which sparked an HHS investigation in 2019 and Congressional attention in 2020), started this trend.
- Alexandra Bruelle, “Publishers Sue AI Startup Over Content Use.” The Wall Street Journal, February 14, 2025, page B1.
A version of this article first appeared on Leibowitz Law's blog. It is republished here with permission.
About The Experts:
Katherine Leibowitz has supported the clinical trials enterprise for 25 years. She co-founded Leibowitz Law in 2013 after spending 17 years at a top global law firm. Her boutique life sciences regulatory and transactional law firm is laser-focused on clinical trials and technology commercialization, serving sponsors/manufacturers, technology service providers, research institutions, CROs, and digital health companies.
Katherine handles the full clinical trial operations contracting process from CTAs and budgets to HIPAA authorizations, informed consent forms, EDC vendor agreements, CRO MSAs, committee membership, physician consulting, and more. In today’s fast-evolving world of electronic databases, decentralized trials, AI, cyber risk, secondary research, and biobanking, she excels at modernizing contract templates and negotiations to align with the shifting landscape and move deals forward efficiently.
A frequent speaker and author, Katherine enjoys combining the multiple regulatory, legal, and industry norms to provide integrated, practical guidance to the life sciences community.
Catherine London has over a decade of experience representing life sciences companies and health care providers on clinical research contracting and compliance matters, providing comprehensive legal support to sponsors, research institutions, CROs, and other stakeholders involved in sponsored research.
Catherine draws on her deep regulatory expertise to counsel clients nationwide on issues impacting clinical trials, including informed consent, IRB considerations, patient privacy, conflicts of interest, and risk management. She is well-versed in the laws, regulations, and industry standards governing clinical trials, including FDA requirements, the Common Rule, HIPAA, and Good Clinical Practice, as well as fraud and abuse laws, Medicare and Medicaid requirements, and transparency reporting requirements. Catherine employs this expertise to help clients navigate the intricacies of clinical trial agreements, ensuring that they align with regulatory requirements, protect business interests, and foster successful collaborations.