Article

The Governance of Clinical Trial Data

Author:

Abstract

Clinical trial results are among the most authoritative inputs in biomedical research and are increasingly reused as training, calibration, and evaluation data for Artificial Intelligence (AI) systems. Yet a large share of completed trials, especially small academic studies, pilot projects, and early-phase experiments, never disclose their findings. These omissions distort meta-analyses, hide negative results, and silently propagate upstream bias into (AI) models. Existing enforcement tools, including Food and Drug Administration (FDA) civil penalties and National Institutes of Health (NIH) grant sanctions, work tolerably well for high-stakes commercial trials when used, but fail across the long-tail, where investigators lack the resources or incentives to upload results once grant funds expire. 
This Article reframes clinical-trial transparency as a training-data governance problem and proposes a complementary solution: modest fixed cost-offsets for compliant uploads, funded through a two-tier access model. Public summaries on ClinicalTrials.gov would remain free, while a licensed API would provide harmonized, machine-readable datasets to high-volume users such as insurers, pharmaceutical consortia, and AI developers. Aligning the distribution of costs and benefits makes transparency feasible for resource-constrained investigators and sustainable for the analytic institutions that depend on complete, unbiased data. The result is a light-touch institutional architecture that preserves fragile trial evidence and strengthens the reliability of both biomedical research and AI health systems. 

Keywords: #ClinicalTrials, #TrainingData, #AITransparency, #DataCommons, #HealthLaw

How to Cite: Fagan, F. (2026) “The Governance of Clinical Trial Data”, Washington University Journal of Law and Policy. 81(1).

None