FDA Docket FDA-2026-N-4390  ·  91 FR 23100

AI-Enabled Optimization of Early-Phase Clinical Trials

A structured response demonstrating how Aurelyn Trial | OS™ and the Aurelyn Clinical Engines™ directly address the FDA's Request for Information — across pilot design, evaluation metrics, and the trustworthy-AI principles of the NIST AI Risk Management Framework.

Issued
Apr 29, 2026
Comments Due
Jun 29, 2026
Coordinating Office
Dep. Chief Medical Officer, OC
Centers
CDER · CBER · OCE

The comment period was extended 30 days from the original deadline; submissions are accepted through June 29, 2026 at regulations.gov under Docket No. FDA-2026-N-4390. This document is formatted to mirror the RFI's question taxonomy (Categories A & B) for direct, citable response.

01 — Positioning

Why Aurelyn answers this RFI

The FDA identifies eight ways AI may improve early-phase trials and asks how to structure a pilot and measure its success. Aurelyn Trial | OS™ was architected for exactly this problem space: a governed, regulator-ready operating system that turns model-informed decision support into auditable, ALCOA+ evidence.

Built for the eight use cases

Recruitment, dose escalation, safety monitoring, adaptive design, Phase 1→2 decisions, biomarker stratification, and endpoint validation are each handled by a named Clinical Engine — not bolt-ons, but the core architecture.

Trustworthy by construction

Every engine runs inside a Governance & Assurance layer mapped to all seven NIST AI RMF characteristics and the FDA's risk-based credibility-assessment framework — context-of-use, model risk, and lifecycle monitoring as first-class objects.

Measurable from day one

The platform emits the exact telemetry the RFI's Category B requests — cycle times, decision concordance, signal-detection latency, drift, and subgroup fairness — pre-instrumented so the pilot can be evaluated rigorously, not retrospectively reconstructed.

02 — Platform Architecture

The Aurelyn Clinical Engines™

Aurelyn Trial | OS™ is the orchestration layer; the Clinical Engines™ are modular, independently validated capabilities. Each maps to one or more of the FDA's stated AI opportunities. A cross-cutting Governance & Assurance layer wraps every engine.

FDA-stated opportunity → Aurelyn Engine mapping · tap a card for detail
⬡ Governance & Assurance Layer — NIST AI RMF · 21 CFR Part 11 · GMLP · Credibility Assessment
Engine 01

Cohort Intelligence Engine™

↳ Recruitment · Biomarker selection · Stratification

Site & patient feasibility modeling, eligibility-criteria optimization, and biomarker-based enrichment for small, hard-to-recruit early-phase populations.

Engine 02

Adaptive Dose Engine™

↳ Dose escalation · Adaptive design

Model-informed dose finding (Bayesian logistic regression, mTPI/BOIN), seamless and adaptive design simulation aligned with FDA Project Optimus.

Engine 03

Safety Sentinel Engine™

↳ Safety monitoring

Continuous AE/SAE/SUSAR signal detection, near-real-time pharmacovigilance triage, and automated narrative drafting with human adjudication.

Engine 04

Clinical Evidence Engine™

↳ Phase 1→2 go/no-go · Endpoint validation

Go/no-go decision support, predictive Phase-2 success modeling, and endpoint/biomarker qualification analytics with calibrated uncertainty.

Engine 05

eTMF Intelligence Engine™

↳ Data integrity · Inspection readiness

CDISC TMF Reference Model classification, ALCOA+ completeness scoring, and continuous inspection-readiness across the trial master file.

Layer CROSS-CUTTING

Governance & Assurance

↳ Trustworthy AI · Validation · Audit

Context-of-use registration, model-risk tiering, drift monitoring, immutable audit trails, model & system cards, and role-based human oversight.

03 — Itemized Response

Answering the RFI, question by question

Below, every sub-question from RFI Categories A and B is reproduced and answered with the specific Aurelyn capability that addresses it. Use the tabs to move between the design questions, the evaluation-metric questions, and the two crosswalks.

A.1

Scope & Focus

a.Which trial types or issues benefit most from AI?

Aurelyn recommends anchoring the pilot in the highest-uncertainty, smallest-N contexts where model-informed methods deliver the greatest marginal value: first-in-human oncology dose escalation and rare-disease trials, with adaptive Phase 1b/2a designs as a secondary focus. These settings have the clearest decision points (dose, expansion, go/no-go) and the strongest existing precedent for quantitative methods.

AurelynAdaptive Dose Engine™ and Clinical Evidence Engine™ target precisely these decision points, where small samples make every datum decision-relevant.
b.Target specific therapeutic areas, or remain broadly applicable?

A tiered approach: anchor in oncology and rare disease for interpretable early signal, but require platform-agnostic architecture so methods and governance generalize. This protects learning velocity without over-fitting the pilot to one indication.

AurelynTrial | OS™ is therapeutic-area-agnostic; engines are configured by context-of-use rather than hard-coded to an indication.
c.Should priority go to specific AI use cases?

Yes — prioritize the three with the clearest measurable endpoints and regulatory touchpoints: (1) dose optimization (aligned with Project Optimus), (2) safety-signal detection, and (3) recruitment & biomarker stratification. These produce the cleanest evidence for the Category B metrics.

AurelynEngines 01–03 map one-to-one to these priorities and emit pre-defined evaluation telemetry for each.
A.2

Participant Selection

a.What criteria should FDA use to select sponsors, trials, or technologies?

Select on: a well-defined context-of-use, an assigned model-risk tier, demonstrated data readiness/ALCOA+ maturity, a pre-specified credibility-assessment plan, and evidence of a managed AI lifecycle (Good Machine Learning Practice). This mirrors the FDA's own draft-guidance framework and keeps selection objective.

AurelynThe Governance layer ships a Readiness Rubric that scores candidates on each criterion before enrollment.
b.How can the pilot ensure representation across size, capability, and therapeutic area?

Use stratified selection quotas across sponsor size (small/emerging biotech through large pharma), AI maturity, and therapeutic area. The chief barrier for smaller sponsors is infrastructure — so a low-infrastructure delivery model is essential to genuine representation.

AurelynDelivered as managed SaaS with no/low-code configuration, lowering the entry barrier so emerging sponsors participate on equal footing.
A.3

Collaboration Models

a.Which partnerships are most effective?

A four-party sponsor–technology vendor–academic–FDA consortium, with an independent technology/assurance layer that no single sponsor owns. This separates the party that builds the model from the party that governs and validates it.

AurelynPositioned as the neutral technology & assurance layer — interoperable with sponsor and third-party models alike.
b.How can FDA facilitate pre-competitive collaboration and knowledge sharing?

Stand up a shared validation harness and benchmark datasets in secure enclaves, with federated evaluation so participants contribute to common metrics without exposing proprietary data or models.

AurelynSupports federated, behavioral (input/output) evaluation — proprietary systems can be benchmarked without source-code disclosure.
c.What role should patient groups and investigators play in AI governance?

Embed both directly in the Govern function of the RMF: patient advisors weigh in on context-of-use, acceptable risk, and fairness; investigators provide the clinical-workflow reality check and serve as the human-in-the-loop for every consequential recommendation.

AurelynRole-based oversight with investigator-in-the-loop checkpoints and a patient-advisory input field on each context-of-use record.
A.4

Operational Structure

a.What support should FDA provide?

Early regulatory engagement (a pre-pilot context-of-use agreement), technical guidance on credibility assessment and model-risk tiering, and a standing review cadence so participants aren't guessing at expectations mid-pilot.

AurelynGenerates regulator-facing COU dossiers and credibility-assessment packages aligned to the FDA draft guidance structure.
b.What infrastructure is needed?

Secure, validated data environments (21 CFR Part 11 compliant), shared tooling for validation and monitoring, and immutable audit trails common across participants.

AurelynShips a Part 11-validated environment with electronic-records/signatures controls and tamper-evident audit logging out of the box.
c.How can the pilot accommodate varying levels of AI maturity?

Adopt a tiered maturity model — from advisory/shadow-mode for low-maturity participants to integrated decision support for the most mature — so every sponsor contributes evidence at a level matched to its readiness.

AurelynConfigurable autonomy: shadow → recommend → integrated, set per engine and per context-of-use.
A.5

Timeline & Milestones

a.What is an appropriate duration?

18–24 months — long enough to carry at least one cohort from first-in-human dosing through a Phase 2 initiation decision, while remaining short enough to inform a summer-2026-style expansion cycle.

AurelynTelemetry is captured continuously, so interim readouts are available well before the full duration elapses.
b.What interim milestones or checkpoints should be included?

Recommended gates: (1) onboarding & context-of-use lock; (2) data-readiness gate; (3) mid-pilot safety & model-performance review; (4) model-drift checkpoint; (5) Phase 1→2 decision capture. Each gate has pre-registered pass/fail criteria.

AurelynMilestone dashboards auto-populate from platform events; nothing is reconstructed after the fact.
c.How should FDA balance rapid insight with rigorous evaluation?

Use a learn-and-confirm staging with pre-registered metrics: continuous telemetry provides rapid operational insight, while confirmatory conclusions are gated on pre-specified, locked endpoints to preserve rigor.

AurelynPre-registration of metrics and locked analysis plans are native objects — rapid signals never contaminate confirmatory analysis.
A.6

Knowledge Sharing

a.How should lessons learned be captured and disseminated?

Maintain a structured pilot registry with standardized context-of-use and credibility-assessment templates, culminating in a public summary report so the broader ecosystem inherits the learning.

AurelynStandardized, exportable COU and credibility templates make cross-participant synthesis straightforward.
b.What mechanisms promote transparency while protecting proprietary information?

Adopt tiered disclosure: public model cards and system cards describe intended use, performance, and limitations at the context-of-use level; deeper artifacts are shared confidentially with the regulator. This satisfies transparency without exposing trade secrets.

AurelynAuto-generates regulator-facing transparency artifacts (model/system cards) that disclose behavior and limits, not source.
04 — Evaluation Telemetry

What the pilot would actually measure

Aurelyn proposes these as the headline, pre-registered targets a pilot could test. The figures below are illustrative design targets and hypotheses — the platform's purpose is to measure them rigorously, not to assert them as proven outcomes.

30%
Target reduction in Phase 1→2 transition time
Clinical Evidence Engine™
40%
Target reduction in safety-signal detection latency
Safety Sentinel Engine™
25%
Target reduction in screen-fail rate via enrichment
Cohort Intelligence Engine™
100%
Decisions with logged human-in-the-loop & provenance
Governance & Assurance

Cycle-time impact by decision point

Baseline vs. AI-supported — illustrative target ranges
Patient screening & enrollment
100%
−25%
Dose-escalation decision
100%
−35%
Phase 1→2 go/no-go
100%
−30%
Safety-signal triage
100%
−40%
Traditional baseline AI-supported (target)

Evaluation coverage

Share of RFI Category-B questions with native telemetry
21 OF 22 Q's

Native, pre-instrumented telemetry maps to nearly every Category-B evaluation question — minimizing bespoke measurement scaffolding during the pilot.

Target figures represent platform design hypotheses to be tested under the pilot's pre-registered analysis plan; they are not claims of demonstrated clinical results. Actual effects depend on indication, sponsor maturity, and comparator design, and would be evaluated per RFI Category B.

05 — Pilot Lifecycle

A governed path from onboarding to evidence

How an Aurelyn-supported participant would move through the pilot, with the milestone gates recommended in answer A.5.

1
COU Lock

Context-of-use registered; model-risk tier assigned

2
Data Readiness

ALCOA+ gate; environment validated to Part 11

3
Shadow Mode

AI runs in parallel; concordance captured, no influence

4
Mid-Pilot Review

Safety, performance & drift checkpoint vs. pre-set criteria

5
Decision Capture

Phase 1→2 go/no-go logged with full provenance

6
Public Report

Model/system cards & lessons disseminated

06 — Engage

Aurelyn AI Clinical seeks to participate

We welcome the opportunity to contribute to the FDA's pilot as a technology and assurance partner — and to submit this framework to Docket FDA-2026-N-4390. Aurelyn Trial | OS™ is ready to operationalize trustworthy AI in early-phase trials today.