Bioresearch
An autonomous research loop: Claude proposes protein-interaction modifications, Modal dispatches experiments to H200 GPUs, a 5-seed Welch's t-test validates each result, and a keep/revert state machine decides whether to commit — all without human intervention.
Python · Modal · Colab · FastAPI · Gradio · pytest · Claude API · H200 GPUs · Statistical testing
- ~12 / hr
- experiments run autonomously
- 5-seed
- Welch's t-test per result
- 38
- tests in the harness
The loop
The system closes the loop between hypothesis and evidence. A proposal becomes a dispatched job, a job becomes a statistically-validated result, and the result feeds a state machine that either keeps the change or reverts it — then the loop runs again.
Claude ──propose──▶ Modal dispatch ──▶ H200 GPUs (run experiment)
▲ │
│ ▼
keep / revert ◀── 5-seed Welch's t-test ◀──── results
state machine Claude ──propose──▶ Modal dispatch ──▶ H200 GPUs (run experiment)
▲ │
│ ▼
keep / revert ◀── 5-seed Welch's t-test ◀──── results
state machineValidation
Every proposed change is run across 5 seeds and judged with a Welch's t-test before it's allowed to persist — no single lucky run can move the project. The keep/revert state machine makes the accept decision mechanical and reproducible.
Infrastructure
Modal handles GPU dispatch to H200s; Colab and FastAPI back the compute and service layers; a Gradio dashboard and a full CLI make the autonomous runs observable; 38 tests keep the harness honest. It sustains roughly 12 experiments per hour, unattended.