Project2026

Bioresearch

An autonomous research loop: Claude proposes protein-interaction modifications, Modal dispatches experiments to H200 GPUs, a 5-seed Welch's t-test validates each result, and a keep/revert state machine decides whether to commit — all without human intervention.

Source ↗

Python · Modal · Colab · FastAPI · Gradio · pytest · Claude API · H200 GPUs · Statistical testing

~12 / hr: experiments run autonomously
5-seed: Welch's t-test per result
38: tests in the harness

The loop

The system closes the loop between hypothesis and evidence. A proposal becomes a dispatched job, a job becomes a statistically-validated result, and the result feeds a state machine that either keeps the change or reverts it — then the loop runs again.

   Claude ──propose──▶ Modal dispatch ──▶ H200 GPUs (run experiment)
      ▲                                            │
      │                                            ▼
  keep / revert ◀── 5-seed Welch's t-test ◀──── results
  state machine

   Claude ──propose──▶ Modal dispatch ──▶ H200 GPUs (run experiment)
      ▲                                            │
      │                                            ▼
  keep / revert ◀── 5-seed Welch's t-test ◀──── results
  state machine

Validation

Every proposed change is run across 5 seeds and judged with a Welch's t-test before it's allowed to persist — no single lucky run can move the project. The keep/revert state machine makes the accept decision mechanical and reproducible.

Infrastructure

Modal handles GPU dispatch to H200s; Colab and FastAPI back the compute and service layers; a Gradio dashboard and a full CLI make the autonomous runs observable; 38 tests keep the harness honest. It sustains roughly 12 experiments per hour, unattended.

← All work