Quick Start¶
1. Create a config¶
Create a file called my_experiment.yaml:
model: "claude-sonnet-4-20250514"
provider: anthropic
work_dir: "./repos/my_project"
session_mode: isolated
system_prompt: |
You are exploring a Python codebase. Use MEMORY.md to keep notes.
sessions:
- session_index: 1
prompt: "Explore the project structure. Take notes in MEMORY.md."
- session_index: 2
prompt: "Read the main module in detail. Update your notes."
2. Run it¶
harness run my_experiment.yaml
The harness will:
- Initialize a shadow git repo to track all file changes
- Seed
MEMORY.mdin the working directory - Run each session sequentially
- Save ATIF trajectories, diffs, and metadata to
runs/<run-name>/
3. Inspect results¶
# CLI summary
harness inspect runs/<run-name>
# Or browse in the web UI
cd ui && npm run dev
# Open http://localhost:5173
4. Resample for variance¶
# Resample a specific API turn
harness resample runs/<run-name> --session 1 --request 5 --count 10
# Or re-run a full session
harness resample-session runs/<run-name> --session 2 --count 5
Example output¶
$ harness run tests/smoke.yaml
[session 1] starting (mode=isolated)...
[session 1] done -- 15 steps, 5 tool calls, $0.0596
Run complete: runs/2026-03-15T10-30-00_claude-sonnet-4-20250514
Next steps¶
- Session Modes — isolated, chained, and forked sessions
- Resampling & Replay — study variance from API-level resampling to full turn-level replay
- Output Structure — where trajectories, diffs, transcripts, and metadata are stored
- Configuration — full config reference with all fields
- Subagents — delegate work to specialized subagents