Configuration¶

Experiments are defined as YAML config files. The harness validates configs with Pydantic — errors are caught before any sessions run.

Full example¶

model: "claude-sonnet-4-20250514"
provider: anthropic
hypothesis: "The agent preserves hedging across sessions"
work_dir: "./repos/my_project"
session_mode: chained
tags: ["experiment-1"]

system_prompt: |
  You are exploring a Python codebase. Use MEMORY.md to keep notes.

allowed_tools:
  - Read
  - Grep
  - Glob
  - Bash
  - Write
  - Edit

max_turns: 30
permission_mode: bypassPermissions
max_budget_usd: 1.00

memory_file: "MEMORY.md"
memory_seed: "# Project Notes\n"

capture_api_requests: true

sessions:
  - session_index: 1
    prompt: "Explore the project structure. Take notes in MEMORY.md."
  - session_index: 2
    prompt: "Read the main module in detail. Update your notes."
  - session_index: 3
    prompt: "Summarize what you know about this project."
    max_turns: 10

Config reference¶

Top-level fields¶

Field	Required	Default	Description
`model`	yes	—	Claude model identifier (e.g. `claude-sonnet-4-20250514`)
`provider`	no	`anthropic`	API provider: `anthropic`, `openrouter`, `bedrock`, `vertex`
`base_url`	no	—	Custom API base URL (overrides provider default)
`hypothesis`	no	—	What this experiment tests. Shown in the web UI.
`work_dir`	yes	—	Working directory the agent operates in (any directory)
`repo_name`	no	—	Human-readable name for the working directory
`sessions`	yes	—	List of session configs
`session_mode`	no	`isolated`	`isolated`, `chained`, or `forked`
`system_prompt`	no	—	System prompt for all sessions
`allowed_tools`	no	Read, Grep, Glob, Bash, Write, Edit	Tools the agent can use
`max_turns`	no	`50`	Max agent turns per session
`permission_mode`	no	`bypassPermissions`	`acceptEdits` or `bypassPermissions`
`memory_file`	no	`MEMORY.md`	File to auto-seed in working directory
`memory_seed`	no	`# Notes\n`	Initial content for the memory file
`max_budget_usd`	no	—	Per-session spend cap
`agents`	no	`[]`	Subagent definitions (see Subagents)
`capture_subagent_trajectories`	no	`true`	Save separate ATIF trajectories per subagent
`capture_api_requests`	no	`true`	Capture raw API requests (enables resampling)
`run_name`	no	auto	Custom name for the run directory
`tags`	no	`[]`	Metadata tags
`revert_work_dir`	no	`false`	Reset working directory to pre-run state after the run completes
`load_project_settings`	no	`false`	Load the repo's CLAUDE.md and .claude/settings.json

Session fields¶

Field	Required	Default	Description
`session_index`	yes	—	Sequential index starting at 1
`prompt`	yes	—	The user prompt for this session
`system_prompt`	no	—	Per-session system prompt override
`max_turns`	no	—	Per-session max turns override
`fork_from`	no	—	Session index to fork from (must be lower)
`count`	no	`1`	Run N independent replicates of this session

Providers¶

Provider	Config value	Env var	Notes
Anthropic	`anthropic` (default)	`ANTHROPIC_API_KEY`	Direct Anthropic API. Falls back to Claude Code subscription if no key set.
OpenRouter	`openrouter`	`OPENROUTER_API_KEY`	Routes through OpenRouter
AWS Bedrock	`bedrock`	AWS credentials	Sets `CLAUDE_CODE_USE_BEDROCK=1`
GCP Vertex AI	`vertex`	GCP credentials	Sets `CLAUDE_CODE_USE_VERTEX=1`
Claude Code subscription	`anthropic`	(none needed)	If no `ANTHROPIC_API_KEY` is set, the SDK uses your Claude Code subscription credentials from `~/.claude/credentials.json`. Usage is covered by your subscription (Pro/Max) with rate limits rather than per-token billing.

Cost reporting¶

Cost figures shown in run_meta.json, harness inspect, and the web UI come from the Claude Agent SDK's total_cost_usd field, which is calculated using Anthropic's list pricing regardless of which provider you use. This means:

OpenRouter — reported cost reflects Anthropic list prices, not your actual OpenRouter bill (which may differ)
Bedrock / Vertex — reported cost may not match AWS or GCP billing
Claude Code subscription — cost is reported but you're not actually billed per-token

Treat cost figures as rough estimates, not authoritative billing data.

Automatic behaviors¶

Memory file is auto-seeded. The harness creates the memory file with seed content if it doesn't already exist.
Working directory path is injected into the system prompt. The agent knows where to read/write.
The agent's cwd is set to the working directory.

Validation rules¶

Session indices must be unique and contiguous starting at 1
fork_from must reference a session with a lower index
count must be >= 1
session_index must be >= 1