Skip to content

Changelog

All notable changes to AgentLens will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[0.1.1] - 2026-03-18

Fixed

  • Replay filesystem reset for chained sessions — when replaying session N > 1, the filesystem was incorrectly reset to baseline (pre-experiment state) instead of the end state of session N-1. This caused the agent to see stale files (e.g. an empty MEMORY.md instead of one populated by prior sessions). The fix falls back to session_{N-1} when no file-write tags exist within the current session before the replay turn.

Changed

  • Removed experiment configs from version control (already in .gitignore, now untracked)

[0.1.0] - 2026-03-17

Initial release.

Added

  • Experiment runner — YAML-based config for multi-session Claude Code experiments via the Claude Agent SDK
  • ATIF trajectory capture — every agent step, tool call, observation, and thinking block captured in ATIF v1.6 format
  • Shadow git change tracking — invisible bare git repo tracks all file changes with per-step write attribution and unified diffs
  • Session modesisolated (fresh conversation, files persist), chained (conversation resumes), forked (independent branches from a base session)
  • Flexible forkingfork_from on individual sessions to fork from any prior session, not just session 1
  • Session replicatescount: N runs the same session N times as independent replicates with _rNN directory suffixes
  • Subagent capture — separate ATIF trajectories for each subagent invocation, linked to parent via SubagentTrajectoryRef
  • API request capture — local reverse proxy captures raw request/response bodies, system prompts, tool definitions, token usage, and compaction events
  • Turn-level resampling — replay a specific API request N times to study response variance (stateless, no tool execution)
  • Intervention testing — edit captured API requests (assistant text, tool results, system prompt) and resample with modified inputs; available from both CLI (harness resample-edit) and web UI
  • Session-level resampling — re-run a forked session N times with full tool execution (harness resample-session)
  • Turn-level replay — branch execution from any API turn with exact-match context, filesystem reset via git worktrees, and full tool execution; replicates run in parallel (harness replay)
  • Transcript capture — Claude Code transcript JSONL copied into session output for replay support
  • UUID map — per-turn correlation across transcript, ATIF trajectory, and raw API dumps using tool_call_id as join key
  • Web UI — SvelteKit interface for browsing runs, viewing trajectories, memory diffs, API captures, resamples, edit & resample, and file changelogs
  • CLIharness run, list, inspect, resample, resample-edit, resample-session, replay
  • Provider support — OpenRouter (default), Anthropic, AWS Bedrock, GCP Vertex AI
  • Memory file — auto-seeded file in working directory for cross-session note persistence