CLI Reference
calibra <command> [options]
calibra validate
Validate a campaign configuration file without running anything.
calibra validate <config>
The config argument is a path to a campaign TOML file.
Validation checks TOML syntax and required fields, task directory structure (that task.md exists, env/ is a directory, verify.sh is executable), uniqueness of matrix dimension labels, validity of constraint references, session option keys and types (rejecting harness-managed keys, unknown keys, and type mismatches), and price coverage if require_price_coverage = true.
uv run calibra validate experiments/model-shootout.toml
Output on success:
Config valid. 10 variants x 5 tasks x 5 repeats = 250 trials.
calibra run
Execute a campaign.
calibra run <config> [--workers N] [--dry-run] [--filter EXPR] [--resume] [--output DIR] [--keep-workdirs] [-v]
The config argument is a path to a campaign TOML file.
| Option | Default | Description |
|---|---|---|
--workers N |
1 |
Number of parallel worker threads |
--dry-run |
off | Print trial plan without executing |
--filter EXPR |
none | Filter variants (e.g., "model=sonnet,skills=full") |
--resume |
off | Skip trials with existing valid results |
--output DIR |
results |
Output directory for trial reports |
--keep-workdirs |
off | Preserve temporary workspace directories |
-v, --verbose |
off | Show detailed trial output (counters, timing, event timeline) |
# Basic run
uv run calibra run experiments/config.toml
# Parallel with filtering
uv run calibra run experiments/config.toml --workers 4 --filter "model=sonnet"
# Resume an interrupted run
uv run calibra run experiments/config.toml --resume --workers 4
# Dry run to preview
uv run calibra run experiments/config.toml --dry-run
# Debug a failing trial
uv run calibra run experiments/config.toml --keep-workdirs --filter "model=haiku"
calibra analyze
Aggregate trial results into statistical summaries.
calibra analyze <results_dir> [--output DIR]
The results_dir argument is a path to a campaign's results directory.
| Option | Default | Description |
|---|---|---|
--output DIR |
same as results_dir |
Where to write summary files |
Produces three files: summary.json (full machine-readable aggregate data), summary.md (human-readable Markdown report), and summary.csv (spreadsheet format).
uv run calibra analyze results/model-shootout
uv run calibra analyze results/model-shootout --output reports/
calibra show
Pretty-print a single trial report.
calibra show <report.json>
The argument is a path to a trial JSON file. Output includes the task name, variant label, outcome, verification status, wall time, turns, LLM calls, total tool calls, tool failures, LLM time, tool time, compactions, and a per-tool usage breakdown.
uv run calibra show results/model-shootout/hello-world/sonnet_default_none_none_base_0.json
calibra compare
Compare two campaign result directories.
calibra compare <dir_a> <dir_b> [--output DIR]
| Option | Default | Description |
|---|---|---|
--output DIR |
parent of dir_a |
Where to write comparison output |
Finds variants common to both campaigns and computes the pass rate delta (B minus A), Cliff's delta effect size and magnitude, and a token usage comparison.
uv run calibra compare results/run-v1 results/run-v2
calibra web serve
Launch the interactive web dashboard.
calibra web serve <results_dir> [--port N] [--host ADDR] [--open]
The results_dir argument is the directory containing campaign result folders.
| Option | Default | Description |
|---|---|---|
--port N |
8118 |
Port to bind |
--host ADDR |
127.0.0.1 |
Host address to bind |
--open |
off | Open browser automatically |
uv run calibra web serve results/ --open
uv run calibra web serve results/ --host 0.0.0.0 --port 9000
calibra web build
Export a static HTML dashboard.
calibra web build <results_dir> [--output DIR]
| Option | Default | Description |
|---|---|---|
--output DIR |
<results_dir>/web |
Output directory for static HTML |
uv run calibra web build results/ --output docs.md/dashboard/
Exit codes
Calibra exits with 0 on success and 1 on error, whether that's a configuration problem (invalid TOML, missing files, bad constraints) or a runtime failure (all trials failed, budget exceeded).
Environment variables
Calibra inherits environment variables for provider authentication. The specific variables depend on which providers you use in your matrix. For example, ANTHROPIC_API_KEY for Anthropic models.