Documentation Index
Fetch the complete documentation index at: https://docs.selftune.dev/llms.txt
Use this file to discover all available pages before exploring further.
Usage
Subcommands
Recommended lifecycle flow
Start with the lifecycle entrypoints, not the low-level stage commands:eval subcommands are the most common supporting steps that verify asks
you to fill in when a draft still needs evidence.
If you want to drive the advanced draft-package loop manually, the stage-level
sequence is still:
selftune status, and per-skill report all read the artifacts
from this flow to show what is still missing before you trust a live deploy,
and when the skill has already moved into watch mode.
generate
Generate eval sets from real usage or synthetically:| Flag | Description |
|---|---|
--skill NAME | Skill to generate evals for |
--list-skills | List all skills with available data |
--stats | Show eval generation statistics |
--max N | Maximum entries per side to generate |
--seed N | Random seed for reproducibility |
--output PATH | Output file path |
--no-negatives | Omit negative eval entries |
--no-taxonomy | Skip invocation_type classification |
--skill-log PATH | Override the skill usage log source |
--agent NAME | Runtime agent for synthetic or blended eval generation (claude, codex, opencode, pi) |
--query-log PATH | Override the query log source |
--telemetry-log PATH | Override the telemetry log source |
--synthetic | Generate from SKILL.md instead of real data |
--auto-synthetic | Fall back to SKILL.md cold-start generation when trusted triggers do not exist |
--blend | Merge log-based evals with synthetic gap-fillers |
--skill-path PATH | Path to SKILL.md (required with --synthetic) |
--model MODEL | Override the synthetic-generation model |
--help | Show command help |
selftune eval generate --help now prints the exact generate-subcommand
surface, including cold-start and blended eval flags.
If Claude Code is rate-limited or you want to force a different runtime, use
--agent opencode (or codex / pi) for --synthetic, --auto-synthetic,
and --blend paths.
Every successful generate run also mirrors a canonical copy to:
selftune status use to
decide whether a skill already has eval coverage.
For new draft packages, the next steps after eval generate are usually
rerunning verify or, if you are driving the advanced loop
manually, continuing with create replay and
create baseline.
unit-test
Run or generate deterministic unit tests:selftune status.