Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.selftune.dev/llms.txt

Use this file to discover all available pages before exploring further.

Overview

After an evolution deploys a new skill description, selftune monitors for regressions. If the new description performs worse than the old one, selftune can automatically roll back.

How monitoring works

selftune watch uses a sliding window of post-deploy sessions to compare against the pre-deploy baseline:
  1. Baseline capture — records pass rates before the evolution deploys
  2. Post-deploy tracking — monitors new sessions after deployment
  3. Regression detection — compares post-deploy metrics against the baseline
  4. Auto-rollback — if regression confidence is strong enough, reverts to the backup
selftune watch --skill my-skill --skill-path path/to/SKILL.md
With auto-rollback enabled:
selftune watch --skill my-skill --skill-path path/to/SKILL.md --auto-rollback

Activation rules

selftune includes built-in activation rules that trigger automatically:
RuleConditionAction
post-session-diagnosticMore than 2 unmatched queries in a sessionSuggests selftune last
grading-threshold-breachSession pass rate below 60%Suggests selftune evolve
stale-evolutionNo evolution in 7+ days with pending false negativesSuggests evolve
regression-detectedMonitoring detects regressionSuggests rollback
Rules fire at most once per session to avoid noise.

Orchestrate loop

For fully autonomous operation, selftune run runs the complete loop:
sync → grade → evolve → watch
selftune run
In continuous mode:
selftune run --loop --loop-interval 3600
See the orchestrate command reference for all options.

Dashboard monitoring

The local dashboard shows real-time skill health with SSE live updates:
selftune dashboard
The dashboard displays:
  • Per-skill pass rates over time
  • Evolution history and outcomes
  • Missed queries and false negatives
  • Orchestrate run summaries