Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.selftune.dev/llms.txt

Use this file to discover all available pages before exploring further.

Usage

selftune watch [skill]
selftune watch [skill] --ignore-watch-alerts
Monitors a skill over time, detecting trigger regressions and grade regressions. After each cycle, watch computes a trust score that gates publishing.

Flags

FlagTypeDefaultDescription
--skillstringRequired. Skill name to monitor
--skill-pathstringRequired. Path to the skill’s SKILL.md
--windownumber20Number of recent sessions to evaluate
--thresholdnumber0.1Trigger-rate regression threshold below baseline
--grade-thresholdnumber0.15Grade score regression threshold below baseline
--no-grade-watchbooleanfalseDisable grade-based regression detection
--auto-rollbackbooleanfalseAutomatically roll back when a regression is detected
--sync-firstbooleanfalseRefresh telemetry before reading watch inputs
--sync-forcebooleanfalseForce a full rescan during --sync-first
--helpbooleanfalseShow command help

Trust score

Every watch cycle produces a trust score between 0 and 1 that summarizes skill health:
ScoreMeaning
1.0No regressions, sufficient check data
0.5–0.99Minor issues or limited data
< 0.5Active regressions or rollbacks detected

How the score is calculated

SignalEffect on score
Trigger pass rate regression−0.5
Grade regression (scaled by delta, max)−0.3
Active alert without specific regression−0.2
Recent rollback−0.2
Insufficient check dataCapped at 0.5
Scores are clamped to [0, 1]. A skill with no regressions and enough data scores 1.0. The current trust score is visible in the skill report on the dashboard.

Output

watch emits structured JSON by default. The key fields are:
  • snapshot.pass_rate and snapshot.baseline_pass_rate for measured trigger delta
  • alert for any trigger or grade regression message
  • recommended_command for a machine-readable follow-up, usually rollback when watch detects a regression
  • gradeAlert and gradeRegression for grade-specific evidence
create publish --watch now returns the nested same-shape watch_result payload too, so agents and the local dashboard can inspect measured post-deploy watch evidence without reparsing raw terminal output.

Grade watch

Publish gate

Before a skill is published to the registry, SelfTune evaluates the most recent watch results. This gate is advisory — it produces warnings, not hard blocks — but publishing with active warnings is not recommended.

What triggers a warning

  • Low trust score — score below 0.70
  • Active alerts — any unresolved alert from recent watch cycles
  • Recent rollback — the skill was rolled back and the issue may not be resolved
  • No watch data — skill has never been watched; consider running selftune watch first

Bypassing warnings

If you are confident the alerts do not apply, pass --ignore-watch-alerts:
selftune watch my-skill --ignore-watch-alerts
This is intended for expert use. Warnings are still shown; they are not suppressed.

Alerts

When watch detects a problem, it sets an alert on the result. Alerts flow through to the publish gate and are visible in the dashboard. Resolve the underlying regression before publishing to avoid distributing a degraded skill.
selftune watch --skill my-skill --skill-path path/to/SKILL.md --auto-rollback
Auto-rollback is irreversible without re-running selftune evolve. Use it only in automated pipelines where you have a clear re-evolution path.

Examples

Run watch on a specific skill:
selftune watch summarize
Publish despite active watch warnings (expert use):
selftune watch summarize --ignore-watch-alerts