Documentation Index
Fetch the complete documentation index at: https://docs.selftune.dev/llms.txt
Use this file to discover all available pages before exploring further.
Start with the symptom
Most skill failures fall into one of five buckets:| Symptom | Usually means |
|---|---|
| The skill does not fire when it should | The description is too narrow or too technical |
| The skill fires for the wrong prompts | The description is too broad or overlaps with another skill |
| The skill fires but ignores the workflow | The instructions are too vague, too long, or not deterministic enough |
| The skill runs but output quality is poor | The workflow is missing constraints, examples, or validation steps |
| The agent hangs while running the skill | A bundled script is interactive or produces unusable output |
1. The skill undertriggers
This is the most common failure mode. The skill exists, but users ask in natural language and the trigger never matches.What to check
- Does the description use developer terms instead of user terms?
- Does it say what the skill does but not when to use it?
- Are you missing common synonyms, adjacent phrases, or contextual wording?
What to run
Typical fix
Rewrite the description around intent:2. The skill overtriggers
This is the opposite failure. The skill catches prompts that belong to another skill or no skill at all.What to check
- Does the description include generic words like “analyze,” “build,” or “review” without domain boundaries?
- Are two skills trying to own the same user intent?
- Did a recent evolution increase recall by sacrificing precision?
What to run
Typical fix
Narrow the trigger boundary:3. The skill fires but the agent ignores instructions
This is usually a Tier 2 problem: the trigger is fine, but the workflow execution is unreliable.What to check
- Is
SKILL.mdtoo long to stay in context? - Are the steps ambiguous or full of prose instead of explicit actions?
- Are repeated mechanical tasks still written as markdown instead of scripts?
- Are there too many branches packed into one file?
What to run
Typical fixes
- Split one large
SKILL.mdinto a router plus focused workflow files - Move deterministic logic into scripts
- Add concrete command examples
- Remove long explanatory text from the operational path
4. The workflow is followed, but the output is weak
This is a Tier 3 problem. The agent is activating the skill and roughly following instructions, but the result is still not good enough.What to check
- Does the workflow specify quality bars, output format, or acceptance criteria?
- Are there examples of good output?
- Is there a validation or review step before final output?
What to run
Typical fixes
- Add a target structure for the final answer
- Include one short example of a strong result
- Add a deterministic validation step before completion
- Separate “collect data” from “present result”
5. Scripts hang or behave unpredictably
This is almost always a script design problem, not a selftune problem.What to check
- Does the script prompt for input interactively?
- Does it print verbose logs to stdout instead of returning structured data?
- Does it require local dependencies that are not declared?
- Does it mutate files or state without explicit flags?
Typical bad pattern
Typical fix
- All inputs via flags, env vars, or stdin
--helpexplains usage- stdout contains parseable output
- stderr contains diagnostics
- exit codes are meaningful
6. selftune does not seem to have enough evidence
Sometimes the skill itself is fine, but selftune cannot judge or evolve it well because the local evidence is thin.What to check
- Did you run
selftune syncafter recent sessions? - Is this a brand-new skill with no real usage history?
- Did the local setup pass
selftune doctor?
What to run
Typical fix
- Repair setup issues first
- Use synthetic evals until real usage arrives
- For draft packages, run
verifyfirst, then fill only the missing replay or baseline steps it asks for before publish - Establish a baseline before you judge evolution results
A practical triage order
When you are unsure where to start, use this order:selftune doctorselftune syncselftune statusselftune grade auto --skill my-skillselftune eval generate --skill my-skill
Escalate carefully
Do not jump straight to rewriting the whole skill. Usually the right fix is smaller:- Tighten the description
- Split a workflow
- Add a script
- Roll back a bad evolution
- Add stronger eval coverage
Next steps
Build Your First Skill
Follow the full authoring and optimization loop.
Testing Triggers
Diagnose undertriggering and overtriggering.
Managing Context
Fix workflows that get ignored mid-session.
Using Scripts
Make the mechanical parts deterministic.