Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.selftune.dev/llms.txt

Use this file to discover all available pages before exploring further.

What this is for

This guide is for people publishing skills other users will install. If you want the concrete shipping sequence first, start with Create, Test, and Deploy a Skill and come back here for the creator-specific before-ship / after-ship loop. The creator lifecycle has two phases:
Before ship: structure -> verify -> publish
After ship: collect signal -> inspect -> propose -> apply -> watch

Before ship

Put the right thing in the right place

SurfaceWhat belongs there
description / routingWhat the skill should trigger on
workflows/The ordered procedure once the skill is chosen
references/Durable context, taxonomy, checklists, examples
scripts/ / toolsDeterministic mechanics the agent should not reinvent
Rule of thumb:
  • Routing problems belong in the description.
  • Execution problems belong in workflows.
  • Missing context belongs in references.
  • Repeated exact logic belongs in code.

Keep the router small and legible

  • Start router-first.
  • Add only the trigger language needed to recognize the job.
  • Add negative examples for nearby intents that should not trigger.
  • Split workflows when the execution path really changes.

Run cold-start tests before launch

Use the lifecycle-first trust flow before launch:
selftune verify --skill-path path/to/my-skill
selftune eval generate --skill my-skill --skill-path path/to/SKILL.md
selftune verify --skill-path path/to/my-skill
selftune eval unit-test --skill my-skill --generate --skill-path path/to/SKILL.md
selftune verify --skill-path path/to/my-skill
selftune create replay --skill-path path/to/my-skill --mode package
selftune create baseline --skill-path path/to/my-skill --mode package
selftune verify --skill-path path/to/my-skill
selftune publish --skill-path path/to/my-skill
verify is the front door. The lower-level eval, replay, and baseline commands are supporting steps you fill in only when verify reports that evidence is still missing. If you prefer the condensed version, the intended lifecycle is:
selftune verify --skill-path path/to/my-skill
# Fill only the missing eval, unit-test, replay, or baseline step that verify reports.
selftune publish --skill-path path/to/my-skill
Ship only when you can explain:
  • what should trigger
  • what should not
  • what the workflow does
  • which parts require references or tools
For already-published skills you are iterating on in place, evolve --dry-run, grade baseline, and evolve --with-baseline are still the right mutation loop. For brand-new package drafts, prefer the lifecycle-first verify / publish flow above.

Bundle creator-directed contribution

If you want post-ship creator feedback in the cloud dashboard:
selftune creator-contributions enable --skill my-skill --creator-id <cloud-user-uuid>
This writes selftune.contribute.json into the skill package so users can opt in to privacy-safe creator-directed relay. Supported creator-directed signals today:
  • trigger
  • grade
  • miss_category

After ship

Understand the two contributor paths

  • selftune contributions approve <skill>: lightweight creator-directed relay into the community dashboard
  • selftune contribute --skill <skill> --submit: a deeper sanitized bundle export
Relay is the everyday signal loop. Bundles are the richer periodic contribution path.

Watch the right dashboard surfaces

After launch, check:
  • Community overview for cross-skill signal strength
  • Skill detail Community tab for missed categories, grades, and proposal drafts
  • Proposals for review/apply
  • Watch outcomes after apply

Wait for actionable signal

Contributor proposals should only be trusted once a skill has:
  • at least 10 total signals
  • at least 3 distinct contributor cohorts
Below that threshold, gather more data before changing the skill.

Interpret the signal correctly

  • High misses in concentrated categories usually mean the router is wrong.
  • Low grades with decent trigger rate usually mean the body/workflow/reference/tool split is wrong.
  • Repeated contributor proposals should be reviewed as hypotheses, not auto-applied truth.

Fast checklist

Before ship:
  • router is explicit
  • workflows are separated by execution path
  • references carry durable context
  • tools handle deterministic mechanics
  • the package passes selftune verify
  • evals cover more language than your own phrasing
  • selftune.contribute.json is bundled if you want creator-directed feedback
After ship:
  • users know how to opt in
  • community pages show the skill by name
  • proposal drafts are only created from coherent signal
  • watch closes the loop after apply