Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.selftune.dev/llms.txt

Use this file to discover all available pages before exploring further.

What this guide covers

Use this when you want one path from idea to shipped skill:
Create -> verify -> publish -> watch
It combines the Agent Skills authoring basics with selftune’s lifecycle-first creator flow, so you can answer four questions before you publish:
  1. Is the skill scoped correctly?
  2. Does it trigger on the right prompts?
  3. Does it add value compared with no skill?
  4. Can I deploy it without guessing?
This guide follows the current package-first flow:
Draft package -> verify -> fill missing evidence -> publish -> watch
If you only want the command reference, use selftune create. If you want the lighter introductory version first, use Build and Improve Your First Skill.

Step 1: Pick a coherent skill boundary

Start with one unit of work the agent would otherwise get wrong or do inconsistently. Good candidates:
  • A reusable workflow with domain-specific context
  • A task with recurring trigger misses
  • A job that mixes judgment with a few deterministic steps
Bad candidates:
  • A vague bucket like “do engineering work”
  • A tiny one-line trick that the base agent already handles well
  • A grab-bag of unrelated tasks
Rule of thumb:
  • Put routing in description
  • Put ordered execution in workflows/
  • Put durable context in references/
  • Put deterministic mechanics in scripts/ or tools

Step 2: Create the draft package

Start with selftune create init if you already know the skill you want to build:
selftune create init \
  --name "Summarize Issues" \
  --description "Use when the user wants issue trackers, support threads, or bug reports summarized into an engineering brief."
If you want to bootstrap from repeated telemetry instead, use:
selftune create scaffold --from-workflow 1 --write
Both commands produce the same package shape:
summarize-issues/
├── SKILL.md
├── workflows/default.md
├── references/overview.md
├── scripts/
├── assets/
└── selftune.create.json
SKILL.md is the router. workflows/default.md is the first execution path. references/overview.md is durable background context. selftune.create.json records the package metadata selftune uses for readiness and package replay. Start with a small router:
---
name: summarize-issues
description: >
  Use this skill when the user wants bug reports, support threads, or issue
  trackers summarized into an engineering brief with key failures, reproduction
  clues, and next steps. Do not use it for writing issue templates, backlog
  prioritization, or fixing the bug itself.
compatibility: Works with standard shell tools only.
---
Keep the description focused on when to use the skill, not how it works internally.

Step 3: Put the right detail in the right file

Keep the always-loaded instructions lean. Use the rest of the directory intentionally:
  • workflows/ for the main path once the skill is selected
  • references/ for checklists, taxonomies, schemas, or examples the agent should load on demand
  • scripts/ for exact mechanics the agent should execute instead of reinventing
  • assets/ for templates, static examples, or config snippets
That split matters because agents load metadata first, full SKILL.md on activation, and support files only when needed.

Step 4: Check the package before generating evals

Use the draft-aware status and verify commands first:
selftune create status --skill-path .agents/skills/summarize-issues
selftune verify --skill-path .agents/skills/summarize-issues
create status is the fast local view. verify runs the same readiness contract as create check, then emits the measured package report once the draft is actually ready. At this point you want to confirm:
  • the package structure is complete enough to validate
  • the entry workflow exists
  • the description is specific enough to route
  • the next missing artifact is clear
Then do a quick manual trigger pass with three kinds of prompts:
  • should-trigger prompts
  • should-not-trigger near misses
  • realistic prompts with file paths, context, and messy phrasing
If the router is obviously too broad or too narrow, fix it now.

Step 5: Generate your first eval set

If you already have real usage:
selftune eval generate --skill my-skill
If the skill is new or cold-start:
selftune eval generate \
  --skill my-skill \
  --auto-synthetic \
  --skill-path path/to/my-skill/SKILL.md
This creates the routing eval set and saves the canonical copy under:
~/.selftune/eval-sets/my-skill.json
After generating evals, rerun:
selftune verify --skill-path .agents/skills/summarize-issues
The package should now move from needs_evals to needs_unit_tests.

Step 6: Add skill-level unit tests

Generate or run deterministic tests for the workflow itself:
selftune eval unit-test \
  --skill my-skill \
  --generate \
  --skill-path path/to/my-skill/SKILL.md
This covers the “once it triggers, does it do the job correctly?” part of the loop. The latest test run summary is stored under:
~/.selftune/unit-tests/my-skill.last-run.json
Run selftune verify --skill-path ... again after the suite is generated or recorded.

Step 7: Prove the package with replay validation

For a new draft package, use the package-aware replay path instead of a generic evolve dry-run:
selftune create replay \
  --skill-path .agents/skills/summarize-issues \
  --mode package
This stages the whole package, not just the router text, so runtime replay is allowed to read:
  • workflows/default.md
  • references/overview.md
  • other package-local files the skill needs during execution
Use --mode routing only if you intentionally want to isolate the routing layer.

Step 8: Measure the no-skill baseline

Record whether the skill actually adds value versus doing nothing:
selftune create baseline \
  --skill-path .agents/skills/summarize-issues \
  --mode package
That baseline is what lets selftune say “this skill helped” instead of only “this skill triggered.” At this point, selftune verify --skill-path ... should move to the point where publish is the next lifecycle action.

Step 9: Publish the draft package

Once evals, unit tests, replay validation, and baseline are all in place, ship through the lifecycle surface:
selftune publish \
  --skill-path .agents/skills/summarize-issues
This is the recommended ship command for new draft packages because it:
  • blocks if the draft is not ready
  • reuses the same measured package evaluation contract you saw during verify
  • starts watch by default unless you pass --no-watch
If you want another review pass first, rerun create replay or create baseline, inspect the dashboard skill report, and publish only after the draft loop is green.

Step 10: Watch the deployed skill

If you did not use --watch, start monitoring explicitly:
selftune watch --skill summarize-issues --skill-path .agents/skills/summarize-issues/SKILL.md
Or let the broader loop manage it:
selftune run --skill summarize-issues
The local dashboard and selftune status now expose this flow directly:
  • missing package resources
  • spec validation not yet run
  • missing evals
  • missing unit tests
  • missing replay validation for the package
  • missing baseline
  • ready to publish
  • already deployed and under watch
The dashboard is especially useful for new packages because draft skills now appear there before they have live telemetry, with package-local create readiness on the skill report.

Step 11: Publish and share it

When the skill is stable, distribute it through the Agent Skills ecosystem:
npx skills add your-org/your-skill
If you want post-ship creator feedback, bundle creator-directed contribution config:
selftune creator-contributions enable --skill my-skill --creator-id <cloud-user-uuid>
That makes it possible to collect privacy-safe contributor signals after launch.

Deployment checklist

Ship only when all of these are true:
  • the description explains when to use the skill, not how to implement it
  • nearby negative examples do not trigger
  • the package passes selftune verify
  • workflows/, references/, and scripts/ each have a clear purpose
  • the skill validates against the Agent Skills package rules
  • evals exist
  • unit tests exist
  • replay evidence exists for the package
  • the no-skill baseline exists
  • the publish command has been reviewed
GoalRead next
Learn the introductory version firstBuild and Improve Your First Skill
Tune the routerWriting Effective Descriptions
Test trigger boundaries harderTesting Skill Triggers
Package it for other usersPublishing and Sharing Skills
Operate the loop continuouslyThe Iteration Loop