Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.selftune.dev/llms.txt

Use this file to discover all available pages before exploring further.

When to use scripts

Skills are markdown — they tell the agent what to do using natural language. Scripts are code — they do things deterministically. The question is: which parts of your skill belong in each? Use markdown for judgment: choosing an approach, interpreting user intent, adapting to context. Use scripts for mechanics: validation, data transformation, formatting, API calls with fixed parameters. A useful heuristic from the agent skills spec: if the agent reinvents the same logic every run, bundle it as a tested script.

Two approaches

Reference existing packages

When an existing package does what you need, reference it directly in your SKILL.md without bundling scripts:
## Validate the configuration

Run validation using the published schema:

\`\`\`bash
npx [email protected] validate -s schema.json -d config.json
\`\`\`
Common runners for one-off commands:
RunnerEcosystemExample
npx / bunxNode.jsnpx [email protected] .
uvx / pipxPythonuvx ruff check .
deno runDenodeno run jsr:@std/csv
go runGogo run golang.org/x/tools/cmd/stringer@latest
Pin versions (e.g., npx [email protected]) so the skill behaves consistently across environments. State prerequisites in the compatibility frontmatter field.

Bundle scripts in the skill

When you need custom logic, put scripts in a scripts/ directory:
my-skill/
├── SKILL.md
├── scripts/
│   ├── validate.sh
│   ├── transform.py
│   └── generate.ts
└── references/
    └── schema.json
Reference them from SKILL.md using relative paths:
## Workflow

1. Validate input: `bash scripts/validate.sh "$INPUT_FILE"`
2. Transform data: `python3 scripts/transform.py --input data.json`
3. Generate output: `bun scripts/generate.ts --format pdf`

Self-contained scripts

The best scripts declare their own dependencies inline, so there’s nothing to install:

Python (PEP 723)

# /// script
# dependencies = [
#   "beautifulsoup4>=4.12",
#   "httpx>=0.27",
# ]
# ///

from bs4 import BeautifulSoup
import httpx

response = httpx.get(sys.argv[1])
soup = BeautifulSoup(response.text, "html.parser")
print(soup.get_text())
Run with: uv run scripts/extract.py https://example.com

TypeScript (Bun)

#!/usr/bin/env bun

import { parseArgs } from "util";

const { values } = parseArgs({
  args: Bun.argv.slice(2),
  options: {
    input: { type: "string" },
    format: { type: "string", default: "json" },
  },
});

// Script logic here
Run with: bun scripts/generate.ts --input data.json

Designing scripts for agents

Agents run scripts in non-interactive shells. The agent skills spec has specific design requirements:

No interactive prompts (hard requirement)

Any TTY prompt will hang indefinitely. Accept all input via flags, environment variables, or stdin:
# BAD: Will hang the agent
read -p "Enter filename: " filename

# GOOD: Accept via flag
filename="${1:?Usage: validate.sh <filename>}"

Document with --help

This is how the agent learns your script’s interface:
$ scripts/validate.sh --help
Usage: validate.sh [OPTIONS] <input-file>

Validates a financial model JSON file against the schema.

Options:
  --strict    Fail on warnings (default: warnings only)
  --fix       Auto-fix common issues
  --format    Output format: json, text (default: text)

Examples:
  validate.sh model.json
  validate.sh --strict --format json model.json

Actionable error messages

Error messages should tell the agent exactly what to do next:
Error: --format must be one of: json, csv, table.
       Received: "xml"

Error: Missing required field "revenue" in assumptions section.
       Fix: Add "revenue" to the assumptions object in model.json
       Hint: Run `scripts/scaffold.sh --template dcf` for a complete template

Structured output

Prefer JSON or CSV over free-form text. Separate data (stdout) from diagnostics (stderr):
# Data goes to stdout (agent can parse it)
echo '{"status": "valid", "warnings": 2}'

# Diagnostics go to stderr (agent reads for troubleshooting)
echo "Checking 42 fields..." >&2

Design checklist

ConsiderationGuidance
IdempotencyAgents may retry. Use “create if not exists” patterns.
Dry-run supportAdd --dry-run for destructive or stateful operations.
Exit codesUse distinct codes for different failure types.
Safe defaultsRequire --confirm or --force for destructive operations.
Output sizeDefault to summaries. Many agent harnesses truncate beyond 10-30K characters.
No side effectsDon’t modify files unless explicitly asked. Return results to stdout.

The progression: skill to script

As you iterate on a skill with selftune, you’ll naturally discover which parts should become scripts:
Week 1: Skill says "validate by checking each field against the schema..."
        → Agent sometimes misses fields, inconsistent results

Week 2: selftune grade shows low process scores for validation steps
        → You extract validation into scripts/validate.sh

Week 3: Skill says "run scripts/validate.sh"
        → Validation is now fast, deterministic, and token-free

Week 4: selftune shows the agent sometimes runs validate with wrong flags
        → You add --help and better error messages to the script
selftune’s workflow analysis helps you spot these patterns by showing which steps agents repeat identically across sessions.

Further reading

Agent Skills Spec: Scripts

The full scripting section of the open standard.

Structuring Skills

How scripts fit into skill directory structure.

Managing Context

Move logic to scripts to reduce context load.

Iteration Loop

How selftune helps you identify script candidates.