Using Scripts in Skills

When to use scripts

Skills are markdown — they tell the agent what to do using natural language. Scripts are code — they do things deterministically. The question is: which parts of your skill belong in each? Use markdown for judgment: choosing an approach, interpreting user intent, adapting to context. Use scripts for mechanics: validation, data transformation, formatting, API calls with fixed parameters. A useful heuristic from the agent skills spec: if the agent reinvents the same logic every run, bundle it as a tested script.

Two approaches

Reference existing packages

When an existing package does what you need, reference it directly in your SKILL.md without bundling scripts:

## Validate the configuration

Run validation using the published schema:

\`\`\`bash
npx [email protected] validate -s schema.json -d config.json
\`\`\`

Common runners for one-off commands:

Runner	Ecosystem	Example
`npx` / `bunx`	Node.js	`npx [email protected] .`
`uvx` / `pipx`	Python	`uvx ruff check .`
`deno run`	Deno	`deno run jsr:@std/csv`
`go run`	Go	`go run golang.org/x/tools/cmd/stringer@latest`

Pin versions (e.g., npx [email protected]) so the skill behaves consistently across environments. State prerequisites in the compatibility frontmatter field.

Bundle scripts in the skill

When you need custom logic, put scripts in a scripts/ directory:

my-skill/
├── SKILL.md
├── scripts/
│   ├── validate.sh
│   ├── transform.py
│   └── generate.ts
└── references/
    └── schema.json

Reference them from SKILL.md using relative paths:

## Workflow

Validate input: `bash scripts/validate.sh "$INPUT_FILE"`
Transform data: `python3 scripts/transform.py --input data.json`
Generate output: `bun scripts/generate.ts --format pdf`

Self-contained scripts

The best scripts declare their own dependencies inline, so there’s nothing to install:

Python (PEP 723)

# /// script
# dependencies = [
#   "beautifulsoup4>=4.12",
#   "httpx>=0.27",
# ]
# ///

from bs4 import BeautifulSoup
import httpx

response = httpx.get(sys.argv[1])
soup = BeautifulSoup(response.text, "html.parser")
print(soup.get_text())

Run with: uv run scripts/extract.py https://example.com

TypeScript (Bun)

#!/usr/bin/env bun

import { parseArgs } from "util";

const { values } = parseArgs({
  args: Bun.argv.slice(2),
  options: {
    input: { type: "string" },
    format: { type: "string", default: "json" },
  },
});

// Script logic here

Run with: bun scripts/generate.ts --input data.json

Designing scripts for agents

Agents run scripts in non-interactive shells. The agent skills spec has specific design requirements:

No interactive prompts (hard requirement)

Any TTY prompt will hang indefinitely. Accept all input via flags, environment variables, or stdin:

# BAD: Will hang the agent
read -p "Enter filename: " filename

# GOOD: Accept via flag
filename="${1:?Usage: validate.sh <filename>}"

Document with `--help`

This is how the agent learns your script’s interface:

$ scripts/validate.sh --help
Usage: validate.sh [OPTIONS] <input-file>

Validates a financial model JSON file against the schema.

Options:
  --strict    Fail on warnings (default: warnings only)
  --fix       Auto-fix common issues
  --format    Output format: json, text (default: text)

Examples:
  validate.sh model.json
  validate.sh --strict --format json model.json

Actionable error messages

Error messages should tell the agent exactly what to do next:

Error: --format must be one of: json, csv, table.
       Received: "xml"

Error: Missing required field "revenue" in assumptions section.
       Fix: Add "revenue" to the assumptions object in model.json
       Hint: Run `scripts/scaffold.sh --template dcf` for a complete template

Structured output

Prefer JSON or CSV over free-form text. Separate data (stdout) from diagnostics (stderr):

# Data goes to stdout (agent can parse it)
echo '{"status": "valid", "warnings": 2}'

# Diagnostics go to stderr (agent reads for troubleshooting)
echo "Checking 42 fields..." >&2

Design checklist

Consideration	Guidance
Idempotency	Agents may retry. Use “create if not exists” patterns.
Dry-run support	Add `--dry-run` for destructive or stateful operations.
Exit codes	Use distinct codes for different failure types.
Safe defaults	Require `--confirm` or `--force` for destructive operations.
Output size	Default to summaries. Many agent harnesses truncate beyond 10-30K characters.
No side effects	Don’t modify files unless explicitly asked. Return results to stdout.

The progression: skill to script

As you iterate on a skill with selftune, you’ll naturally discover which parts should become scripts:

Week 1: Skill says "validate by checking each field against the schema..."
        → Agent sometimes misses fields, inconsistent results

Week 2: selftune grade shows low process scores for validation steps
        → You extract validation into scripts/validate.sh

Week 3: Skill says "run scripts/validate.sh"
        → Validation is now fast, deterministic, and token-free

Week 4: selftune shows the agent sometimes runs validate with wrong flags
        → You add --help and better error messages to the script

selftune’s workflow analysis helps you spot these patterns by showing which steps agents repeat identically across sessions.

Agent Skills Spec: Scripts

The full scripting section of the open standard.

Structuring Skills

How scripts fit into skill directory structure.

Managing Context

Move logic to scripts to reduce context load.

Iteration Loop

How selftune helps you identify script candidates.

Documentation Index

​When to use scripts

​Two approaches

​Reference existing packages

​Bundle scripts in the skill

​Self-contained scripts

​Python (PEP 723)

​TypeScript (Bun)

​Designing scripts for agents

​No interactive prompts (hard requirement)

​Document with --help

​Actionable error messages

​Structured output

​Design checklist

​The progression: skill to script

​Further reading

Agent Skills Spec: Scripts

Structuring Skills

Managing Context

Iteration Loop

When to use scripts

Two approaches

Reference existing packages

Bundle scripts in the skill

Self-contained scripts

Python (PEP 723)

TypeScript (Bun)

Designing scripts for agents

No interactive prompts (hard requirement)

Document with `--help`

Actionable error messages

Structured output

Design checklist

The progression: skill to script

Further reading