Writing Effective Descriptions

Why descriptions matter

Your skill description is the single most important piece of text in your entire skill. It carries the entire burden of triggering — at session start, the agent loads only the name and description of each skill (~50-100 tokens). If the description doesn’t match the user’s query, the skill never activates and none of your carefully written instructions matter. The agent skills spec defines a hard limit of 1024 characters for descriptions. That’s your budget for convincing every compatible agent to activate your skill at the right time.

The anatomy of a good description

A good description has three parts:

What it does — imperative phrasing, not passive
When to use it — explicit trigger contexts
Trigger keywords — the words users actually say

# All three parts:
description: >
  Create PowerPoint presentations and slide decks from structured data,
  outlines, or descriptions. USE WHEN presentation, slides, deck, pptx,
  pitch deck, board deck, keynote export.

Use imperative phrasing

The spec recommends “Use this skill when…” not “This skill does…” — imperative phrasing gives the agent a clearer activation signal.

# Passive (weaker trigger):
description: This skill processes CSV files and generates reports.

# Imperative (stronger trigger):
description: >
  Analyze CSV and tabular data — compute statistics, add derived columns,
  generate charts, and clean messy data. Use when the user has a CSV, TSV,
  or Excel file and wants to explore, transform, or visualize it.

Focus on user intent, not implementation

Users don’t say “execute web security assessment using OWASP methodology.” They say “check if my site is vulnerable.”

# Implementation-focused (misses real queries):
description: >
  Execute comprehensive web application security assessment using
  OWASP Top 10 methodology with automated scanning and manual verification.

# Intent-focused (matches how people talk):
description: >
  Test web applications for security vulnerabilities — SQL injection, XSS,
  authentication bypass, and OWASP Top 10 issues. Use when checking if a
  site is secure, running a pentest, or auditing web app security.

Be pushy about trigger contexts

The spec says to err on the side of listing contexts explicitly, including cases where the user doesn’t name the domain directly:

description: >
  Analyze CSV and tabular data files — compute summary statistics,
  add derived columns, generate charts, and clean messy data. Use
  this skill when the user has a CSV, TSV, or Excel file and wants
  to explore, transform, or visualize the data, even if they don't
  explicitly mention "CSV" or "analysis."

That last clause — “even if they don’t explicitly mention” — helps the agent activate on contextual queries like “the Q3 numbers look off, can you take a look at this spreadsheet?”

The developer-user gap

This is the core problem selftune solves. Developers think in technical terms; users think in task terms:

Developer writes	User says
”Create PowerPoint presentations"	"make me a slide deck"
"Execute web security assessment"	"check if my site is vulnerable"
"Generate TypeScript CLI tools"	"build me a command-line thing"
"Process PDF documents"	"grab the text from this contract”

You can’t anticipate every variation. That’s why selftune’s evolution pipeline observes real queries and proposes description updates automatically.

Testing and optimization

The spec recommends designing ~20 test queries and running an optimization loop. See the full guide in Testing Skill Triggers. The short version:

Write description
Create 8-10 should-trigger queries + 8-10 should-not-trigger queries
Test against a live agent
Identify failures
Revise description — generalize, don't add specific keywords
Repeat (5 iterations is typically enough)

Avoiding overfitting

Split your test queries into a training set (~60%) and a validation set (~40%). If you optimize only against the full set, you’ll overfit to specific phrasings. Better yet, let selftune handle it — evolution generates multiple candidate descriptions and validates each against eval sets using Pareto multi-candidate selection, so no single trigger dimension is sacrificed for another.

Before and after

Here’s a real example of description evolution:

# Before (40% trigger rate):
description: Process CSV files.

# After selftune evolution (92% trigger rate):
description: >
  Analyze CSV and tabular data files — compute summary statistics,
  add derived columns, generate charts, and clean messy data. Use
  this skill when the user has a CSV, TSV, or Excel file and wants
  to explore, transform, or visualize the data, even if they don't
  explicitly mention "CSV" or "analysis."

The evolved version is longer but still well under the 1024-character limit. Every additional word earns its place by covering real user language patterns.

Agent Skills Spec

The open standard for skill descriptions and discovery.

Testing Triggers

Verify your descriptions actually work.

Evolution

Let selftune optimize descriptions automatically.

Example Skills

Study real skill descriptions from Anthropic’s skill library.

Documentation Index

​Why descriptions matter

​The anatomy of a good description

​Use imperative phrasing

​Focus on user intent, not implementation

​Be pushy about trigger contexts

​The developer-user gap

​Testing and optimization

​Avoiding overfitting

​Before and after

​Further reading