Add comprehensive tutorials for registry, tool composition, and testing

New documentation sections: - "Discovering Community Tools" - browsing, searching, installing, and managing tools from the registry with both GUI and CLI instructions - "Tools Within Tools" - composing tools using the tool step type, with patterns for pipelines, forks, and conditionals - "The Testing Sandbox" - step-by-step testing in the Visual Builder, mock providers, and debugging workflows Updated table of contents to include new sections under appropriate categories. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 00:34:55 -04:00 · 2026-01-18 00:34:55 -04:00 · d0581daa35
parent 50ba841a62
commit d0581daa35
1 changed files with 819 additions and 0 deletions
--- a/src/cmdforge/web/docs_content.py
+++ b/src/cmdforge/web/docs_content.py
@ -2414,6 +2414,822 @@ application that lets you create, edit, and manage tools with a visual interface
            ("next-up", "Next Up"),
        ],
    },
+
+    "registry-usage": {
+        "title": "Discovering Community Tools",
+        "description": "Find, install, and use tools created by the community",
+        "content": """
+<p class="lead">Why build everything from scratch? The CmdForge Registry is a treasure trove of
+ready-to-use tools built by developers just like you. In minutes, you can have a fully-equipped
+command line without writing a single line of YAML.</p>
+
+<div class="bg-indigo-50 border-l-4 border-indigo-500 p-4 my-6">
+    <p class="font-semibold text-indigo-800">What You'll Learn</p>
+    <ul class="mt-2 text-indigo-700">
+        <li>How to search and browse the registry</li>
+        <li>Installing tools with a single command</li>
+        <li>Managing versions and updates</li>
+        <li>Using installed tools like a pro</li>
+    </ul>
+</div>
+
+<h2 id="browsing">Browsing the Registry</h2>
+
+<p>Think of the registry as an app store for your terminal. Every tool has been reviewed, tested,
+and is ready to use. There are two ways to explore:</p>
+
+<div class="grid grid-cols-1 md:grid-cols-2 gap-4 my-6">
+    <div class="bg-white border-2 border-blue-200 rounded-lg p-4">
+        <p class="font-bold text-blue-800">Visual Builder</p>
+        <p class="text-sm text-blue-700 mt-2">Run <code>cmdforge</code> and click <strong>Registry</strong>
+        in the sidebar. Browse by category, search by keyword, and install with one click.</p>
+    </div>
+    <div class="bg-white border-2 border-green-200 rounded-lg p-4">
+        <p class="font-bold text-green-800">Command Line</p>
+        <p class="text-sm text-green-700 mt-2">Use <code>cmdforge registry search</code> for quick
+        lookups when you know what you want.</p>
+    </div>
+</div>
+
+<h3>Searching from the Command Line</h3>
+
+<pre><code class="language-bash"># Search by keyword
+cmdforge registry search "code review"
+
+# Browse a category
+cmdforge registry search --category Developer
+
+# Find popular tools
+cmdforge registry search --sort downloads
+
+# Combine filters
+cmdforge registry search "translate" --category Text --sort downloads</code></pre>
+
+<p>Each result shows you the tool name, author, description, and download count—everything you need to decide if it's right for you.</p>
+
+<h2 id="installing">Installing Tools</h2>
+
+<p>Found something you like? Installation is one command:</p>
+
+<pre><code class="language-bash"># Install a tool
+cmdforge registry install official/summarize
+
+# Install a specific version
+cmdforge registry install official/summarize@1.2.0
+
+# Install multiple tools at once
+cmdforge registry install official/summarize official/translate official/fix-grammar</code></pre>
+
+<p>That's it. The tool is now available as a command. Try it:</p>
+
+<pre><code class="language-bash"># Use your newly installed tool
+echo "The quick brown fox jumps over the lazy dog" | summarize</code></pre>
+
+<div class="bg-green-50 border-l-4 border-green-500 p-4 my-6">
+    <p class="font-semibold text-green-800">What Just Happened?</p>
+    <p class="text-green-700">CmdForge downloaded the tool config to <code>~/.cmdforge/summarize/</code>
+    and created a wrapper script in <code>~/.local/bin/</code>. The tool is now part of your system
+    just like <code>grep</code> or <code>cat</code>.</p>
+</div>
+
+<h2 id="tool-info">Inspecting Tools Before Installing</h2>
+
+<p>Not sure what a tool does? Peek inside before you commit:</p>
+
+<pre><code class="language-bash"># View tool details
+cmdforge registry info official/explain-code
+
+# See what you'll get:
+# - Description
+# - Available arguments
+# - Required providers
+# - Version history
+# - Download count</code></pre>
+
+<p>In the Visual Builder, just click on any tool to see its full details in the right panel.</p>
+
+<h2 id="using-tools">Using Installed Tools</h2>
+
+<p>Every installed tool works like a Unix command. The universal pattern:</p>
+
+<pre><code class="language-bash"># Pipe input
+cat file.txt | toolname
+
+# Pass a file directly
+toolname file.txt
+
+# Use arguments
+cat file.txt | toolname --flag value
+
+# Chain tools together
+cat article.txt | summarize | translate --lang Spanish</code></pre>
+
+<h3>Discovering Arguments</h3>
+
+<p>Every tool comes with built-in help:</p>
+
+<pre><code class="language-bash"># See what arguments a tool accepts
+summarize --help
+
+# Output:
+# Usage: summarize [OPTIONS] [INPUT]
+#
+# Summarize text using AI
+#
+# Options:
+#   --max-length TEXT  Maximum summary length in words [default: 200]
+#   --provider TEXT    Override the AI provider
+#   --help             Show this message</code></pre>
+
+<h2 id="managing">Managing Your Tools</h2>
+
+<h3>List Installed Tools</h3>
+<pre><code class="language-bash"># See everything you have installed
+cmdforge list
+
+# Filter by category
+cmdforge list --category Developer</code></pre>
+
+<h3>Update Tools</h3>
+<pre><code class="language-bash"># Check for updates
+cmdforge registry check-updates
+
+# Update a specific tool
+cmdforge registry install official/summarize@latest
+
+# Update all tools (coming soon)
+cmdforge registry update-all</code></pre>
+
+<h3>Remove Tools</h3>
+<pre><code class="language-bash"># Remove a tool you no longer need
+cmdforge delete summarize</code></pre>
+
+<h2 id="providers">A Note on Providers</h2>
+
+<p>Registry tools specify which AI provider they use. If you don't have that provider configured,
+you have two options:</p>
+
+<div class="grid grid-cols-1 md:grid-cols-2 gap-4 my-6">
+    <div class="bg-white border rounded-lg p-4">
+        <p class="font-bold text-gray-900">Option 1: Configure the Provider</p>
+        <p class="text-sm text-gray-600 mt-2">Follow our <a href="/docs/providers">Providers Guide</a>
+        to set up the required provider.</p>
+    </div>
+    <div class="bg-white border rounded-lg p-4">
+        <p class="font-bold text-gray-900">Option 2: Override at Runtime</p>
+        <p class="text-sm text-gray-600 mt-2">Use a provider you already have:
+        <code>cat file.txt | summarize --provider ollama</code></p>
+    </div>
+</div>
+
+<h2 id="starter-collection">Quick Start: The Starter Collection</h2>
+
+<p>Not sure where to begin? Install our curated starter pack:</p>
+
+<pre><code class="language-bash"># Install the essentials
+cmdforge collections install starter</code></pre>
+
+<p>This gives you:</p>
+
+<table class="w-full my-4">
+    <thead class="bg-gray-100">
+        <tr>
+            <th class="px-4 py-2 text-left">Tool</th>
+            <th class="px-4 py-2 text-left">What It Does</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>summarize</code></td>
+            <td class="px-4 py-2">Condense long text into key points</td>
+        </tr>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>translate</code></td>
+            <td class="px-4 py-2">Translate to any language</td>
+        </tr>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>fix-grammar</code></td>
+            <td class="px-4 py-2">Fix spelling and grammar issues</td>
+        </tr>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>explain-error</code></td>
+            <td class="px-4 py-2">Decode cryptic error messages</td>
+        </tr>
+        <tr>
+            <td class="px-4 py-2"><code>commit-msg</code></td>
+            <td class="px-4 py-2">Generate commit messages from diffs</td>
+        </tr>
+    </tbody>
+</table>
+
+<h2 id="next-steps">What's Next?</h2>
+
+<p>Now that you've got tools from the registry:</p>
+
+<ul>
+    <li><a href="/docs/tool-steps">Use registry tools inside your own tools</a> - Build on top of community work</li>
+    <li><a href="/docs/publishing">Publish your own tools</a> - Share what you create</li>
+    <li><a href="/forum">Join the community</a> - Get help, share ideas, show off your creations</li>
+</ul>
+""",
+        "headings": [
+            ("browsing", "Browsing the Registry"),
+            ("installing", "Installing Tools"),
+            ("tool-info", "Inspecting Tools Before Installing"),
+            ("using-tools", "Using Installed Tools"),
+            ("managing", "Managing Your Tools"),
+            ("providers", "A Note on Providers"),
+            ("starter-collection", "Quick Start: The Starter Collection"),
+            ("next-steps", "What's Next?"),
+        ],
+    },
+
+    "tool-steps": {
+        "title": "Tools Within Tools",
+        "description": "Build powerful workflows by combining existing tools",
+        "content": """
+<p class="lead">Here's a secret that changes everything: tools can call other tools. That
+<code>summarize</code> command you installed from the registry? You can use it as a building
+block inside your own creations. It's like having LEGO bricks that already do something cool.</p>
+
+<div class="bg-indigo-50 border-l-4 border-indigo-500 p-4 my-6">
+    <p class="font-semibold text-indigo-800">What You'll Learn</p>
+    <ul class="mt-2 text-indigo-700">
+        <li>The <code>tool</code> step type and when to use it</li>
+        <li>Passing data between tools</li>
+        <li>Building pipelines from existing tools</li>
+        <li>Real-world composition patterns</li>
+    </ul>
+</div>
+
+<h2 id="why-compose">Why Compose Tools?</h2>
+
+<p>Let's say you want a tool that:</p>
+<ol>
+    <li>Summarizes an article</li>
+    <li>Translates the summary to Spanish</li>
+    <li>Fixes any grammar issues</li>
+</ol>
+
+<p>You <em>could</em> write three prompt steps from scratch. Or you could stand on the shoulders
+of giants and reuse tools that already exist:</p>
+
+<pre><code class="language-yaml">name: spanish-summary
+version: "1.0.0"
+description: Summarize text and translate to Spanish
+
+steps:
+  - type: tool
+    tool: summarize
+    output_var: summary
+
+  - type: tool
+    tool: translate
+    input: "{summary}"
+    args:
+      lang: Spanish
+    output_var: translated
+
+  - type: tool
+    tool: fix-grammar
+    input: "{translated}"
+    output_var: final
+
+output: "{final}"</code></pre>
+
+<p>Three lines per step. Each one leverages a battle-tested tool from the registry. The prompts,
+the edge cases, the provider configs—all handled.</p>
+
+<div class="bg-green-50 border-l-4 border-green-500 p-4 my-6">
+    <p class="font-semibold text-green-800">The Power of Composition</p>
+    <p class="text-green-700">When the <code>summarize</code> tool gets updated with better prompts,
+    your tool automatically benefits. You're not copying code—you're building on it.</p>
+</div>
+
+<h2 id="tool-step-anatomy">Anatomy of a Tool Step</h2>
+
+<p>A tool step has four parts:</p>
+
+<pre><code class="language-yaml">- type: tool              # This is a tool step
+  tool: owner/name        # Which tool to call (or just "name" for local tools)
+  input: "{variable}"     # What to send as input (optional, defaults to {input})
+  args:                   # Arguments to pass (optional)
+    flag-name: value
+  output_var: result      # Where to store the output</code></pre>
+
+<table class="w-full my-4">
+    <thead class="bg-gray-100">
+        <tr>
+            <th class="px-4 py-2 text-left">Field</th>
+            <th class="px-4 py-2 text-left">Required</th>
+            <th class="px-4 py-2 text-left">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>type</code></td>
+            <td class="px-4 py-2">Yes</td>
+            <td class="px-4 py-2">Must be <code>tool</code></td>
+        </tr>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>tool</code></td>
+            <td class="px-4 py-2">Yes</td>
+            <td class="px-4 py-2">Tool name or <code>owner/name</code> for registry tools</td>
+        </tr>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>input</code></td>
+            <td class="px-4 py-2">No</td>
+            <td class="px-4 py-2">Input template (defaults to <code>{input}</code>)</td>
+        </tr>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>args</code></td>
+            <td class="px-4 py-2">No</td>
+            <td class="px-4 py-2">Key-value pairs for tool arguments</td>
+        </tr>
+        <tr>
+            <td class="px-4 py-2"><code>output_var</code></td>
+            <td class="px-4 py-2">Yes</td>
+            <td class="px-4 py-2">Variable name to store the result</td>
+        </tr>
+    </tbody>
+</table>
+
+<h2 id="passing-data">Passing Data Between Tools</h2>
+
+<p>The <code>input</code> field supports variable substitution, just like prompt templates:</p>
+
+<pre><code class="language-yaml">steps:
+  # Step 1: Use original input
+  - type: tool
+    tool: extract-keywords
+    input: "{input}"           # The original stdin input
+    output_var: keywords
+
+  # Step 2: Use output from step 1
+  - type: tool
+    tool: expand-topic
+    input: "{keywords}"        # Output from previous step
+    output_var: expanded
+
+  # Step 3: Combine multiple variables
+  - type: tool
+    tool: write-article
+    input: |
+      Topic: {keywords}
+
+      Research:
+      {expanded}
+    output_var: article</code></pre>
+
+<h2 id="passing-args">Passing Arguments</h2>
+
+<p>Most tools accept arguments. Pass them through the <code>args</code> field:</p>
+
+<pre><code class="language-yaml">- type: tool
+  tool: translate
+  input: "{input}"
+  args:
+    lang: French          # --lang French
+    formal: "true"        # --formal true
+  output_var: french_text</code></pre>
+
+<div class="bg-amber-50 border-l-4 border-amber-500 p-4 my-6">
+    <p class="font-semibold text-amber-800">Variable Arguments</p>
+    <p class="text-amber-700">Arguments can use variables too! This is incredibly powerful:</p>
+    <pre class="mt-2"><code class="language-yaml">args:
+  lang: "{target_language}"    # From a previous step or tool argument</code></pre>
+</div>
+
+<h2 id="local-vs-registry">Local vs Registry Tools</h2>
+
+<p>You can call both your own tools and registry tools:</p>
+
+<pre><code class="language-yaml">steps:
+  # Call a local tool (in your ~/.cmdforge/)
+  - type: tool
+    tool: my-custom-extractor
+    output_var: extracted
+
+  # Call a registry tool
+  - type: tool
+    tool: official/summarize
+    input: "{extracted}"
+    output_var: summary</code></pre>
+
+<h2 id="real-patterns">Real-World Patterns</h2>
+
+<h3>Pattern 1: The Pipeline</h3>
+<p>Chain tools in sequence, each transforming the output of the last:</p>
+
+<pre><code class="language-yaml">name: polish-writing
+description: Clean up rough drafts
+
+steps:
+  - type: tool
+    tool: fix-grammar
+    output_var: fixed
+
+  - type: tool
+    tool: simplify
+    input: "{fixed}"
+    args:
+      level: "high school"
+    output_var: simple
+
+  - type: tool
+    tool: tone-shift
+    input: "{simple}"
+    args:
+      tone: professional
+    output_var: polished
+
+output: "{polished}"</code></pre>
+
+<h3>Pattern 2: The Fork</h3>
+<p>Send the same input to multiple tools, then combine results:</p>
+
+<pre><code class="language-yaml">name: multi-perspective
+description: Get analysis from different angles
+
+steps:
+  - type: tool
+    tool: summarize
+    output_var: summary
+
+  - type: tool
+    tool: extract-keywords
+    input: "{input}"        # Same original input
+    output_var: keywords
+
+  - type: tool
+    tool: sentiment-analysis
+    input: "{input}"        # Same original input
+    output_var: sentiment
+
+  - type: prompt
+    provider: claude
+    prompt: |
+      Combine these analyses into a report:
+
+      Summary: {summary}
+      Keywords: {keywords}
+      Sentiment: {sentiment}
+    output_var: report
+
+output: "{report}"</code></pre>
+
+<h3>Pattern 3: The Conditional</h3>
+<p>Use code steps to route data to different tools:</p>
+
+<pre><code class="language-yaml">name: smart-translate
+description: Only translate if not already in English
+
+steps:
+  - type: tool
+    tool: detect-language
+    output_var: detected_lang
+
+  - type: code
+    code: |
+      needs_translation = detected_lang.strip().lower() != "english"
+    output_vars: [needs_translation]
+
+  - type: tool
+    tool: translate
+    input: "{input}"
+    args:
+      lang: English
+    output_var: translated
+    # Note: Tool runs but you can use code to choose output
+
+  - type: code
+    code: |
+      if needs_translation:
+          result = translated
+      else:
+          result = input
+    output_vars: [result]
+
+output: "{result}"</code></pre>
+
+<h2 id="provider-override">Provider Overrides</h2>
+
+<p>Need to use a different AI provider than what the tool specifies? Override it:</p>
+
+<pre><code class="language-yaml">- type: tool
+  tool: summarize
+  provider: ollama          # Use local Ollama instead of cloud
+  output_var: summary</code></pre>
+
+<h2 id="debugging">Debugging Composed Tools</h2>
+
+<p>When things go wrong, test each tool step individually:</p>
+
+<pre><code class="language-bash"># Test the first tool manually
+echo "your input" | summarize
+
+# Check what's going into step 2
+echo "your input" | summarize | cat
+
+# Then test step 2
+echo "output from step 1" | translate --lang Spanish</code></pre>
+
+<p>Or use the <a href="/docs/testing-steps">Testing Sandbox</a> in the Visual Builder to step through each tool interactively.</p>
+
+<h2 id="next">Next Steps</h2>
+
+<ul>
+    <li><a href="/docs/testing-steps">Test your tool steps</a> - Debug before you deploy</li>
+    <li><a href="/docs/multi-step">Mix with prompt and code steps</a> - The full power of multi-step tools</li>
+    <li><a href="/docs/publishing">Publish your composed tools</a> - Share your creations</li>
+</ul>
+""",
+        "headings": [
+            ("why-compose", "Why Compose Tools?"),
+            ("tool-step-anatomy", "Anatomy of a Tool Step"),
+            ("passing-data", "Passing Data Between Tools"),
+            ("passing-args", "Passing Arguments"),
+            ("local-vs-registry", "Local vs Registry Tools"),
+            ("real-patterns", "Real-World Patterns"),
+            ("provider-override", "Provider Overrides"),
+            ("debugging", "Debugging Composed Tools"),
+            ("next", "Next Steps"),
+        ],
+    },
+
+    "testing-steps": {
+        "title": "The Testing Sandbox",
+        "description": "Test your tool steps before deploying to the real world",
+        "content": """
+<p class="lead">There's nothing worse than deploying a tool and watching it fail on real data.
+The Testing Sandbox lets you run individual steps, inspect variables, and squash bugs—all
+without burning through API credits or waiting for full pipeline runs.</p>
+
+<div class="bg-indigo-50 border-l-4 border-indigo-500 p-4 my-6">
+    <p class="font-semibold text-indigo-800">What You'll Learn</p>
+    <ul class="mt-2 text-indigo-700">
+        <li>How to test individual steps in isolation</li>
+        <li>Setting up test inputs and variables</li>
+        <li>Understanding test output and timing</li>
+        <li>Testing without API calls using mock providers</li>
+    </ul>
+</div>
+
+<h2 id="why-test">Why Test Steps Individually?</h2>
+
+<p>Imagine a 5-step tool that's failing. Is it step 1? Step 4? The output template? Running
+the whole thing over and over is slow and expensive. Step-by-step testing lets you:</p>
+
+<div class="grid grid-cols-1 md:grid-cols-3 gap-4 my-6">
+    <div class="bg-white border rounded-lg p-4 text-center">
+        <div class="text-3xl mb-2">🎯</div>
+        <p class="font-bold text-gray-900">Isolate Problems</p>
+        <p class="text-sm text-gray-600">Test one step at a time</p>
+    </div>
+    <div class="bg-white border rounded-lg p-4 text-center">
+        <div class="text-3xl mb-2">💰</div>
+        <p class="font-bold text-gray-900">Save Money</p>
+        <p class="text-sm text-gray-600">Only call APIs when ready</p>
+    </div>
+    <div class="bg-white border rounded-lg p-4 text-center">
+        <div class="text-3xl mb-2">⚡</div>
+        <p class="font-bold text-gray-900">Iterate Fast</p>
+        <p class="text-sm text-gray-600">Quick feedback loops</p>
+    </div>
+</div>
+
+<h2 id="opening-sandbox">Opening the Testing Sandbox</h2>
+
+<p>The Testing Sandbox is built into the Visual Builder. Here's how to access it:</p>
+
+<ol>
+    <li>Open the Visual Builder: <code>cmdforge</code></li>
+    <li>Navigate to <strong>My Tools</strong> and select a tool (or create a new one)</li>
+    <li>Click <strong>Edit</strong> to open the tool builder</li>
+    <li>Find the step you want to test</li>
+    <li>Click the <strong>Test</strong> button on that step</li>
+</ol>
+
+<p>The Testing Sandbox dialog opens with everything you need:</p>
+
+<div class="bg-gray-50 border rounded-lg p-4 my-6">
+    <div class="grid grid-cols-2 gap-4">
+        <div>
+            <p class="font-bold text-gray-800">Input Section</p>
+            <ul class="text-sm text-gray-600 mt-2">
+                <li>Test input text area</li>
+                <li>Variable table for setting values</li>
+                <li>Provider override dropdown</li>
+            </ul>
+        </div>
+        <div>
+            <p class="font-bold text-gray-800">Output Section</p>
+            <ul class="text-sm text-gray-600 mt-2">
+                <li>Step output display</li>
+                <li>Output variables table</li>
+                <li>Execution time</li>
+                <li>Success/error status</li>
+            </ul>
+        </div>
+    </div>
+</div>
+
+<h2 id="testing-prompt">Testing a Prompt Step</h2>
+
+<p>Prompt steps call your AI provider. Here's how to test one:</p>
+
+<h3>1. Set Your Test Input</h3>
+<p>In the input text area, enter the text you want to process:</p>
+
+<pre><code class="language-text">The quick brown fox jumps over the lazy dog.
+This is a sample sentence for testing purposes.</code></pre>
+
+<h3>2. Set Variables</h3>
+<p>If your prompt uses variables like <code>{language}</code> or <code>{max_length}</code>,
+fill them in the variables table:</p>
+
+<table class="w-full my-4">
+    <thead class="bg-gray-100">
+        <tr>
+            <th class="px-4 py-2 text-left">Variable</th>
+            <th class="px-4 py-2 text-left">Value</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>input</code></td>
+            <td class="px-4 py-2">(auto-filled from test input)</td>
+        </tr>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>language</code></td>
+            <td class="px-4 py-2">Spanish</td>
+        </tr>
+        <tr>
+            <td class="px-4 py-2"><code>max_length</code></td>
+            <td class="px-4 py-2">100</td>
+        </tr>
+    </tbody>
+</table>
+
+<h3>3. Choose a Provider</h3>
+<p>Select which provider to use for this test. You can:</p>
+<ul>
+    <li>Use the step's default provider</li>
+    <li>Override with a different provider</li>
+    <li>Use <code>mock</code> for testing without API calls</li>
+</ul>
+
+<h3>4. Run the Test</h3>
+<p>Click <strong>Run Test</strong> and watch the magic happen. You'll see:</p>
+
+<ul>
+    <li><strong>Output</strong> - What the AI returned</li>
+    <li><strong>Output Variable</strong> - The value stored in <code>output_var</code></li>
+    <li><strong>Timing</strong> - How long the call took (in milliseconds)</li>
+    <li><strong>Status</strong> - Success or error with details</li>
+</ul>
+
+<h2 id="testing-code">Testing a Code Step</h2>
+
+<p>Code steps run Python. The sandbox lets you verify your logic:</p>
+
+<h3>Set Up Variables</h3>
+<p>Code steps often depend on outputs from previous steps. Enter those values manually:</p>
+
+<table class="w-full my-4">
+    <thead class="bg-gray-100">
+        <tr>
+            <th class="px-4 py-2 text-left">Variable</th>
+            <th class="px-4 py-2 text-left">Value</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr class="border-b">
+            <td class="px-4 py-2"><code>emails_raw</code></td>
+            <td class="px-4 py-2">john@example.com, jane@example.com, invalid-email</td>
+        </tr>
+        <tr>
+            <td class="px-4 py-2"><code>count</code></td>
+            <td class="px-4 py-2">3</td>
+        </tr>
+    </tbody>
+</table>
+
+<h3>View All Output Variables</h3>
+<p>After running, you'll see every variable your code step creates:</p>
+
+<pre><code class="language-text">emails = ['john@example.com', 'jane@example.com']
+unique_count = 2
+is_valid = True</code></pre>
+
+<div class="bg-amber-50 border-l-4 border-amber-500 p-4 my-6">
+    <p class="font-semibold text-amber-800">Debugging Tip</p>
+    <p class="text-amber-700">If your code step isn't working, add temporary <code>print()</code>
+    statements. The output will appear in the error/output section, helping you trace what's happening.</p>
+</div>
+
+<h2 id="testing-tool">Testing a Tool Step</h2>
+
+<p>Tool steps call other tools. Testing them works the same way:</p>
+
+<ol>
+    <li>Set the input that will be sent to the tool</li>
+    <li>Configure any arguments the tool needs</li>
+    <li>Optionally override the provider</li>
+    <li>Run and verify the output</li>
+</ol>
+
+<p>This is incredibly useful for debugging composed tools—you can verify each tool in the chain
+is receiving and producing the right data.</p>
+
+<h2 id="mock-testing">Testing Without API Calls</h2>
+
+<p>Don't want to burn through API credits while iterating? Use the mock provider:</p>
+
+<h3>Option 1: Override in the Sandbox</h3>
+<p>Select <code>mock</code> from the provider dropdown before running your test.</p>
+
+<h3>Option 2: Create a Smart Mock</h3>
+<p>Add a mock provider that returns realistic test data:</p>
+
+<pre><code class="language-yaml"># In ~/.cmdforge/providers.yaml
+providers:
+  - name: mock
+    command: 'echo "This is a mock response for testing"'
+
+  - name: mock-json
+    command: 'echo "{\"status\": \"ok\", \"count\": 42}"'
+
+  - name: mock-summary
+    command: 'echo "This is a summary of the input text."'</code></pre>
+
+<div class="bg-green-50 border-l-4 border-green-500 p-4 my-6">
+    <p class="font-semibold text-green-800">Pro Tip: Echo the Input</p>
+    <p class="text-green-700">For testing variable flow, create a mock that echoes its input:</p>
+    <pre class="mt-2"><code class="language-yaml">- name: echo-mock
+  command: 'cat'  # Just returns whatever is piped in</code></pre>
+</div>
+
+<h2 id="common-issues">Common Testing Scenarios</h2>
+
+<h3>Scenario: "My variable is empty"</h3>
+<p>Check these in order:</p>
+<ol>
+    <li>Is the variable name spelled correctly? (<code>output_var</code> matches what you're using)</li>
+    <li>Did the previous step actually set it? (Test that step first)</li>
+    <li>Are you using the right syntax? (<code>{varname}</code> not <code>$varname</code>)</li>
+</ol>
+
+<h3>Scenario: "Provider call failed"</h3>
+<p>Verify:</p>
+<ol>
+    <li>Is the provider configured in <code>~/.cmdforge/providers.yaml</code>?</li>
+    <li>Does the provider CLI work on its own? Try: <code>echo "test" | claude -p</code></li>
+    <li>Are API keys/auth set up correctly?</li>
+</ol>
+
+<h3>Scenario: "Code step error"</h3>
+<p>The error message shows the Python exception. Common causes:</p>
+<ul>
+    <li><strong>NameError</strong> - Variable from previous step not available (test that step)</li>
+    <li><strong>SyntaxError</strong> - Check your Python indentation and syntax</li>
+    <li><strong>TypeError</strong> - You're treating a string as a list, or vice versa</li>
+</ul>
+
+<h2 id="workflow">The Testing Workflow</h2>
+
+<p>Here's how the pros do it:</p>
+
+<div class="bg-white border rounded-lg p-4 my-6">
+    <ol class="space-y-3">
+        <li><strong>1. Test Step 1 with mock provider</strong> - Verify input handling</li>
+        <li><strong>2. Test Step 1 with real provider</strong> - Confirm AI output format</li>
+        <li><strong>3. Copy Step 1's output, paste as Step 2's input</strong> - Test the handoff</li>
+        <li><strong>4. Repeat for each step</strong> - Build confidence layer by layer</li>
+        <li><strong>5. Run the full tool</strong> - Everything should work!</li>
+    </ol>
+</div>
+
+<h2 id="next">Next Steps</h2>
+
+<ul>
+    <li><a href="/docs/tool-steps">Compose tools together</a> - Build powerful pipelines</li>
+    <li><a href="/docs/code-steps">Master code steps</a> - Add Python processing</li>
+    <li><a href="/docs/publishing">Publish your tested tool</a> - Share with confidence</li>
+</ul>
+""",
+        "headings": [
+            ("why-test", "Why Test Steps Individually?"),
+            ("opening-sandbox", "Opening the Testing Sandbox"),
+            ("testing-prompt", "Testing a Prompt Step"),
+            ("testing-code", "Testing a Code Step"),
+            ("testing-tool", "Testing a Tool Step"),
+            ("mock-testing", "Testing Without API Calls"),
+            ("common-issues", "Common Testing Scenarios"),
+            ("workflow", "The Testing Workflow"),
+            ("next", "Next Steps"),
+        ],
+    },
 }


@ -2434,10 +3250,13 @@ def get_toc():
            SimpleNamespace(slug="visual-builder", title="Visual Builder"),
            SimpleNamespace(slug="yaml-config", title="YAML Config"),
        ]),
+        SimpleNamespace(slug="registry-usage", title="Using the Registry", children=[]),
        SimpleNamespace(slug="arguments", title="Custom Arguments", children=[]),
        SimpleNamespace(slug="multi-step", title="Multi-Step Workflows", children=[
            SimpleNamespace(slug="code-steps", title="Code Steps"),
+            SimpleNamespace(slug="tool-steps", title="Tools Within Tools"),
        ]),
+        SimpleNamespace(slug="testing-steps", title="Testing Sandbox", children=[]),
        SimpleNamespace(slug="providers", title="Providers", children=[]),
        SimpleNamespace(slug="publishing", title="Publishing", children=[]),
        SimpleNamespace(slug="advanced-workflows", title="Advanced Workflows", children=[