Add voice/conversational loop reusing CmdForge tools

- driver.py (woodshop-talk): the conversational loop. Reuses dictate (STT),
  pa-load-tools (schemas), claude -p (interpret), pa-execute-tool (dispatch),
  read-aloud (TTS). Resolves $N symbols so multi-op utterances can reference
  boards placed earlier in the same sentence; tolerates fenced/garbage output.
- wood-* CmdForge tools generator (scripts/gen_wood_tools.py): place/join/sand/
  delete/undo wrappers over the woodshop CLI; arg descriptions double as the
  LLM's command documentation.
- UX/realism fixes: lenient anchor parsing (end/start/far/near), and joins now
  stack board B on A's face in Z instead of interpenetrating centerlines.
- Tests: 25 passing (added anchor, Z-stack, and driver symbol-resolution tests).
- CLAUDE.md: architecture, entry points, setup, known limitations.

Verified end-to-end (typed): the canonical sentence produces the correct 4-op
scene; follow-up commands on a non-empty scene resolve ids correctly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
rob 2026-05-29 01:28:36 -03:00
parent a688623caf
commit fa03ee71d3
8 changed files with 459 additions and 6 deletions

View File

@ -4,7 +4,74 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
## Project Overview
**WoodShop** - Voice-driven conversational 3D woodworking & furniture modeler
**WoodShop** - Voice-driven conversational 3D woodworking & furniture modeler.
Speak (or type) commands like *"place a 6 foot 2x4, sand it, attach a 2 foot 2x4
at 90 degrees 10 inches from the end"* and watch the model build in a live 3D
viewport — Holodeck-style.
## Architecture
**Design principle:** reuse existing CmdForge tools for everything that isn't
woodshop-specific; don't reinvent voice/AI plumbing.
```
woodshop-talk (driver.py) ── the conversational loop
│ dictate ............... speech→text (CmdForge tool, reused)
│ pa-load-tools ......... wood-* → Claude schemas (reused)
│ claude -p ............. interpret utterance → JSON tool calls (reused provider)
│ pa-execute-tool ....... dispatch each wood-* tool (reused)
│ read-aloud ........... speak confirmation (reused)
scene.json ← single source of truth (parts, joints, selection, undo stack)
▲ │ writes
│ reads/mutates ▼
wood-* CmdForge tools woodshop-view (viewer.py)
(place/join/sand/delete/undo) watches scene.json → live pyvista 3D
thin wrappers over `woodshop` CLI
```
Only woodshop-specific code lives in this repo: the scene model
(`scene.py`), nominal→actual lumber table (`lumber.py`), length parsing
(`units.py`), the `woodshop` CLI (`cli.py`), build123d geometry + STL/STEP
export (`geometry.py`), the pyvista viewport (`viewer.py`), and the driver
(`driver.py`). The driver uses Claude (not `pa-tool-loop`, which hard-wires a
small local model) for reliable structured tool-calling.
### Entry points
| Command | Purpose |
|---------|---------|
| `woodshop <op>` | CLI: `place`, `join`, `sand`, `delete`, `undo`, `export`, `status` |
| `woodshop-view` | Live 3D viewport (watches `scene.json`) |
| `woodshop-talk` | Conversational driver (`--voice` for mic, `--once "..."` for one command) |
Scene file location: `$WOODSHOP_SCENE` or `~/.local/share/woodshop/scene.json`.
### CmdForge tools (the documented command vocabulary)
`wood-place`, `wood-join`, `wood-sand`, `wood-delete`, `wood-undo` live in
`~/.cmdforge/<name>/` and wrap the `woodshop` CLI. Regenerate them with
`/tmp/gen_wood_tools.py` (kept in the repo plan) if their schemas change. The
arg descriptions ARE the LLM's documentation, so keep them clear.
### Setup
```bash
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[viewer,dev]" # viewer extra pulls build123d + pyvista
pytest # 25 tests
```
### Known limitations / next steps
1. **No vertical orientation.** Boards only rotate in the horizontal (XY) plane
(`rotation_deg` about Z). Furniture legs that "stand up" (length along Z)
aren't representable yet — this is the top priority for real furniture.
2. **Joins stack in Z** (board B rests on B's top face). This avoids
interpenetration but isn't true joinery (no butt/mortise/lap geometry).
3. **Latency** ~713s per utterance (one `claude -p` call). Fine for now.
4. Voice path (`--voice`) reuses `dictate`; not yet exercised on real hardware
in this repo's tests.
## ⚠️ CRITICAL: Updating Todos, Milestones, and Goals

View File

@ -13,6 +13,7 @@ dependencies = []
[project.scripts]
woodshop = "woodshop.cli:main"
woodshop-view = "woodshop.viewer:main"
woodshop-talk = "woodshop.driver:main"
[project.optional-dependencies]
# Heavy 3D stack (OpenCASCADE etc.) — only needed to run the live viewport.

126
scripts/gen_wood_tools.py Normal file
View File

@ -0,0 +1,126 @@
"""Generate the wood-* CmdForge tools: the documented woodworking command
vocabulary. Each is a thin wrapper over the `woodshop` CLI so the logic lives in
one place; pa-load-tools turns these into Claude function schemas."""
import os
import stat
from pathlib import Path
import yaml
CMDFORGE_PY = "/home/rob/.local/share/pipx/venvs/cmdforge/bin/python"
CMDFORGE_DIR = Path.home() / ".cmdforge"
BIN_DIR = Path.home() / ".local" / "bin"
WS = 'ws = os.path.expanduser("~/PycharmProjects/woodshop/.venv/bin/woodshop")'
PLACE = f'''import subprocess, os
{WS}
r = subprocess.run([ws, "place", stock, length], capture_output=True, text=True)
out = (r.stdout + r.stderr).strip()
'''
JOIN = f'''import subprocess, os
{WS}
cmd = [ws, "join", part_b]
if to: cmd += ["--to", to]
if angle: cmd += ["--angle", str(angle)]
if offset: cmd += ["--offset", offset]
if anchor: cmd += ["--anchor", anchor]
r = subprocess.run(cmd, capture_output=True, text=True)
out = (r.stdout + r.stderr).strip()
'''
SAND = f'''import subprocess, os
{WS}
cmd = [ws, "sand"] + ([part] if part else [])
r = subprocess.run(cmd, capture_output=True, text=True)
out = (r.stdout + r.stderr).strip()
'''
DELETE = f'''import subprocess, os
{WS}
cmd = [ws, "delete"] + ([part] if part else [])
r = subprocess.run(cmd, capture_output=True, text=True)
out = (r.stdout + r.stderr).strip()
'''
UNDO = f'''import subprocess, os
{WS}
r = subprocess.run([ws, "undo"], capture_output=True, text=True)
out = (r.stdout + r.stderr).strip()
'''
TOOLS = {
"wood-place": {
"description": "Place a new board of dimensional lumber into the scene. Use for any 'place', 'add', 'put', 'grab', 'cut me a' board command.",
"arguments": [
{"flag": "--stock", "variable": "stock",
"description": "Nominal lumber size, e.g. 2x4, 2x6, 1x4, 4x4"},
{"flag": "--length", "variable": "length",
"description": "Length with units, e.g. '6 ft', '72 in', '3 ft 6 in'"},
],
"code": PLACE,
},
"wood-join": {
"description": "Attach/join one board to another at an angle, optionally offset along the target board. Use for 'attach', 'join', 'connect', 'fasten', 'screw to'.",
"arguments": [
{"flag": "--part-b", "variable": "part_b",
"description": "Id of the board being attached, e.g. p2"},
{"flag": "--to", "variable": "to", "default": "",
"description": "Id of the board to attach to, e.g. p1. Omit to use the most recently touched board."},
{"flag": "--angle", "variable": "angle", "default": "90",
"description": "Angle in degrees between the two boards (default 90)"},
{"flag": "--offset", "variable": "offset", "default": "",
"description": "Distance from the anchor end, e.g. '10 in'. Omit to attach at the very end."},
{"flag": "--anchor", "variable": "anchor", "default": "end_b",
"description": "Measure offset from 'end_a' (start) or 'end_b' (far end)"},
],
"code": JOIN,
},
"wood-sand": {
"description": "Sand a board smooth. Use for 'sand', 'smooth', 'finish'.",
"arguments": [
{"flag": "--part", "variable": "part", "default": "",
"description": "Id of the board to sand, e.g. p1. Omit to sand the most recently touched board ('it')."},
],
"code": SAND,
},
"wood-delete": {
"description": "Remove a board from the scene. Use for 'delete', 'remove', 'get rid of', 'scrap'.",
"arguments": [
{"flag": "--part", "variable": "part", "default": "",
"description": "Id of the board to delete, e.g. p2. Omit for the most recently touched board."},
],
"code": DELETE,
},
"wood-undo": {
"description": "Undo the last operation. Use for 'undo', 'never mind', 'take that back', 'go back'.",
"arguments": [],
"code": UNDO,
},
}
WRAPPER = '''#!/bin/bash
# CmdForge wrapper for '{name}'
# Auto-generated - do not edit
exec "{py}" -m cmdforge.runner "{name}" "$@"
'''
for name, spec in TOOLS.items():
tool_dir = CMDFORGE_DIR / name
tool_dir.mkdir(parents=True, exist_ok=True)
config = {
"name": name,
"description": spec["description"],
"category": "Other",
"version": "0.1.0",
"arguments": spec["arguments"],
"steps": [{"type": "code", "code": spec["code"], "output_var": "out"}],
"output": "{out}",
}
(tool_dir / "config.yaml").write_text(yaml.safe_dump(config, sort_keys=False))
wrapper = BIN_DIR / name
wrapper.write_text(WRAPPER.format(name=name, py=CMDFORGE_PY))
wrapper.chmod(wrapper.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
print(f"created {name}: {tool_dir/'config.yaml'} + {wrapper}")

View File

@ -29,11 +29,23 @@ def cmd_place(scene: Scene, args) -> str:
return f"Placed {part.id}: a {_fmt_len(length)} {part.stock}."
_ANCHOR_ALIASES = {
"end_a": "end_a", "start": "end_a", "near": "end_a", "beginning": "end_a",
"end_b": "end_b", "end": "end_b", "far": "end_b", "tip": "end_b",
}
def normalize_anchor(value: str) -> str:
"""Accept loose spoken anchors ('the end', 'start') -> end_a/end_b."""
return _ANCHOR_ALIASES.get((value or "end_b").strip().lower(), "end_b")
def cmd_join(scene: Scene, args) -> str:
anchor = normalize_anchor(args.anchor)
offset = to_inches(args.offset, default_unit=args.unit) if args.offset else 0.0
joint = scene.join(args.part_a, args.part_b, angle_deg=args.angle,
offset_in=offset, anchor=args.anchor)
where = f" {_fmt_len(offset)} from {'the start' if args.anchor == 'end_a' else 'the end'}" if offset else ""
offset_in=offset, anchor=anchor)
where = f" {_fmt_len(offset)} from {'the start' if anchor == 'end_a' else 'the end'}" if offset else ""
return f"Joined {joint.part_b} to {joint.part_a} at {args.angle:g} degrees{where}."
@ -86,8 +98,8 @@ def build_parser() -> argparse.ArgumentParser:
sp.add_argument("--to", dest="part_a", default=None, help="Board to attach to (default: selection)")
sp.add_argument("--angle", type=float, default=90.0, help="Angle in degrees")
sp.add_argument("--offset", default=None, help="Distance from anchor, e.g. '10 in'")
sp.add_argument("--anchor", choices=["end_a", "end_b"], default="end_b",
help="Measure offset from start (end_a) or far end (end_b)")
sp.add_argument("--anchor", default="end_b",
help="Measure offset from start (end_a/start) or far end (end_b/end)")
sp.add_argument("--unit", default="inch")
sp.set_defaults(func=cmd_join)

179
src/woodshop/driver.py Normal file
View File

@ -0,0 +1,179 @@
"""The conversational driver: speak (or type) a command, watch it build.
Reuses existing CmdForge tools for everything that isn't woodshop-specific:
* `dictate` -> speech to text (with --voice)
* `pa-load-tools` -> turns the wood-* tools into Claude function schemas
* `claude -p` -> interprets the utterance into tool calls
* `pa-execute-tool`-> dispatches each wood-* tool
* `read-aloud` -> speaks the confirmation back
Only the orchestration here is woodshop-specific (it must be: we use Claude
rather than pa-tool-loop's hard-wired local model). Run the viewer alongside it:
woodshop-view & # 3D window
woodshop-talk # type commands; add --voice to speak them
"""
from __future__ import annotations
import argparse
import json
import os
import re
import subprocess
import sys
WOOD_TOOLS = ["wood-place", "wood-join", "wood-sand", "wood-delete", "wood-undo"]
REASON_PROVIDER = "claude -p" # chosen for reliable structured tool-calling
# A board placed earlier in the SAME utterance is referenced as $1, $2, ...
_SYMBOL = re.compile(r"\$(\d+)")
def _run(cmd: list[str], stdin: str = "") -> str:
proc = subprocess.run(cmd, input=stdin, capture_output=True, text=True)
return (proc.stdout or "").strip()
def load_schemas() -> str:
return _run(["pa-load-tools", "--tools", ",".join(WOOD_TOOLS), "--format", "anthropic"])
def scene_summary() -> str:
ws = os.path.expanduser("~/PycharmProjects/woodshop/.venv/bin/woodshop")
return _run([ws, "status"]) or "empty"
SYSTEM = """You are WoodShop, a voice-driven woodworking assistant. Translate the \
user's spoken command into a JSON array of tool calls.
Tools (JSON schemas):
{schemas}
Current scene:
{scene}
Rules:
- Respond with ONLY a JSON array. No prose, no markdown fences.
- Each element is {{"tool": "<name>", "args": {{...}}}}.
- Refer to boards that ALREADY exist by their real id (p1, p2, ...).
- For a board you place earlier in THIS response, refer to it later as $1, $2, ...
numbered by the order you place boards in this response (the first wood-place is $1).
- For wood-join, "part_b" is the board being attached (it gets moved); "to" is the
board it attaches to. Anchor is "end" (far end) or "start".
- If the command is ambiguous or not about woodworking, return a single
{{"tool": "say", "args": {{"text": "<short question or reply>"}}}}.
User said: "{utterance}"
"""
def interpret(utterance: str, schemas: str) -> list[dict]:
prompt = SYSTEM.format(schemas=schemas, scene=scene_summary(), utterance=utterance)
raw = _run(REASON_PROVIDER.split(), stdin=prompt)
match = re.search(r"\[.*\]", raw, re.DOTALL) # tolerate stray text/fences
if not match:
return [{"tool": "say", "args": {"text": "Sorry, I didn't catch a command."}}]
try:
calls = json.loads(match.group(0))
except json.JSONDecodeError:
return [{"tool": "say", "args": {"text": "Sorry, I couldn't parse that."}}]
return calls if isinstance(calls, list) else [calls]
def dispatch(calls: list[dict], verbose: bool = True) -> list[str]:
"""Execute calls in order, resolving $N to ids of boards placed this turn."""
placed: list[str] = []
messages: list[str] = []
def resolve(value):
if isinstance(value, str):
def sub(m):
i = int(m.group(1)) - 1
return placed[i] if 0 <= i < len(placed) else m.group(0)
return _SYMBOL.sub(sub, value)
return value
for call in calls:
tool = call.get("tool", "")
args = {k: resolve(v) for k, v in (call.get("args") or {}).items()}
if tool == "say":
messages.append(args.get("text", ""))
continue
result = _run(["pa-execute-tool", "--tool-name", tool,
"--tool-args", json.dumps(args)])
try:
payload = json.loads(result)
except json.JSONDecodeError:
payload = {"success": False, "output": "", "error": result}
out = payload.get("output") or payload.get("error") or "(no output)"
if payload.get("success") and tool == "wood-place":
m = re.search(r"\b(p\d+)\b", out) # remember the new id for $N
if m:
placed.append(m.group(1))
messages.append(out)
if verbose:
print(f" {tool}{args} -> {out}")
return messages
def speak(text: str) -> None:
if text.strip():
subprocess.run(["read-aloud", "--strip-md", "true"], input=text, text=True)
def handle(utterance: str, schemas: str, voice: bool, verbose: bool) -> None:
calls = interpret(utterance, schemas)
messages = dispatch(calls, verbose=verbose)
summary = " ".join(m for m in messages if m).strip()
print(f"WoodShop: {summary}")
if voice and summary:
speak(summary)
def get_utterance(voice: bool, duration: int) -> str | None:
if voice:
print(f"[listening {duration}s...]")
text = _run(["dictate", "--duration", str(duration)])
print(f"You said: {text!r}")
return text or None
try:
return input("you> ").strip() or None
except (EOFError, KeyboardInterrupt):
return None
def main(argv: list[str] | None = None) -> int:
ap = argparse.ArgumentParser(prog="woodshop-talk", description="Conversational woodworking.")
ap.add_argument("--voice", action="store_true", help="Listen on the mic instead of typing")
ap.add_argument("--duration", type=int, default=6, help="Mic recording seconds (--voice)")
ap.add_argument("--once", help="Run a single command (non-interactive) and exit")
ap.add_argument("--quiet", action="store_true", help="Don't print per-call detail")
args = ap.parse_args(argv)
schemas = load_schemas()
if not schemas:
print("Could not load wood-* tool schemas (is CmdForge/pa-load-tools available?)",
file=sys.stderr)
return 1
if args.once is not None:
handle(args.once, schemas, voice=args.voice, verbose=not args.quiet)
return 0
print("WoodShop ready. Say things like 'place a 6 foot 2x4'. Ctrl-C to quit.")
while True:
utterance = get_utterance(args.voice, args.duration)
if utterance is None:
print()
return 0
if utterance.lower() in ("quit", "exit", "stop", "done"):
return 0
handle(utterance, schemas, voice=args.voice, verbose=not args.quiet)
if __name__ == "__main__":
raise SystemExit(main())

View File

@ -149,9 +149,12 @@ class Scene:
# Distance measured along A from its start.
along = offset_in if anchor == "end_a" else max(a.length_in - offset_in, 0.0)
ux, uy = a.axis_unit()
# Stack B on A's top face (in Z) so the boards rest against each other
# instead of interpenetrating at the centerlines, as real lumber would.
stack_z = a.section_in[0] / 2 + b.section_in[0] / 2
attach = [a.position_in[0] + ux * along,
a.position_in[1] + uy * along,
a.position_in[2]]
a.position_in[2] + stack_z]
b.position_in = attach
b.rotation_deg = a.rotation_deg + angle_deg

63
tests/test_driver.py Normal file
View File

@ -0,0 +1,63 @@
"""Tests for the driver's orchestration logic (external tools are mocked)."""
import json
from woodshop import driver
from woodshop.cli import normalize_anchor
def test_anchor_aliases():
assert normalize_anchor("end") == "end_b"
assert normalize_anchor("the end") == "end_b" # falls through to default end_b
assert normalize_anchor("start") == "end_a"
assert normalize_anchor("NEAR") == "end_a"
assert normalize_anchor("") == "end_b"
def test_dispatch_resolves_dollar_symbols(monkeypatch):
"""$1/$2 in a multi-op turn resolve to the ids of boards placed this turn."""
seen = []
def fake_run(cmd, stdin=""):
if cmd[0] != "pa-execute-tool":
return ""
name, args = cmd[2], json.loads(cmd[4])
seen.append((name, args))
if name == "wood-place":
n = sum(1 for c in seen if c[0] == "wood-place")
return json.dumps({"success": True, "output": f"Placed p{n}: a board.", "error": ""})
return json.dumps({"success": True, "output": f"did {name}", "error": ""})
monkeypatch.setattr(driver, "_run", fake_run)
calls = [
{"tool": "wood-place", "args": {"stock": "2x4", "length": "2 ft"}},
{"tool": "wood-place", "args": {"stock": "2x4", "length": "2 ft"}},
{"tool": "wood-join", "args": {"part_b": "$2", "to": "$1", "angle": "90"}},
]
driver.dispatch(calls, verbose=False)
join_args = next(a for n, a in seen if n == "wood-join")
assert join_args["part_b"] == "p2"
assert join_args["to"] == "p1"
def test_say_pseudo_tool_does_not_dispatch(monkeypatch):
calls_made = []
monkeypatch.setattr(driver, "_run", lambda cmd, stdin="": calls_made.append(cmd) or "")
msgs = driver.dispatch([{"tool": "say", "args": {"text": "which end?"}}], verbose=False)
assert msgs == ["which end?"]
assert calls_made == [] # nothing executed
def test_interpret_tolerates_fenced_json(monkeypatch):
monkeypatch.setattr(
driver, "_run",
lambda cmd, stdin="": '```json\n[{"tool": "wood-undo", "args": {}}]\n```'
if cmd[:2] != ["pa-load-tools", "--tools"] else "[]",
)
calls = driver.interpret("undo that", schemas="[]")
assert calls == [{"tool": "wood-undo", "args": {}}]
def test_interpret_handles_garbage(monkeypatch):
monkeypatch.setattr(driver, "_run", lambda cmd, stdin="": "I'm not sure what you mean")
calls = driver.interpret("blah", schemas="[]")
assert calls[0]["tool"] == "say"

View File

@ -60,6 +60,8 @@ def test_the_example_sentence():
assert "sanded" in p1.finishes
# attach point is 10in back from p1's far end (72 - 10 = 62 along +X)
assert p2.position_in[0] == pytest.approx(62.0)
# p2 rests on p1's top face: z = t_a/2 + t_b/2 = 0.75 + 0.75
assert p2.position_in[2] == pytest.approx(1.5)
assert p2.rotation_deg == pytest.approx(90.0)
# p2 now runs along +Y
ux, uy = p2.axis_unit()