History

rob 5159450feb Add semantic search (RAG) for registry tool discovery Users can now find tools by describing what they need in natural language. Uses Ollama embeddings (nomic-embed-text) on AI-Server for vector similarity search. Available via CLI (registry describe), GUI (AI search row), and API. New files: - registry/embeddings.py: core embedding logic (Ollama API, cosine similarity, pack/unpack vectors, backfill) - tests/test_embeddings.py: 16 unit tests Modified: - registry/db.py: tool_embeddings table (schema + migration) - registry/settings.py: embeddings.* settings (ollama_url, model, enabled, min_score) - registry/app.py: semantic-search endpoint, publish hook, admin backfill/status - registry_client.py: semantic_search() method with error surfacing - gui/pages/registry_page.py: AI search row with SemanticSearchWorker - cli/__init__.py + registry_commands.py: registry describe subcommand Backfill required after deploy: POST /api/v1/admin/embeddings/backfill Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>		2026-02-02 01:52:27 -04:00
..
README.md	…
diagram.puml	…
test_cli.py	Add system dependencies support for tools	2026-01-28 16:31:37 -04:00
test_collection.py	Add admin publish-as-official, hash-based status sync, and GUI fixes	2026-01-29 17:18:37 -04:00
test_collection_api.py	…
test_dependency_graph.py	…
test_embeddings.py	Add semantic search (RAG) for registry tool discovery	2026-02-02 01:52:27 -04:00
test_lockfile.py	…
test_providers.py	…
test_registry_integration.py	…
test_runner.py	Add system dependencies support for tools	2026-01-28 16:31:37 -04:00
test_settings.py	Add tool settings files for configurable tool defaults	2026-01-27 11:56:07 -04:00
test_system_deps.py	Add system dependencies support for tools	2026-01-28 16:31:37 -04:00
test_tool.py	…
test_version.py	…

README.md

CmdForge Test Suite

Overview

The test suite covers the core CmdForge modules with 205 unit tests organized into test files:

Test File	Module Tested	Test Count	Description
`test_tool.py`	`tool.py`	35	Tool definitions, YAML loading/saving
`test_runner.py`	`runner.py`	31	Variable substitution, step execution
`test_providers.py`	`providers.py`	35	Provider management, AI calls
`test_cli.py`	`cli/`	20	CLI commands and arguments
`test_collection.py`	`collection.py`	27	Collection CRUD, tool resolution
`test_collection_api.py`	Registry API	20	Collection API endpoints, client methods
`test_registry_integration.py`	Registry	37	Registry client, resolution (12 require server)

Running Tests

# Run all unit tests (excluding integration tests)
pytest tests/ -m "not integration"

# Run all tests with verbose output
pytest tests/ -v

# Run a specific test file
pytest tests/test_tool.py -v

# Run a specific test class
pytest tests/test_runner.py::TestSubstituteVariables -v

# Run with coverage
pytest tests/ --cov=cmdforge --cov-report=html

Test Categories

Unit Tests (run without external dependencies)

test_tool.py - Tool data structures and persistence

TestToolArgument - Custom argument definition and serialization
TestPromptStep - AI prompt step configuration
TestCodeStep - Python code step configuration
TestTool - Full tool configuration and roundtrip serialization
TestValidateToolName - Tool name validation rules
TestToolPersistence - Save/load/delete operations
TestLegacyFormat - Backward compatibility with old format

test_runner.py - Tool execution engine

TestSubstituteVariables - Variable placeholder substitution
- Simple substitution: {name} → value
- Escaped braces: {{literal}} → {literal}
- Multiple variables, multiline templates
TestExecutePromptStep - AI provider calls with mocking
TestExecuteCodeStep - Python code execution via exec()
TestRunTool - Full tool execution with steps
TestCreateArgumentParser - CLI argument parsing

test_providers.py - AI provider abstraction

TestProvider - Provider dataclass and serialization
TestProviderResult - Success/error result handling
TestMockProvider - Built-in mock for testing
TestProviderPersistence - YAML save/load operations
TestCallProvider - Subprocess execution with mocking
TestProviderCommandParsing - Shell command parsing with shlex

test_cli.py - Command-line interface

TestCLIBasics - Help, version flags
TestListCommand - List available tools
TestCreateCommand - Create new tools
TestDeleteCommand - Delete tools
TestRunCommand - Execute tools
TestTestCommand - Test tools with mock provider
TestProvidersCommand - Manage AI providers
TestRefreshCommand - Regenerate wrapper scripts
TestDocsCommand - View/edit documentation

test_collection.py - Collection management

TestCollection - Collection dataclass and CRUD operations
- Create, save, load, delete collections
- Registry URL generation
- YAML serialization roundtrip
TestListCollections - List local collections
TestGetCollection - Get collection by name
TestClassifyToolReference - Classify local vs registry tool refs
TestResolveToolReferences - Transform tool refs for publishing
- Local tool name expansion (name → user/name)
- Visibility issue detection
- Registry tool validation
- Pinned version transformation
TestToolResolutionResult - Resolution result dataclass

test_collection_api.py - Collection API and client

TestGetMeEndpoint - GET /api/v1/me endpoint (requires Flask)
TestToolApprovedEndpoint - GET /api/v1/tools/.../approved (requires Flask)
TestPostCollectionsEndpoint - POST /api/v1/collections (requires Flask)
TestRegistryClientMethods - Client methods with mocked requests
- get_me() - Get authenticated user info
- has_approved_public_tool() - Check tool approval status
- publish_collection() - Publish collection to registry

Integration Tests (require running server)

test_registry_integration.py - Registry API tests (marked with @pytest.mark.integration)

TestRegistryIntegration - List, search, categories, index
TestAuthIntegration - Registration, login, tokens
TestPublishIntegration - Tool publishing workflow

Run with a local registry server:

# Start registry server
python -m cmdforge.registry.app

# Run integration tests
pytest tests/test_registry_integration.py -v -m integration

Test Fixtures

Common fixtures used across tests:

@pytest.fixture
def temp_tools_dir(tmp_path):
    """Redirect TOOLS_DIR and BIN_DIR to temp directory."""
    with patch('cmdforge.tool.TOOLS_DIR', tmp_path / ".cmdforge"):
        with patch('cmdforge.tool.BIN_DIR', tmp_path / ".local" / "bin"):
            yield tmp_path

@pytest.fixture
def temp_providers_file(tmp_path):
    """Redirect providers.yaml to temp directory."""
    providers_file = tmp_path / ".cmdforge" / "providers.yaml"
    with patch('cmdforge.providers.PROVIDERS_FILE', providers_file):
        yield providers_file

Mocking Strategy

File system: Use tmp_path fixture and patch module-level paths
Subprocess calls: Mock subprocess.run and shutil.which
Stdin: Use patch('sys.stdin', StringIO("input"))
Provider calls: Mock call_provider to return ProviderResult

Known Limitations

Tool name validation - CLI doesn't validate names before creation (test skipped)
Nested braces - {{{x}}} breaks due to overlapping escape sequences (documented behavior)
Integration tests - Require local registry server at localhost:5000

Adding New Tests

Create test functions in the appropriate test file
Use descriptive names: test_<what>_<scenario>
Add docstrings explaining the test purpose
Use fixtures for common setup
Mock external dependencies (subprocess, network, file system)

Example:

def test_tool_with_multiple_code_steps(self):
    """Multiple code steps should pass variables between them."""
    tool = Tool(
        name="multi-code",
        steps=[
            CodeStep(code="x = 1", output_var="x"),
            CodeStep(code="y = int(x) + 1", output_var="y")
        ],
        output="{y}"
    )
    output, code = run_tool(tool, "", {})
    assert code == 0
    assert output == "2"