64 KiB
CmdForge Registry Design
Document Status (January 2026): This is the original design document used to build the registry. Phases 1-4 and 6-8 are complete. Phase 5 (Project Dependencies) is partially implemented. Many features beyond this spec have been added (reviews, issues, fork tracking, scrutiny, admin moderation). See the Implementation Phases section for current status. For the latest documentation, see the project-docs system.
Purpose
Build a centralized registry for CmdForge to enable discovery, publishing, dependency management, and future curation at scale.
Terminology
| Term | Definition |
|---|---|
| Tool definition | The full YAML file in the registry (config.yaml) containing name, steps, arguments, etc. |
| Tool config | The configuration within a tool definition (arguments, steps, provider settings) |
| cmdforge.yaml | Project manifest file declaring tool dependencies and overrides |
| config.yaml | The tool definition file, both in registry and when installed locally |
| Owner | Immutable namespace slug identifying the publisher (e.g., rob, alice) |
| Publisher | A registered user who can publish tools to the registry |
| Wrapper script | Auto-generated bash script in ~/.local/bin/ that invokes a tool |
Canonical naming: Use CmdForge-Registry (capitalized, hyphenated) for the repository name.
Diagram References
- System overview:
discussions/diagrams/cmdforge-registry_rob_1.puml - Data flows:
discussions/diagrams/cmdforge-registry_rob_5.puml
System Overview
Users interact via the CLI and a future Web UI. Both call a Registry API hosted at https://gitea.brrd.tech/api/v1 (future alias: cmdforge.brrd.tech/api/v1). The API syncs from a Gitea-backed registry repo and maintains a SQLite cache/search index.
Canonical API base path: https://gitea.brrd.tech/api/v1
All API endpoints are versioned under /api/v1. When breaking changes are needed, a new version (/api/v2) will be introduced with deprecation notices.
Core API endpoints:
GET /api/v1/toolsGET /api/v1/tools/search?q=...GET /api/v1/tools/{owner}/{name}GET /api/v1/tools/{owner}/{name}/versionsGET /api/v1/tools/{owner}/{name}/download?version=...POST /api/v1/tools(publish)GET /api/v1/categoriesGET /api/v1/collectionsGET /api/v1/collections/{name}GET /api/v1/stats/popularPOST /api/v1/webhook/gitea
Pagination
All list endpoints support pagination:
| Parameter | Default | Max | Description |
|---|---|---|---|
page |
1 | - | Page number (1-indexed) |
per_page |
20 | 100 | Items per page |
sort |
downloads |
- | Sort field |
order |
desc |
- | Sort order (asc/desc) |
Stable ordering: To ensure deterministic results across pages, sorting includes a secondary key:
- Primary: requested field (e.g.,
downloads) - Secondary:
published_at(desc) - Tertiary:
id(for absolute stability)
ORDER BY downloads DESC, published_at DESC, id DESC
LIMIT 20 OFFSET 0
Response pagination metadata:
{
"data": [...],
"meta": {
"page": 1,
"per_page": 20,
"total": 142,
"total_pages": 8
}
}
Input Constraints
Size limits to prevent oversized uploads:
| Field | Max Size | Notes |
|---|---|---|
config.yaml |
64 KB | Tool definition |
README.md |
256 KB | Documentation |
| Request body | 512 KB | Total POST payload |
| Tool name | 64 chars | Alphanumeric + hyphen |
| Description | 500 chars | Short summary |
| Tag | 32 chars | Individual tag |
| Tags array | 10 items | Maximum tags per tool |
Validation errors:
{
"error": {
"code": "PAYLOAD_TOO_LARGE",
"message": "config.yaml exceeds 64KB limit",
"details": {
"field": "config",
"size": 72000,
"limit": 65536
}
}
}
Sort Fields and Indexes
Allowed sort fields:
| Endpoint | Allowed sort values |
|---|---|
GET /tools |
downloads, published_at, name |
GET /tools/search |
relevance, downloads, published_at |
GET /categories |
name, tool_count |
Invalid sort values return 400:
{"error": {"code": "INVALID_SORT", "message": "Unknown sort field 'foo'. Allowed: downloads, published_at, name"}}
Database indexes:
-- Frequent query patterns
CREATE INDEX idx_tools_owner_name ON tools(owner, name);
CREATE INDEX idx_tools_category ON tools(category);
CREATE INDEX idx_tools_published_at ON tools(published_at DESC);
CREATE INDEX idx_tools_downloads ON tools(downloads DESC);
CREATE INDEX idx_tools_owner_name_version ON tools(owner, name, version);
-- For pagination stability
CREATE INDEX idx_tools_sort_stable ON tools(downloads DESC, published_at DESC, id DESC);
-- Publisher lookups
CREATE INDEX idx_publishers_slug ON publishers(slug);
CREATE INDEX idx_publishers_email ON publishers(email);
-- Token lookups
CREATE INDEX idx_api_tokens_hash ON api_tokens(token_hash);
CREATE INDEX idx_api_tokens_publisher ON api_tokens(publisher_id);
API Version Compatibility
Forward compatibility: Clients should ignore unknown fields in API responses:
# Good: ignore unknown fields
tool = response['data']
name = tool.get('name')
# Don't fail if 'new_field' exists but client doesn't know about it
# Bad: strict parsing that fails on unknown fields
tool = ToolSchema.parse(response['data']) # May fail on new fields
Backward compatibility: The API will:
- Never remove fields in a version (only deprecate)
- Never change field types
- Add new optional fields without version bump
- Use new version (
/api/v2) for breaking changes
Deprecation process:
- Add
X-Deprecated-Field: old_fieldheader - Document in changelog
- Remove after 6 months minimum
- Major version bump if widely used
Client version header:
X-CmdForge-Client: cli/1.2.0
Helps server track client versions for deprecation decisions.
Source of Truth
- Gitea registry repo is the source of truth.
- API syncs repo content into SQLite for fast queries, stats, and FTS5 search.
index.jsonremains useful for offline CLI search and as a fallback.
If the cache is stale, the API can fall back to repo reads; a warning header may be emitted.
Namespacing and Paths
Support owner/name from day one:
- Registry path:
tools/{owner}/{name}/config.yaml - API URL:
/tools/{owner}/{name} - Install:
cmdforge registry install rob/summarize - Shorthand:
cmdforge registry install summarizeresolves to the official namespace.
PR branches: submit/{owner}/{name}/{version}.
Namespace Identity
The owner is an immutable slug, not the display name:
-- In publishers table
slug TEXT UNIQUE NOT NULL, -- immutable: "rob", "alice-dev"
display_name TEXT NOT NULL, -- mutable: "Rob", "Alice Developer"
Slug rules:
- Lowercase alphanumeric + hyphens only:
^[a-z0-9][a-z0-9-]*[a-z0-9]$ - 2-39 characters
- Cannot start/end with hyphen
- Set once at registration, cannot be changed
- Reserved slugs:
official,admin,system,api,registry
Rename policy:
display_namecan be changed anytime via dashboardslug(owner) is permanent to preserve URLs and tool references- If a publisher absolutely must change slug (legal reasons, etc.):
- Create new account with new slug
- Republish tools under new namespace
- Mark old tools as deprecated with
replacementpointing to new namespace - Old namespace remains reserved (cannot be reused by others)
Why immutable:
rob/summarize@1.0.0must always resolve to the same tool- Prevents namespace hijacking after rename
- Simplifies caching and CDN strategies
Tool Format (Registry == Local)
Registry tool folders mirror local tools:
tools/
rob/
summarize/
config.yaml
README.md
Tool files match the existing CmdForge format. Registry-specific metadata is kept under registry:. Deprecation is tool-defined and top-level:
name: summarize
version: "1.2.0"
deprecated: true
deprecated_message: "Security issue. Use v1.2.1"
replacement: "rob/summarize@1.2.1"
registry:
published_at: "2025-01-15T10:30:00Z"
downloads: 142
Attribution and Source Fields
Tools can include optional source attribution for provenance and licensing:
name: summarize
version: "1.2.0"
description: "Summarize text using AI"
# Attribution fields (optional)
source:
type: original # original, adapted, or imported
license: MIT # SPDX license identifier
url: https://example.com/tool-repo
author: "Original Author"
# For adapted/imported tools
original_tool: other/original-summarize@1.0.0
changes: "Added French language support"
Source types:
| Type | Description |
|---|---|
original |
Created from scratch by the publisher |
adapted |
Based on another tool with modifications |
imported |
Direct import of external tool (e.g., from npm/pip) |
License field:
- Uses SPDX identifiers:
MIT,Apache-2.0,GPL-3.0, etc. - Required for registry publication
- Validated against SPDX license list
Collections
Collections are curated groups of tools that can be installed together with a single command.
Collection Structure
Collections are defined in collections/{name}.yaml:
name: text-processing-essentials
display_name: "Text Processing Essentials"
description: "Essential tools for text processing and manipulation"
icon: "📝"
tools:
- official/summarize
- official/translate
- official/fix-grammar
- official/simplify
- official/tone-shift
# Optional
curator: official
tags: ["text", "nlp", "writing"]
Collections API
List all collections:
GET /api/v1/collections
Response:
{
"data": [
{
"name": "text-processing-essentials",
"display_name": "Text Processing Essentials",
"description": "Essential tools for text processing...",
"icon": "📝",
"tool_count": 5,
"curator": "official"
}
],
"meta": {"page": 1, "per_page": 20, "total": 8}
}
Get collection details:
GET /api/v1/collections/{name}
Response:
{
"data": {
"name": "text-processing-essentials",
"display_name": "Text Processing Essentials",
"description": "Essential tools for text processing...",
"icon": "📝",
"curator": "official",
"tools": [
{"owner": "official", "name": "summarize", "version": "1.2.0", ...},
{"owner": "official", "name": "translate", "version": "2.1.0", ...}
]
}
}
CLI Commands
# List available collections
cmdforge registry collections
# View collection details
cmdforge registry collections text-processing-essentials
# Install all tools in a collection
cmdforge registry install --collection text-processing-essentials
# Show what would be installed (dry run)
cmdforge registry install --collection text-processing-essentials --dry-run
Schema compatibility note: The current CmdForge config parser may reject unknown top-level keys like deprecated, replacement, and registry. Before implementing registry features:
- Update the YAML parser to ignore unknown keys (permissive mode)
- Or explicitly define these fields in the Tool dataclass with defaults
- Validate registry-specific fields only when publishing, not when running locally
This ensures local tools continue to work even if they don't have registry fields.
Versioning and Immutability
- Unique key:
owner/name + version. - Published versions are immutable.
- Deprecation uses
deprecated,deprecated_message, andreplacement. - CLI warns on install if a version is deprecated.
Yank Policy
Yanking allows removing a version from resolution without deleting it (for auditability):
# In tool config
yanked: true
yanked_reason: "Critical security vulnerability CVE-2025-1234"
yanked_at: "2025-01-20T15:00:00Z"
Yanked version behavior:
| Operation | Behavior |
|---|---|
install foo@1.0.0 (exact) |
Warns but allows install |
install foo@^1.0.0 (constraint) |
Excludes yanked, resolves to next valid |
search / browse |
Hidden by default, shown with --include-yanked |
| Direct URL access | Returns tool with yanked: true in response |
| Already installed | Continues to work, no forced removal |
Database schema addition:
-- Add to tools table
yanked BOOLEAN DEFAULT FALSE,
yanked_reason TEXT,
yanked_at TIMESTAMP
Yank vs Delete:
- Yank: Version remains in DB, excluded from resolution, auditable
- Delete: Reserved for DMCA/legal, requires admin action, leaves tombstone record
Version Format
Tools use semantic versioning (semver):
MAJOR.MINOR.PATCH[-PRERELEASE][+BUILD]
Examples:
1.0.0 # stable release
1.2.3 # stable release
2.0.0-alpha.1 # prerelease
2.0.0-beta.2 # prerelease
2.0.0-rc.1 # release candidate
Version Constraints
Manifest files support these constraint formats:
| Constraint | Meaning | Example Match |
|---|---|---|
1.2.3 |
Exact version | 1.2.3 only |
>=1.2.0 |
Minimum version | 1.2.0, 1.3.0, 2.0.0 |
<2.0.0 |
Below version | 1.9.9, 1.0.0 |
>=1.0.0,<2.0.0 |
Range | 1.0.0 to 1.9.9 |
^1.2.3 |
Compatible (same major) | 1.2.3 to 1.9.9 |
~1.2.3 |
Approximately (same minor) | 1.2.3 to 1.2.9 |
* |
Any version | latest stable |
Version Resolution Rules
When resolving a version constraint:
- Filter: Get all versions matching the constraint
- Exclude prereleases: Unless constraint explicitly includes them (e.g.,
>=2.0.0-alpha.1) - Sort: By semver precedence (descending)
- Select: Highest matching version
Tie-breakers:
- Stable versions preferred over prereleases
- Later publish date wins if versions are equal (shouldn't happen with immutability)
Unsatisfiable constraints:
// API Response: 404
{
"error": {
"code": "VERSION_NOT_FOUND",
"message": "No version of 'rob/summarize' satisfies constraint '>=5.0.0'",
"details": {
"tool": "rob/summarize",
"constraint": ">=5.0.0",
"available_versions": ["1.0.0", "1.1.0", "1.2.0"],
"latest_stable": "1.2.0"
}
}
}
Prerelease Handling
- Prereleases are not returned for
*or range constraints by default - To install prerelease:
cmdforge registry install rob/summarize@2.0.0-beta.1 - To allow prereleases in manifest:
version: ">=2.0.0-0"(the-0suffix includes prereleases)
Download Endpoint Version Selection
The /api/v1/tools/{owner}/{name}/download endpoint accepts version parameters:
| Parameter | Behavior | Example |
|---|---|---|
| (none) | Returns latest stable version | /download → 1.2.0 |
version=1.2.0 |
Exact version (must exist) | /download?version=1.2.0 |
version=^1.0.0 |
Server resolves constraint | /download?version=^1.0.0 → 1.2.0 |
version=latest |
Alias for latest stable | /download?version=latest |
Server-side resolution: The API server resolves version constraints, not the client. This ensures consistent resolution and allows the server to apply policies (e.g., exclude yanked versions).
GET /api/v1/tools/rob/summarize/download?version=^1.0.0&install=true
Response (200):
{
"data": {
"owner": "rob",
"name": "summarize",
"resolved_version": "1.2.0",
"config": "... YAML content ..."
},
"meta": {
"constraint": "^1.0.0",
"available_versions": ["1.0.0", "1.1.0", "1.2.0"]
}
}
Invalid/unsatisfiable constraint:
GET /api/v1/tools/rob/summarize/download?version=^5.0.0
Response (404):
{
"error": {
"code": "CONSTRAINT_UNSATISFIABLE",
"message": "No version matches constraint '^5.0.0'",
"details": {
"constraint": "^5.0.0",
"latest_stable": "1.2.0",
"available_versions": ["1.0.0", "1.1.0", "1.2.0"]
}
}
}
Tool Resolution Order
When a tool is invoked, the CLI searches in this order:
- Local project:
./.cmdforge/<owner>/<name>/config.yaml(or./.cmdforge/<name>/for unnamespaced) - Global user:
~/.cmdforge/<owner>/<name>/config.yaml - Registry: Fetch from API, install to global, then run
- Error:
Tool '<toolname>' not found
Step 3 only occurs if auto_fetch_from_registry: true in config (default: true).
Path convention: Use .cmdforge/ (with leading dot) for both local and global to maintain consistency.
Resolution also respects namespacing:
summarize→ searches for any tool namedsummarize, prefersofficial/summarizeif existsrob/summarize→ searches for exactlyrob/summarize
Official Namespace
The slug official is reserved for curated, high-quality tools maintained by the registry administrators.
- Shorthand
summarizeresolves toofficial/summarizeif it exists - If no
official/summarize, falls back to most-downloaded tool namedsummarize - To avoid ambiguity, always use full
owner/namein manifests
Reserved slugs that cannot be registered: official, admin, system, api, registry, cmdforge
Auto-Fetch Behavior
When enabled (auto_fetch_from_registry: true), missing tools are automatically fetched:
$ summarize < file.txt
# Tool 'summarize' not found locally.
# Fetching from registry...
# Installed: official/summarize@1.2.0
# Running...
Behavior details:
- Fetches latest stable version unless pinned in
cmdforge.yaml - Installs to
~/.cmdforge/<owner>/<name>/ - Generates wrapper script in
~/.local/bin/ - Subsequent runs use local copy (no re-fetch)
To disable (require explicit install):
# ~/.cmdforge/config.yaml
auto_fetch_from_registry: false
Wrapper Script Collisions
When two tools from different owners have the same name:
| Scenario | Behavior |
|---|---|
Install official/summarize |
Creates wrapper ~/.local/bin/summarize |
Install rob/summarize (collision) |
Creates wrapper ~/.local/bin/rob-summarize |
Uninstall official/summarize |
Removes summarize wrapper, promotes rob-summarize → summarize if desired |
The first-installed tool with a given name gets the short wrapper. Subsequent tools use owner-name format.
To invoke a specific owner's tool:
# Short form (whichever was installed first)
summarize < file.txt
# Explicit owner form (always works)
rob-summarize < file.txt
# Or via cmdforge run
cmdforge run rob/summarize < file.txt
Project Manifest (cmdforge.yaml)
Defines tool dependencies with optional runtime overrides:
name: my-ai-project
version: "1.0.0"
dependencies:
- name: rob/summarize
version: ">=1.0.0"
overrides:
rob/summarize:
provider: ollama
Overrides are applied at runtime and do not mutate installed tool configs.
CLI Config and Tokens
Global config lives in ~/.cmdforge/config.yaml:
registry:
url: https://gitea.brrd.tech/api/v1 # Must match canonical base path
token: "reg_xxxxxxxxxxxx"
client_id: "anon_abc123def456"
auto_fetch_from_registry: true
client_id is generated locally and used for anonymous install dedupe.
Publishing and Auth
Publishing uses registry accounts, not Gitea accounts:
- Public endpoints require no auth.
POST /toolsrequires a registry token.- The API server uses a private Gitea service account to open PRs.
Publish Idempotency and Edge Cases
Idempotency key: owner/name@version
| Scenario | API Response | HTTP Code |
|---|---|---|
| New version, no PR exists | Create PR, return URL | 201 Created |
| PR already exists (pending) | Return existing PR URL | 200 OK |
| Version already published | Error: version exists | 409 Conflict |
| PR was closed without merge | Allow new PR | 201 Created |
| PR was merged, then tool deleted | Error: version exists (tombstone) | 409 Conflict |
Version immutability enforcement:
// Attempt to publish existing version
// Response: 409 Conflict
{
"error": {
"code": "VERSION_EXISTS",
"message": "Version 1.2.0 of 'rob/summarize' already exists and cannot be overwritten",
"details": {
"published_at": "2025-01-15T10:30:00Z",
"action": "Bump version number to publish changes"
}
}
}
Closed PR handling:
- Track PR state in database:
pending,merged,closed - If PR was closed (rejected/abandoned), allow new submission for same version
- If PR was merged, version is immutable forever
Update flow (new version, not overwrite):
- Developer modifies tool locally
- Bumps version in
config.yaml(e.g.,1.2.0→1.3.0) - Runs
cmdforge registry publish - New PR created for
1.3.0 - Old version
1.2.0remains available
Publisher Registration
Publishers register on the registry website, not Gitea:
Registration flow:
- User visits
https://gitea.brrd.tech/registry/register(or futurecmdforge.brrd.tech) - Creates account with email + password + slug
- Receives verification email (optional in v1, but track
verifiedstatus) - Logs into dashboard at
/dashboard - Generates API token from dashboard
- Uses token in CLI for publishing
Authentication Security
Password hashing:
- Algorithm: Argon2id (memory-hard, recommended by OWASP)
- Parameters:
memory=65536, iterations=3, parallelism=4 - Library:
argon2-cffifor Python
from argon2 import PasswordHasher
ph = PasswordHasher(memory_cost=65536, time_cost=3, parallelism=4)
hash = ph.hash(password)
ph.verify(hash, password) # raises on mismatch
API token format:
reg_<random-32-bytes-base62>
Example: reg_7kX9mPqR2sT4vW6xY8zA1bC3dE5fG7hJ
- Prefix
reg_for easy identification in logs/configs - 32 bytes of cryptographically random data
- Base62 encoded (alphanumeric, no special chars)
- Total length: ~47 characters
- Stored as SHA-256 hash in database (never plain text)
Token lifecycle:
| Action | Behavior |
|---|---|
| Generate | Create new token, return once, store hash |
| List | Show token name, created date, last used (not the token itself) |
| Revoke | Set revoked_at timestamp, reject future uses |
| Rotate | Generate new token, optionally revoke old |
Rate limits:
| Endpoint | Limit | Window | Scope | Retry-After |
|---|---|---|---|---|
POST /register |
5 | 1 hour | IP | 3600 |
POST /login |
10 | 15 min | IP | 900 |
POST /login (failed) |
5 | 15 min | IP + email | 900 |
POST /tokens |
10 | 1 hour | Token | 3600 |
POST /tools |
20 | 1 hour | Token | 3600 |
GET /tools/* |
100 | 1 min | IP | 60 |
GET /download |
60 | 1 min | IP | 60 |
Rate limit response (429):
{
"error": {
"code": "RATE_LIMITED",
"message": "Too many requests. Try again in 60 seconds.",
"details": {
"limit": 100,
"window": "1 minute",
"retry_after": 60
}
}
}
Headers on rate-limited response:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1705766400
Scope priority: For authenticated requests, both IP and token limits apply. The more restrictive limit wins.
Account lockout:
- After 5 failed login attempts: 15-minute lockout for that email
- After 10 failed attempts: 1-hour lockout
- Lockout clears on successful password reset
Password reset flow (deferred to v1.1):
- User requests reset via email
- Server generates time-limited token (1 hour expiry)
- Email contains reset link with token
- User sets new password
- All existing sessions/tokens optionally invalidated
Email verification flow (deferred to v1.1):
- On registration, send verification email
- User clicks link with verification token
- Set
verified = truein database - Unverified accounts can browse but not publish
Token Scopes and Authorization
Tokens have scopes that limit their capabilities:
| Scope | Permissions |
|---|---|
read |
View own published tools, download stats |
publish |
Submit new tools, update own tool metadata |
admin |
Yank tools, manage categories (registry admins only) |
Default scope: New tokens get read,publish by default.
Ownership enforcement:
@app.route('/api/v1/tools', methods=['POST'])
@require_token(scopes=['publish'])
def publish_tool():
token = get_current_token()
tool_data = request.json
# Enforce owner == token holder's slug
if tool_data['owner'] != token.publisher.slug:
return {
"error": {
"code": "FORBIDDEN",
"message": f"Cannot publish to namespace '{tool_data['owner']}'. "
f"Your namespace is '{token.publisher.slug}'."
}
}, 403
# Proceed with publish...
GET /api/v1/me/tools authorization:
- Requires valid token with
readscope - Returns only tools where
owner == token.publisher.slug - Includes pending PRs and all versions (including yanked)
Web Session Security
Dashboard login uses session cookies (not tokens) for browser auth:
Cookie settings:
SESSION_COOKIE_NAME = 'cmdforge_session'
SESSION_COOKIE_HTTPONLY = True # Prevent JS access
SESSION_COOKIE_SECURE = True # HTTPS only in production
SESSION_COOKIE_SAMESITE = 'Lax' # CSRF protection
SESSION_COOKIE_MAX_AGE = 86400 * 7 # 7 days
CSRF protection:
- All POST/PUT/DELETE forms include
csrf_tokenhidden field - Token validated server-side before processing
- 403 Forbidden if token missing or invalid
Session lifecycle:
| Event | Action |
|---|---|
| Login | Create session, set cookie |
| Logout | Delete session, clear cookie |
| Idle 24h | Session expires, re-login required |
| Password change | Invalidate all sessions |
| Token revocation | Existing sessions continue (token != session) |
Secure session storage:
# Store sessions in DB, not filesystem
from flask_session import Session
app.config['SESSION_TYPE'] = 'sqlalchemy'
app.config['SESSION_SQLALCHEMY_TABLE'] = 'sessions'
Database schema:
-- Publishers
CREATE TABLE publishers (
id INTEGER PRIMARY KEY AUTOINCREMENT,
email TEXT UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
slug TEXT UNIQUE NOT NULL, -- immutable namespace: "rob", "alice-dev"
display_name TEXT NOT NULL, -- mutable: "Rob", "Alice Developer"
bio TEXT,
website TEXT,
verified BOOLEAN DEFAULT FALSE,
locked_until TIMESTAMP, -- account lockout
failed_login_attempts INTEGER DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- API tokens (one publisher can have multiple)
CREATE TABLE api_tokens (
id INTEGER PRIMARY KEY AUTOINCREMENT,
publisher_id INTEGER NOT NULL REFERENCES publishers(id),
token_hash TEXT NOT NULL,
name TEXT NOT NULL, -- "CLI token", "CI token"
last_used_at TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
revoked_at TIMESTAMP -- NULL if active
);
-- Tools (links to publisher)
CREATE TABLE tools (
id INTEGER PRIMARY KEY AUTOINCREMENT,
owner TEXT NOT NULL, -- namespace slug (immutable, from publisher.slug)
name TEXT NOT NULL,
version TEXT NOT NULL,
description TEXT,
category TEXT,
tags TEXT, -- JSON array
config_yaml TEXT NOT NULL, -- Full tool config
readme TEXT,
publisher_id INTEGER NOT NULL REFERENCES publishers(id),
deprecated BOOLEAN DEFAULT FALSE,
deprecated_message TEXT,
replacement TEXT,
downloads INTEGER DEFAULT 0,
published_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE(owner, name, version)
);
-- Download stats (for deduplication)
CREATE TABLE download_stats (
id INTEGER PRIMARY KEY AUTOINCREMENT,
tool_id INTEGER NOT NULL REFERENCES tools(id),
client_id TEXT NOT NULL,
downloaded_at DATE NOT NULL,
UNIQUE(tool_id, client_id, downloaded_at)
);
-- Search index (FTS5)
CREATE VIRTUAL TABLE tools_fts USING fts5(
name, description, tags, readme,
content='tools',
content_rowid='id'
);
-- FTS5 sync triggers (required for external content tables)
CREATE TRIGGER tools_ai AFTER INSERT ON tools BEGIN
INSERT INTO tools_fts(rowid, name, description, tags, readme)
VALUES (new.id, new.name, new.description, new.tags, new.readme);
END;
CREATE TRIGGER tools_ad AFTER DELETE ON tools BEGIN
INSERT INTO tools_fts(tools_fts, rowid, name, description, tags, readme)
VALUES ('delete', old.id, old.name, old.description, old.tags, old.readme);
END;
CREATE TRIGGER tools_au AFTER UPDATE ON tools BEGIN
INSERT INTO tools_fts(tools_fts, rowid, name, description, tags, readme)
VALUES ('delete', old.id, old.name, old.description, old.tags, old.readme);
INSERT INTO tools_fts(rowid, name, description, tags, readme)
VALUES (new.id, new.name, new.description, new.tags, new.readme);
END;
-- Pending PRs (track publish state)
CREATE TABLE pending_prs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
publisher_id INTEGER NOT NULL REFERENCES publishers(id),
owner TEXT NOT NULL,
name TEXT NOT NULL,
version TEXT NOT NULL,
pr_number INTEGER NOT NULL,
pr_url TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending', -- pending, merged, closed
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE(owner, name, version)
);
-- Webhook sync log (idempotency)
CREATE TABLE webhook_log (
id INTEGER PRIMARY KEY AUTOINCREMENT,
delivery_id TEXT UNIQUE NOT NULL, -- Gitea delivery ID
event_type TEXT NOT NULL,
processed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Note on tags indexing: The tags column stores JSON arrays as text. For v1, FTS5 will search within the JSON string. If tag filtering becomes a bottleneck, normalize to a tool_tags junction table:
-- Future: normalized tags (if needed)
CREATE TABLE tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL
);
CREATE TABLE tool_tags (
tool_id INTEGER REFERENCES tools(id),
tag_id INTEGER REFERENCES tags(id),
PRIMARY KEY (tool_id, tag_id)
);
CLI first-time publish flow:
$ cmdforge registry publish
No registry account configured.
1. Register at: https://gitea.brrd.tech/registry/register
2. Generate a token from your dashboard
3. Enter your token below
Registry token: ********
Token saved to ~/.cmdforge/config.yaml
Validating tool...
✓ config.yaml is valid
✓ README.md exists (2.3 KB)
✓ Version 1.0.0 not yet published
Publishing rob/my-tool@1.0.0...
✓ PR created: https://gitea.brrd.tech/rob/CmdForge-Registry/pulls/42
Your tool is pending review. You'll receive an email when it's approved.
CLI Commands Reference
Full mapping of CLI commands to API calls:
Registry Commands
# Search for tools
$ cmdforge registry search <query> [--category=<cat>] [--limit=20]
→ GET /api/v1/tools/search?q=<query>&category=<cat>&limit=20
# Browse tools (TUI)
$ cmdforge registry browse [--category=<cat>]
→ GET /api/v1/tools?category=<cat>&page=1
→ GET /api/v1/categories
# View tool details
$ cmdforge registry info <owner/name>
→ GET /api/v1/tools/<owner>/<name>
# Install a tool
$ cmdforge registry install <owner/name> [--version=<ver>]
→ GET /api/v1/tools/<owner>/<name>/download?version=<ver>&install=true
→ Writes to ~/.cmdforge/<owner>/<name>/config.yaml
→ Generates ~/.local/bin/<name> wrapper (or <owner>-<name> if collision)
# Uninstall a tool
$ cmdforge registry uninstall <owner/name>
→ Removes ~/.cmdforge/<owner>/<name>/
→ Removes wrapper script
# Publish a tool
$ cmdforge registry publish [path] [--dry-run]
→ POST /api/v1/tools (with registry token)
→ Returns PR URL
# List my published tools
$ cmdforge registry my-tools
→ GET /api/v1/me/tools (with registry token)
# Update index cache
$ cmdforge registry update
→ GET /api/v1/index.json
→ Writes to ~/.cmdforge/registry/index.json
Project Commands
# Install project dependencies from cmdforge.yaml
$ cmdforge install
→ Reads ./cmdforge.yaml
→ For each dependency:
GET /api/v1/tools/<owner>/<name>/download?version=<constraint>&install=true
→ Installs to ~/.cmdforge/<owner>/<name>/
# Add a dependency to cmdforge.yaml
$ cmdforge add <owner/name> [--version=<constraint>]
→ Adds to ./cmdforge.yaml dependencies
→ Runs install for that tool
# Show project dependencies status
$ cmdforge deps
→ Reads ./cmdforge.yaml
→ Shows installed status for each dependency
→ Note: "cmdforge list" is reserved for listing installed tools
Command naming note: cmdforge list already exists to list locally installed tools. Use cmdforge deps to show project manifest dependencies.
Flags available on most commands
| Flag | Description |
|---|---|
--offline |
Use cached index only, don't fetch |
--refresh |
Force refresh of cached data |
--json |
Output in JSON format |
--verbose |
Show detailed output |
Webhooks and Security
HMAC Verification
All Gitea webhooks are verified using HMAC-SHA256:
import hmac
import hashlib
def verify_webhook(request, secret):
signature = request.headers.get('X-Gitea-Signature')
if not signature:
return False
expected = hmac.new(
secret.encode(),
request.body,
hashlib.sha256
).hexdigest()
return hmac.compare_digest(signature, expected)
Replay Protection
While sync is idempotent, implement basic replay protection:
def process_webhook(request):
delivery_id = request.headers.get('X-Gitea-Delivery')
# Check if already processed
if db.webhook_log.exists(delivery_id=delivery_id):
return {"status": "already_processed"}, 200
# Verify signature
if not verify_webhook(request, WEBHOOK_SECRET):
return {"error": "invalid_signature"}, 401
# Process with lock to prevent concurrent processing
with db.lock(f"webhook:{delivery_id}"):
# Double-check after acquiring lock
if db.webhook_log.exists(delivery_id=delivery_id):
return {"status": "already_processed"}, 200
# Process the webhook
result = sync_from_repo()
# Log successful processing
db.webhook_log.insert(
delivery_id=delivery_id,
event_type=request.json.get('action'),
processed_at=datetime.utcnow()
)
return {"status": "processed"}, 200
Sync Job Locking
Prevent concurrent sync operations:
# Using file lock or database advisory lock
SYNC_LOCK_TIMEOUT = 300 # 5 minutes max
def sync_from_repo():
try:
with acquire_lock("registry_sync", timeout=SYNC_LOCK_TIMEOUT):
# Pull latest from Gitea
repo.fetch()
repo.reset('origin/main', hard=True)
# Parse and update database
for tool_path in glob('tools/*/*/config.yaml'):
update_tool_in_db(tool_path)
# Rebuild FTS index if needed
rebuild_fts_index()
except LockTimeout:
logger.warning("Sync already in progress, skipping")
return {"status": "skipped", "reason": "sync_in_progress"}
Atomic Sync Strategy
To avoid partially updated DB during webhook sync, use transactional table swap:
def sync_from_repo_atomic():
with acquire_lock("registry_sync", timeout=SYNC_LOCK_TIMEOUT):
# 1. Pull latest from Gitea
repo.fetch()
repo.reset('origin/main', hard=True)
# 2. Parse all tools into memory
new_tools = []
for tool_path in glob('tools/*/*/config.yaml'):
tool_data = parse_tool(tool_path)
if tool_data:
new_tools.append(tool_data)
# 3. Atomic swap using transaction
with db.transaction():
# Create temp table
db.execute("CREATE TABLE tools_new AS SELECT * FROM tools WHERE 0")
# Bulk insert into temp table
for tool in new_tools:
db.execute("INSERT INTO tools_new ...", tool)
# Swap tables atomically
db.execute("ALTER TABLE tools RENAME TO tools_old")
db.execute("ALTER TABLE tools_new RENAME TO tools")
db.execute("DROP TABLE tools_old")
# Rebuild FTS index
db.execute("INSERT INTO tools_fts(tools_fts) VALUES('rebuild')")
# Update sync timestamp
db.execute("UPDATE sync_status SET last_sync = ?", [datetime.utcnow()])
Why atomic: Per-row updates with FTS triggers can yield inconsistent reads under load. Readers may see partial state mid-sync. Table swap ensures all-or-nothing visibility.
Error Handling
| Error Scenario | Behavior |
|---|---|
| Repo fetch fails | Log error, retry in 5 min, alert if 3 failures |
| YAML parse error | Skip tool, log error, continue with others |
| Database write fails | Rollback transaction, retry once, then alert |
| Lock timeout | Skip this sync, next webhook will retry |
Automated CI Validation
PRs are validated automatically using CmdForge (dogfooding):
PR Submitted
│
▼
┌─────────────────────────────────────┐
│ Gitea CI runs validation tools: │
│ • schema-validator │
│ • security-scanner │
│ • duplicate-detector │
└───────────────┬─────────────────────┘
│
┌───────┴───────┐
│ │
All pass Any fail
│ │
▼ ▼
Auto-merge or Add comment,
flag for review request changes
Validation checks:
- Schema validation: config.yaml matches expected format
- Security scan: No dangerous shell commands, no secrets in prompts
- Duplicate detection: AI-powered similarity check against existing tools
- README check: README.md exists and is non-empty
CI workflow (.gitea/workflows/validate.yaml):
name: Validate Tool Submission
on: [pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Validate schema
run: python scripts/validate_tool.py ${{ github.event.pull_request.head.sha }}
- name: Security scan
run: cmdforge run security-scanner < changed_files.txt
- name: Check duplicates
run: cmdforge run duplicate-detector < changed_files.txt
Registry Repository Structure
Full structure of the CmdForge-Registry repo:
CmdForge-Registry/
├── README.md # Registry overview
├── CONTRIBUTING.md # How to submit tools
├── LICENSE
│
├── tools/ # All published tools
│ ├── rob/
│ │ ├── summarize/
│ │ │ ├── config.yaml
│ │ │ └── README.md
│ │ └── translate/
│ │ ├── config.yaml
│ │ └── README.md
│ └── alice/
│ └── code-review/
│ ├── config.yaml
│ └── README.md
│
├── categories/
│ └── categories.yaml # Category definitions
│
├── collections/ # Curated tool collections
│ ├── text-processing-essentials.yaml
│ ├── developer-toolkit.yaml
│ └── data-pipeline-basics.yaml
│
├── index.json # Auto-generated search index
│
├── .gitea/
│ └── workflows/
│ ├── validate.yaml # PR validation
│ ├── build-index.yaml # Rebuild index on merge
│ └── notify-api.yaml # Webhook to API server
│
└── scripts/
├── validate_tool.py # Schema validation
├── build_index.py # Generate index.json
├── check_duplicates.py # Similarity detection
└── security_scan.py # Security checks
categories.yaml format:
categories:
- name: text-processing
description: Tools for manipulating and analyzing text
icon: 📝
- name: code
description: Tools for code review, generation, and analysis
icon: 💻
- name: data
description: Tools for data transformation and analysis
icon: 📊
- name: media
description: Tools for image, audio, and video processing
icon: 🎨
- name: productivity
description: General productivity and automation tools
icon: ⚡
Download Stats
Counting Methodology
- Count installs only, not views or searches
- Increment after successful download (response sent)
- Dedupe by
client_id + tool_id + date
def download_tool(owner, name, version, install=False, client_id=None):
tool = get_tool(owner, name, version)
if not tool:
return {"error": "not_found"}, 404
config_yaml = tool.config_yaml
# Only count if this is an install (not just viewing)
if install:
record_download(tool.id, client_id)
return {"config": config_yaml}, 200
def record_download(tool_id, client_id):
today = date.today()
# Use client_id if provided, otherwise generate anonymous fallback
effective_client_id = client_id or f"anon_{hash(request.remote_addr)}"
# Dedupe: only count once per client per tool per day
try:
db.download_stats.insert(
tool_id=tool_id,
client_id=effective_client_id,
downloaded_at=today
)
# Increment counter (can be async/batch updated)
db.execute("UPDATE tools SET downloads = downloads + 1 WHERE id = ?", [tool_id])
except IntegrityError:
pass # Already counted today, ignore
Client ID Generation
CLI generates a persistent anonymous ID on first run:
# In CLI, on first run
import uuid
import os
CONFIG_PATH = os.path.expanduser("~/.cmdforge/config.yaml")
def get_or_create_client_id():
config = load_config()
if 'client_id' not in config:
config['client_id'] = f"anon_{uuid.uuid4().hex[:16]}"
save_config(config)
return config['client_id']
Fallback when client_id missing:
- If header
X-Client-IDnot sent, use IP hash as fallback - This still provides some dedupe for anonymous users
- Logged users' downloads are attributed to their account instead
Privacy Considerations
- No IP addresses stored in database
client_idis client-controlled and can be regenerated- Stats are aggregated (total count), not individual tracking
Async Stats Strategy
To avoid DB contention on the hot download path:
from queue import Queue
from threading import Thread
# In-memory queue for stats
stats_queue = Queue()
def record_download_async(tool_id, client_id):
"""Non-blocking: enqueue for background processing"""
stats_queue.put({
'tool_id': tool_id,
'client_id': client_id,
'date': date.today()
})
def stats_worker():
"""Background thread: batch process stats every 5 seconds"""
batch = []
while True:
try:
item = stats_queue.get(timeout=5)
batch.append(item)
except Empty:
if batch:
flush_batch(batch)
batch = []
def flush_batch(batch):
"""Bulk insert with conflict ignore"""
with db.transaction():
for item in batch:
try:
db.execute("""
INSERT INTO download_stats (tool_id, client_id, downloaded_at)
VALUES (?, ?, ?)
ON CONFLICT DO NOTHING
""", [item['tool_id'], item['client_id'], item['date']])
except Exception as e:
logger.warning(f"Stats insert failed: {e}")
# Don't fail downloads for stats errors
Failure behavior: If stats DB write fails, log the error but don't fail the download. Stats are "best effort" - the download must succeed.
Search
- Primary search: SQLite FTS5 inside the API.
index.jsonprovides offline CLI search and backup.- If FTS5 is stale, return results with
X-Search-Index-Stale: true.
API Caching Strategy
Cache Headers
| Endpoint | Cache-Control | ETag | Notes |
|---|---|---|---|
GET /index.json |
max-age=300, stale-while-revalidate=60 |
Yes | 5 min cache, background refresh |
GET /tools/{owner}/{name} |
max-age=60 |
Yes | 1 min cache |
GET /tools/{owner}/{name}/download |
max-age=3600, immutable |
Yes | Immutable versions, 1 hour |
GET /tools/search |
no-cache |
No | Always fresh |
GET /categories |
max-age=3600 |
Yes | Categories change rarely |
ETag Implementation
import hashlib
from datetime import datetime
def get_tool_etag(tool):
"""Generate ETag from tool identity (immutable versions don't change)"""
# Since versions are immutable, owner/name@version is stable
# Use published_at for extra safety (not updated_at, which doesn't exist)
content = f"{tool.owner}/{tool.name}@{tool.version}:{tool.published_at.isoformat()}"
return hashlib.md5(content.encode()).hexdigest()
def get_index_etag():
"""Generate ETag from last sync timestamp"""
last_sync = db.get_last_sync_time()
return hashlib.md5(last_sync.isoformat().encode()).hexdigest()
@app.route('/api/v1/tools/<owner>/<name>/download')
def download_tool(owner, name):
version = request.args.get('version', 'latest')
tool = resolve_and_get_tool(owner, name, version)
etag = get_tool_etag(tool)
# Check If-None-Match header
if request.headers.get('If-None-Match') == etag:
return '', 304 # Not Modified
response = jsonify({
"data": {
"owner": tool.owner,
"name": tool.name,
"resolved_version": tool.version,
"config": tool.config_yaml
}
})
response.headers['ETag'] = etag
response.headers['Cache-Control'] = 'max-age=3600, immutable'
return response
Note: Since tool versions are immutable, the ETag based on owner/name@version is permanently stable. The published_at timestamp is included for defense-in-depth but won't change.
DB vs Repo Read Strategy
| Scenario | Read From | Reason |
|---|---|---|
| Normal operation | SQLite DB | Fast, indexed, FTS |
| DB empty/corrupted | Gitea repo | Fallback/recovery |
| Webhook sync in progress | DB (stale OK) | Avoid blocking reads |
| Search query | SQLite FTS5 | Full-text search |
| Download specific version | DB, fallback to repo | DB is cache, repo is truth |
Staleness Detection
STALE_THRESHOLD = timedelta(minutes=10)
def is_db_stale():
last_sync = db.get_last_sync_time()
return datetime.utcnow() - last_sync > STALE_THRESHOLD
@app.route('/tools/search')
def search_tools(q):
results = db.search_fts(q)
response = jsonify({"results": results})
if is_db_stale():
response.headers['X-Search-Index-Stale'] = 'true'
response.headers['X-Last-Sync'] = db.get_last_sync_time().isoformat()
return response
Error Model
Response Envelopes
Success response:
{
"data": { ... },
"meta": {
"page": 1,
"per_page": 20,
"total": 42,
"total_pages": 3
}
}
Error response:
{
"error": {
"code": "TOOL_NOT_FOUND",
"message": "Tool 'foo/bar' does not exist",
"details": {
"owner": "foo",
"name": "bar",
"suggestion": "Did you mean 'rob/bar'?"
},
"docs_url": "https://cmdforge.brrd.tech/docs/errors#TOOL_NOT_FOUND"
}
}
Error Codes
| Code | HTTP | Description |
|---|---|---|
TOOL_NOT_FOUND |
404 | Tool does not exist |
VERSION_NOT_FOUND |
404 | Requested version doesn't exist |
VERSION_EXISTS |
409 | Cannot overwrite published version |
INVALID_VERSION |
400 | Version string is not valid semver |
INVALID_CONSTRAINT |
400 | Version constraint syntax error |
CONSTRAINT_UNSATISFIABLE |
404 | No version matches constraint |
VALIDATION_ERROR |
400 | Tool config validation failed |
UNAUTHORIZED |
401 | Missing or invalid auth token |
FORBIDDEN |
403 | Token valid but lacks permission |
RATE_LIMITED |
429 | Too many requests |
SLUG_TAKEN |
409 | Namespace slug already registered |
ACCOUNT_LOCKED |
403 | Too many failed login attempts |
SERVER_ERROR |
500 | Internal error (logged for debugging) |
Error Scenarios and Fallbacks
CLI Error Handling
| Scenario | CLI Behavior | User Message |
|---|---|---|
| Registry offline | Use cached tools if available | "Registry unavailable. Using cached version." |
| Tool not found | Check cache, then fail | "Tool 'foo/bar' not found in registry or cache." |
| Version constraint unsatisfiable | Show available versions | "No version matches '>=5.0.0'. Available: 1.0.0, 1.1.0, 1.2.0" |
| Auth token expired | Prompt for new token | "Token expired. Please re-authenticate." |
| Rate limited | Wait and retry (backoff) | "Rate limited. Retrying in 30 seconds..." |
| Network timeout | Retry with backoff, then fail | "Connection timed out. Check your network." |
Validation Failure Details
When VALIDATION_ERROR occurs, provide specific field errors:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Tool configuration is invalid",
"details": {
"errors": [
{
"path": "steps[0].provider",
"message": "Provider 'gpt5' is not recognized",
"allowed": ["claude", "openai", "ollama", "mock"]
},
{
"path": "version",
"message": "Version '1.0' is not valid semver (use '1.0.0')"
}
]
},
"docs_url": "https://cmdforge.brrd.tech/docs/tool-format"
}
}
Dependency Resolution Failures
When cmdforge install fails on a manifest:
$ cmdforge install
Error: Could not resolve all dependencies
rob/summarize@^2.0.0
✗ No matching version (latest: 1.2.0)
alice/translate@>=1.0.0
✓ Found 1.3.0
Suggestions:
- Update rob/summarize constraint to "^1.0.0"
- Contact the tool author for a v2 release
Graceful Degradation
| Component Down | Fallback Behavior |
|---|---|
| API server | CLI uses ~/.cmdforge/registry/index.json for search |
| Gitea repo | API serves from DB cache (may be stale) |
| FTS5 index | Fall back to LIKE queries (slower but works) |
| Network | Use locally installed tools, skip registry features |
UX Requirements (CLI/TUI)
Publishing UX
-
cmdforge registry publish --dry-runvalidates locally and shows what would be submitted:$ cmdforge registry publish --dry-run Validating tool... ✓ config.yaml is valid ✓ README.md exists (2.3 KB) ✓ Version 1.1.0 not yet published Would submit: Owner: rob Name: summarize Version: 1.1.0 Category: text-processing Tags: summarization, ai, text Config preview: ───────────────────────────── name: summarize version: "1.1.0" description: Summarize text using AI ... ───────────────────────────── Run without --dry-run to submit for review. -
Version bump reminder: CLI warns if version hasn't changed from published:
⚠ Version 1.0.0 is already published. Bump version in config.yaml to publish changes. -
First-time publishing flow prompts for token and saves it to config.
Progress Indicators
Long-running operations show progress:
$ cmdforge install
Installing project dependencies...
[1/3] rob/summarize@^1.0.0
Resolving version... 1.2.0
Downloading... done
Installing... done ✓
[2/3] alice/translate@>=2.0.0
Resolving version... 2.1.0
Downloading... done
Installing... done ✓
[3/3] official/code-review@*
Resolving version... 1.0.0
Downloading... done
Installing... done ✓
✓ Installed 3 tools
$ cmdforge registry publish
Submitting rob/summarize@1.1.0...
Validating... done ✓
Uploading... done ✓
Creating PR... done ✓
✓ PR created: https://gitea.brrd.tech/rob/CmdForge-Registry/pulls/42
Your tool is pending review. You'll receive an email when it's approved.
TUI Browse
cmdforge registry browse opens a full-screen terminal UI:
┌─ CmdForge Registry ───────────────────────────────────────┐
│ Search: [________________] [All Categories ▼] [Sort: Popular ▼] │
├─────────────────────────────────────────────────────────────┤
│ │
│ ▶ rob/summarize v1.2.0 ⬇ 142 │
│ Summarize text using AI │
│ [text-processing] [ai] [summarization] │
│ │
│ alice/translate v2.1.0 ⬇ 98 │
│ Translate text between languages │
│ [text-processing] [translation] │
│ │
│ official/code-review v1.0.0 ⬇ 87 │
│ AI-powered code review │
│ [code] [review] [ai] │
│ │
├─────────────────────────────────────────────────────────────┤
│ ↑↓ Navigate Enter: Details i: Install /: Search q: Quit │
└─────────────────────────────────────────────────────────────┘
Keyboard shortcuts:
| Key | Action |
|---|---|
↑/↓ or j/k |
Navigate list |
Enter |
View tool details |
i |
Install selected tool |
/ |
Focus search box |
c |
Change category filter |
s |
Change sort order |
? |
Show help |
q |
Quit |
Virtual scrolling: For large tool lists (>100), use virtual scrolling to maintain performance.
Project Initialization
$ cmdforge init
Creating cmdforge.yaml...
Project name [my-project]: my-ai-project
Version [1.0.0]:
Would you like to add any tools? (search with 's', skip with Enter)
> s
Search: summ
1. rob/summarize v1.2.0 - Summarize text using AI
2. alice/summary v1.0.0 - Generate summaries
Add tool (number, or Enter to finish): 1
Added rob/summarize@^1.2.0
Add tool (number, or Enter to finish):
✓ Created cmdforge.yaml
name: my-ai-project
version: "1.0.0"
dependencies:
- name: rob/summarize
version: "^1.2.0"
Run 'cmdforge install' to install dependencies.
Accessibility
- CLI: All output works with screen readers, no color-only information
- TUI: Full keyboard navigation, high-contrast mode support
- Web UI: WCAG 2.1 AA compliance target
- Semantic HTML
- ARIA labels for interactive elements
- Focus management in modals
- Skip links for navigation
Offline Cache
Cache registry index locally:
~/.cmdforge/registry/index.json
Refresh when older than 24 hours; support --offline and --refresh flags.
Index Integrity
The cached index.json includes integrity metadata:
{
"version": "1.0",
"generated_at": "2025-01-20T12:00:00Z",
"checksum": "sha256:abc123...",
"tool_count": 142,
"tools": [...]
}
API response headers:
ETag: "abc123def456"
X-Index-Checksum: sha256:abc123...
X-Index-Generated: 2025-01-20T12:00:00Z
CLI verification:
def verify_cached_index():
"""Verify cached index integrity on load"""
cached = load_cached_index()
if not cached:
return None
# Verify checksum
content = json.dumps(cached['tools'], sort_keys=True)
computed = hashlib.sha256(content.encode()).hexdigest()
if computed != cached.get('checksum', '').replace('sha256:', ''):
logger.warning("Cached index checksum mismatch, will refresh")
return None
return cached
Corruption handling:
- If checksum fails, discard cache and fetch fresh
- If partial write detected (missing fields), discard and refresh
- CLI shows warning: "Cached index corrupted, fetching fresh copy..."
Web UI Vision
The registry includes a full website, not just an API:
Site structure:
cmdforge.brrd.tech (or gitea.brrd.tech/registry)
├── / # Landing page
├── /tools # Browse all tools
├── /tools/{owner}/{name} # Tool detail page
├── /categories # Browse by category
├── /categories/{name} # Tools in category
├── /collections # Browse curated collections
├── /collections/{name} # Collection detail page
├── /search?q=... # Search results
├── /docs # Documentation
│ ├── /docs/getting-started
│ ├── /docs/creating-tools
│ ├── /docs/publishing
│ └── /docs/best-practices
├── /tutorials # Step-by-step guides
│ ├── /tutorials/first-tool
│ ├── /tutorials/chaining-steps
│ └── /tutorials/code-steps
├── /examples # Example projects
├── /blog # Updates, announcements (optional)
├── /register # Publisher registration
├── /login # Publisher login
├── /dashboard # Publisher dashboard
│ ├── /dashboard/tools # My published tools
│ ├── /dashboard/tokens # API tokens
│ └── /dashboard/settings # Account settings
└── /api/v1/... # API endpoints
Landing page content:
- Hero: "Share and discover AI-powered CLI tools"
- Quick install example
- Featured/popular tools
- Category highlights
- "Get Started" CTA
Tool detail page:
- Name, description, version, author
- README rendered as markdown (sanitized)
- Install command (copy-to-clipboard)
- Version history
- Download stats
- Category/tags
- "Report" button for abuse
README Security
When rendering README markdown, apply XSS sanitization:
import bleach
from markdown import markdown
ALLOWED_TAGS = [
'h1', 'h2', 'h3', 'h4', 'h5', 'h6',
'p', 'br', 'hr',
'ul', 'ol', 'li',
'strong', 'em', 'code', 'pre',
'blockquote',
'a', 'img',
'table', 'thead', 'tbody', 'tr', 'th', 'td'
]
ALLOWED_ATTRS = {
'a': ['href', 'title'],
'img': ['src', 'alt', 'title'],
'code': ['class'], # for syntax highlighting
}
def render_readme_safe(readme_raw: str) -> str:
"""Convert markdown to sanitized HTML"""
# Convert markdown to HTML
html = markdown(readme_raw, extensions=['fenced_code', 'tables'])
# Sanitize to prevent XSS
safe_html = bleach.clean(
html,
tags=ALLOWED_TAGS,
attributes=ALLOWED_ATTRS,
strip=True
)
# Linkify URLs
safe_html = bleach.linkify(safe_html)
return safe_html
Storage strategy:
- Store raw README in
tools.readme - Render and sanitize on request (or cache rendered HTML)
- Never trust client-submitted HTML directly
Tech stack options:
| Option | Pros | Cons |
|---|---|---|
| Flask + Jinja + Tailwind | Simple, Python-only, fast to build | Less interactive |
| FastAPI + Vue/React SPA | Modern, interactive | More complex, separate build |
| Astro/Next.js | Great SEO, static-first | Different stack (Node.js) |
Recommendation: Flask + Jinja + Tailwind for v1
- Keeps everything in Python
- Server-rendered is fine for a registry
- Good SEO out of the box
- Can add interactivity with Alpine.js or htmx if needed
Monetization considerations:
- AdSense-compatible (server-rendered pages)
- Analytics tracking for traffic insights
- Future: sponsored tools, featured placements
- Future: premium publisher tiers (more tools, priority review)
Implementation Phases
Status as of January 2026: Phases 1-4 and 6-8 are complete. Phase 5 is partially complete.
Phase 1: Foundation ✅ Complete
- Define
cmdforge.yamlmanifest format - Implement tool resolution order (local → global → registry)
- Create CmdForge-Registry repo on Gitea (bootstrap)
- Add 3-5 example tools to seed the registry (233+ Fabric patterns imported)
Phase 2: Core Backend ✅ Complete
- Set up Flask/FastAPI project structure
- Implement SQLite database schema (25+ tables including FTS5, reviews, issues, audit)
- Build core API endpoints (list, search, get, download)
- Implement webhook receiver for Gitea sync
- Set up HMAC verification
Phase 3: CLI Commands ✅ Complete
cmdforge registry search(with faceted search, tags, categories, owner, date range)cmdforge registry installcmdforge registry infocmdforge registry browse(opens PySide6 GUI)- Local index caching
cmdforge registry tags(list available tags)cmdforge registry uninstallcmdforge registry my-toolscmdforge registry statuscmdforge registry config(admin settings)
Phase 4: Publishing ✅ Complete
- Publisher registration (web UI)
- Token management
cmdforge registry publishcommand (with dry-run, version bump, similarity warnings)- PR creation via Gitea API
- CI validation workflows
- App pairing/connection flow (polling-based, replaces manual token entry)
- Connected apps management
Phase 5: Project Dependencies ⚠️ Partial
cmdforge install(from manifest) - Not implementedcmdforge addcommand - Not implemented- Runtime override application - Not implemented
- Dependency resolution - Not implemented
- Individual tool install via
cmdforge registry install
Phase 6: Smart Features ✅ Complete
- SQLite FTS5 search index (with prefix matching)
- AI-powered auto-categorization
- Duplicate/similarity detection
- Security scanning (scrutiny system)
- AI-powered secondary review (scrutiny-ai-review integration)
- Auto-approve/review/reject decision logic
- Confidence-based moderation
Phase 7: Full Web UI ✅ Complete
- Landing page (with featured tools, tutorials, contributor spotlight)
- Tool browsing/search pages (with faceted filters)
- Tool detail pages with README rendering (plus reviews, issues, versions, forks)
- Publisher dashboard (tools, tokens, settings)
- Documentation/tutorials section
- Admin dashboard (pending queue, publishers, reports, scrutiny, settings)
- Forum (categories, topics, replies)
- Collections pages
- Publisher profiles with trust scores and badges
Phase 8: Polish & Scale ✅ Complete
- Rate limiting
- Abuse reporting
- Analytics integration (pageviews, consent management)
- Performance optimization
- Monitoring/alerting (Sentry integration)
- Audit logging
- Role-based access control (user, moderator, admin)
- Publisher banning/unbanning
- Review voting and flagging
- Issue tracking (bug, security, compatibility)
Features Implemented Beyond Original Design
The following features were added during development but not in the original design:
Reviews & Ratings System
- 1-5 star ratings with text reviews
- Helpful/unhelpful voting on reviews
- Review flagging for moderation
- Rating aggregates and statistics
- Publisher reputation scores
Issue Tracking
- Bug, security, and compatibility issue types
- Severity levels (low, medium, high, critical)
- Issue status workflow (open, confirmed, fixed, wontfix, duplicate)
- Issue resolution with notes
Fork Tracking
- Track original tool when forking (
forked_fromfield) - Store
forked_versionfor version tracking - Display fork information in GUI and web
Scrutiny System (Automated Vetting)
- Automated security/compatibility analysis on publish
- AI-powered secondary review
- Confidence-based approval decisions
- Admin scrutiny dashboard with statistics
Trust & Badges
- Publisher trust scores
- Achievement badges
- Download milestones
PySide6 Desktop GUI
- Modern desktop GUI replacing urwid TUI
- Tool Builder with visual form editor
- AI-assisted code generation for code steps
- AI persona profiles system