58 KiB
SmartTools Registry Design
Purpose
Build a centralized registry for SmartTools to enable discovery, publishing, dependency management, and future curation at scale.
Terminology
| Term | Definition |
|---|---|
| Tool definition | The full YAML file in the registry (config.yaml) containing name, steps, arguments, etc. |
| Tool config | The configuration within a tool definition (arguments, steps, provider settings) |
| smarttools.yaml | Project manifest file declaring tool dependencies and overrides |
| config.yaml | The tool definition file, both in registry and when installed locally |
| Owner | Immutable namespace slug identifying the publisher (e.g., rob, alice) |
| Publisher | A registered user who can publish tools to the registry |
| Wrapper script | Auto-generated bash script in ~/.local/bin/ that invokes a tool |
Canonical naming: Use SmartTools-Registry (capitalized, hyphenated) for the repository name.
Diagram References
- System overview:
discussions/diagrams/smarttools-registry_rob_1.puml - Data flows:
discussions/diagrams/smarttools-registry_rob_5.puml
System Overview
Users interact via the CLI and a future Web UI. Both call a Registry API hosted at https://gitea.brrd.tech/api/v1 (future alias: registry.smarttools.dev/api/v1). The API syncs from a Gitea-backed registry repo and maintains a SQLite cache/search index.
Canonical API base path: https://gitea.brrd.tech/api/v1
All API endpoints are versioned under /api/v1. When breaking changes are needed, a new version (/api/v2) will be introduced with deprecation notices.
Core API endpoints:
GET /api/v1/toolsGET /api/v1/tools/search?q=...GET /api/v1/tools/{owner}/{name}GET /api/v1/tools/{owner}/{name}/versionsGET /api/v1/tools/{owner}/{name}/download?version=...POST /api/v1/tools(publish)GET /api/v1/categoriesGET /api/v1/stats/popularPOST /api/v1/webhook/gitea
Pagination
All list endpoints support pagination:
| Parameter | Default | Max | Description |
|---|---|---|---|
page |
1 | - | Page number (1-indexed) |
per_page |
20 | 100 | Items per page |
sort |
downloads |
- | Sort field |
order |
desc |
- | Sort order (asc/desc) |
Stable ordering: To ensure deterministic results across pages, sorting includes a secondary key:
- Primary: requested field (e.g.,
downloads) - Secondary:
published_at(desc) - Tertiary:
id(for absolute stability)
ORDER BY downloads DESC, published_at DESC, id DESC
LIMIT 20 OFFSET 0
Response pagination metadata:
{
"data": [...],
"meta": {
"page": 1,
"per_page": 20,
"total": 142,
"total_pages": 8
}
}
Input Constraints
Size limits to prevent oversized uploads:
| Field | Max Size | Notes |
|---|---|---|
config.yaml |
64 KB | Tool definition |
README.md |
256 KB | Documentation |
| Request body | 512 KB | Total POST payload |
| Tool name | 64 chars | Alphanumeric + hyphen |
| Description | 500 chars | Short summary |
| Tag | 32 chars | Individual tag |
| Tags array | 10 items | Maximum tags per tool |
Validation errors:
{
"error": {
"code": "PAYLOAD_TOO_LARGE",
"message": "config.yaml exceeds 64KB limit",
"details": {
"field": "config",
"size": 72000,
"limit": 65536
}
}
}
Sort Fields and Indexes
Allowed sort fields:
| Endpoint | Allowed sort values |
|---|---|
GET /tools |
downloads, published_at, name |
GET /tools/search |
relevance, downloads, published_at |
GET /categories |
name, tool_count |
Invalid sort values return 400:
{"error": {"code": "INVALID_SORT", "message": "Unknown sort field 'foo'. Allowed: downloads, published_at, name"}}
Database indexes:
-- Frequent query patterns
CREATE INDEX idx_tools_owner_name ON tools(owner, name);
CREATE INDEX idx_tools_category ON tools(category);
CREATE INDEX idx_tools_published_at ON tools(published_at DESC);
CREATE INDEX idx_tools_downloads ON tools(downloads DESC);
CREATE INDEX idx_tools_owner_name_version ON tools(owner, name, version);
-- For pagination stability
CREATE INDEX idx_tools_sort_stable ON tools(downloads DESC, published_at DESC, id DESC);
-- Publisher lookups
CREATE INDEX idx_publishers_slug ON publishers(slug);
CREATE INDEX idx_publishers_email ON publishers(email);
-- Token lookups
CREATE INDEX idx_api_tokens_hash ON api_tokens(token_hash);
CREATE INDEX idx_api_tokens_publisher ON api_tokens(publisher_id);
API Version Compatibility
Forward compatibility: Clients should ignore unknown fields in API responses:
# Good: ignore unknown fields
tool = response['data']
name = tool.get('name')
# Don't fail if 'new_field' exists but client doesn't know about it
# Bad: strict parsing that fails on unknown fields
tool = ToolSchema.parse(response['data']) # May fail on new fields
Backward compatibility: The API will:
- Never remove fields in a version (only deprecate)
- Never change field types
- Add new optional fields without version bump
- Use new version (
/api/v2) for breaking changes
Deprecation process:
- Add
X-Deprecated-Field: old_fieldheader - Document in changelog
- Remove after 6 months minimum
- Major version bump if widely used
Client version header:
X-SmartTools-Client: cli/1.2.0
Helps server track client versions for deprecation decisions.
Source of Truth
- Gitea registry repo is the source of truth.
- API syncs repo content into SQLite for fast queries, stats, and FTS5 search.
index.jsonremains useful for offline CLI search and as a fallback.
If the cache is stale, the API can fall back to repo reads; a warning header may be emitted.
Namespacing and Paths
Support owner/name from day one:
- Registry path:
tools/{owner}/{name}/config.yaml - API URL:
/tools/{owner}/{name} - Install:
smarttools registry install rob/summarize - Shorthand:
smarttools registry install summarizeresolves to the official namespace.
PR branches: submit/{owner}/{name}/{version}.
Namespace Identity
The owner is an immutable slug, not the display name:
-- In publishers table
slug TEXT UNIQUE NOT NULL, -- immutable: "rob", "alice-dev"
display_name TEXT NOT NULL, -- mutable: "Rob", "Alice Developer"
Slug rules:
- Lowercase alphanumeric + hyphens only:
^[a-z0-9][a-z0-9-]*[a-z0-9]$ - 2-39 characters
- Cannot start/end with hyphen
- Set once at registration, cannot be changed
- Reserved slugs:
official,admin,system,api,registry
Rename policy:
display_namecan be changed anytime via dashboardslug(owner) is permanent to preserve URLs and tool references- If a publisher absolutely must change slug (legal reasons, etc.):
- Create new account with new slug
- Republish tools under new namespace
- Mark old tools as deprecated with
replacementpointing to new namespace - Old namespace remains reserved (cannot be reused by others)
Why immutable:
rob/summarize@1.0.0must always resolve to the same tool- Prevents namespace hijacking after rename
- Simplifies caching and CDN strategies
Tool Format (Registry == Local)
Registry tool folders mirror local tools:
tools/
rob/
summarize/
config.yaml
README.md
Tool files match the existing SmartTools format. Registry-specific metadata is kept under registry:. Deprecation is tool-defined and top-level:
name: summarize
version: "1.2.0"
deprecated: true
deprecated_message: "Security issue. Use v1.2.1"
replacement: "rob/summarize@1.2.1"
registry:
published_at: "2025-01-15T10:30:00Z"
downloads: 142
Schema compatibility note: The current SmartTools config parser may reject unknown top-level keys like deprecated, replacement, and registry. Before implementing registry features:
- Update the YAML parser to ignore unknown keys (permissive mode)
- Or explicitly define these fields in the Tool dataclass with defaults
- Validate registry-specific fields only when publishing, not when running locally
This ensures local tools continue to work even if they don't have registry fields.
Versioning and Immutability
- Unique key:
owner/name + version. - Published versions are immutable.
- Deprecation uses
deprecated,deprecated_message, andreplacement. - CLI warns on install if a version is deprecated.
Yank Policy
Yanking allows removing a version from resolution without deleting it (for auditability):
# In tool config
yanked: true
yanked_reason: "Critical security vulnerability CVE-2025-1234"
yanked_at: "2025-01-20T15:00:00Z"
Yanked version behavior:
| Operation | Behavior |
|---|---|
install foo@1.0.0 (exact) |
Warns but allows install |
install foo@^1.0.0 (constraint) |
Excludes yanked, resolves to next valid |
search / browse |
Hidden by default, shown with --include-yanked |
| Direct URL access | Returns tool with yanked: true in response |
| Already installed | Continues to work, no forced removal |
Database schema addition:
-- Add to tools table
yanked BOOLEAN DEFAULT FALSE,
yanked_reason TEXT,
yanked_at TIMESTAMP
Yank vs Delete:
- Yank: Version remains in DB, excluded from resolution, auditable
- Delete: Reserved for DMCA/legal, requires admin action, leaves tombstone record
Version Format
Tools use semantic versioning (semver):
MAJOR.MINOR.PATCH[-PRERELEASE][+BUILD]
Examples:
1.0.0 # stable release
1.2.3 # stable release
2.0.0-alpha.1 # prerelease
2.0.0-beta.2 # prerelease
2.0.0-rc.1 # release candidate
Version Constraints
Manifest files support these constraint formats:
| Constraint | Meaning | Example Match |
|---|---|---|
1.2.3 |
Exact version | 1.2.3 only |
>=1.2.0 |
Minimum version | 1.2.0, 1.3.0, 2.0.0 |
<2.0.0 |
Below version | 1.9.9, 1.0.0 |
>=1.0.0,<2.0.0 |
Range | 1.0.0 to 1.9.9 |
^1.2.3 |
Compatible (same major) | 1.2.3 to 1.9.9 |
~1.2.3 |
Approximately (same minor) | 1.2.3 to 1.2.9 |
* |
Any version | latest stable |
Version Resolution Rules
When resolving a version constraint:
- Filter: Get all versions matching the constraint
- Exclude prereleases: Unless constraint explicitly includes them (e.g.,
>=2.0.0-alpha.1) - Sort: By semver precedence (descending)
- Select: Highest matching version
Tie-breakers:
- Stable versions preferred over prereleases
- Later publish date wins if versions are equal (shouldn't happen with immutability)
Unsatisfiable constraints:
// API Response: 404
{
"error": {
"code": "VERSION_NOT_FOUND",
"message": "No version of 'rob/summarize' satisfies constraint '>=5.0.0'",
"details": {
"tool": "rob/summarize",
"constraint": ">=5.0.0",
"available_versions": ["1.0.0", "1.1.0", "1.2.0"],
"latest_stable": "1.2.0"
}
}
}
Prerelease Handling
- Prereleases are not returned for
*or range constraints by default - To install prerelease:
smarttools registry install rob/summarize@2.0.0-beta.1 - To allow prereleases in manifest:
version: ">=2.0.0-0"(the-0suffix includes prereleases)
Download Endpoint Version Selection
The /api/v1/tools/{owner}/{name}/download endpoint accepts version parameters:
| Parameter | Behavior | Example |
|---|---|---|
| (none) | Returns latest stable version | /download → 1.2.0 |
version=1.2.0 |
Exact version (must exist) | /download?version=1.2.0 |
version=^1.0.0 |
Server resolves constraint | /download?version=^1.0.0 → 1.2.0 |
version=latest |
Alias for latest stable | /download?version=latest |
Server-side resolution: The API server resolves version constraints, not the client. This ensures consistent resolution and allows the server to apply policies (e.g., exclude yanked versions).
GET /api/v1/tools/rob/summarize/download?version=^1.0.0&install=true
Response (200):
{
"data": {
"owner": "rob",
"name": "summarize",
"resolved_version": "1.2.0",
"config": "... YAML content ..."
},
"meta": {
"constraint": "^1.0.0",
"available_versions": ["1.0.0", "1.1.0", "1.2.0"]
}
}
Invalid/unsatisfiable constraint:
GET /api/v1/tools/rob/summarize/download?version=^5.0.0
Response (404):
{
"error": {
"code": "CONSTRAINT_UNSATISFIABLE",
"message": "No version matches constraint '^5.0.0'",
"details": {
"constraint": "^5.0.0",
"latest_stable": "1.2.0",
"available_versions": ["1.0.0", "1.1.0", "1.2.0"]
}
}
}
Tool Resolution Order
When a tool is invoked, the CLI searches in this order:
- Local project:
./.smarttools/<owner>/<name>/config.yaml(or./.smarttools/<name>/for unnamespaced) - Global user:
~/.smarttools/<owner>/<name>/config.yaml - Registry: Fetch from API, install to global, then run
- Error:
Tool '<toolname>' not found
Step 3 only occurs if auto_fetch_from_registry: true in config (default: true).
Path convention: Use .smarttools/ (with leading dot) for both local and global to maintain consistency.
Resolution also respects namespacing:
summarize→ searches for any tool namedsummarize, prefersofficial/summarizeif existsrob/summarize→ searches for exactlyrob/summarize
Official Namespace
The slug official is reserved for curated, high-quality tools maintained by the registry administrators.
- Shorthand
summarizeresolves toofficial/summarizeif it exists - If no
official/summarize, falls back to most-downloaded tool namedsummarize - To avoid ambiguity, always use full
owner/namein manifests
Reserved slugs that cannot be registered: official, admin, system, api, registry, smarttools
Auto-Fetch Behavior
When enabled (auto_fetch_from_registry: true), missing tools are automatically fetched:
$ summarize < file.txt
# Tool 'summarize' not found locally.
# Fetching from registry...
# Installed: official/summarize@1.2.0
# Running...
Behavior details:
- Fetches latest stable version unless pinned in
smarttools.yaml - Installs to
~/.smarttools/<owner>/<name>/ - Generates wrapper script in
~/.local/bin/ - Subsequent runs use local copy (no re-fetch)
To disable (require explicit install):
# ~/.smarttools/config.yaml
auto_fetch_from_registry: false
Wrapper Script Collisions
When two tools from different owners have the same name:
| Scenario | Behavior |
|---|---|
Install official/summarize |
Creates wrapper ~/.local/bin/summarize |
Install rob/summarize (collision) |
Creates wrapper ~/.local/bin/rob-summarize |
Uninstall official/summarize |
Removes summarize wrapper, promotes rob-summarize → summarize if desired |
The first-installed tool with a given name gets the short wrapper. Subsequent tools use owner-name format.
To invoke a specific owner's tool:
# Short form (whichever was installed first)
summarize < file.txt
# Explicit owner form (always works)
rob-summarize < file.txt
# Or via smarttools run
smarttools run rob/summarize < file.txt
Project Manifest (smarttools.yaml)
Defines tool dependencies with optional runtime overrides:
name: my-ai-project
version: "1.0.0"
dependencies:
- name: rob/summarize
version: ">=1.0.0"
overrides:
rob/summarize:
provider: ollama
Overrides are applied at runtime and do not mutate installed tool configs.
CLI Config and Tokens
Global config lives in ~/.smarttools/config.yaml:
registry:
url: https://gitea.brrd.tech/api/v1 # Must match canonical base path
token: "reg_xxxxxxxxxxxx"
client_id: "anon_abc123def456"
auto_fetch_from_registry: true
client_id is generated locally and used for anonymous install dedupe.
Publishing and Auth
Publishing uses registry accounts, not Gitea accounts:
- Public endpoints require no auth.
POST /toolsrequires a registry token.- The API server uses a private Gitea service account to open PRs.
Publish Idempotency and Edge Cases
Idempotency key: owner/name@version
| Scenario | API Response | HTTP Code |
|---|---|---|
| New version, no PR exists | Create PR, return URL | 201 Created |
| PR already exists (pending) | Return existing PR URL | 200 OK |
| Version already published | Error: version exists | 409 Conflict |
| PR was closed without merge | Allow new PR | 201 Created |
| PR was merged, then tool deleted | Error: version exists (tombstone) | 409 Conflict |
Version immutability enforcement:
// Attempt to publish existing version
// Response: 409 Conflict
{
"error": {
"code": "VERSION_EXISTS",
"message": "Version 1.2.0 of 'rob/summarize' already exists and cannot be overwritten",
"details": {
"published_at": "2025-01-15T10:30:00Z",
"action": "Bump version number to publish changes"
}
}
}
Closed PR handling:
- Track PR state in database:
pending,merged,closed - If PR was closed (rejected/abandoned), allow new submission for same version
- If PR was merged, version is immutable forever
Update flow (new version, not overwrite):
- Developer modifies tool locally
- Bumps version in
config.yaml(e.g.,1.2.0→1.3.0) - Runs
smarttools registry publish - New PR created for
1.3.0 - Old version
1.2.0remains available
Publisher Registration
Publishers register on the registry website, not Gitea:
Registration flow:
- User visits
https://gitea.brrd.tech/registry/register(or futureregistry.smarttools.dev) - Creates account with email + password + slug
- Receives verification email (optional in v1, but track
verifiedstatus) - Logs into dashboard at
/dashboard - Generates API token from dashboard
- Uses token in CLI for publishing
Authentication Security
Password hashing:
- Algorithm: Argon2id (memory-hard, recommended by OWASP)
- Parameters:
memory=65536, iterations=3, parallelism=4 - Library:
argon2-cffifor Python
from argon2 import PasswordHasher
ph = PasswordHasher(memory_cost=65536, time_cost=3, parallelism=4)
hash = ph.hash(password)
ph.verify(hash, password) # raises on mismatch
API token format:
reg_<random-32-bytes-base62>
Example: reg_7kX9mPqR2sT4vW6xY8zA1bC3dE5fG7hJ
- Prefix
reg_for easy identification in logs/configs - 32 bytes of cryptographically random data
- Base62 encoded (alphanumeric, no special chars)
- Total length: ~47 characters
- Stored as SHA-256 hash in database (never plain text)
Token lifecycle:
| Action | Behavior |
|---|---|
| Generate | Create new token, return once, store hash |
| List | Show token name, created date, last used (not the token itself) |
| Revoke | Set revoked_at timestamp, reject future uses |
| Rotate | Generate new token, optionally revoke old |
Rate limits:
| Endpoint | Limit | Window | Scope | Retry-After |
|---|---|---|---|---|
POST /register |
5 | 1 hour | IP | 3600 |
POST /login |
10 | 15 min | IP | 900 |
POST /login (failed) |
5 | 15 min | IP + email | 900 |
POST /tokens |
10 | 1 hour | Token | 3600 |
POST /tools |
20 | 1 hour | Token | 3600 |
GET /tools/* |
100 | 1 min | IP | 60 |
GET /download |
60 | 1 min | IP | 60 |
Rate limit response (429):
{
"error": {
"code": "RATE_LIMITED",
"message": "Too many requests. Try again in 60 seconds.",
"details": {
"limit": 100,
"window": "1 minute",
"retry_after": 60
}
}
}
Headers on rate-limited response:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1705766400
Scope priority: For authenticated requests, both IP and token limits apply. The more restrictive limit wins.
Account lockout:
- After 5 failed login attempts: 15-minute lockout for that email
- After 10 failed attempts: 1-hour lockout
- Lockout clears on successful password reset
Password reset flow (deferred to v1.1):
- User requests reset via email
- Server generates time-limited token (1 hour expiry)
- Email contains reset link with token
- User sets new password
- All existing sessions/tokens optionally invalidated
Email verification flow (deferred to v1.1):
- On registration, send verification email
- User clicks link with verification token
- Set
verified = truein database - Unverified accounts can browse but not publish
Token Scopes and Authorization
Tokens have scopes that limit their capabilities:
| Scope | Permissions |
|---|---|
read |
View own published tools, download stats |
publish |
Submit new tools, update own tool metadata |
admin |
Yank tools, manage categories (registry admins only) |
Default scope: New tokens get read,publish by default.
Ownership enforcement:
@app.route('/api/v1/tools', methods=['POST'])
@require_token(scopes=['publish'])
def publish_tool():
token = get_current_token()
tool_data = request.json
# Enforce owner == token holder's slug
if tool_data['owner'] != token.publisher.slug:
return {
"error": {
"code": "FORBIDDEN",
"message": f"Cannot publish to namespace '{tool_data['owner']}'. "
f"Your namespace is '{token.publisher.slug}'."
}
}, 403
# Proceed with publish...
GET /api/v1/me/tools authorization:
- Requires valid token with
readscope - Returns only tools where
owner == token.publisher.slug - Includes pending PRs and all versions (including yanked)
Web Session Security
Dashboard login uses session cookies (not tokens) for browser auth:
Cookie settings:
SESSION_COOKIE_NAME = 'smarttools_session'
SESSION_COOKIE_HTTPONLY = True # Prevent JS access
SESSION_COOKIE_SECURE = True # HTTPS only in production
SESSION_COOKIE_SAMESITE = 'Lax' # CSRF protection
SESSION_COOKIE_MAX_AGE = 86400 * 7 # 7 days
CSRF protection:
- All POST/PUT/DELETE forms include
csrf_tokenhidden field - Token validated server-side before processing
- 403 Forbidden if token missing or invalid
Session lifecycle:
| Event | Action |
|---|---|
| Login | Create session, set cookie |
| Logout | Delete session, clear cookie |
| Idle 24h | Session expires, re-login required |
| Password change | Invalidate all sessions |
| Token revocation | Existing sessions continue (token != session) |
Secure session storage:
# Store sessions in DB, not filesystem
from flask_session import Session
app.config['SESSION_TYPE'] = 'sqlalchemy'
app.config['SESSION_SQLALCHEMY_TABLE'] = 'sessions'
Database schema:
-- Publishers
CREATE TABLE publishers (
id INTEGER PRIMARY KEY AUTOINCREMENT,
email TEXT UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
slug TEXT UNIQUE NOT NULL, -- immutable namespace: "rob", "alice-dev"
display_name TEXT NOT NULL, -- mutable: "Rob", "Alice Developer"
bio TEXT,
website TEXT,
verified BOOLEAN DEFAULT FALSE,
locked_until TIMESTAMP, -- account lockout
failed_login_attempts INTEGER DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- API tokens (one publisher can have multiple)
CREATE TABLE api_tokens (
id INTEGER PRIMARY KEY AUTOINCREMENT,
publisher_id INTEGER NOT NULL REFERENCES publishers(id),
token_hash TEXT NOT NULL,
name TEXT NOT NULL, -- "CLI token", "CI token"
last_used_at TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
revoked_at TIMESTAMP -- NULL if active
);
-- Tools (links to publisher)
CREATE TABLE tools (
id INTEGER PRIMARY KEY AUTOINCREMENT,
owner TEXT NOT NULL, -- namespace slug (immutable, from publisher.slug)
name TEXT NOT NULL,
version TEXT NOT NULL,
description TEXT,
category TEXT,
tags TEXT, -- JSON array
config_yaml TEXT NOT NULL, -- Full tool config
readme TEXT,
publisher_id INTEGER NOT NULL REFERENCES publishers(id),
deprecated BOOLEAN DEFAULT FALSE,
deprecated_message TEXT,
replacement TEXT,
downloads INTEGER DEFAULT 0,
published_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE(owner, name, version)
);
-- Download stats (for deduplication)
CREATE TABLE download_stats (
id INTEGER PRIMARY KEY AUTOINCREMENT,
tool_id INTEGER NOT NULL REFERENCES tools(id),
client_id TEXT NOT NULL,
downloaded_at DATE NOT NULL,
UNIQUE(tool_id, client_id, downloaded_at)
);
-- Search index (FTS5)
CREATE VIRTUAL TABLE tools_fts USING fts5(
name, description, tags, readme,
content='tools',
content_rowid='id'
);
-- FTS5 sync triggers (required for external content tables)
CREATE TRIGGER tools_ai AFTER INSERT ON tools BEGIN
INSERT INTO tools_fts(rowid, name, description, tags, readme)
VALUES (new.id, new.name, new.description, new.tags, new.readme);
END;
CREATE TRIGGER tools_ad AFTER DELETE ON tools BEGIN
INSERT INTO tools_fts(tools_fts, rowid, name, description, tags, readme)
VALUES ('delete', old.id, old.name, old.description, old.tags, old.readme);
END;
CREATE TRIGGER tools_au AFTER UPDATE ON tools BEGIN
INSERT INTO tools_fts(tools_fts, rowid, name, description, tags, readme)
VALUES ('delete', old.id, old.name, old.description, old.tags, old.readme);
INSERT INTO tools_fts(rowid, name, description, tags, readme)
VALUES (new.id, new.name, new.description, new.tags, new.readme);
END;
-- Pending PRs (track publish state)
CREATE TABLE pending_prs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
publisher_id INTEGER NOT NULL REFERENCES publishers(id),
owner TEXT NOT NULL,
name TEXT NOT NULL,
version TEXT NOT NULL,
pr_number INTEGER NOT NULL,
pr_url TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending', -- pending, merged, closed
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE(owner, name, version)
);
-- Webhook sync log (idempotency)
CREATE TABLE webhook_log (
id INTEGER PRIMARY KEY AUTOINCREMENT,
delivery_id TEXT UNIQUE NOT NULL, -- Gitea delivery ID
event_type TEXT NOT NULL,
processed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Note on tags indexing: The tags column stores JSON arrays as text. For v1, FTS5 will search within the JSON string. If tag filtering becomes a bottleneck, normalize to a tool_tags junction table:
-- Future: normalized tags (if needed)
CREATE TABLE tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL
);
CREATE TABLE tool_tags (
tool_id INTEGER REFERENCES tools(id),
tag_id INTEGER REFERENCES tags(id),
PRIMARY KEY (tool_id, tag_id)
);
CLI first-time publish flow:
$ smarttools registry publish
No registry account configured.
1. Register at: https://gitea.brrd.tech/registry/register
2. Generate a token from your dashboard
3. Enter your token below
Registry token: ********
Token saved to ~/.smarttools/config.yaml
Validating tool...
✓ config.yaml is valid
✓ README.md exists (2.3 KB)
✓ Version 1.0.0 not yet published
Publishing rob/my-tool@1.0.0...
✓ PR created: https://gitea.brrd.tech/rob/SmartTools-Registry/pulls/42
Your tool is pending review. You'll receive an email when it's approved.
CLI Commands Reference
Full mapping of CLI commands to API calls:
Registry Commands
# Search for tools
$ smarttools registry search <query> [--category=<cat>] [--limit=20]
→ GET /api/v1/tools/search?q=<query>&category=<cat>&limit=20
# Browse tools (TUI)
$ smarttools registry browse [--category=<cat>]
→ GET /api/v1/tools?category=<cat>&page=1
→ GET /api/v1/categories
# View tool details
$ smarttools registry info <owner/name>
→ GET /api/v1/tools/<owner>/<name>
# Install a tool
$ smarttools registry install <owner/name> [--version=<ver>]
→ GET /api/v1/tools/<owner>/<name>/download?version=<ver>&install=true
→ Writes to ~/.smarttools/<owner>/<name>/config.yaml
→ Generates ~/.local/bin/<name> wrapper (or <owner>-<name> if collision)
# Uninstall a tool
$ smarttools registry uninstall <owner/name>
→ Removes ~/.smarttools/<owner>/<name>/
→ Removes wrapper script
# Publish a tool
$ smarttools registry publish [path] [--dry-run]
→ POST /api/v1/tools (with registry token)
→ Returns PR URL
# List my published tools
$ smarttools registry my-tools
→ GET /api/v1/me/tools (with registry token)
# Update index cache
$ smarttools registry update
→ GET /api/v1/index.json
→ Writes to ~/.smarttools/registry/index.json
Project Commands
# Install project dependencies from smarttools.yaml
$ smarttools install
→ Reads ./smarttools.yaml
→ For each dependency:
GET /api/v1/tools/<owner>/<name>/download?version=<constraint>&install=true
→ Installs to ~/.smarttools/<owner>/<name>/
# Add a dependency to smarttools.yaml
$ smarttools add <owner/name> [--version=<constraint>]
→ Adds to ./smarttools.yaml dependencies
→ Runs install for that tool
# Show project dependencies status
$ smarttools deps
→ Reads ./smarttools.yaml
→ Shows installed status for each dependency
→ Note: "smarttools list" is reserved for listing installed tools
Command naming note: smarttools list already exists to list locally installed tools. Use smarttools deps to show project manifest dependencies.
Flags available on most commands
| Flag | Description |
|---|---|
--offline |
Use cached index only, don't fetch |
--refresh |
Force refresh of cached data |
--json |
Output in JSON format |
--verbose |
Show detailed output |
Webhooks and Security
HMAC Verification
All Gitea webhooks are verified using HMAC-SHA256:
import hmac
import hashlib
def verify_webhook(request, secret):
signature = request.headers.get('X-Gitea-Signature')
if not signature:
return False
expected = hmac.new(
secret.encode(),
request.body,
hashlib.sha256
).hexdigest()
return hmac.compare_digest(signature, expected)
Replay Protection
While sync is idempotent, implement basic replay protection:
def process_webhook(request):
delivery_id = request.headers.get('X-Gitea-Delivery')
# Check if already processed
if db.webhook_log.exists(delivery_id=delivery_id):
return {"status": "already_processed"}, 200
# Verify signature
if not verify_webhook(request, WEBHOOK_SECRET):
return {"error": "invalid_signature"}, 401
# Process with lock to prevent concurrent processing
with db.lock(f"webhook:{delivery_id}"):
# Double-check after acquiring lock
if db.webhook_log.exists(delivery_id=delivery_id):
return {"status": "already_processed"}, 200
# Process the webhook
result = sync_from_repo()
# Log successful processing
db.webhook_log.insert(
delivery_id=delivery_id,
event_type=request.json.get('action'),
processed_at=datetime.utcnow()
)
return {"status": "processed"}, 200
Sync Job Locking
Prevent concurrent sync operations:
# Using file lock or database advisory lock
SYNC_LOCK_TIMEOUT = 300 # 5 minutes max
def sync_from_repo():
try:
with acquire_lock("registry_sync", timeout=SYNC_LOCK_TIMEOUT):
# Pull latest from Gitea
repo.fetch()
repo.reset('origin/main', hard=True)
# Parse and update database
for tool_path in glob('tools/*/*/config.yaml'):
update_tool_in_db(tool_path)
# Rebuild FTS index if needed
rebuild_fts_index()
except LockTimeout:
logger.warning("Sync already in progress, skipping")
return {"status": "skipped", "reason": "sync_in_progress"}
Atomic Sync Strategy
To avoid partially updated DB during webhook sync, use transactional table swap:
def sync_from_repo_atomic():
with acquire_lock("registry_sync", timeout=SYNC_LOCK_TIMEOUT):
# 1. Pull latest from Gitea
repo.fetch()
repo.reset('origin/main', hard=True)
# 2. Parse all tools into memory
new_tools = []
for tool_path in glob('tools/*/*/config.yaml'):
tool_data = parse_tool(tool_path)
if tool_data:
new_tools.append(tool_data)
# 3. Atomic swap using transaction
with db.transaction():
# Create temp table
db.execute("CREATE TABLE tools_new AS SELECT * FROM tools WHERE 0")
# Bulk insert into temp table
for tool in new_tools:
db.execute("INSERT INTO tools_new ...", tool)
# Swap tables atomically
db.execute("ALTER TABLE tools RENAME TO tools_old")
db.execute("ALTER TABLE tools_new RENAME TO tools")
db.execute("DROP TABLE tools_old")
# Rebuild FTS index
db.execute("INSERT INTO tools_fts(tools_fts) VALUES('rebuild')")
# Update sync timestamp
db.execute("UPDATE sync_status SET last_sync = ?", [datetime.utcnow()])
Why atomic: Per-row updates with FTS triggers can yield inconsistent reads under load. Readers may see partial state mid-sync. Table swap ensures all-or-nothing visibility.
Error Handling
| Error Scenario | Behavior |
|---|---|
| Repo fetch fails | Log error, retry in 5 min, alert if 3 failures |
| YAML parse error | Skip tool, log error, continue with others |
| Database write fails | Rollback transaction, retry once, then alert |
| Lock timeout | Skip this sync, next webhook will retry |
Automated CI Validation
PRs are validated automatically using SmartTools (dogfooding):
PR Submitted
│
▼
┌─────────────────────────────────────┐
│ Gitea CI runs validation tools: │
│ • schema-validator │
│ • security-scanner │
│ • duplicate-detector │
└───────────────┬─────────────────────┘
│
┌───────┴───────┐
│ │
All pass Any fail
│ │
▼ ▼
Auto-merge or Add comment,
flag for review request changes
Validation checks:
- Schema validation: config.yaml matches expected format
- Security scan: No dangerous shell commands, no secrets in prompts
- Duplicate detection: AI-powered similarity check against existing tools
- README check: README.md exists and is non-empty
CI workflow (.gitea/workflows/validate.yaml):
name: Validate Tool Submission
on: [pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Validate schema
run: python scripts/validate_tool.py ${{ github.event.pull_request.head.sha }}
- name: Security scan
run: smarttools run security-scanner < changed_files.txt
- name: Check duplicates
run: smarttools run duplicate-detector < changed_files.txt
Registry Repository Structure
Full structure of the SmartTools-Registry repo:
SmartTools-Registry/
├── README.md # Registry overview
├── CONTRIBUTING.md # How to submit tools
├── LICENSE
│
├── tools/ # All published tools
│ ├── rob/
│ │ ├── summarize/
│ │ │ ├── config.yaml
│ │ │ └── README.md
│ │ └── translate/
│ │ ├── config.yaml
│ │ └── README.md
│ └── alice/
│ └── code-review/
│ ├── config.yaml
│ └── README.md
│
├── categories/
│ └── categories.yaml # Category definitions
│
├── index.json # Auto-generated search index
│
├── .gitea/
│ └── workflows/
│ ├── validate.yaml # PR validation
│ ├── build-index.yaml # Rebuild index on merge
│ └── notify-api.yaml # Webhook to API server
│
└── scripts/
├── validate_tool.py # Schema validation
├── build_index.py # Generate index.json
├── check_duplicates.py # Similarity detection
└── security_scan.py # Security checks
categories.yaml format:
categories:
- name: text-processing
description: Tools for manipulating and analyzing text
icon: 📝
- name: code
description: Tools for code review, generation, and analysis
icon: 💻
- name: data
description: Tools for data transformation and analysis
icon: 📊
- name: media
description: Tools for image, audio, and video processing
icon: 🎨
- name: productivity
description: General productivity and automation tools
icon: ⚡
Download Stats
Counting Methodology
- Count installs only, not views or searches
- Increment after successful download (response sent)
- Dedupe by
client_id + tool_id + date
def download_tool(owner, name, version, install=False, client_id=None):
tool = get_tool(owner, name, version)
if not tool:
return {"error": "not_found"}, 404
config_yaml = tool.config_yaml
# Only count if this is an install (not just viewing)
if install:
record_download(tool.id, client_id)
return {"config": config_yaml}, 200
def record_download(tool_id, client_id):
today = date.today()
# Use client_id if provided, otherwise generate anonymous fallback
effective_client_id = client_id or f"anon_{hash(request.remote_addr)}"
# Dedupe: only count once per client per tool per day
try:
db.download_stats.insert(
tool_id=tool_id,
client_id=effective_client_id,
downloaded_at=today
)
# Increment counter (can be async/batch updated)
db.execute("UPDATE tools SET downloads = downloads + 1 WHERE id = ?", [tool_id])
except IntegrityError:
pass # Already counted today, ignore
Client ID Generation
CLI generates a persistent anonymous ID on first run:
# In CLI, on first run
import uuid
import os
CONFIG_PATH = os.path.expanduser("~/.smarttools/config.yaml")
def get_or_create_client_id():
config = load_config()
if 'client_id' not in config:
config['client_id'] = f"anon_{uuid.uuid4().hex[:16]}"
save_config(config)
return config['client_id']
Fallback when client_id missing:
- If header
X-Client-IDnot sent, use IP hash as fallback - This still provides some dedupe for anonymous users
- Logged users' downloads are attributed to their account instead
Privacy Considerations
- No IP addresses stored in database
client_idis client-controlled and can be regenerated- Stats are aggregated (total count), not individual tracking
Async Stats Strategy
To avoid DB contention on the hot download path:
from queue import Queue
from threading import Thread
# In-memory queue for stats
stats_queue = Queue()
def record_download_async(tool_id, client_id):
"""Non-blocking: enqueue for background processing"""
stats_queue.put({
'tool_id': tool_id,
'client_id': client_id,
'date': date.today()
})
def stats_worker():
"""Background thread: batch process stats every 5 seconds"""
batch = []
while True:
try:
item = stats_queue.get(timeout=5)
batch.append(item)
except Empty:
if batch:
flush_batch(batch)
batch = []
def flush_batch(batch):
"""Bulk insert with conflict ignore"""
with db.transaction():
for item in batch:
try:
db.execute("""
INSERT INTO download_stats (tool_id, client_id, downloaded_at)
VALUES (?, ?, ?)
ON CONFLICT DO NOTHING
""", [item['tool_id'], item['client_id'], item['date']])
except Exception as e:
logger.warning(f"Stats insert failed: {e}")
# Don't fail downloads for stats errors
Failure behavior: If stats DB write fails, log the error but don't fail the download. Stats are "best effort" - the download must succeed.
Search
- Primary search: SQLite FTS5 inside the API.
index.jsonprovides offline CLI search and backup.- If FTS5 is stale, return results with
X-Search-Index-Stale: true.
API Caching Strategy
Cache Headers
| Endpoint | Cache-Control | ETag | Notes |
|---|---|---|---|
GET /index.json |
max-age=300, stale-while-revalidate=60 |
Yes | 5 min cache, background refresh |
GET /tools/{owner}/{name} |
max-age=60 |
Yes | 1 min cache |
GET /tools/{owner}/{name}/download |
max-age=3600, immutable |
Yes | Immutable versions, 1 hour |
GET /tools/search |
no-cache |
No | Always fresh |
GET /categories |
max-age=3600 |
Yes | Categories change rarely |
ETag Implementation
import hashlib
from datetime import datetime
def get_tool_etag(tool):
"""Generate ETag from tool identity (immutable versions don't change)"""
# Since versions are immutable, owner/name@version is stable
# Use published_at for extra safety (not updated_at, which doesn't exist)
content = f"{tool.owner}/{tool.name}@{tool.version}:{tool.published_at.isoformat()}"
return hashlib.md5(content.encode()).hexdigest()
def get_index_etag():
"""Generate ETag from last sync timestamp"""
last_sync = db.get_last_sync_time()
return hashlib.md5(last_sync.isoformat().encode()).hexdigest()
@app.route('/api/v1/tools/<owner>/<name>/download')
def download_tool(owner, name):
version = request.args.get('version', 'latest')
tool = resolve_and_get_tool(owner, name, version)
etag = get_tool_etag(tool)
# Check If-None-Match header
if request.headers.get('If-None-Match') == etag:
return '', 304 # Not Modified
response = jsonify({
"data": {
"owner": tool.owner,
"name": tool.name,
"resolved_version": tool.version,
"config": tool.config_yaml
}
})
response.headers['ETag'] = etag
response.headers['Cache-Control'] = 'max-age=3600, immutable'
return response
Note: Since tool versions are immutable, the ETag based on owner/name@version is permanently stable. The published_at timestamp is included for defense-in-depth but won't change.
DB vs Repo Read Strategy
| Scenario | Read From | Reason |
|---|---|---|
| Normal operation | SQLite DB | Fast, indexed, FTS |
| DB empty/corrupted | Gitea repo | Fallback/recovery |
| Webhook sync in progress | DB (stale OK) | Avoid blocking reads |
| Search query | SQLite FTS5 | Full-text search |
| Download specific version | DB, fallback to repo | DB is cache, repo is truth |
Staleness Detection
STALE_THRESHOLD = timedelta(minutes=10)
def is_db_stale():
last_sync = db.get_last_sync_time()
return datetime.utcnow() - last_sync > STALE_THRESHOLD
@app.route('/tools/search')
def search_tools(q):
results = db.search_fts(q)
response = jsonify({"results": results})
if is_db_stale():
response.headers['X-Search-Index-Stale'] = 'true'
response.headers['X-Last-Sync'] = db.get_last_sync_time().isoformat()
return response
Error Model
Response Envelopes
Success response:
{
"data": { ... },
"meta": {
"page": 1,
"per_page": 20,
"total": 42,
"total_pages": 3
}
}
Error response:
{
"error": {
"code": "TOOL_NOT_FOUND",
"message": "Tool 'foo/bar' does not exist",
"details": {
"owner": "foo",
"name": "bar",
"suggestion": "Did you mean 'rob/bar'?"
},
"docs_url": "https://registry.smarttools.dev/docs/errors#TOOL_NOT_FOUND"
}
}
Error Codes
| Code | HTTP | Description |
|---|---|---|
TOOL_NOT_FOUND |
404 | Tool does not exist |
VERSION_NOT_FOUND |
404 | Requested version doesn't exist |
VERSION_EXISTS |
409 | Cannot overwrite published version |
INVALID_VERSION |
400 | Version string is not valid semver |
INVALID_CONSTRAINT |
400 | Version constraint syntax error |
CONSTRAINT_UNSATISFIABLE |
404 | No version matches constraint |
VALIDATION_ERROR |
400 | Tool config validation failed |
UNAUTHORIZED |
401 | Missing or invalid auth token |
FORBIDDEN |
403 | Token valid but lacks permission |
RATE_LIMITED |
429 | Too many requests |
SLUG_TAKEN |
409 | Namespace slug already registered |
ACCOUNT_LOCKED |
403 | Too many failed login attempts |
SERVER_ERROR |
500 | Internal error (logged for debugging) |
Error Scenarios and Fallbacks
CLI Error Handling
| Scenario | CLI Behavior | User Message |
|---|---|---|
| Registry offline | Use cached tools if available | "Registry unavailable. Using cached version." |
| Tool not found | Check cache, then fail | "Tool 'foo/bar' not found in registry or cache." |
| Version constraint unsatisfiable | Show available versions | "No version matches '>=5.0.0'. Available: 1.0.0, 1.1.0, 1.2.0" |
| Auth token expired | Prompt for new token | "Token expired. Please re-authenticate." |
| Rate limited | Wait and retry (backoff) | "Rate limited. Retrying in 30 seconds..." |
| Network timeout | Retry with backoff, then fail | "Connection timed out. Check your network." |
Validation Failure Details
When VALIDATION_ERROR occurs, provide specific field errors:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Tool configuration is invalid",
"details": {
"errors": [
{
"path": "steps[0].provider",
"message": "Provider 'gpt5' is not recognized",
"allowed": ["claude", "openai", "ollama", "mock"]
},
{
"path": "version",
"message": "Version '1.0' is not valid semver (use '1.0.0')"
}
]
},
"docs_url": "https://registry.smarttools.dev/docs/tool-format"
}
}
Dependency Resolution Failures
When smarttools install fails on a manifest:
$ smarttools install
Error: Could not resolve all dependencies
rob/summarize@^2.0.0
✗ No matching version (latest: 1.2.0)
alice/translate@>=1.0.0
✓ Found 1.3.0
Suggestions:
- Update rob/summarize constraint to "^1.0.0"
- Contact the tool author for a v2 release
Graceful Degradation
| Component Down | Fallback Behavior |
|---|---|
| API server | CLI uses ~/.smarttools/registry/index.json for search |
| Gitea repo | API serves from DB cache (may be stale) |
| FTS5 index | Fall back to LIKE queries (slower but works) |
| Network | Use locally installed tools, skip registry features |
UX Requirements (CLI/TUI)
Publishing UX
-
smarttools registry publish --dry-runvalidates locally and shows what would be submitted:$ smarttools registry publish --dry-run Validating tool... ✓ config.yaml is valid ✓ README.md exists (2.3 KB) ✓ Version 1.1.0 not yet published Would submit: Owner: rob Name: summarize Version: 1.1.0 Category: text-processing Tags: summarization, ai, text Config preview: ───────────────────────────── name: summarize version: "1.1.0" description: Summarize text using AI ... ───────────────────────────── Run without --dry-run to submit for review. -
Version bump reminder: CLI warns if version hasn't changed from published:
⚠ Version 1.0.0 is already published. Bump version in config.yaml to publish changes. -
First-time publishing flow prompts for token and saves it to config.
Progress Indicators
Long-running operations show progress:
$ smarttools install
Installing project dependencies...
[1/3] rob/summarize@^1.0.0
Resolving version... 1.2.0
Downloading... done
Installing... done ✓
[2/3] alice/translate@>=2.0.0
Resolving version... 2.1.0
Downloading... done
Installing... done ✓
[3/3] official/code-review@*
Resolving version... 1.0.0
Downloading... done
Installing... done ✓
✓ Installed 3 tools
$ smarttools registry publish
Submitting rob/summarize@1.1.0...
Validating... done ✓
Uploading... done ✓
Creating PR... done ✓
✓ PR created: https://gitea.brrd.tech/rob/SmartTools-Registry/pulls/42
Your tool is pending review. You'll receive an email when it's approved.
TUI Browse
smarttools registry browse opens a full-screen terminal UI:
┌─ SmartTools Registry ───────────────────────────────────────┐
│ Search: [________________] [All Categories ▼] [Sort: Popular ▼] │
├─────────────────────────────────────────────────────────────┤
│ │
│ ▶ rob/summarize v1.2.0 ⬇ 142 │
│ Summarize text using AI │
│ [text-processing] [ai] [summarization] │
│ │
│ alice/translate v2.1.0 ⬇ 98 │
│ Translate text between languages │
│ [text-processing] [translation] │
│ │
│ official/code-review v1.0.0 ⬇ 87 │
│ AI-powered code review │
│ [code] [review] [ai] │
│ │
├─────────────────────────────────────────────────────────────┤
│ ↑↓ Navigate Enter: Details i: Install /: Search q: Quit │
└─────────────────────────────────────────────────────────────┘
Keyboard shortcuts:
| Key | Action |
|---|---|
↑/↓ or j/k |
Navigate list |
Enter |
View tool details |
i |
Install selected tool |
/ |
Focus search box |
c |
Change category filter |
s |
Change sort order |
? |
Show help |
q |
Quit |
Virtual scrolling: For large tool lists (>100), use virtual scrolling to maintain performance.
Project Initialization
$ smarttools init
Creating smarttools.yaml...
Project name [my-project]: my-ai-project
Version [1.0.0]:
Would you like to add any tools? (search with 's', skip with Enter)
> s
Search: summ
1. rob/summarize v1.2.0 - Summarize text using AI
2. alice/summary v1.0.0 - Generate summaries
Add tool (number, or Enter to finish): 1
Added rob/summarize@^1.2.0
Add tool (number, or Enter to finish):
✓ Created smarttools.yaml
name: my-ai-project
version: "1.0.0"
dependencies:
- name: rob/summarize
version: "^1.2.0"
Run 'smarttools install' to install dependencies.
Accessibility
- CLI: All output works with screen readers, no color-only information
- TUI: Full keyboard navigation, high-contrast mode support
- Web UI: WCAG 2.1 AA compliance target
- Semantic HTML
- ARIA labels for interactive elements
- Focus management in modals
- Skip links for navigation
Offline Cache
Cache registry index locally:
~/.smarttools/registry/index.json
Refresh when older than 24 hours; support --offline and --refresh flags.
Index Integrity
The cached index.json includes integrity metadata:
{
"version": "1.0",
"generated_at": "2025-01-20T12:00:00Z",
"checksum": "sha256:abc123...",
"tool_count": 142,
"tools": [...]
}
API response headers:
ETag: "abc123def456"
X-Index-Checksum: sha256:abc123...
X-Index-Generated: 2025-01-20T12:00:00Z
CLI verification:
def verify_cached_index():
"""Verify cached index integrity on load"""
cached = load_cached_index()
if not cached:
return None
# Verify checksum
content = json.dumps(cached['tools'], sort_keys=True)
computed = hashlib.sha256(content.encode()).hexdigest()
if computed != cached.get('checksum', '').replace('sha256:', ''):
logger.warning("Cached index checksum mismatch, will refresh")
return None
return cached
Corruption handling:
- If checksum fails, discard cache and fetch fresh
- If partial write detected (missing fields), discard and refresh
- CLI shows warning: "Cached index corrupted, fetching fresh copy..."
Web UI Vision
The registry includes a full website, not just an API:
Site structure:
registry.smarttools.dev (or gitea.brrd.tech/registry)
├── / # Landing page
├── /tools # Browse all tools
├── /tools/{owner}/{name} # Tool detail page
├── /categories # Browse by category
├── /categories/{name} # Tools in category
├── /search?q=... # Search results
├── /docs # Documentation
│ ├── /docs/getting-started
│ ├── /docs/creating-tools
│ ├── /docs/publishing
│ └── /docs/best-practices
├── /tutorials # Step-by-step guides
│ ├── /tutorials/first-tool
│ ├── /tutorials/chaining-steps
│ └── /tutorials/code-steps
├── /examples # Example projects
├── /blog # Updates, announcements (optional)
├── /register # Publisher registration
├── /login # Publisher login
├── /dashboard # Publisher dashboard
│ ├── /dashboard/tools # My published tools
│ ├── /dashboard/tokens # API tokens
│ └── /dashboard/settings # Account settings
└── /api/v1/... # API endpoints
Landing page content:
- Hero: "Share and discover AI-powered CLI tools"
- Quick install example
- Featured/popular tools
- Category highlights
- "Get Started" CTA
Tool detail page:
- Name, description, version, author
- README rendered as markdown (sanitized)
- Install command (copy-to-clipboard)
- Version history
- Download stats
- Category/tags
- "Report" button for abuse
README Security
When rendering README markdown, apply XSS sanitization:
import bleach
from markdown import markdown
ALLOWED_TAGS = [
'h1', 'h2', 'h3', 'h4', 'h5', 'h6',
'p', 'br', 'hr',
'ul', 'ol', 'li',
'strong', 'em', 'code', 'pre',
'blockquote',
'a', 'img',
'table', 'thead', 'tbody', 'tr', 'th', 'td'
]
ALLOWED_ATTRS = {
'a': ['href', 'title'],
'img': ['src', 'alt', 'title'],
'code': ['class'], # for syntax highlighting
}
def render_readme_safe(readme_raw: str) -> str:
"""Convert markdown to sanitized HTML"""
# Convert markdown to HTML
html = markdown(readme_raw, extensions=['fenced_code', 'tables'])
# Sanitize to prevent XSS
safe_html = bleach.clean(
html,
tags=ALLOWED_TAGS,
attributes=ALLOWED_ATTRS,
strip=True
)
# Linkify URLs
safe_html = bleach.linkify(safe_html)
return safe_html
Storage strategy:
- Store raw README in
tools.readme - Render and sanitize on request (or cache rendered HTML)
- Never trust client-submitted HTML directly
Tech stack options:
| Option | Pros | Cons |
|---|---|---|
| Flask + Jinja + Tailwind | Simple, Python-only, fast to build | Less interactive |
| FastAPI + Vue/React SPA | Modern, interactive | More complex, separate build |
| Astro/Next.js | Great SEO, static-first | Different stack (Node.js) |
Recommendation: Flask + Jinja + Tailwind for v1
- Keeps everything in Python
- Server-rendered is fine for a registry
- Good SEO out of the box
- Can add interactivity with Alpine.js or htmx if needed
Monetization considerations:
- AdSense-compatible (server-rendered pages)
- Analytics tracking for traffic insights
- Future: sponsored tools, featured placements
- Future: premium publisher tiers (more tools, priority review)
Implementation Phases
Phase 1: Foundation
- Define
smarttools.yamlmanifest format - Implement tool resolution order (local → global → registry)
- Create SmartTools-Registry repo on Gitea (bootstrap)
- Add 3-5 example tools to seed the registry
Phase 2: Core Backend
- Set up Flask/FastAPI project structure
- Implement SQLite database schema
- Build core API endpoints (list, search, get, download)
- Implement webhook receiver for Gitea sync
- Set up HMAC verification
Phase 3: CLI Commands
smarttools registry searchsmarttools registry installsmarttools registry infosmarttools registry browse(TUI)- Local index caching
Phase 4: Publishing
- Publisher registration (web UI)
- Token management
smarttools registry publishcommand- PR creation via Gitea API
- CI validation workflows
Phase 5: Project Dependencies
smarttools install(from manifest)smarttools addcommand- Runtime override application
- Dependency resolution
Phase 6: Smart Features
- SQLite FTS5 search index
- AI-powered auto-categorization
- Duplicate/similarity detection
- Security scanning
Phase 7: Full Web UI
- Landing page
- Tool browsing/search pages
- Tool detail pages with README rendering
- Publisher dashboard
- Documentation/tutorials section
Phase 8: Polish & Scale
- Rate limiting
- Abuse reporting
- Analytics integration
- Performance optimization
- Monitoring/alerting