58 KiB
AI–Human Collaboration System
Process & Architecture Design Document (v2.0)
- Feature ID: FR_2025-10-21_initial-feature-request
- Status: Design Approved (Ready for Implementation)
- Date: 2025-10-21
- Owners: Rob (maintainer), AI_Moderator (process steward)
- Contributors: AI_Claude, AI_Deepseek, AI_Junie, AI_Chat-GPT, AI_GitHubCopilot
Table of Contents
- Executive Summary
- Repository Layout
- Stage Model & Operational Procedure
- Voting, Quorum & Etiquette
- Cascading Rules System
- Orchestration Architecture
- Moderator Protocol
- Error Handling & Resilience
- Security & Secrets Management
- Performance & Scale Considerations
- Testing Strategy
- Implementation Plan
- Risks & Mitigations
- Template Evolution
- Roles & Agent Personas
- Glossary
- Appendices
Executive Summary
We are implementing a Git-native, rules-driven workflow that enables seamless collaboration between humans and multiple AI agents across the entire software development lifecycle. The system uses cascading .ai-rules.yml configurations and a thin Bash pre-commit hook to automatically generate and maintain development artifacts (discussions, design docs, reviews, diagrams, plans). A Python orchestrator provides structured checks and status reporting while preserving the fast Bash execution path.
Core Principles
- Lightweight & Fast: Everything stored in Git as Markdown; minimal external dependencies
- Single Source of Truth: Repository contains all conversations, decisions, and code artifacts
- Self-Driving with Human Safety: AI agents can propose and vote; humans must approve critical stages
- Deterministic & Reversible: All automated actions are diffed, logged, and easily revertible
- Composable Rules: Nearest-folder precedence via cascading .ai-rules.yml configurations
Key Innovations
- Stage-Per-Discussion Model: Separate conversation threads for each development phase
- Automated Artifact Generation: Discussions automatically drive corresponding documentation
- Integrated Bug Sub-Cycles: Test failures automatically spawn bug reports with their own mini-lifecycle
- Intelligent Promotion Gates: Status-based transitions with configurable voting thresholds
- Multi-Agent Role Specialization: Different AI personas with stage-specific responsibilities
Repository Layout
Canonical Structure (Option A: Per-Feature Folders)
/ (repository root)
├─ .ai-rules.yml # Global defaults + file associations
├─ automation/ # Orchestrator & adapters
│ ├─ workflow.py # Python status/reporting (v1 non-blocking)
│ ├─ adapters/
│ │ ├─ claude_adapter.py # Model interface (future)
│ │ ├─ gitea_adapter.py # Gitea API integration (future)
│ │ └─ agent_coordinator.py # Role routing & task allocation (future)
│ ├─ agents.yml # Role → stages mapping
│ └─ config.yml # Configuration (future)
├─ process/ # Process documentation & templates
│ ├─ design.md # This document
│ ├─ policies.md # Human-friendly policy documentation
│ ├─ policies.yml # Machine-readable policy configuration
│ └─ templates/
│ ├─ feature_request.md
│ ├─ discussion.md
│ ├─ design_doc.md
│ └─ implementation_plan.md
├─ Docs/
│ ├─ features/
│ │ ├─ .ai-rules.yml # Folder-scoped rules for all features
│ │ ├─ FR_YYYY-MM-DD_<slug>/ # Individual feature folders
│ │ │ ├─ request.md # Original feature request
│ │ │ ├─ discussions/ # Stage-specific conversations
│ │ │ │ ├─ feature.discussion.md # Discuss the request
│ │ │ │ ├─ design.discussion.md # Discuss the design
│ │ │ │ ├─ implementation.discussion.md # Track implementation
│ │ │ │ ├─ testing.discussion.md # Plan/track testing
│ │ │ │ └─ review.discussion.md # Final review
│ │ │ ├─ design/ # Design artifacts
│ │ │ │ ├─ design.md # Evolving design document
│ │ │ │ └─ diagrams/ # Architecture diagrams
│ │ │ ├─ implementation/ # Implementation artifacts
│ │ │ │ ├─ plan.md # Implementation plan
│ │ │ │ └─ tasks.md # Task checklist
│ │ │ ├─ testing/ # Testing artifacts
│ │ │ │ ├─ testplan.md # Test strategy
│ │ │ │ └─ checklist.md # Test checklist
│ │ │ ├─ review/ # Review artifacts
│ │ │ │ └─ findings.md # Review findings
│ │ │ └─ bugs/ # Auto-generated bug reports
│ │ │ └─ BUG_YYYYMMDD_<slug>/
│ │ │ ├─ report.md
│ │ │ ├─ discussion.md
│ │ │ └─ fix/
│ │ │ ├─ plan.md
│ │ │ └─ tasks.md
│ ├─ discussions/
│ │ └─ reviews/ # Code reviews from hook
│ └─ diagrams/
│ └─ file_diagrams/ # PlantUML from source files
├─ src/ # Application source code
└─ tests/ # System test suite
├─ unit/
├─ integration/
└─ bin/
Note: Each stage discussion has a companion summary maintained automatically next to it (e.g., feature.summary.md, design.summary.md, implementation.summary.md, testing.summary.md, review.summary.md) to provide a live, scannable state of the thread.
Naming Conventions
- Feature Folder: Docs/features/FR_YYYY-MM-DD_/
- Discussion Files: {stage}.discussion.md in discussions/ subfolder
- Bug Reports: bugs/BUG_YYYYMMDD_/ with standardized contents
- Source Files: Maintain existing patterns in src/
Template Variables
Supported in path resolution:
- {basename}: Filename without extension
- {date}: Current date in YYYY-MM-DD format
- {rel}: Repository-relative path to source file
- {dir}: Directory containing the source file (NEW)
Stage Model & Operational Procedure
Complete Stage Lifecycle
Request → Feature Discussion → Design Discussion → Implementation Discussion → Testing Discussion → Review Discussion → Release
Stage 1: Request
Entry Criteria
- Docs/features/FR_*/request.md created from template
- Template completeness: intent, motivation, constraints, open questions
Artifacts Generated
- request.md: Source feature request document
Automated Actions
- Creates discussions/feature.discussion.md with standard header
- Adds Summary and Participation sections
- Appends initial AI comment with vote
Exit Criteria
- Discussion file created and populated
- Ready for feature discussion phase
Stage 2: Feature Discussion
- File: discussions/feature.discussion.md
Header Template
---
type: discussion
stage: feature
status: OPEN # OPEN | READY_FOR_DESIGN | REJECTED
feature_id: FR_YYYY-MM-DD_<slug>
created: YYYY-MM-DD
promotion_rule:
allow_agent_votes: true
ready_min_eligible_votes: all
reject_min_eligible_votes: 1
participation:
instructions: |
- Append your input at the end as: "YourName: your comment…"
- Every comment must end with a vote line: "VOTE: READY|CHANGES|REJECT"
- Agents/bots must prefix names with "AI_"
voting:
values: [READY, CHANGES, REJECT]
---
Operational Flow
- Participants append comments ending with vote lines
- Latest vote per participant counts toward thresholds
- AI_Moderator tracks unanswered questions and missing votes
- When READY threshold met: status → READY_FOR_DESIGN
Promotion Actions
- Creates discussions/design.discussion.md (OPEN)
- Creates design/design.md seeded from request + feature discussion
- Creates design/diagrams/ folder
Stage 3: Design Discussion
- File: discussions/design.discussion.md
Header
---
type: discussion
stage: design
status: OPEN # OPEN | READY_FOR_IMPLEMENTATION | NEEDS_MORE_INFO
# ... same promotion_rule, participation, voting as feature
---
Operational Flow
- AI_Architect updates design/design.md on each commit
- Design doc evolves with discussion: options, decisions, risks, acceptance criteria
- Participants vote on design completeness
- When READY threshold met: status → READY_FOR_IMPLEMENTATION
Design Document Structure
- Context & Goals
- Non-Goals & Constraints
- Options Considered with Trade-offs
- Decision & Rationale
- Architecture Diagrams
- Risks & Mitigations
- Measurable Acceptance Criteria
Promotion Actions
- Creates discussions/implementation.discussion.md (OPEN)
- Creates implementation/plan.md and implementation/tasks.md
- Tasks are checkboxes aligned to acceptance criteria
Stage 4: Implementation Discussion
- File: discussions/implementation.discussion.md
Header
---
type: discussion
stage: implementation
status: OPEN # OPEN | READY_FOR_TESTING
promotion_rule:
allow_agent_votes: true
ready_min_eligible_votes: 1_human # HUMAN GATE
reject_min_eligible_votes: 1
# ...
---
Operational Flow
- AI_Implementer syncs implementation/tasks.md with discussion
- Parse checkboxes and PR mentions from discussion posts
- Link commits/PRs to tasks when mentioned ([#123], commit shas)
- When all required tasks complete: status → READY_FOR_TESTING
Task Management
- Tasks.md maintained as single source of truth
- Checkbox completion tracked automatically
- PR and commit references linked automatically
Promotion Actions
- Creates discussions/testing.discussion.md (OPEN)
- Creates testing/testplan.md and testing/checklist.md
- Test checklist derived from acceptance criteria + edge cases
Stage 5: Testing Discussion
- File: discussions/testing.discussion.md
Header
---
type: discussion
stage: testing
status: OPEN # OPEN | READY_FOR_REVIEW
promotion_rule:
allow_agent_votes: true
ready_min_eligible_votes: all
reject_min_eligible_votes: 1
# ...
---
Operational Flow
- AI_Tester syncs testing/checklist.md with discussion posts
- Parse result blocks: [RESULT] PASS/FAIL: description
- Mark corresponding checklist items pass/fail
- On test failure: auto-create bug report with full sub-cycle
Bug Sub-Cycle Creation
bugs/BUG_YYYYMMDD_<slug>/
├─ report.md # Steps, expected/actual, environment
├─ discussion.md # Bug discussion (OPEN)
└─ fix/
├─ plan.md # Fix implementation plan
└─ tasks.md # Fix tasks checklist
Bug Resolution Flow
- Bug follows mini Implementation→Testing cycle
- On bug closure, return to main testing discussion
- Bug results integrated into main test checklist
Promotion Actions
- Creates discussions/review.discussion.md (OPEN)
- Creates review/findings.md with verification summary
Stage 6: Review Discussion
- File: discussions/review.discussion.md
Header
---
type: discussion
stage: review
status: OPEN # OPEN | READY_FOR_RELEASE | CHANGES_REQUESTED
promotion_rule:
allow_agent_votes: true
ready_min_eligible_votes: 1_human # HUMAN GATE
reject_min_eligible_votes: 1
# ...
---
Operational Flow
- AI_Reviewer summarizes into review/findings.md
- Review covers: changes, risks, test evidence, deployment considerations
- Can spawn follow-up feature requests or bugs from findings
- When human READY present and no blockers: status → READY_FOR_RELEASE
Follow-up Artifact Creation
- New FR: ../../FR_YYYY-MM-DD_followup/request.md
- New Bug: bugs/BUG_YYYYMMDD_review/report.md
Stage 7: Release
Entry Criteria
- Review discussion status is READY_FOR_RELEASE
Automated Actions
- Generate release notes from feature changes
- Semver bump based on change type
- Create git tag
- Update changelog
- Document rollback procedure
Post-Release
- Queue post-release validation tasks
- Update documentation as needed
- Archive feature folder if complete
Voting, Quorum & Etiquette
Voting System
Vote Values: READY | CHANGES | REJECT
Format Requirements:
Each comment must end with: VOTE: READY|CHANGES|REJECT
Last line of comment, exact format
Multiple votes by same participant: latest wins
Vote Parsing Examples:
Rob: I agree with this approach. VOTE: READY
→ Rob: READY
AI_Claude: Here's my analysis... VOTE: CHANGES
→ AI_Claude: CHANGES (if allow_agent_votes=true)
User: I have concerns... VOTE: CHANGES
Later: User: Actually, addressed now. VOTE: READY
→ User: READY (latest vote wins)
Eligibility & Quorum
Default Policy (machine-readable in process/policies.yml):
version: 1
voting:
values: [READY, CHANGES, REJECT]
allow_agent_votes: true
quorum:
discussion: { ready: all, reject: 1 }
implementation: { ready: 1_human, reject: 1 }
release: { ready: 1_human, reject: 1 }
eligibility:
agents_allowed: true
require_human_for: [implementation, release]
etiquette:
name_prefix_agents: "AI_"
vote_line_regex: "^VOTE:\\s*(READY|CHANGES|REJECT)\\b"
timeouts:
discussion_stale_days: 3
nudge_interval_hours: 24
Human Safety Gates:
Implementation promotion: ≥1 human READY required
Release promotion: ≥1 human READY required
Agent votes count toward discussion but cannot satisfy human requirements
Participation Etiquette
Conciseness: Keep comments action-oriented and focused
References: Link to files/sections when possible (design.md#architecture)
Naming: Agents must prefix with AI_ (e.g., AI_Architect)
Ownership: Suggest explicit owners for next steps (@AI_Architect: please draft...)
Timeliness: Respond to direct questions within 24 hours
Cascading Rules System
Global Rules (Root .ai-rules.yml)
version: 1
# Map file extensions to rule names
file_associations:
"*.js": "js-file"
"*.ts": "js-file"
"*.puml": "puml-file"
"*.md": "md-file"
rules:
js-file:
description: "Generate PlantUML + review for JS/TS files"
outputs:
diagram:
enabled: true
path: "Docs/diagrams/file_diagrams/{basename}.puml"
output_type: "puml-file"
instruction: |
Update the PlantUML diagram to reflect staged code changes.
Focus on: key functions, control flow, data transformations, dependencies.
Keep architectural elements clear and focused.
review:
enabled: true
path: "Docs/discussions/reviews/{date}_{basename}.md"
output_type: "md-file"
instruction: |
Create technical review of code changes.
Include: summary of changes, potential risks, edge cases,
testing considerations, performance implications.
Use concise bullet points.
puml-file:
description: "Rules for PlantUML diagram files"
instruction: |
Maintain readable, consistent diagrams.
Use descriptive element names, consistent arrow styles.
Include brief legend for complex diagrams.
md-file:
description: "Rules for Markdown documentation"
instruction: |
Use proper Markdown syntax with concise paragraphs.
Use code fences for examples, lists for multiple points.
Maintain technical, clear tone.
settings:
max_tokens: 4000
temperature: 0.1
model: "claude-sonnet-4-5-20250929"
Feature-Scoped Rules (Docs/features/.ai-rules.yml)
version: 1
file_associations:
"request.md": "feature_request"
"feature.discussion.md": "feature_discussion"
"design.discussion.md": "design_discussion"
"implementation.discussion.md":"impl_discussion"
"testing.discussion.md": "test_discussion"
"review.discussion.md": "review_discussion"
rules:
feature_request:
outputs:
feature_discussion:
path: "{dir}/discussions/feature.discussion.md"
output_type: "feature_discussion_writer"
instruction: |
If discussion file missing: create with standard header (stage: feature, status: OPEN),
add Summary and Participation sections, then append initial AI comment with vote.
If exists: no operation.
feature_discussion:
outputs:
self_append:
path: "{dir}/discussions/feature.discussion.md"
output_type: "feature_discussion_writer"
instruction: |
Append concise comment signed with AI name, ending with single vote line.
Evaluate votes against header thresholds. If READY threshold met:
- Flip status to READY_FOR_DESIGN
- Clearly state promotion decision in comment
Append-only with minimal diff.
design_discussion:
path: "{dir}/discussions/design.discussion.md"
output_type: "design_discussion_writer"
instruction: |
Create ONLY if feature discussion status is READY_FOR_DESIGN.
Seed with standard header (stage: design, status: OPEN).
design_doc:
path: "{dir}/design/design.md"
output_type: "design_doc_writer"
instruction: |
Create ONLY if feature discussion status is READY_FOR_DESIGN.
Seed design document from request content and feature discussion.
Include: Context, Options, Decision, Risks, Acceptance Criteria.
design_discussion:
outputs:
design_update:
path: "{dir}/design/design.md"
output_type: "design_doc_writer"
instruction: |
Update design document to reflect latest design discussion.
Ensure acceptance criteria are measurable and complete.
Maintain all standard sections. Minimal diffs.
impl_discussion:
path: "{dir}/discussions/implementation.discussion.md"
output_type: "impl_discussion_writer"
instruction: |
Create ONLY if design discussion status is READY_FOR_IMPLEMENTATION.
impl_plan:
path: "{dir}/implementation/plan.md"
output_type: "impl_plan_writer"
instruction: |
Create ONLY if design status is READY_FOR_IMPLEMENTATION.
Draft implementation milestones and scope.
impl_tasks:
path: "{dir}/implementation/tasks.md"
output_type: "impl_tasks_writer"
instruction: |
Create ONLY if design status is READY_FOR_IMPLEMENTATION.
Generate task checklist aligned to acceptance criteria.
impl_discussion:
outputs:
tasks_sync:
path: "{dir}/implementation/tasks.md"
output_type: "impl_tasks_maintainer"
instruction: |
Parse checkboxes and PR mentions from implementation discussion.
Synchronize tasks.md accordingly.
When all required tasks complete, mark implementation discussion READY_FOR_TESTING.
test_discussion:
path: "{dir}/discussions/testing.discussion.md"
output_type: "test_discussion_writer"
instruction: |
Create ONLY if implementation status is READY_FOR_TESTING.
test_plan:
path: "{dir}/testing/testplan.md"
output_type: "testplan_writer"
instruction: |
Create ONLY if implementation status is READY_FOR_TESTING.
Derive test strategy from acceptance criteria.
test_checklist:
path: "{dir}/testing/checklist.md"
output_type: "testchecklist_writer"
instruction: |
Create ONLY if implementation status is READY_FOR_TESTING.
Generate test checklist covering acceptance criteria and edge cases.
test_discussion:
outputs:
checklist_update:
path: "{dir}/testing/checklist.md"
output_type: "testchecklist_maintainer"
instruction: |
Parse result blocks from test discussion ([RESULT] PASS/FAIL: description).
Update checklist accordingly with evidence links.
On test failure, create appropriate bug report.
bug_report:
path: "{dir}/bugs/BUG_{date}_auto/report.md"
output_type: "bug_report_writer"
instruction: |
Create bug report ONLY when test failure with clear reproduction steps.
Initialize bug discussion and fix plan in same folder.
review_discussion:
path: "{dir}/discussions/review.discussion.md"
output_type: "review_discussion_writer"
instruction: |
Create ONLY if all test checklist items pass.
Set testing discussion status to READY_FOR_REVIEW.
review_findings:
path: "{dir}/review/findings.md"
output_type: "review_findings_writer"
instruction: |
Create summary of verified functionality, risks, and noteworthy changes.
review_discussion:
outputs:
followup_feature:
path: "../../FR_{date}_followup/request.md"
output_type: "feature_request_writer"
instruction: |
Create follow-up feature request ONLY when review identifies enhancement opportunity.
followup_bug:
path: "{dir}/bugs/BUG_{date}_review/report.md"
output_type: "bug_report_writer"
instruction: |
Create bug report ONLY when review identifies defect.
Seed discussion and fix plan.
5.3 Rule Resolution Precedence Nearest Directory: Check source file directory and parents upward
Feature Scope: Docs/features/.ai-rules.yml for feature artifacts
Global Fallback: Root .ai-rules.yml for code files
Conflict Resolution: Nearest rule wins, with logging of override decisions
Orchestration Architecture
Bash Pre-commit Hook (Current Implementation)
Core Responsibilities:
Collect staged files (Added/Modified only)
Resolve rules via cascading lookup
Build context prompts from staged content
Call AI model via CLI for patch generation
Apply patches with robust error handling
Enhanced Template Support:
# Add to resolve_template() function
local dirpath
dirpath=$(dirname "$rel_path")
# ...
-e "s|{dir}|$dirpath|g"
Patch Application Strategy:
Preserve Index Lines: Enable 3-way merge capability
Try 3-way First: git apply --index --3way --recount --whitespace=nowarn
Fallback to Strict: git apply --index if 3-way fails
Debug Artifacts: Save raw/clean/sanitized/final patches to .git/ai-rules-debug/
Discussion File Optimization:
Prefer append-only edits with optional header flips
For large files: generate full new content and compute diff locally
Minimize hunk drift through careful patch construction
Python Orchestrator (automation/workflow.py)
Phase 1 (Non-blocking Status):
#!/usr/bin/env python3
import json, os, sys, subprocess, re
from pathlib import Path
def main():
changed_files = read_changed_files()
status_report = analyze_discussion_status(changed_files)
if status_report:
print("AI-Workflow Status Report")
print(json.dumps(status_report, indent=2))
sys.exit(0) # Always non-blocking in v1
Core Functions:
Vote Parsing: Parse discussion files, track latest votes per participant
Threshold Evaluation: Compute eligibility and quorum status
Status Reporting: JSON output of current discussion state
Decision Hints: Suggest promotion based on policy rules
Future Enhancements:
Policy enforcement based on process/policies.yml
Gitea API integration for issue/PR management
Advanced agent coordination and task routing
Gitea Integration (Future)
Label System:
stage/*: stage/discussion, stage/design, stage/implementation, etc.
blocked/*: blocked/needs-votes, blocked/needs-human
needs/*: needs/design, needs/review, needs/tests
Automated Actions:
Open/label PRs for implementation transitions
Post status summaries to PR threads
Create tracking issues for feature implementation
Report status checks to PRs
Moderator Protocol
AI_Moderator Responsibilities
Conversation Tracking:
Monitor unanswered questions (>24 hours)
Track missing votes from active participants
Identify stale threads needing attention
Flag direct mentions that need responses
Progress Reporting:
Compute current vote tallies and thresholds
List participants who haven't voted recently
Summarize promotion status and remaining requirements
Highlight blocking issues or concerns
Task Allocation:
Suggest explicit owners for pending tasks
Example: "AI_Architect: please draft the acceptance criteria section"
Example: "Rob: could you clarify the deployment timeline?"
Moderator Implementation
Rule Definition (in Docs/features/.ai-rules.yml):
discussion_moderator_nudge:
outputs:
self_append:
path: "{dir}/discussions/{basename}.discussion.md"
output_type: "feature_discussion_writer"
instruction: |
Act as AI_Moderator. Analyze the entire discussion and:
UNANSWERED QUESTIONS:
- List any direct questions unanswered for >24 hours (mention @names)
- Flag questions that need clarification or follow-up
VOTE STATUS:
- Current tally: READY: X, CHANGES: Y, REJECT: Z
- Missing votes from: [list of participants without recent votes]
- Promotion status: [based on header thresholds]
ACTION ITEMS:
- Suggest specific next owners for pending tasks
- Propose concrete next steps with deadlines
Keep comment under 10 lines. End with "VOTE: CHANGES".
Append-only; minimal diff; update nothing else.
Nudge Frequency: Controlled by nudge_interval_hours in policies
Error Handling & Resilience
Common Failure Modes
Patch Application Issues:
Symptom: Hunk drift on large files, merge conflicts
Mitigation: 3-way apply with index preservation, append-only strategies
Fallback: Local diff computation from full new content
Model Output Problems:
Symptom: Malformed diff, missing markers, invalid patch format
Mitigation: Extract between markers, validate with git apply --check
Fallback: Clear diagnostics with patch validation output
Tooling Dependencies:
Symptom: Missing yq, claude, or other required tools
Mitigation: Pre-flight checks with clear error messages
Fallback: Graceful degradation with feature-specific disabling
Rule Conflicts:
Symptom: Multiple rules matching same file with conflicting instructions
Mitigation: Nearest-directory precedence with conflict logging
Fallback: Global rule application with warning
Recovery Procedures
Manual Override:
# Bypass hook for emergency edits
git commit --no-verify -m "Emergency fix: manually overriding discussion status"
# Manually update discussion header
# type: discussion -> status: READY_FOR_IMPLEMENTATION
Debug Artifacts:
All patch variants saved to .git/ai-rules-debug/
Timestamped files: raw, clean, sanitized, final patches
Commit-specific directories for correlation
Rollback Strategy:
All generated artifacts are staged separately
Easy partial staging: git reset HEAD for specific artifacts
Full reset: git reset HEAD~1 to undo entire commit with generations
Audit Trail
Execution Logging:
All rule invocations logged with source→output mapping
Patch application attempts and outcomes recorded
Vote calculations and promotion decisions documented
Debug Bundle:
.git/ai-rules-debug/
├─ 20251021-143022-12345-feature.discussion.md/
│ ├─ raw.out # Raw model output
│ ├─ clean.diff # Extracted patch
│ ├─ sanitized.diff # After sanitization
│ └─ final.diff # Final applied patch
└─ execution.log # Chronological action log
Security & Secrets Management
Secret Protection
Never Commit:
API keys, authentication tokens
Personal identifying information
Internal system credentials
Private configuration data
Environment Variables:
# Current approach
export CLAUDE_API_KEY="your_key"
# Future .env approach (git-ignored)
# .env file loaded via python-dotenv in Python components
Configuration Management:
Keep sensitive endpoints in automation/config.yml
Use environment variable substitution in configuration
Validate no secrets in discussions, rules, or generated artifacts
Access Control
Repository Security:
Assume all repository contents are potentially exposed
No sensitive business logic in prompt instructions
Regular security reviews of rule definitions
Agent Permissions:
Limit file system access to repository scope
Validate output paths stay within repository
Sanitize all file operations for path traversal
Performance & Scale Considerations
Optimization Strategies
Prompt Efficiency:
Pass staged diffs instead of full file contents when possible
Use concise, structured instructions with clear formatting
Limit context to relevant sections for large files
Discussion Management:
Append-only edits with periodic summarization
Compact status reporting in moderator comments
Archive completed discussions if they become too large
Batch Operations:
Process multiple related files in single model calls when beneficial
Cache rule resolutions for multiple files in same directory
Parallelize independent output generations
Scaling Limits
File Size Considerations:
Small (<100KB): Full content in prompts
Medium (100KB-1MB): Diff-only with strategic context
Large (>1MB): Chunked processing or summary-only approaches
Repository Size:
Current approach suitable for medium-sized repositories
For very large codebases: scope rules to specific directories
Consider rule disabling for generated/binary assets
Rate Limiting:
Model API calls: implement throttling and retry logic
Gitea API: respect rate limits with exponential backoff
File operations: batch where possible to reduce I/O
Testing Strategy
Testing Tiers
Unit Tests (Python):
Vote parsing and eligibility calculation
Policy evaluation and quorum determination
Rules resolution and conflict handling
Template variable substitution
Integration Tests (Bash + Python):
End-to-end rule → prompt → patch → apply cycle
Discussion status transitions and promotion logic
Error handling and recovery procedures
Multi-file rule processing
Artifact Validation:
PlantUML syntax checking: plantuml -checkonly
Markdown structure validation
Template completeness checks
YAML syntax validation
Test Architecture
tests/
├─ unit/
│ ├─ test_votes.py
│ ├─ test_policies.py
│ ├─ test_rules_resolution.py
│ └─ test_template_variables.py
├─ integration/
│ ├─ run.sh # Main test runner
│ ├─ lib.sh # Test utilities
│ ├─ fixtures/
│ │ └─ repo_skeleton/ # Minimal test repository
│ │ ├─ .ai-rules.yml
│ │ ├─ Docs/features/.ai-rules.yml
│ │ └─ Docs/features/FR_test/
│ │ ├─ request.md
│ │ └─ discussions/
│ └─ test_cases/
│ ├─ test_feature_promotion.sh
│ ├─ test_design_generation.sh
│ └─ test_bug_creation.sh
├─ bin/
│ └─ claude # Fake deterministic model
└─ README.md
Fake Model Implementation
Purpose: Deterministic testing without external API dependencies
Implementation (tests/bin/claude):
#!/bin/bash
# Fake Claude CLI for testing
# Reads prompt from stdin, outputs predetermined patch based on content
if grep -q "OUTPUT FILE:.*discussion.md" ; then
# Output discussion update patch
cat << 'EOF'
<<<AI_DIFF_START>>>
diff --git a/Docs/features/FR_test/discussions/feature.discussion.md b/Docs/features/FR_test/discussions/feature.discussion.md
index 1234567..890abcd 100644
--- a/Docs/features/FR_test/discussions/feature.discussion.md
+++ b/Docs/features/FR_test/discussions/feature.discussion.md
@@ -15,3 +15,6 @@ voting:
## Summary
Test feature for validation
+
+## Participation
+AI_Test: This is a test comment. VOTE: READY
<<<AI_DIFF_END>>>
EOF
else
# Default patch for other file types
echo "No specific patch for this file type"
fi
Integration Test Runner
Key Test Scenarios
- Feature Promotion: request.md → feature.discussion.md → READY_FOR_DESIGN
- Design Generation: design.discussion.md → design.md updates
- Bug Creation: test failure → auto bug report generation
- Error Recovery: Malformed patch → graceful failure with diagnostics
- Rule Conflicts: Multiple rule matches → nearest-directory resolution
Test Execution
# Run full test suite
cd tests/integration
./run.sh
# Run specific test case
./test_cases/test_feature_promotion.sh
Continuous Validation
Pre-commit Checks:
PlantUML syntax validation for generated diagrams
Markdown link validation
YAML syntax checking for rule files
Template variable validation
Performance Benchmarks:
Rule resolution time for typical commit
Patch generation and application duration
Memory usage during large file processing
Source Intelligence Automation (Auto-Review + Auto-Diagram)
Purpose
To keep technical documentation and diagrams in sync with evolving source code. On every staged change to src/**/*.js|ts|py, the automation layer:
- Analyzes the diff and AST to produce a concise review summary
- Extracts structure and updates a PlantUML diagram in Docs/diagrams/file_diagrams/
A) Folder Layout
src/
├─ automation/
│ ├─ __init__.py
│ ├─ analyzer.py # parses diffs, extracts structure & metrics
│ ├─ reviewer.py # writes review summaries (md)
│ ├─ diagrammer.py # emits PUML diagrams
│ └─ utils/
│ ├─ git_tools.py # staged diff, blob lookup
│ ├─ code_parser.py # AST helpers (JS/TS/Python)
│ └─ plantuml_gen.py # renders PlantUML text
B) Operational Flow (Triggered by Hook)
┌────────────────────────────────────────────────────────┐
│ pre-commit hook (bash) │
│ └──> detect src/**/*.js|ts|py changes │
│ ├─> call automation/analyzer.py --file <path> │
│ │ ├─ parse diff + AST │
│ │ ├─ collect functions, classes, calls │
│ │ └─ emit JSON summary │
│ ├─> reviewer.py → Docs/discussions/reviews/ │
│ └─> diagrammer.py → Docs/diagrams/file_diagrams/│
└────────────────────────────────────────────────────────┘
Each stage emits a unified diff so the same patch-application rules (3-way apply, append-only) still apply.
C) Sample Rule (Root .ai-rules.yml)
js-file:
description: "Generate PlantUML + review for JS/TS files"
outputs:
diagram:
path: "Docs/diagrams/file_diagrams/{basename}.puml"
output_type: "puml-file"
instruction: |
Parse code structure and update a PlantUML diagram:
- Modules, classes, functions
- Control-flow edges between major functions
review:
path: "Docs/discussions/reviews/{date}_{basename}.md"
output_type: "md-file"
instruction: |
Summarize this commit’s code changes:
- What changed and why
- Possible risks / performance / security notes
- Suggested tests or TODOs
Similar rules exist for py-file, ts-file, etc.
D) Core Algorithms (pseudocode)
# 1 analyzer.py
def analyze_source(path):
diff = git_diff(path)
tree = parse_ast(path)
funcs, classes = extract_symbols(tree)
flows = extract_calls(tree)
metrics = compute_metrics(tree)
return {
"file": path,
"functions": funcs,
"classes": classes,
"flows": flows,
"metrics": metrics,
"diff_summary": summarize_diff(diff),
}
# 2 diagrammer.py
def generate_puml(analysis):
nodes = [*analysis["classes"], *analysis["functions"]]
edges = analysis["flows"]
puml = "@startuml\n"
for n in nodes:
puml += f"class {n}\n"
for a, b in edges:
puml += f"{a} --> {b}\n"
puml += "@enduml\n"
return puml
# 3 reviewer.py
def generate_review(analysis):
return f"""# Auto Review — {analysis['file']}
## Summary
{analysis['diff_summary']}
## Key Functions
{', '.join(analysis['functions'][:10])}
## Potential Risks
- TODO: evaluate complexity or security implications
## Suggested Tests
- Unit tests for new/modified functions
"""
E) Outputs
- .puml → Docs/diagrams/file_diagrams/{basename}.puml (keeps architecture maps current)
- .md → Docs/discussions/reviews/{date}_{basename}.md (rolling code review history) Each output follows 3-way apply / append-only rules; every commit leaves a diff trail in .git/ai-rules-debug/.
F) Integration with Orchestrator
# automation/workflow.py (aggregation example)
if src_changed():
from automation import analyzer, reviewer, diagrammer
for f in changed_src_files:
data = analyzer.analyze_source(f)
diagrammer.update_puml(data)
reviewer.update_review(data)
Future versions can post summaries to the feature’s implementation discussion and link diagrams into design/design.md.
G) Testing the Source Automation Layer
- Unit: tests/unit/test_code_parser.py, tests/unit/test_puml_gen.py
- Integration: tests/integration/test_cases/test_auto_review.sh, test_auto_diagram.sh
- Fixtures: tests/integration/fixtures/repo_skeleton/src/ with fake commits to verify generation
H) Security & Performance Notes
- Sandbox analysis only — no execution of user code
- AST parsing limited to static structure
- Large files (>5k lines): partial summarization
- Output capped to ≤ 200 KB per artifact
I) Deliverables Added to Milestones
- M0 → create src/automation/ skeleton
- M1 → functional auto-review + auto-diagram for JS/TS files
- M2 → extend to Python + PlantUML cross-linking in design docs
Discussion Summaries (Companion Artifacts per Stage)
What it is
For every {stage}.discussion.md, maintain a sibling {stage}.summary.md. It is append-minimal with bounded section rewrites only (between stable markers). Contents: decisions, vote tallies, open questions, awaiting replies, action items, compact timeline.
Where it lives
Docs/features/FR_.../
└─ discussions/
├─ feature.discussion.md
├─ feature.summary.md
├─ design.discussion.md
├─ design.summary.md
├─ implementation.discussion.md
├─ implementation.summary.md
├─ testing.discussion.md
├─ testing.summary.md
├─ review.discussion.md
└─ review.summary.md
Header (machine-readable)
---
type: discussion-summary
stage: feature # feature|design|implementation|testing|review
status: ACTIVE # ACTIVE|SNAPSHOT|ARCHIVED
source_discussion: feature.discussion.md
feature_id: FR_YYYY-MM-DD_<slug>
updated: YYYY-MM-DDTHH:MM:SSZ
policy:
allow_agent_votes: true
require_human_for: [implementation, review]
---
Stable section markers (for tiny diffs)
# Summary — <Stage Title>
<!-- SUMMARY:DECISIONS START -->
## Decisions (ADR-style)
- (none yet)
<!-- SUMMARY:DECISIONS END -->
<!-- SUMMARY:OPEN_QUESTIONS START -->
## Open Questions
- (none yet)
<!-- SUMMARY:OPEN_QUESTIONS END -->
<!-- SUMMARY:AWAITING START -->
## Awaiting Replies
- (none yet)
<!-- SUMMARY:AWAITING END -->
<!-- SUMMARY:ACTION_ITEMS START -->
## Action Items
- (none yet)
<!-- SUMMARY:ACTION_ITEMS END -->
<!-- SUMMARY:VOTES START -->
## Votes (latest per participant)
READY: 0 • CHANGES: 0 • REJECT: 0
- (no votes yet)
<!-- SUMMARY:VOTES END -->
<!-- SUMMARY:TIMELINE START -->
## Timeline (most recent first)
- <YYYY-MM-DD HH:MM> <name>: <one-liner>
<!-- SUMMARY:TIMELINE END -->
<!-- SUMMARY:LINKS START -->
## Links
- Related PRs: –
- Commits: –
- Design/Plan: ../design/design.md
<!-- SUMMARY:LINKS END -->
How it updates
Trigger: whenever {stage}.discussion.md is staged, the hook also updates/creates {stage}.summary.md.
Deterministic logic:
- Votes: parse latest vote per participant (eligibility per policy)
- Decisions: if header status flips (e.g., READY_FOR_IMPLEMENTATION), append an ADR entry
- Open Questions: lines ending with ? or flagged Q: with @owner if present
- Awaiting Replies: mentions with no response from that participant within response_timeout_hours
- Action Items: unchecked tasks (- ) with @owner remain tracked until checked
- Timeline: last N (default 15) comment one-liners with timestamp and name
- Links: auto-add PRs (#123), SHAs, and cross-file links
Rotation / snapshots (optional): when discussion grows large or on schedule, write discussions/summaries/.md (status: SNAPSHOT) and keep {stage}.summary.md trimmed while retaining Decisions/Open Q/Actions/Votes.
Rules (additions in Docs/features/.ai-rules.yml)
file_associations:
"feature.discussion.md": "feature_discussion"
"design.discussion.md": "design_discussion"
"implementation.discussion.md": "impl_discussion"
"testing.discussion.md": "test_discussion"
"review.discussion.md": "review_discussion"
rules:
feature_discussion:
outputs:
summary_companion:
path: "{dir}/discussions/feature.summary.md"
output_type: "discussion_summary_writer"
instruction: |
Create or update the summary file. Replace ONLY content between
these markers: DECISIONS, OPEN_QUESTIONS, AWAITING, ACTION_ITEMS,
VOTES, TIMELINE, LINKS. Do not touch other lines.
Inputs: the entire feature.discussion.md.
design_discussion:
outputs:
summary_companion:
path: "{dir}/discussions/design.summary.md"
output_type: "discussion_summary_writer"
instruction: |
Same summary policy as feature.summary.md; also add link to ../design/design.md.
impl_discussion:
outputs:
summary_companion:
path: "{dir}/discussions/implementation.summary.md"
output_type: "discussion_summary_writer"
instruction: |
Same summary policy; include unchecked items from ../implementation/tasks.md.
test_discussion:
outputs:
summary_companion:
path: "{dir}/discussions/testing.summary.md"
output_type: "discussion_summary_writer"
instruction: |
Same summary policy; include failing test artifacts and ensure FAILs surface in Open Questions or Awaiting.
review_discussion:
outputs:
summary_companion:
path: "{dir}/discussions/review.summary.md"
output_type: "discussion_summary_writer"
instruction: |
Same summary policy; Decisions should note READY_FOR_RELEASE with date and follow-ups.
Orchestrator support (nice-to-have)
Provide workflow.py --summarize to output regenerated sections for tests/CI. Track awaiting replies via timestamps per author; if absent, mark as awaiting.
Testing additions
- Unit: parsing of votes, questions, mentions, action items
- Integration: commit a discussion with constructs → verify summary sections updated and only marker-bounded hunks changed
- Failure: malformed discussion / huge file → generator still writes sections; timeline truncates; no crash
Why this helps
Newcomers can open {stage}.summary.md and immediately see the state. Humans keep talking in the discussion; the system curates the signal in the summary. Promotions are transparent via Decisions. Open loops are visible and assigned.
Implementation Plan
Milestone M0: Process Foundation
Deliverables:
process/design.md (this document)
process/policies.md + process/policies.yml
process/templates/ (all four core templates)
automation/agents.yml (role mappings)
src/automation/ skeleton (analyzer.py, reviewer.py, diagrammer.py, utils/*)
Success Criteria:
All process documentation in place
Policy definitions machine-readable
Templates provide clear starting points
Milestone M1: Orchestrator MVP + Hook Enhancements
Deliverables:
automation/workflow.py (non-blocking status reporter)
Bash hook: {dir} template variable support
Bash hook: Index preservation for 3-way apply
Bash hook: Append-only optimization for discussions
Auto-review + auto-diagram operational for JS/TS via root rules (js-file)
Success Criteria:
Python orchestrator reports discussion status
Template variables work for feature folder paths
3-way apply handles merge conflicts gracefully
Milestone M2: Stage Automation & Moderator
Deliverables:
Enhanced Docs/features/.ai-rules.yml with stage rules
AI_Moderator implementation via discussion rules
Python orchestrator: policy-based decision hints
Test suite for feature promotion flow
Discussion summaries: rules (discussion_summary_writer) + tests
Success Criteria:
Feature requests auto-create discussions
Discussions promote through stages based on votes
Moderator provides useful conversation guidance
Milestone M3: Gitea Integration
Deliverables:
automation/adapters/gitea_adapter.py
Automated PR creation and labeling
Status reporting to PR threads
Issue tracking integration
Success Criteria:
Implementation stage auto-creates PRs
Review status visible in PR discussions
Labels reflect current stage and blockers
Milestone M4: Bash to Python Migration
Deliverables:
Core rule resolution logic in Python
Patch generation and application in Python
Bash hook as thin wrapper calling Python
Enhanced error handling and diagnostics
Success Criteria:
Maintains current functionality with better maintainability
Improved error messages and recovery options
Consistent behavior across all operations
Risks & Mitigations
Technical Risks
Over-Automation Bypassing Humans:
Risk: Critical decisions made without human oversight
Mitigation: Human READY gates for Implementation and Release stages
Control: Manual override capability for all automated promotions
Patch Instability on Large Files:
Risk: Hunk drift and merge conflicts in long discussions
Mitigation: 3-way apply with index preservation, append-only strategies
Fallback: Local diff computation from full content regeneration
Tooling Dependency Management:
Risk: Version conflicts or missing dependencies break system
Mitigation: Pre-flight validation with clear error messages
Recovery: Graceful degradation with feature flags
Context Limit Exceeded:
Risk: AI models cannot process very large discussions
Mitigation: Structured summarization, chunked processing
Alternative: Focus on recent changes with reference to history
13.2 Process Risks Vote Manipulation or Gaming:
Risk: Participants exploit voting system for unwanted outcomes
Mitigation: Clear etiquette policies, human override capability
Oversight: Moderator monitoring for voting patterns
Discussion Fragmentation:
Risk: Conversations become scattered across multiple files
Mitigation: Clear stage boundaries, cross-references between discussions
Tooling: Search and navigation aids for related artifacts
Agent Coordination Conflicts:
Risk: Multiple agents making conflicting changes
Mitigation: Clear role definitions, sequential processing
Resolution: Human maintainer as final arbiter
13.3 Adoption Risks Learning Curve:
Risk: New contributors struggle with system complexity
Mitigation: Comprehensive documentation, template guidance
Support: AI_Moderator provides onboarding assistance
Process Overhead:
Risk: System creates too much ceremony for small changes
Mitigation: Configurable rule enabling/disabling
Flexibility: Bypass options for trivial changes
14 Template Evolution 14.1 Versioning Strategy Template Location as Version:
Current templates always in process/templates/
Breaking changes require new feature request and migration plan
Existing features use templates current at their creation
Migration Guidance:
Document template changes in release notes
Provide automated migration scripts for simple changes
Flag features using deprecated templates
14.2 Core Templates Feature Request Template (process/templates/feature_request.md):
markdown