1st commit
This commit is contained in:
parent
18d84c2b15
commit
b2dec202e3
502
Docs/DESIGN.md
502
Docs/DESIGN.md
|
|
@ -859,7 +859,13 @@ rules:
|
|||
- Conflict Resolution: Nearest rule wins, with logging of override decisions
|
||||
|
||||
## Orchestration Architecture
|
||||
### Bash Pre-commit Hook (Current Implementation)
|
||||
**Principles**
|
||||
- **Single-commit boundary:** automation only stages changes within the *current* commit; it never creates new commits or loops.
|
||||
- **Deterministic prompts:** identical inputs produce identical patches (prompt hashing + stable sorting of inputs).
|
||||
- **Nearest-rule wins:** rule resolution favors the closest `.ai-rules.yml`.
|
||||
- **Fail fast, explain:** on any failure, keep the index untouched and write actionable diagnostics to `.git/ai-rules-debug/`.
|
||||
|
||||
### Bash Pre-commit Hook
|
||||
Core Responsibilities:
|
||||
- Collect staged files (Added/Modified only)
|
||||
- Resolve rules via cascading lookup
|
||||
|
|
@ -867,14 +873,63 @@ Core Responsibilities:
|
|||
- Call AI model via CLI for patch generation
|
||||
- Apply patches with robust error handling
|
||||
|
||||
Prompt Envelope (deterministic)
|
||||
```text
|
||||
BEGIN ENVELOPE
|
||||
VERSION: 1
|
||||
SOURCE_FILE: <rel_path>
|
||||
RULE: <rule_name>/<output_key>
|
||||
FEATURE_ID: <feature_id>
|
||||
STAGE: <stage>
|
||||
POLICY_SHA256: <sha of process/policies.yml>
|
||||
CONTEXT_FILES: <sorted list>
|
||||
PROMPT_SHA256: <sha of everything above + inputs>
|
||||
--- INPUT:FILE ---
|
||||
<trimmed content or staged diff>
|
||||
--- INPUT:POLICY ---
|
||||
<process/policies.yml relevant subset>
|
||||
--- INSTRUCTION ---
|
||||
<rules.outputs[*].instruction>
|
||||
END ENVELOPE
|
||||
```
|
||||
On output, the model must return only a unified diff between
|
||||
`<<<AI_DIFF_START>>>` and `<<<AI_DIFF_END>>>`. The orchestrator records
|
||||
`PROMPT_SHA256` alongside the patch for reproducibility.
|
||||
|
||||
Execution Order (per staged file)
|
||||
1) **resolve_rules(rel_path)** → pick nearest `.ai-rules.yml`, match `file_associations`, assemble outputs.
|
||||
2) **build_prompt(ctx)** → gather file content/diff, parsed headers, policy, `{feature_id}/{stage}` and neighboring artifacts.
|
||||
3) **invoke_model(prompt)** → receive a **unified diff** envelope (no raw text rewrites).
|
||||
4) **sanitize_diff()** → enforce patch constraints (no path traversal, within repo, size limits).
|
||||
5) **apply_patch()** → try 3-way apply, then strict apply; stage only on success.
|
||||
6) **log_diagnostics()** → write `resolution.log`, raw/clean/sanitized/final diffs.
|
||||
|
||||
Enhanced Template Support:
|
||||
|
||||
```bash
|
||||
# Add to resolve_template() function
|
||||
local dirpath
|
||||
dirpath=$(dirname "$rel_path")
|
||||
# ...
|
||||
-e "s|{dir}|$dirpath|g"
|
||||
# Add/extend in resolve_template() function
|
||||
resolve_template() {
|
||||
local tmpl="$1" rel_path="$2"
|
||||
local today dirpath basename name ext feature_id stage
|
||||
today="$(date +%F)"
|
||||
dirpath="$(dirname "$rel_path")"
|
||||
basename="$(basename "$rel_path")"
|
||||
name="${basename%.*}"
|
||||
ext="${basename##*.}"
|
||||
# nearest FR_* ancestor as feature_id
|
||||
feature_id="$(echo "$rel_path" | sed -n 's|.*Docs/features/\(FR_[^/]*\).*|\1|p')"
|
||||
# infer stage from <stage>.discussion.md when applicable
|
||||
stage="$(echo "$basename" | sed -n 's/^\([A-Za-z0-9_-]\+\)\.discussion\.md$/\1/p')"
|
||||
echo "$tmpl" \
|
||||
| sed -e "s|{date}|$today|g" \
|
||||
-e "s|{rel}|$rel_path|g" \
|
||||
-e "s|{dir}|$dirpath|g" \
|
||||
-e "s|{basename}|$basename|g" \
|
||||
-e "s|{name}|$name|g" \
|
||||
-e "s|{ext}|$ext|g" \
|
||||
-e "s|{feature_id}|$feature_id|g" \
|
||||
-e "s|{stage}|$stage|g"
|
||||
}
|
||||
```
|
||||
Patch Application Strategy:
|
||||
- Preserve Index Lines: Enable 3-way merge capability
|
||||
|
|
@ -882,13 +937,22 @@ Patch Application Strategy:
|
|||
- Fallback to Strict: git apply --index if 3-way fails
|
||||
- Debug Artifacts: Save raw/clean/sanitized/final patches to .git/ai-rules-debug/
|
||||
|
||||
Additional Safeguards:
|
||||
- Reject patches that:
|
||||
- create or edit files outside the repo root
|
||||
- exceed 200 KB per artifact (configurable)
|
||||
- modify binary or non-targeted files for the current output
|
||||
- Normalize line endings; ensure new files include headers when required.
|
||||
- Abort on conflicting hunks; do not partially apply a file’s patch.
|
||||
|
||||
Discussion File Optimization:
|
||||
- Prefer append-only edits with optional header flips
|
||||
- For large files: generate full new content and compute diff locally
|
||||
- Minimize hunk drift through careful patch construction
|
||||
- Enforce append-only: refuse hunks that modify prior lines except header keys explicitly allowed (`status`, timestamps, `feature_id`, `stage_id`).
|
||||
|
||||
### Python Orchestrator (automation/workflow.py)
|
||||
Phase 1 (Non-blocking Status):
|
||||
#### Phase 1 (Non-blocking Status):
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
|
|
@ -903,48 +967,97 @@ def main():
|
|||
print(json.dumps(status_report, indent=2))
|
||||
sys.exit(0) # Always non-blocking in v1
|
||||
```
|
||||
Core Functions:
|
||||
#### Core Functions:
|
||||
- Vote Parsing: Parse discussion files, track latest votes per participant
|
||||
- Threshold Evaluation: Compute eligibility and quorum status
|
||||
- Status Reporting: JSON output of current discussion state
|
||||
- Decision Hints: Suggest promotion based on policy rules
|
||||
|
||||
Future Enhancements:
|
||||
#### Optimization Notes:
|
||||
- Memoize `load_policy()` and `parse_front_matter()` with LRU caches.
|
||||
- Reuse a single regex object for `vote_line_regex`.
|
||||
- Avoid re-reading unchanged files by comparing `git hash-object` results.
|
||||
|
||||
#### CLI (v1):
|
||||
- workflow.py --status # print stage/vote status for staged files
|
||||
- workflow.py --summarize <path> # regenerate summary sections to stdout (no write)
|
||||
- workflow.py --dry-run # run full pipeline but do not stage patches
|
||||
- Outputs are written to stdout and .git/ai-rules-debug/orchestrator.log.
|
||||
|
||||
#### Future Enhancements:
|
||||
- Policy enforcement based on process/policies.yml
|
||||
- Gitea API integration for issue/PR management
|
||||
- Advanced agent coordination and task routing
|
||||
|
||||
#### Model Invocation (env-configured)
|
||||
- `AI_MODEL_CMD` (default: `claude`) and `AI_MODEL_OPTS` are read from env.
|
||||
- `AI_RULES_SEED` allows deterministic sampling where supported.
|
||||
- If the model returns non-diff output, the hook aborts with diagnostics.
|
||||
|
||||
#### Environment toggles:
|
||||
- `AI_RULES_MAX_JOBS` caps parallel workers (default 4)
|
||||
- `AI_RULES_CACHE_DIR` overrides `.git/ai-rules-cache`
|
||||
- `AI_RULES_DISABLE_CACHE=1` forces re-generation
|
||||
- `AI_RULES_CI=1` enables dry-run & cache-only in CI
|
||||
### Gitea Integration (Future)
|
||||
Label System:
|
||||
|
||||
#### Label System:
|
||||
- stage/*: stage/discussion, stage/design, stage/implementation, etc.
|
||||
- blocked/*: blocked/needs-votes, blocked/needs-human
|
||||
- needs/*: needs/design, needs/review, needs/tests
|
||||
|
||||
Automated Actions:
|
||||
#### Automated Actions:
|
||||
- Open/label PRs for implementation transitions
|
||||
- Post status summaries to PR threads
|
||||
- Create tracking issues for feature implementation
|
||||
- Report status checks to PRs
|
||||
|
||||
#### Happy Path (single changed discussion file)
|
||||
```text
|
||||
git add Docs/features/FR_x/.../feature.discussion.md
|
||||
└─ pre-commit
|
||||
├─ resolve_rules → feature_discussion + summary_companion
|
||||
├─ build_prompt (PROMPT_SHA256=…)
|
||||
├─ invoke_model → <<<AI_DIFF_START>>>…<<<AI_DIFF_END>>>
|
||||
├─ sanitize_diff + guards
|
||||
├─ apply_patch (3-way → strict)
|
||||
└─ write logs under .git/ai-rules-debug/
|
||||
```
|
||||
|
||||
## Moderator Protocol
|
||||
### AI_Moderator Responsibilities
|
||||
Conversation Tracking:
|
||||
#### Conversation Tracking:
|
||||
- Monitor unanswered questions (>24 hours)
|
||||
- Track missing votes from active participants
|
||||
- Identify stale threads needing attention
|
||||
- Flag direct mentions that need responses
|
||||
|
||||
Progress Reporting:
|
||||
##### Signals & Triggers
|
||||
- **Unanswered Qs:** any line ending with `?` or prefixed `Q:` with an `@owner` and no reply within `response_timeout_hours`.
|
||||
- **Missing Votes:** participants who posted in the stage but whose last non-empty line does not match `vote_line_regex`.
|
||||
- **Stale Discussion:** no new comments within `discussion_stale_days`.
|
||||
- **Promotion Drift:** conflicting votes present beyond `promotion_timeout_days`.
|
||||
|
||||
#### Progress Reporting:
|
||||
- Compute current vote tallies and thresholds
|
||||
- List participants who haven't voted recently
|
||||
- Summarize promotion status and remaining requirements
|
||||
- Highlight blocking issues or concerns
|
||||
|
||||
Task Allocation:
|
||||
##### Comment Template & Constraints:
|
||||
- Max 10 lines, each ≤ 120 chars.
|
||||
- Sections in order: **UNANSWERED**, **VOTES**, **ACTION ITEMS**, **STATUS**.
|
||||
- Always end with `VOTE: CHANGES` (so it never promotes by itself).
|
||||
|
||||
#### Task Allocation:
|
||||
- Suggest explicit owners for pending tasks
|
||||
- Example: "AI_Architect: please draft the acceptance criteria section"
|
||||
- Example: "Rob: could you clarify the deployment timeline?"
|
||||
|
||||
##### Escalation Path:
|
||||
- If blockers persist past `promotion_timeout_days`, ping owners + maintainer.
|
||||
- If still unresolved after one more nudge interval, create a follow-up entry in the summary’s **ACTION_ITEMS** with owner + due date.
|
||||
|
||||
### Moderator Implementation
|
||||
Rule Definition (in Docs/features/.ai-rules.yml):
|
||||
|
||||
|
|
@ -952,8 +1065,8 @@ Rule Definition (in Docs/features/.ai-rules.yml):
|
|||
discussion_moderator_nudge:
|
||||
outputs:
|
||||
self_append:
|
||||
path: "{dir}/discussions/{basename}.discussion.md"
|
||||
output_type: "feature_discussion_writer"
|
||||
path: "{dir}/discussions/{stage}.discussion.md"
|
||||
output_type: "discussion_moderator_writer"
|
||||
instruction: |
|
||||
Act as AI_Moderator. Analyze the entire discussion and:
|
||||
|
||||
|
|
@ -972,32 +1085,71 @@ discussion_moderator_nudge:
|
|||
|
||||
Keep comment under 10 lines. End with "VOTE: CHANGES".
|
||||
Append-only; minimal diff; update nothing else.
|
||||
UNANSWERED: list @owners for Qs > response_timeout_hours.
|
||||
VOTES: READY=X, CHANGES=Y, REJECT=Z; Missing=[@a, @b]
|
||||
ACTION: concrete next steps with @owner and a due date.
|
||||
STATUS: promotion readiness per process/policies.yml (voting.quorum).
|
||||
Constraints: ≤10 lines; ≤120 chars/line; append-only; end with:
|
||||
VOTE: CHANGES
|
||||
```
|
||||
Nudge Frequency: Controlled by nudge_interval_hours in policies
|
||||
> **Automation boundary:** Moderator comments are appended within the current commit; no auto-commits are created.
|
||||
|
||||
## Error Handling & Resilience
|
||||
|
||||
### Safety Invariants
|
||||
- **No auto-commits:** automation only stages changes in the current commit.
|
||||
- **Atomic per-file:** a patch for a file applies all-or-nothing; no partial hunks.
|
||||
- **Append-first for discussions:** prior lines are immutable except allowed header keys.
|
||||
- **Inside repo only:** patches cannot create/modify files outside the repository root.
|
||||
- **Deterministic retry:** identical inputs → identical patches (same prompt hash).
|
||||
|
||||
### Common Failure Modes
|
||||
Patch Application Issues:
|
||||
- Symptom: Hunk drift on large files, merge conflicts
|
||||
- Mitigation: 3-way apply with index preservation, append-only strategies
|
||||
- Fallback: Local diff computation from full new content
|
||||
- Exit code: 2 (apply failure); write `final.diff` and `apply.stderr`
|
||||
|
||||
Model Output Problems:
|
||||
- Symptom: Malformed diff, missing markers, invalid patch format
|
||||
- Mitigation: Extract between markers, validate with git apply --check
|
||||
- Fallback: Clear diagnostics with patch validation output
|
||||
- Exit code: 3 (invalid diff); write `raw.out`, `clean.diff`, `sanitize.log`
|
||||
|
||||
Tooling Dependencies:
|
||||
- Symptom: Missing yq, claude, or other required tools
|
||||
- Mitigation: Pre-flight checks with clear error messages
|
||||
- Fallback: Graceful degradation with feature-specific disabling
|
||||
- Exit code: 4 (missing dependency); write `preflight.log`
|
||||
|
||||
Rule Conflicts:
|
||||
- Symptom: Multiple rules matching same file with conflicting instructions
|
||||
- Mitigation: Nearest-directory precedence with conflict logging
|
||||
- Fallback: Global rule application with warning
|
||||
- Exit code: 5 (rule resolution); write `resolution.log`
|
||||
|
||||
Guardrail Violations:
|
||||
- Symptom: Patch touches forbidden paths, exceeds size, or edits outside markers
|
||||
- Mitigation: Reject patch, print exact guard name and offending path/line count
|
||||
- Exit code: 6 (guardrail); write `guards.json`
|
||||
|
||||
### Retry & Idempotency
|
||||
- Re-run the same commit contents → identical `PROMPT_SHA256` and identical patch.
|
||||
- To force a new generation, change *only* the source file content or the rule instruction.
|
||||
- `--dry-run` prints the unified diff without staging; useful for CI and reproduction.
|
||||
|
||||
### Recovery Procedures
|
||||
|
||||
#### Quick Triage Map
|
||||
| Failure | Where to Look | Fast Fix |
|
||||
|---|---|---|
|
||||
| Patch won’t apply | `.git/ai-rules-debug/*/apply.stderr` | Rebase or re-run after pulling; if discussion, ensure append-only |
|
||||
| Invalid diff envelope | `raw.out`, `clean.diff`, `sanitize.log` | Check that model returned `<<<AI_DIFF_START/END>>>`; shorten file context |
|
||||
| Rule not found | `resolution.log` | Verify `file_associations` and `{stage}`/`{feature_id}` resolution |
|
||||
| Guardrail breach | `guards.json` | Reduce patch size, keep edits within markers, or adjust config limit |
|
||||
| Missing dependency | `preflight.log` | Install tool or disable rule until available |
|
||||
|
||||
Manual Override:
|
||||
|
||||
```bash
|
||||
|
|
@ -1011,19 +1163,42 @@ Debug Artifacts:
|
|||
- All patch variants saved to .git/ai-rules-debug/
|
||||
- Timestamped files: raw, clean, sanitized, final patches
|
||||
- Commit-specific directories for correlation
|
||||
- Rule resolution decisions saved to `.git/ai-rules-debug/resolution.log` including matched rule, output keys, and template-expanded paths.
|
||||
|
||||
Rollback Strategy:
|
||||
- All generated artifacts are staged separately
|
||||
- Easy partial staging: git reset HEAD <file> for specific artifacts
|
||||
- Full reset: git reset HEAD~1 to undo entire commit with generations
|
||||
|
||||
Regenerate Safely:
|
||||
```bash
|
||||
# See what would be generated without staging anything
|
||||
automation/workflow.py --dry-run
|
||||
# Apply only after inspection
|
||||
git add -p
|
||||
````
|
||||
Bypass & Minimal Patch:
|
||||
```bash
|
||||
# Temporarily bypass the hook for urgent hand-edits
|
||||
git commit --no-verify -m "Hotfix: manual edit, will reconcile with rules later"
|
||||
```
|
||||
|
||||
### Audit Trail
|
||||
Execution Logging:
|
||||
|
||||
#### Patch Sanitization & Guards (summary)
|
||||
- Validate unified diff headers; reject non-diff content.
|
||||
- Enforce append-only on discussions; allow header keys: status, feature_id, stage_id, timestamps.
|
||||
- Enforce marker-bounded edits for *.discussion.sum.md.
|
||||
- Limit per-artifact patch size (default 200 KB; configurable).
|
||||
- Reject paths escaping repo root or targeting binaries.
|
||||
- See Appendix B for the normative, full rule set.
|
||||
|
||||
#### Execution Logging:
|
||||
- All rule invocations logged with source→output mapping
|
||||
- Patch application attempts and outcomes recorded
|
||||
- Vote calculations and promotion decisions documented
|
||||
|
||||
Debug Bundle:
|
||||
#### Debug Bundle:
|
||||
|
||||
```bash
|
||||
.git/ai-rules-debug/
|
||||
|
|
@ -1034,15 +1209,42 @@ Debug Bundle:
|
|||
│ └─ final.diff # Final applied patch
|
||||
└─ execution.log # Chronological action log
|
||||
```
|
||||
### Operator Checklist (1-minute)
|
||||
1. git status → confirm only intended files are staged.
|
||||
2. Open .git/ai-rules-debug/…/apply.stderr (if failed) or final.diff.
|
||||
3. If discussion file: ensure your change is append-only.
|
||||
4. Re-run automation/workflow.py --dry-run and compare diffs.
|
||||
5. If still blocked, bypass with --no-verify, commit, and open a follow-up to reconcile.
|
||||
|
||||
## Security & Secrets Management
|
||||
|
||||
### Security Principles
|
||||
- **No plaintext secrets in Git** — ever.
|
||||
- **Scan before stage** — block secrets at pre-commit, not in CI.
|
||||
- **Redact on write** — debug logs and prompts never store raw secrets.
|
||||
- **Least scope** — env vars loaded only for the current process; not persisted.
|
||||
|
||||
### Secret Protection
|
||||
Never Commit:
|
||||
#### Never Commit:
|
||||
- API keys, authentication tokens
|
||||
- Personal identifying information
|
||||
- Internal system credentials
|
||||
- Private configuration data
|
||||
|
||||
Environment Variables:
|
||||
#### Secret Scanning & Blocking (pre-commit):
|
||||
- Run lightweight detectors before rule execution; fail fast on matches.
|
||||
- Suggested tools (any one is fine): `git-secrets`, `gitleaks`, or `trufflehog` (regex mode).
|
||||
- Provide a repo-local config at `process/secrets.allowlist` to suppress false positives.
|
||||
|
||||
#### Redaction Policy:
|
||||
- If a candidate secret is detected in an input file, the hook **aborts**.
|
||||
- If a secret appears only in model output or logs, it is **replaced** with `***REDACTED***` before writing artifacts.
|
||||
|
||||
#### Inbound/Outbound Data Handling:
|
||||
- **Inbound (source & discussions):** if a suspected secret is present, the hook blocks the commit and points to the line numbers.
|
||||
- **Outbound (logs, envelopes, diffs):** redact values and include a `[REDACTED:<key>]` tag to aid debugging without leakage.
|
||||
|
||||
#### Environment Variables:
|
||||
|
||||
```bash
|
||||
# Current approach
|
||||
|
|
@ -1050,74 +1252,160 @@ export CLAUDE_API_KEY="your_key"
|
|||
# Future .env approach (git-ignored)
|
||||
# .env file loaded via python-dotenv in Python components
|
||||
```
|
||||
.gitignore (additions):
|
||||
```
|
||||
.env
|
||||
.env.*
|
||||
# .env.local, .env.prod, etc.
|
||||
*.key
|
||||
*.pem
|
||||
*.p12
|
||||
secrets/*.json
|
||||
secrets/*.yaml
|
||||
```
|
||||
Provide non-sensitive examples as *.sample:
|
||||
- .env.sample with placeholder keys
|
||||
- automation/config.sample.yml showing structure without values
|
||||
|
||||
Configuration Management:
|
||||
- Keep sensitive endpoints in automation/config.yml
|
||||
- Use environment variable substitution in configuration
|
||||
- Validate no secrets in discussions, rules, or generated artifacts
|
||||
- Substitution happens **in-memory** during prompt build; no expanded values are written back to disk.
|
||||
- Maintain a denylist of key names that must never appear in artifacts: `API_KEY, ACCESS_TOKEN, SECRET, PASSWORD, PRIVATE_KEY`.
|
||||
|
||||
### Access Control
|
||||
Repository Security:
|
||||
- Assume all repository contents are potentially exposed
|
||||
- No sensitive business logic in prompt instructions
|
||||
- Regular security reviews of rule definitions
|
||||
- Guardrails: outputs cannot target paths outside repo root; writes to `secrets/` are blocked.
|
||||
|
||||
Agent Permissions:
|
||||
- Limit file system access to repository scope
|
||||
- Validate output paths stay within repository
|
||||
- Sanitize all file operations for path traversal
|
||||
- Prompt Redaction: when building the model prompt, mask env-like values with `***REDACTED***` for any key matching the denylist or high-entropy detector.
|
||||
- See **Appendix B: Diff Application Rules (Normative)** for the full list of path/size/marker guardrails enforced during patch application.
|
||||
|
||||
|
||||
### Incident Response & Rotation
|
||||
- If a secret is accidentally committed, immediately:
|
||||
1) Rotate the key at the provider,
|
||||
2) Purge it from Git history (e.g., `git filter-repo`),
|
||||
3) Invalidate caches and re-run the secret scanner.
|
||||
- Track rotations in a private runbook (outside the repo).
|
||||
|
||||
### Preflight Checks (hook)
|
||||
- Verify required tools present: `git`, `python3`, `yq` (optional), chosen secret scanner.
|
||||
- Run secret scanner against **staged** changes; on hit → exit 11.
|
||||
- Validate `.ai-rules.yml` schema; on error → exit 12.
|
||||
- Confirm patch guards (size/paths); violations → exit 13.
|
||||
- Diagnostics: write to `.git/ai-rules-debug/preflight.log`.
|
||||
|
||||
## Performance & Scale Considerations
|
||||
### Optimization Strategies
|
||||
Prompt Efficiency:
|
||||
|
||||
#### Deterministic Caching & Batching:
|
||||
- **Prompt cache**: reuse model outputs when `PROMPT_SHA256` is identical.
|
||||
- **Batch compatible files**: same rule/output pairs with small contexts can be grouped.
|
||||
- **Stable ordering**: sort staged files + outputs before batching to keep results repeatable.
|
||||
- **Cache location**: `.git/ai-rules-cache/` (keys by `PROMPT_SHA256` + rule/output).
|
||||
|
||||
#### Prompt Efficiency:
|
||||
- Pass staged diffs instead of full file contents when possible
|
||||
- Use concise, structured instructions with clear formatting
|
||||
- Limit context to relevant sections for large files
|
||||
- Preload policy once per run; inject only relevant subsections into the prompt
|
||||
- Memoize parsed front-matter (YAML) and ASTs across files in the same run
|
||||
- Trim discussion context to the last N lines (configurable) + stable summary
|
||||
|
||||
Discussion Management:
|
||||
#### Discussion Management:
|
||||
- Append-only edits with periodic summarization
|
||||
- Compact status reporting in moderator comments
|
||||
- Archive completed discussions if they become too large
|
||||
- Sliding-window summarization: regenerate `{stage}.discussion.sum.md` when diff > threshold lines
|
||||
- Limit TIMELINE to the last 15 entries (configurable)
|
||||
|
||||
Batch Operations:
|
||||
#### Batch Operations:
|
||||
- Process multiple related files in single model calls when beneficial
|
||||
- Cache rule resolutions for multiple files in same directory
|
||||
- Parallelize independent output generations
|
||||
- Cap parallelism with `AI_RULES_MAX_JOBS` (default 4) to avoid CPU thrash.
|
||||
- Deduplicate prompts for identical contexts across multiple outputs.
|
||||
|
||||
### Scaling Limits
|
||||
File Size Considerations:
|
||||
|
||||
#### File Size Considerations:
|
||||
- Small (<100KB): Full content in prompts
|
||||
- Medium (100KB-1MB): Diff-only with strategic context
|
||||
- Large (>1MB): Chunked processing or summary-only approaches
|
||||
- Very large (>5MB): refuse inline context; require pre-summarized artifacts
|
||||
|
||||
Repository Size:
|
||||
#### Context Window Strategy:
|
||||
- Hard cap prompt body at 200 KB per output (configurable)
|
||||
- If over cap: (1) include diff; (2) include header + last 200 lines; (3) link to file path
|
||||
|
||||
#### AST/Diagram Work:
|
||||
- Cache ASTs in `.git/ai-rules-cache/ast/` keyed by `<rel_path>:<blob_sha>`
|
||||
- Rate-limit diagram updates to once per file per commit (guard duplicate runs)
|
||||
|
||||
#### Repository Size:
|
||||
- Current approach suitable for medium-sized repositories
|
||||
- For very large codebases: scope rules to specific directories
|
||||
- Consider rule disabling for generated/binary assets
|
||||
|
||||
Rate Limiting:
|
||||
#### Rate Limiting:
|
||||
- Model API calls: implement throttling and retry logic
|
||||
- Gitea API: respect rate limits with exponential backoff
|
||||
- File operations: batch where possible to reduce I/O
|
||||
|
||||
#### Performance Telemetry (optional)
|
||||
- Write `.git/ai-rules-debug/perf.json` with per-output timings:
|
||||
`{ resolve_ms, prompt_ms, model_ms, sanitize_ms, apply_ms, bytes_in, bytes_out }`
|
||||
- Summarize totals at end of run for quick regressions spotting.
|
||||
|
||||
## Testing Strategy
|
||||
### Testing Tiers
|
||||
Unit Tests (Python):
|
||||
|
||||
## Goals
|
||||
- Prove determinism (same inputs → same patch).
|
||||
- Prove guardrails (append-only, marker-bounded, path/size limits).
|
||||
- Prove promotion math (votes, quorum, human gates).
|
||||
- Keep runs fast and hermetic (temp repo, mock clock, seeded RNG).
|
||||
|
||||
## Testing Tiers
|
||||
### Unit Tests (Python):
|
||||
- Vote parsing and eligibility calculation
|
||||
- Policy evaluation and quorum determination
|
||||
- Rules resolution and conflict handling
|
||||
- Template variable substitution
|
||||
- Integration Tests (Bash + Python):
|
||||
|
||||
### Integration Tests (Bash + Python):
|
||||
- End-to-end rule → prompt → patch → apply cycle
|
||||
- Discussion status transitions and promotion logic
|
||||
- Error handling and recovery procedures
|
||||
- Multi-file rule processing
|
||||
|
||||
Artifact Validation:
|
||||
### Artifact Validation:
|
||||
- PlantUML syntax checking: plantuml -checkonly
|
||||
- Markdown structure validation
|
||||
- Template completeness checks
|
||||
- YAML syntax validation
|
||||
|
||||
### Golden & Snapshot Tests:
|
||||
- **Prompt Envelope Golden**: compare against `tests/gold/envelopes/<case>.txt`
|
||||
- **Diff Output Golden**: compare unified diffs in `tests/gold/diffs/<case>.diff`
|
||||
- **Summary Snapshot**: write `{stage}.discussion.sum.md` and compare against `tests/snapshots/<case>/<stage>.discussion.sum.md` (markers only)
|
||||
|
||||
### Property-Based Tests:
|
||||
- Using `hypothesis` to fuzz discussion comments; invariants:
|
||||
- last non-empty line drives the vote
|
||||
- regex `vote_line_regex` never matches malformed lines
|
||||
- marker-bounded writer never edits outside markers
|
||||
|
||||
### Mutation Tests (optional):
|
||||
- Run `mutmut` on `automation/workflow.py` vote math and ensure tests fail when logic is mutated.
|
||||
|
||||
### Test Architecture
|
||||
```text
|
||||
tests/
|
||||
|
|
@ -1136,14 +1424,39 @@ tests/
|
|||
│ │ └─ Docs/features/FR_test/
|
||||
│ │ ├─ request.md
|
||||
│ │ └─ discussions/
|
||||
│ │ └─ data/
|
||||
│ │ ├─ bigfile.md # >1MB to trigger chunking
|
||||
│ │ ├─ bad.diff # malformed diff for sanitizer tests
|
||||
│ │ ├─ secrets.txt # simulated secrets for scanner tests
|
||||
│ │ └─ envelopes/ # golden prompt envelopes
|
||||
│ ├─ gold/
|
||||
│ │ ├─ envelopes/
|
||||
│ │ └─ diffs/
|
||||
│ └─ test_cases/
|
||||
│ ├─ test_feature_promotion.sh
|
||||
│ ├─ test_design_generation.sh
|
||||
│ └─ test_bug_creation.sh
|
||||
│ ├─ test_bug_creation.sh
|
||||
│ ├─ test_append_only_guard.sh
|
||||
│ ├─ test_summary_snapshot.sh
|
||||
│ ├─ test_secret_scanner_block.sh
|
||||
│ ├─ test_ci_cache_only_mode.sh
|
||||
│ ├─ test_moderator_nudge.sh
|
||||
│ └─ test_rule_precedence.sh
|
||||
├─ bin/
|
||||
│ └─ claude # Fake deterministic model
|
||||
├─ snapshots/
|
||||
│ └─ FR_test_case/
|
||||
│ ├─ feature.discussion.sum.md
|
||||
│ └─ design.discussion.sum.md
|
||||
└─ README.md
|
||||
```
|
||||
|
||||
### Hermetic Test Utilities
|
||||
- Mock clock: set SOURCE_DATE_EPOCH to freeze {date} expansions.
|
||||
- Temp repo: each test case creates a fresh TMP_REPO with isolated .git.
|
||||
- Seeded RNG: set AI_RULES_SEED for deterministic model variants.
|
||||
- Filesystem isolation: tests write only under TMPDIR and .git/ai-rules-*.
|
||||
|
||||
### Fake Model Implementation
|
||||
Purpose: Deterministic testing without external API dependencies
|
||||
|
||||
|
|
@ -1152,11 +1465,24 @@ Implementation (tests/bin/claude):
|
|||
```bash
|
||||
#!/bin/bash
|
||||
# Fake Claude CLI for testing
|
||||
# Reads prompt from stdin, outputs predetermined patch based on content
|
||||
# Reads prompt envelope from stdin, outputs a unified diff or injected error.
|
||||
# Controls:
|
||||
# AI_FAKE_ERR=diff|apply|malformed (force error modes)
|
||||
# AI_FAKE_SEED=<int> (deterministic variant)
|
||||
# AI_FAKE_MODE=discussion|design (which template to emit)
|
||||
|
||||
set -euo pipefail
|
||||
prompt="$(cat)"
|
||||
|
||||
if [[ "${AI_FAKE_ERR:-}" == "malformed" ]]; then
|
||||
echo "this is not a diff"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
target_file=$(echo "$prompt" | awk '/^SOURCE_FILE:/ {print $2}')
|
||||
if echo "$prompt" | grep -q "RULE: .*feature_discussion/self_append"; then
|
||||
cat << 'EOF'
|
||||
|
||||
if grep -q "OUTPUT FILE:.*discussion.md" ; then
|
||||
# Output discussion update patch
|
||||
cat << 'EOF'
|
||||
<<<AI_DIFF_START>>>
|
||||
diff --git a/Docs/features/FR_test/discussions/feature.discussion.md b/Docs/features/FR_test/discussions/feature.discussion.md
|
||||
index 1234567..890abcd 100644
|
||||
|
|
@ -1166,15 +1492,35 @@ index 1234567..890abcd 100644
|
|||
|
||||
## Summary
|
||||
Test feature for validation
|
||||
+
|
||||
+## Participation
|
||||
+AI_Test: This is a test comment. VOTE: READY
|
||||
+
|
||||
+## Participation
|
||||
+AI_Test: This is a test comment. VOTE: READY
|
||||
EOF
|
||||
elif echo "$prompt" | grep -q "RULE: .*discussion_summary_writer"; then
|
||||
cat << 'EOF'
|
||||
<<<AI_DIFF_START>>>
|
||||
diff --git a/Docs/features/FR_test/discussions/feature.discussion.sum.md b/Docs/features/FR_test/discussions/feature.discussion.sum.md
|
||||
index 1111111..2222222 100644
|
||||
--- a/Docs/features/FR_test/discussions/feature.discussion.sum.md
|
||||
+++ b/Docs/features/FR_test/discussions/feature.discussion.sum.md
|
||||
@@ -5,3 +5,4 @@
|
||||
<!-- SUMMARY:VOTES START -->
|
||||
## Votes (latest per participant)
|
||||
READY: 1 • CHANGES: 0 • REJECT: 0
|
||||
- Rob
|
||||
<!-- SUMMARY:VOTES END -->
|
||||
<<<AI_DIFF_END>>>
|
||||
EOF
|
||||
else
|
||||
# Default patch for other file types
|
||||
echo "No specific patch for this file type"
|
||||
fi
|
||||
echo "<<<AI_DIFF_START>>>"
|
||||
echo "diff --git a/README.md b/README.md"
|
||||
echo "index 0000000..0000001 100644"
|
||||
echo "--- a/README.md"
|
||||
echo "+++ b/README.md"
|
||||
echo "@@ -0,0 +1,1 @@"
|
||||
echo "+Generated by fake model"
|
||||
echo "<<<AI_DIFF_END>>>"fi
|
||||
```
|
||||
### Integration Test Runner
|
||||
#### Key Test Scenarios
|
||||
|
|
@ -1183,6 +1529,12 @@ fi
|
|||
- Bug Creation: test failure → auto bug report generation
|
||||
- Error Recovery: Malformed patch → graceful failure with diagnostics
|
||||
- Rule Conflicts: Multiple rule matches → nearest-directory resolution
|
||||
- Append-Only Guard: attempt to edit earlier lines in discussion → reject
|
||||
- Summary Snapshot: only markers mutate; outside text preserved
|
||||
- Secret Scanner: staged secret blocks commit with exit 11
|
||||
- CI Cache-only: with AI_RULES_CI=1 and cache miss → exit 20
|
||||
- Moderator Nudge: comment ≤10 lines, ends with `VOTE: CHANGES`
|
||||
- Rule Precedence: local overrides feature, feature overrides global
|
||||
|
||||
#### Test Execution
|
||||
```bash
|
||||
|
|
@ -1193,17 +1545,42 @@ cd tests/integration
|
|||
# Run specific test case
|
||||
./test_cases/test_feature_promotion.sh
|
||||
```
|
||||
|
||||
Makefile (optional)
|
||||
```make
|
||||
.PHONY: test unit integ lint
|
||||
test: unit integ
|
||||
unit:
|
||||
- pytest -q tests/unit
|
||||
integ:
|
||||
- cd tests/integration && ./run.sh
|
||||
lint:
|
||||
- ruff check automation src || true
|
||||
```
|
||||
|
||||
### Continuous Validation
|
||||
Pre-commit Checks:
|
||||
#### Pre-commit Checks:
|
||||
- PlantUML syntax validation for generated diagrams
|
||||
- Markdown link validation
|
||||
- YAML syntax checking for rule files
|
||||
- Template variable validation
|
||||
|
||||
Performance Benchmarks:
|
||||
#### Performance Benchmarks:
|
||||
- Rule resolution time for typical commit
|
||||
- Patch generation and application duration
|
||||
- Memory usage during large file processing
|
||||
- **CI Mode (`AI_RULES_CI=1`)**:
|
||||
- Default to `--dry-run` and **cache-only** model lookups.
|
||||
- On cache miss, print the missing `PROMPT_SHA256`, skip invocation, and exit 20.
|
||||
- Use to keep CI fast and reproducible.
|
||||
|
||||
#### Coverage Targets:
|
||||
- ≥90% line coverage on `automation/workflow.py` vote/quorum logic
|
||||
- ≥80% branch coverage on rule resolution and guards
|
||||
|
||||
#### Success Criteria:
|
||||
- All golden prompts/diffs stable across runs (no drift)
|
||||
- Guardrail tests fail if append-only/marker or path checks are removed
|
||||
|
||||
## Source Intelligence Automation (Auto-Review + Auto-Diagram)
|
||||
|
||||
|
|
@ -1247,14 +1624,14 @@ js-file:
|
|||
description: "Generate PlantUML + review for JS/TS files"
|
||||
outputs:
|
||||
diagram:
|
||||
path: "Docs/diagrams/file_diagrams/{basename}.puml"
|
||||
path: "Docs/diagrams/file_diagrams/{name}.puml"
|
||||
output_type: "puml-file"
|
||||
instruction: |
|
||||
Parse code structure and update a PlantUML diagram:
|
||||
- Modules, classes, functions
|
||||
- Control-flow edges between major functions
|
||||
review:
|
||||
path: "Docs/discussions/reviews/{date}_{basename}.md"
|
||||
path: "Docs/discussions/reviews/{date}_{name}.md"
|
||||
output_type: "md-file"
|
||||
instruction: |
|
||||
Summarize this commit’s code changes:
|
||||
|
|
@ -1857,6 +2234,43 @@ timeouts:
|
|||
discussion_stale_days: 3
|
||||
nudge_interval_hours: 24
|
||||
promotion_timeout_days: 14
|
||||
moderation:
|
||||
max_lines: 10
|
||||
max_line_length: 120
|
||||
security:
|
||||
scanners:
|
||||
enabled: true
|
||||
tool: gitleaks # or git-secrets, trufflehog
|
||||
allowlist_file: process/secrets.allowlist
|
||||
redaction:
|
||||
apply_to:
|
||||
- logs
|
||||
- prompts
|
||||
- patches
|
||||
denylist_keys:
|
||||
- API_KEY
|
||||
- ACCESS_TOKEN
|
||||
- SECRET
|
||||
- PASSWORD
|
||||
- PRIVATE_KEY
|
||||
guards:
|
||||
block_paths:
|
||||
- secrets/
|
||||
max_patch_kb: 200
|
||||
forbid_binary_edits: true
|
||||
performance:
|
||||
max_jobs: 4
|
||||
prompt_kb_cap: 200
|
||||
discussion_timeline_limit: 15
|
||||
cache:
|
||||
enabled: true
|
||||
dir: .git/ai-rules-cache
|
||||
batching:
|
||||
enabled: true
|
||||
max_batch: 4
|
||||
ast_cache:
|
||||
enabled: true
|
||||
dir: .git/ai-rules-cache/ast
|
||||
```
|
||||
|
||||
### Appendix B: Diff Application Rules (Normative)
|
||||
|
|
|
|||
Loading…
Reference in New Issue