From 33b550ad5b2060cf9ffe36a45a8a9c4445e9131b Mon Sep 17 00:00:00 2001 From: rob Date: Sun, 2 Nov 2025 20:04:09 -0400 Subject: [PATCH] docs: complete update for AI normalization architecture MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Updated all documentation to reflect the new two-tier extraction system: **workflow-marker-extraction.puml:** - Completely rewritten to show AI normalization flow - Documents agents.normalize_discussion() as primary method - Shows simple line-start fallback for explicit markers - Includes natural conversation examples vs. explicit markers - Demonstrates resilience and cost-effectiveness **AUTOMATION.md:** - Restructured "Conversation Guidelines" section - Emphasizes natural conversation as recommended approach - Clarifies AI normalization extracts from conversational text - Documents explicit markers as fallback when AI unavailable - Explains two-tier architecture benefits **diagrams-README.md:** - Already updated in previous commit All documentation now accurately reflects: ✅ AI-powered extraction (agents.py) for natural conversation ✅ Simple fallback parsing (workflow.py) for explicit markers ✅ Multi-provider resilience (claude → codex → gemini) ✅ No strict formatting requirements for participants 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- docs/AUTOMATION.md | 70 +++++-- docs/workflow-marker-extraction.puml | 283 +++++++++------------------ docs/workflow-marker-extraction.svg | 151 ++++++++++++++ 3 files changed, 293 insertions(+), 211 deletions(-) create mode 100644 docs/workflow-marker-extraction.svg diff --git a/docs/AUTOMATION.md b/docs/AUTOMATION.md index acb9db6..fcdf1ec 100644 --- a/docs/AUTOMATION.md +++ b/docs/AUTOMATION.md @@ -265,41 +265,73 @@ Captures architectural decisions with rationale. ``` -## Conversation Guidelines (Optional) +## Conversation Guidelines -Using these markers helps extract information accurately. **Many work without AI using regex:** +### Natural Conversation (Recommended) + +**Write naturally - AI normalization extracts markers automatically:** ```markdown -# Markers (✅ = works without AI) +# Examples of natural conversation that AI understands: -Q: # ✅ Mark questions explicitly (also: "Question:", or ending with ?) -A: # Mark answers explicitly (AI tracks these) -Re: # Partial answers or follow-ups (AI tracks these) +- Alice: I think we should use OAuth2. Does anyone know if we need OAuth 2.1 specifically? + VOTE: READY + +- Bob: Good question Alice. I'm making a decision here - we'll use OAuth 2.0 for now. + @Carol can you research migration paths to 2.1? VOTE: CHANGES + +- Carol: I've completed the OAuth research. We can upgrade later without breaking changes. + VOTE: READY +``` + +**AI normalization (via `agents.py`) extracts:** +- Decisions from natural language ("I'm making a decision here - ...") +- Questions from conversational text ("Does anyone know if...") +- Action items with @mentions ("@Carol can you research...") +- Votes (always tracked: `VOTE: READY|CHANGES|REJECT`) + +### Explicit Markers (Fallback) + +**If AI is unavailable, these explicit line-start markers work as fallback:** + +```markdown +# Markers (✅ = works without AI as simple fallback) + +QUESTION: # ✅ Explicit question marker +Q: # ✅ Short form TODO: # ✅ New unassigned task -ACTION: # ✅ Task with implied ownership (alias for TODO) -ASSIGNED: @name # ✅ Claimed task (extracts @mention as assignee) +ACTION: # ✅ Task with implied ownership +ASSIGNED: @name # ✅ Claimed task DONE: # ✅ Mark task complete -DECISION: # ✅ Architectural decision (AI adds rationale/alternatives) -Rationale: # Explain reasoning (AI extracts this) +DECISION: # ✅ Architectural decision -VOTE: READY|CHANGES|REJECT # ✅ REQUIRED for voting (always tracked) +VOTE: READY|CHANGES|REJECT # ✅ ALWAYS tracked (with or without AI) -@Name # ✅ Mention someone specifically -@all # ✅ Mention everyone +@Name # ✅ Mention extraction (simple regex) ``` -**Example Workflow:** +**Example with explicit markers:** ```markdown -- Alice: Q: Should we support OAuth2? +- Alice: QUESTION: Should we support OAuth2? - Bob: TODO: Research OAuth2 libraries -- Bob: ASSIGNED: OAuth2 library research (@Bob taking ownership) -- Carol: DECISION: Use OAuth2 for authentication. Rationale: Industry standard with good library support. -- Carol: DONE: Completed OAuth2 comparison document -- Dave: @all Please review the comparison by Friday. VOTE: READY +- Bob: ASSIGNED: OAuth2 library research +- Carol: DECISION: Use OAuth2 for authentication +- Dave: @all Please review. VOTE: READY ``` +### Two-Tier Architecture + +1. **AI Normalization (Primary):** Handles natural conversation, embedded markers, context understanding +2. **Simple Fallback:** Handles explicit line-start markers when AI unavailable + +Benefits: +- ✅ Participants write naturally without strict formatting +- ✅ Resilient (multi-provider fallback: claude → codex → gemini) +- ✅ Works offline/API-down with explicit markers +- ✅ Cost-effective (uses fast models for extraction) + ## Implementation Details ### Incremental Processing diff --git a/docs/workflow-marker-extraction.puml b/docs/workflow-marker-extraction.puml index 7ecfad8..b740274 100644 --- a/docs/workflow-marker-extraction.puml +++ b/docs/workflow-marker-extraction.puml @@ -1,6 +1,6 @@ @startuml workflow-marker-extraction !theme plain -title Workflow Marker Extraction with Regex Pattern Matching +title Workflow Marker Extraction with AI Normalization start @@ -8,152 +8,80 @@ start :workflow.py reads file content; -partition "Parse Comments" { - :Split file into lines; +partition "Two-Tier Extraction" { + :Call extract_structured_basic()\nSimple fallback parsing; - repeat - :Read next line; + note right + **Fallback: Simple Line-Start Matching** + Only matches explicit markers at line start: + - DECISION: text + - QUESTION: text + - Q: text + - ACTION: text + - TODO: text + - ASSIGNED: text + - DONE: text - if (Line is HTML comment?) then (yes) - :Skip (metadata); - else if (Line is heading?) then (yes) - :Skip (structure); - else (participant comment) - :Extract participant name\n(before first ":"); + Uses case-insensitive startswith() matching. + Handles strictly-formatted discussions. + end note + :Store fallback results\n(decisions, questions, actions, mentions); + + :Call agents.normalize_discussion()\nAI-powered extraction; + + partition "AI Normalization (agents.py)" { + :Build prompt for AI model; + note right + **AI Prompt:** + "Extract structured information from discussion. + Return JSON with: votes, questions, decisions, + action_items, mentions" + + Supports natural conversation like: + "I'm making a decision here - we'll use X" + "Does anyone know if we need Y?" + "@Sarah can you check Z?" + end note + + :Execute command chain\n(claude → codex → gemini); + + if (AI returned valid JSON?) then (yes) + :Parse JSON response; + :Extract structured data:\n- votes\n- questions\n- decisions\n- action_items\n- mentions; + :Override fallback results\nwith AI results; note right - **Participant Format:** - - Rob: Comment text... - - Sarah: Comment text... - - AI_Claude: Comment text... - - Names starting with "AI_" - are excluded from voting if - allow_agent_votes: false + **AI advantages:** + - Handles embedded markers + - Understands context + - Extracts from natural language + - No strict formatting required + end note + else (no - AI failed or unavailable) + :Use fallback results only; + note right + **Fallback activated when:** + - All providers fail + - Invalid JSON response + - agents.py import fails + - API rate limits hit end note - - partition "Extract Structured Markers" { - :Apply regex patterns\nto comment text; - - if (**DECISION**: found?) then (yes) - :Extract decision text; - :Store decision record; - note right - **Pattern:** - (?:\\*\\*)?DECISION(?:\\*\\*)? - \\s*:\\s*(.+?) - (?=\\s*(?:\\*\\*QUESTION|\\*\\*ACTION|VOTE:)|$) - - **Captures:** Decision text until next marker - - **Example:** - { - participant: "Rob", - decision: "text...", - rationale: "", - supporters: [] - } - end note - endif - - if (**QUESTION**: found?) then (yes) - :Extract question text; - :Store question record; - note right - **Pattern:** - (?:\\*\\*)?(?:QUESTION|Q)(?:\\*\\*)? - \\s*:\\s*(.+?) - (?=\\s*(?:\\*\\*DECISION|\\*\\*ACTION|VOTE:)|$) - - **Captures:** Question text until next marker - - **Example:** - { - participant: "Rob", - question: "text...", - status: "OPEN" - } - end note - endif - - if (**ACTION**: found?) then (yes) - :Extract action text; - :Search for @mention in text; - :Store action record; - note right - **Pattern:** - (?:\\*\\*)?(?:ACTION|TODO)(?:\\*\\*)? - \\s*:\\s*(.+?) - (?=\\s*(?:\\*\\*DECISION|\\*\\*QUESTION|VOTE:)|$) - - **Captures:** Action text + assignee from @mention - - **Example:** - { - participant: "Rob", - action: "text...", - assignee: "Sarah", - status: "TODO" - } - end note - endif - - if (Line ends with "?") then (yes) - :Auto-detect as question; - note right - Fallback heuristic: - If no explicit marker but - line ends with "?", - treat as question - end note - endif - - if (@mention found?) then (yes) - :Extract @mentions; - :Store in "Awaiting Replies" list; - endif - } - - if (VOTE: line found?) then (yes) - :Extract vote value:\nREADY|CHANGES|REJECT; - :Store latest vote per participant; - endif endif - - repeat while (More lines?) is (yes) - -> no; + } } partition "Generate Summary Sections" { - :Format Decisions section: - - Group by participant - - Number sequentially - - Include rationale if present; + :Format Decisions section:\n- Group by participant\n- Number sequentially\n- Include rationale if present; - :Format Open Questions section: - - List unanswered questions - - Track by participant - - Mark status (OPEN/PARTIAL); + :Format Open Questions section:\n- List unanswered questions\n- Track by participant\n- Mark status (OPEN/PARTIAL); - :Format Action Items section: - - Group by status (TODO/ASSIGNED/DONE) - - Show assignees - - Link to requesters; + :Format Action Items section:\n- Group by status (TODO/ASSIGNED/DONE)\n- Show assignees\n- Link to requesters; - :Format Awaiting Replies section: - - Group by @mentioned person - - Show context of request - - Track unresolved mentions; + :Format Awaiting Replies section:\n- Group by @mentioned person\n- Show context of request\n- Track unresolved mentions; - :Format Votes section: - - Count by value (READY/CHANGES/REJECT) - - List latest vote per participant - - Exclude AI votes if configured; + :Format Votes section:\n- Count by value (READY/CHANGES/REJECT)\n- List latest vote per participant\n- Exclude AI votes if configured; - :Format Timeline section: - - Chronological order (newest first) - - Include status changes - - Summarize key events; + :Format Timeline section:\n- Chronological order (newest first)\n- Include status changes\n- Summarize key events; } :Update marker blocks in .sum.md; @@ -168,73 +96,44 @@ end note stop legend bottom - **Example Input (feature.discussion.md):** + **Example Input (natural conversation):** - Rob: The architecture looks solid. **DECISION**: We'll use PostgreSQL - for the database. **QUESTION**: Should we use TypeScript or JavaScript? - **ACTION**: @Sarah please research auth libraries. Looking forward to - feedback. VOTE: CHANGES + Rob: I've been thinking about the timeline. I'm making a decision here - + we'll build the upload system first. Does anyone know if we need real-time + preview? @Sarah can you research Unity Asset Store API? VOTE: READY - **Extracted Output (.sum.md):** + **AI Normalization Output (JSON):** + { + "votes": [{"participant": "Rob", "vote": "READY"}], + "decisions": [{"participant": "Rob", + "decision": "build the upload system first"}], + "questions": [{"participant": "Rob", + "question": "if we need real-time preview"}], + "action_items": [{"participant": "Rob", "action": "research Unity API", + "assignee": "Sarah"}], + "mentions": [{"from": "Rob", "to": "Sarah"}] + } - - ## Decisions (ADR-style) - ### Decision 1: We'll use PostgreSQL for the database. - - **Proposed by:** @Rob - - - - ## Open Questions - - @Rob: Should we use TypeScript or JavaScript? - - - - ## Action Items - ### TODO (unassigned): - - [ ] @Sarah please research auth libraries (suggested by @Rob) - - - - ## Awaiting Replies - ### @Sarah - - @Rob: ... **ACTION**: @Sarah please research auth libraries ... - + **Fallback Only Matches:** + DECISION: We'll build upload first + QUESTION: Do we need real-time preview? + ACTION: @Sarah research Unity API endlegend note right - **Regex Pattern Details:** + **Architecture Benefits:** - **Decision Pattern:** - (?:\\*\\*)?DECISION(?:\\*\\*)?\\s*:\\s*(.+?) - (?=\\s*(?:\\*\\*QUESTION|\\*\\*ACTION|VOTE:)|$) + ✓ Participants write naturally + ✓ No strict formatting rules + ✓ AI handles understanding + ✓ Simple code for fallback + ✓ Resilient (multi-provider chain) + ✓ Cost-effective (fast models) - **Features:** - - Case-insensitive - - Optional markdown bold (**) on both sides - - Captures text until next marker or VOTE: - - DOTALL mode for multi-line capture - - **Supported Formats:** - - DECISION: text - - **DECISION**: text - - decision: text - - **decision**: text -end note - -note right - **Why Regex Instead of Line-Start Matching?** - - ✗ Old approach: `if line.startswith("decision:"):` - Problem: Markers embedded mid-sentence fail - - ✓ New approach: Regex search anywhere in line - Handles: "Good point. **DECISION**: We'll use X." - - **Benefits:** - - Natural conversational style - - Markdown formatting preserved - - Multiple markers per comment - - Robust extraction + **Files:** + - automation/agents.py (AI normalization) + - automation/workflow.py (fallback + orchestration) + - automation/patcher.py (provider chain execution) end note @enduml diff --git a/docs/workflow-marker-extraction.svg b/docs/workflow-marker-extraction.svg new file mode 100644 index 0000000..e7fd675 --- /dev/null +++ b/docs/workflow-marker-extraction.svg @@ -0,0 +1,151 @@ +[From workflow-marker-extraction.puml (line 2) ]@startuml workflow-marker-extraction!theme plainSyntax Error? \ No newline at end of file