docs: complete update for AI normalization architecture

Updated all documentation to reflect the new two-tier extraction system:

**workflow-marker-extraction.puml:**
- Completely rewritten to show AI normalization flow
- Documents agents.normalize_discussion() as primary method
- Shows simple line-start fallback for explicit markers
- Includes natural conversation examples vs. explicit markers
- Demonstrates resilience and cost-effectiveness

**AUTOMATION.md:**
- Restructured "Conversation Guidelines" section
- Emphasizes natural conversation as recommended approach
- Clarifies AI normalization extracts from conversational text
- Documents explicit markers as fallback when AI unavailable
- Explains two-tier architecture benefits

**diagrams-README.md:**
- Already updated in previous commit

All documentation now accurately reflects:
 AI-powered extraction (agents.py) for natural conversation
 Simple fallback parsing (workflow.py) for explicit markers
 Multi-provider resilience (claude → codex → gemini)
 No strict formatting requirements for participants

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
rob 2025-11-02 20:04:09 -04:00
parent 380c7b5d12
commit 33b550ad5b
3 changed files with 293 additions and 211 deletions

View File

@ -265,41 +265,73 @@ Captures architectural decisions with rationale.
<!-- SUMMARY:DECISIONS END -->
```
## Conversation Guidelines (Optional)
## Conversation Guidelines
Using these markers helps extract information accurately. **Many work without AI using regex:**
### Natural Conversation (Recommended)
**Write naturally - AI normalization extracts markers automatically:**
```markdown
# Markers (✅ = works without AI)
# Examples of natural conversation that AI understands:
Q: <question> # ✅ Mark questions explicitly (also: "Question:", or ending with ?)
A: <answer> # Mark answers explicitly (AI tracks these)
Re: <response> # Partial answers or follow-ups (AI tracks these)
- Alice: I think we should use OAuth2. Does anyone know if we need OAuth 2.1 specifically?
VOTE: READY
- Bob: Good question Alice. I'm making a decision here - we'll use OAuth 2.0 for now.
@Carol can you research migration paths to 2.1? VOTE: CHANGES
- Carol: I've completed the OAuth research. We can upgrade later without breaking changes.
VOTE: READY
```
**AI normalization (via `agents.py`) extracts:**
- Decisions from natural language ("I'm making a decision here - ...")
- Questions from conversational text ("Does anyone know if...")
- Action items with @mentions ("@Carol can you research...")
- Votes (always tracked: `VOTE: READY|CHANGES|REJECT`)
### Explicit Markers (Fallback)
**If AI is unavailable, these explicit line-start markers work as fallback:**
```markdown
# Markers (✅ = works without AI as simple fallback)
QUESTION: <question> # ✅ Explicit question marker
Q: <question> # ✅ Short form
TODO: <action> # ✅ New unassigned task
ACTION: <action> # ✅ Task with implied ownership (alias for TODO)
ASSIGNED: <task> @name # ✅ Claimed task (extracts @mention as assignee)
ACTION: <action> # ✅ Task with implied ownership
ASSIGNED: <task> @name # ✅ Claimed task
DONE: <completion> # ✅ Mark task complete
DECISION: <choice> # ✅ Architectural decision (AI adds rationale/alternatives)
Rationale: <why> # Explain reasoning (AI extracts this)
DECISION: <choice> # ✅ Architectural decision
VOTE: READY|CHANGES|REJECT # ✅ REQUIRED for voting (always tracked)
VOTE: READY|CHANGES|REJECT # ✅ ALWAYS tracked (with or without AI)
@Name # ✅ Mention someone specifically
@all # ✅ Mention everyone
@Name # ✅ Mention extraction (simple regex)
```
**Example Workflow:**
**Example with explicit markers:**
```markdown
- Alice: Q: Should we support OAuth2?
- Alice: QUESTION: Should we support OAuth2?
- Bob: TODO: Research OAuth2 libraries
- Bob: ASSIGNED: OAuth2 library research (@Bob taking ownership)
- Carol: DECISION: Use OAuth2 for authentication. Rationale: Industry standard with good library support.
- Carol: DONE: Completed OAuth2 comparison document
- Dave: @all Please review the comparison by Friday. VOTE: READY
- Bob: ASSIGNED: OAuth2 library research
- Carol: DECISION: Use OAuth2 for authentication
- Dave: @all Please review. VOTE: READY
```
### Two-Tier Architecture
1. **AI Normalization (Primary):** Handles natural conversation, embedded markers, context understanding
2. **Simple Fallback:** Handles explicit line-start markers when AI unavailable
Benefits:
- ✅ Participants write naturally without strict formatting
- ✅ Resilient (multi-provider fallback: claude → codex → gemini)
- ✅ Works offline/API-down with explicit markers
- ✅ Cost-effective (uses fast models for extraction)
## Implementation Details
### Incremental Processing

View File

@ -1,6 +1,6 @@
@startuml workflow-marker-extraction
!theme plain
title Workflow Marker Extraction with Regex Pattern Matching
title Workflow Marker Extraction with AI Normalization
start
@ -8,152 +8,80 @@ start
:workflow.py reads file content;
partition "Parse Comments" {
:Split file into lines;
repeat
:Read next line;
if (Line is HTML comment?) then (yes)
:Skip (metadata);
else if (Line is heading?) then (yes)
:Skip (structure);
else (participant comment)
:Extract participant name\n(before first ":");
partition "Two-Tier Extraction" {
:Call extract_structured_basic()\nSimple fallback parsing;
note right
**Participant Format:**
- Rob: Comment text...
- Sarah: Comment text...
- AI_Claude: Comment text...
**Fallback: Simple Line-Start Matching**
Only matches explicit markers at line start:
- DECISION: text
- QUESTION: text
- Q: text
- ACTION: text
- TODO: text
- ASSIGNED: text
- DONE: text
Names starting with "AI_"
are excluded from voting if
allow_agent_votes: false
Uses case-insensitive startswith() matching.
Handles strictly-formatted discussions.
end note
partition "Extract Structured Markers" {
:Apply regex patterns\nto comment text;
:Store fallback results\n(decisions, questions, actions, mentions);
if (**DECISION**: found?) then (yes)
:Extract decision text;
:Store decision record;
:Call agents.normalize_discussion()\nAI-powered extraction;
partition "AI Normalization (agents.py)" {
:Build prompt for AI model;
note right
**Pattern:**
(?:\\*\\*)?DECISION(?:\\*\\*)?
\\s*:\\s*(.+?)
(?=\\s*(?:\\*\\*QUESTION|\\*\\*ACTION|VOTE:)|$)
**AI Prompt:**
"Extract structured information from discussion.
Return JSON with: votes, questions, decisions,
action_items, mentions"
**Captures:** Decision text until next marker
**Example:**
{
participant: "Rob",
decision: "text...",
rationale: "",
supporters: []
}
Supports natural conversation like:
"I'm making a decision here - we'll use X"
"Does anyone know if we need Y?"
"@Sarah can you check Z?"
end note
endif
if (**QUESTION**: found?) then (yes)
:Extract question text;
:Store question record;
:Execute command chain\n(claude → codex → gemini);
if (AI returned valid JSON?) then (yes)
:Parse JSON response;
:Extract structured data:\n- votes\n- questions\n- decisions\n- action_items\n- mentions;
:Override fallback results\nwith AI results;
note right
**Pattern:**
(?:\\*\\*)?(?:QUESTION|Q)(?:\\*\\*)?
\\s*:\\s*(.+?)
(?=\\s*(?:\\*\\*DECISION|\\*\\*ACTION|VOTE:)|$)
**Captures:** Question text until next marker
**Example:**
{
participant: "Rob",
question: "text...",
status: "OPEN"
}
**AI advantages:**
- Handles embedded markers
- Understands context
- Extracts from natural language
- No strict formatting required
end note
endif
if (**ACTION**: found?) then (yes)
:Extract action text;
:Search for @mention in text;
:Store action record;
else (no - AI failed or unavailable)
:Use fallback results only;
note right
**Pattern:**
(?:\\*\\*)?(?:ACTION|TODO)(?:\\*\\*)?
\\s*:\\s*(.+?)
(?=\\s*(?:\\*\\*DECISION|\\*\\*QUESTION|VOTE:)|$)
**Captures:** Action text + assignee from @mention
**Example:**
{
participant: "Rob",
action: "text...",
assignee: "Sarah",
status: "TODO"
}
**Fallback activated when:**
- All providers fail
- Invalid JSON response
- agents.py import fails
- API rate limits hit
end note
endif
if (Line ends with "?") then (yes)
:Auto-detect as question;
note right
Fallback heuristic:
If no explicit marker but
line ends with "?",
treat as question
end note
endif
if (@mention found?) then (yes)
:Extract @mentions;
:Store in "Awaiting Replies" list;
endif
}
if (VOTE: line found?) then (yes)
:Extract vote value:\nREADY|CHANGES|REJECT;
:Store latest vote per participant;
endif
endif
repeat while (More lines?) is (yes)
-> no;
}
partition "Generate Summary Sections" {
:Format Decisions section:
- Group by participant
- Number sequentially
- Include rationale if present;
:Format Decisions section:\n- Group by participant\n- Number sequentially\n- Include rationale if present;
:Format Open Questions section:
- List unanswered questions
- Track by participant
- Mark status (OPEN/PARTIAL);
:Format Open Questions section:\n- List unanswered questions\n- Track by participant\n- Mark status (OPEN/PARTIAL);
:Format Action Items section:
- Group by status (TODO/ASSIGNED/DONE)
- Show assignees
- Link to requesters;
:Format Action Items section:\n- Group by status (TODO/ASSIGNED/DONE)\n- Show assignees\n- Link to requesters;
:Format Awaiting Replies section:
- Group by @mentioned person
- Show context of request
- Track unresolved mentions;
:Format Awaiting Replies section:\n- Group by @mentioned person\n- Show context of request\n- Track unresolved mentions;
:Format Votes section:
- Count by value (READY/CHANGES/REJECT)
- List latest vote per participant
- Exclude AI votes if configured;
:Format Votes section:\n- Count by value (READY/CHANGES/REJECT)\n- List latest vote per participant\n- Exclude AI votes if configured;
:Format Timeline section:
- Chronological order (newest first)
- Include status changes
- Summarize key events;
:Format Timeline section:\n- Chronological order (newest first)\n- Include status changes\n- Summarize key events;
}
:Update marker blocks in .sum.md;
@ -168,73 +96,44 @@ end note
stop
legend bottom
**Example Input (feature.discussion.md):**
**Example Input (natural conversation):**
Rob: The architecture looks solid. **DECISION**: We'll use PostgreSQL
for the database. **QUESTION**: Should we use TypeScript or JavaScript?
**ACTION**: @Sarah please research auth libraries. Looking forward to
feedback. VOTE: CHANGES
Rob: I've been thinking about the timeline. I'm making a decision here -
we'll build the upload system first. Does anyone know if we need real-time
preview? @Sarah can you research Unity Asset Store API? VOTE: READY
**Extracted Output (.sum.md):**
**AI Normalization Output (JSON):**
{
"votes": [{"participant": "Rob", "vote": "READY"}],
"decisions": [{"participant": "Rob",
"decision": "build the upload system first"}],
"questions": [{"participant": "Rob",
"question": "if we need real-time preview"}],
"action_items": [{"participant": "Rob", "action": "research Unity API",
"assignee": "Sarah"}],
"mentions": [{"from": "Rob", "to": "Sarah"}]
}
<!-- SUMMARY:DECISIONS START -->
## Decisions (ADR-style)
### Decision 1: We'll use PostgreSQL for the database.
- **Proposed by:** @Rob
<!-- SUMMARY:DECISIONS END -->
<!-- SUMMARY:OPEN_QUESTIONS START -->
## Open Questions
- @Rob: Should we use TypeScript or JavaScript?
<!-- SUMMARY:OPEN_QUESTIONS END -->
<!-- SUMMARY:ACTION_ITEMS START -->
## Action Items
### TODO (unassigned):
- [ ] @Sarah please research auth libraries (suggested by @Rob)
<!-- SUMMARY:ACTION_ITEMS END -->
<!-- SUMMARY:AWAITING START -->
## Awaiting Replies
### @Sarah
- @Rob: ... **ACTION**: @Sarah please research auth libraries ...
<!-- SUMMARY:AWAITING END -->
**Fallback Only Matches:**
DECISION: We'll build upload first
QUESTION: Do we need real-time preview?
ACTION: @Sarah research Unity API
endlegend
note right
**Regex Pattern Details:**
**Architecture Benefits:**
**Decision Pattern:**
(?:\\*\\*)?DECISION(?:\\*\\*)?\\s*:\\s*(.+?)
(?=\\s*(?:\\*\\*QUESTION|\\*\\*ACTION|VOTE:)|$)
✓ Participants write naturally
✓ No strict formatting rules
✓ AI handles understanding
✓ Simple code for fallback
✓ Resilient (multi-provider chain)
✓ Cost-effective (fast models)
**Features:**
- Case-insensitive
- Optional markdown bold (**) on both sides
- Captures text until next marker or VOTE:
- DOTALL mode for multi-line capture
**Supported Formats:**
- DECISION: text
- **DECISION**: text
- decision: text
- **decision**: text
end note
note right
**Why Regex Instead of Line-Start Matching?**
✗ Old approach: `if line.startswith("decision:"):`
Problem: Markers embedded mid-sentence fail
✓ New approach: Regex search anywhere in line
Handles: "Good point. **DECISION**: We'll use X."
**Benefits:**
- Natural conversational style
- Markdown formatting preserved
- Multiple markers per comment
- Robust extraction
**Files:**
- automation/agents.py (AI normalization)
- automation/workflow.py (fallback + orchestration)
- automation/patcher.py (provider chain execution)
end note
@enduml

View File

@ -0,0 +1,151 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" contentScriptType="application/ecmascript" contentStyleType="text/css" height="103px" preserveAspectRatio="none" style="width:352px;height:103px;background:#000000;" version="1.1" viewBox="0 0 352 103" width="352px" zoomAndPan="magnify"><defs/><g><rect fill="#11060A" height="1" style="stroke: #11060A; stroke-width: 1.0;" width="1" x="0" y="0"/><rect fill="#33FF02" height="24.0679" style="stroke: #33FF02; stroke-width: 1.0;" width="346" x="5" y="5"/><text fill="#000000" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="344" x="6" y="20">[From workflow-marker-extraction.puml (line 2) ]</text><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="0" x="9" y="43.0679"/><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="275" x="5" y="62.1358">@startuml workflow-marker-extraction</text><text fill="#33FF02" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="87" x="5" y="81.2038">!theme plain</text><text fill="#FF0000" font-family="sans-serif" font-size="14" font-weight="bold" lengthAdjust="spacingAndGlyphs" textLength="93" x="9" y="100.2717">Syntax Error?</text><!--MD5=[32d7802434cc4c797d2bc79c191390cf]
@startuml workflow-marker-extraction
!theme plain
title Workflow Marker Extraction with AI Normalization
start
:Discussion file staged\n(feature.discussion.md,\ndesign.discussion.md, etc);
:workflow.py reads file content;
partition "Two-Tier Extraction" {
:Call extract_structured_basic()\nSimple fallback parsing;
note right
**Fallback: Simple Line-Start Matching**
Only matches explicit markers at line start:
- DECISION: text
- QUESTION: text
- Q: text
- ACTION: text
- TODO: text
- ASSIGNED: text
- DONE: text
Uses case-insensitive startswith() matching.
Handles strictly-formatted discussions.
end note
:Store fallback results\n(decisions, questions, actions, mentions);
:Call agents.normalize_discussion()\nAI-powered extraction;
partition "AI Normalization (agents.py)" {
:Build prompt for AI model;
note right
**AI Prompt:**
"Extract structured information from discussion.
Return JSON with: votes, questions, decisions,
action_items, mentions"
Supports natural conversation like:
"I'm making a decision here - we'll use X"
"Does anyone know if we need Y?"
"@Sarah can you check Z?"
end note
:Execute command chain\n(claude → codex → gemini);
if (AI returned valid JSON?) then (yes)
:Parse JSON response;
:Extract structured data:\n- votes\n- questions\n- decisions\n- action_items\n- mentions;
:Override fallback results\nwith AI results;
note right
**AI advantages:**
- Handles embedded markers
- Understands context
- Extracts from natural language
- No strict formatting required
end note
else (no - AI failed or unavailable)
:Use fallback results only;
note right
**Fallback activated when:**
- All providers fail
- Invalid JSON response
- agents.py import fails
- API rate limits hit
end note
endif
}
}
partition "Generate Summary Sections" {
:Format Decisions section:\n- Group by participant\n- Number sequentially\n- Include rationale if present;
:Format Open Questions section:\n- List unanswered questions\n- Track by participant\n- Mark status (OPEN/PARTIAL);
:Format Action Items section:\n- Group by status (TODO/ASSIGNED/DONE)\n- Show assignees\n- Link to requesters;
:Format Awaiting Replies section:\n- Group by @mentioned person\n- Show context of request\n- Track unresolved mentions;
:Format Votes section:\n- Count by value (READY/CHANGES/REJECT)\n- List latest vote per participant\n- Exclude AI votes if configured;
:Format Timeline section:\n- Chronological order (newest first)\n- Include status changes\n- Summarize key events;
}
:Update marker blocks in .sum.md;
note right
<!- - SUMMARY:DECISIONS START - ->
...
<!- - SUMMARY:DECISIONS END - ->
end note
:Stage updated .sum.md file;
stop
legend bottom
**Example Input (natural conversation):**
Rob: I've been thinking about the timeline. I'm making a decision here -
we'll build the upload system first. Does anyone know if we need real-time
preview? @Sarah can you research Unity Asset Store API? VOTE: READY
**AI Normalization Output (JSON):**
{
"votes": [{"participant": "Rob", "vote": "READY"}],
"decisions": [{"participant": "Rob",
"decision": "build the upload system first"}],
"questions": [{"participant": "Rob",
"question": "if we need real-time preview"}],
"action_items": [{"participant": "Rob", "action": "research Unity API",
"assignee": "Sarah"}],
"mentions": [{"from": "Rob", "to": "Sarah"}]
}
**Fallback Only Matches:**
DECISION: We'll build upload first
QUESTION: Do we need real-time preview?
ACTION: @Sarah research Unity API
endlegend
note right
**Architecture Benefits:**
✓ Participants write naturally
✓ No strict formatting rules
✓ AI handles understanding
✓ Simple code for fallback
✓ Resilient (multi-provider chain)
✓ Cost-effective (fast models)
**Files:**
- automation/agents.py (AI normalization)
- automation/workflow.py (fallback + orchestration)
- automation/patcher.py (provider chain execution)
end note
@enduml
PlantUML version 1.2020.02(Sun Mar 01 06:22:07 AST 2020)
(GPL source distribution)
Java Runtime: OpenJDK Runtime Environment
JVM: OpenJDK 64-Bit Server VM
Java Version: 21.0.8+9-Ubuntu-0ubuntu124.04.1
Operating System: Linux
Default Encoding: UTF-8
Language: en
Country: CA
--></g></svg>

After

Width:  |  Height:  |  Size: 6.1 KiB