CascadingDev/docs/workflow-marker-extraction....

@startuml workflow-marker-extraction
!theme plain
title Workflow Marker Extraction with AI Normalization

start

:Discussion file staged\n(feature.discussion.md,\ndesign.discussion.md, etc);

:workflow.py reads file content;

partition "Two-Tier Extraction" {
  :Call extract_structured_basic()\nSimple fallback parsing;

  note right
    **Fallback: Simple Line-Start Matching**
    Only matches explicit markers at line start:
    - DECISION: text
    - QUESTION: text
    - Q: text
    - ACTION: text
    - TODO: text
    - ASSIGNED: text
    - DONE: text

    Uses case-insensitive startswith() matching.
    Handles strictly-formatted discussions.
  end note

  :Store fallback results\n(decisions, questions, actions, mentions);

  :Call agents.normalize_discussion()\nAI-powered extraction;

  partition "AI Normalization (agents.py)" {
    :Build prompt for AI model;
    note right
      **AI Prompt:**
      "Extract structured information from discussion.
      Return JSON with: votes, questions, decisions,
      action_items, mentions"

      Supports natural conversation like:
      "I'm making a decision here - we'll use X"
      "Does anyone know if we need Y?"
      "@Sarah can you check Z?"
    end note

    :Execute command chain\n(claude → codex → gemini);

    if (AI returned valid JSON?) then (yes)
      :Parse JSON response;
      :Extract structured data:\n- votes\n- questions\n- decisions\n- action_items\n- mentions;
      :Override fallback results\nwith AI results;
      note right
        **AI advantages:**
        - Handles embedded markers
        - Understands context
        - Extracts from natural language
        - No strict formatting required
      end note
    else (no - AI failed or unavailable)
      :Use fallback results only;
      note right
        **Fallback activated when:**
        - All providers fail
        - Invalid JSON response
        - agents.py import fails
        - API rate limits hit
      end note
    endif
  }
}

partition "Generate Summary Sections" {
  :Format Decisions section:\n- Group by participant\n- Number sequentially\n- Include rationale if present;

  :Format Open Questions section:\n- List unanswered questions\n- Track by participant\n- Mark status (OPEN/PARTIAL);

  :Format Action Items section:\n- Group by status (TODO/ASSIGNED/DONE)\n- Show assignees\n- Link to requesters;

  :Format Awaiting Replies section:\n- Group by @mentioned person\n- Show context of request\n- Track unresolved mentions;

  :Format Votes section:\n- Count by value (READY/CHANGES/REJECT)\n- List latest vote per participant\n- Exclude AI votes if configured;

  :Format Timeline section:\n- Chronological order (newest first)\n- Include status changes\n- Summarize key events;
}

:Update marker blocks in .sum.md;
note right
  <!-- SUMMARY:DECISIONS START -->
  ...
  <!-- SUMMARY:DECISIONS END -->
end note

:Stage updated .sum.md file;

stop

legend bottom
  **Example Input (natural conversation):**

  Name: Rob
  I've been thinking about the timeline. I'm making a decision here -
  we'll build the upload system first. Does anyone know if we need real-time
  preview? @Sarah can you research Unity Asset Store API?
  VOTE: READY

  **AI Normalization Output (JSON):**
  {
    "votes": [{"participant": "Rob", "vote": "READY"}],
    "decisions": [{"participant": "Rob",
                   "decision": "build the upload system first"}],
    "questions": [{"participant": "Rob",
                   "question": "if we need real-time preview"}],
    "action_items": [{"participant": "Rob", "action": "research Unity API",
                      "assignee": "Sarah"}],
    "mentions": [{"from": "Rob", "to": "Sarah"}]
  }

  **Fallback Only Matches:**
  DECISION: We'll build upload first
  QUESTION: Do we need real-time preview?
  ACTION: @Sarah research Unity API
endlegend

note right
  **Architecture Benefits:**

  ✓ Participants write naturally
  ✓ No strict formatting rules
  ✓ AI handles understanding
  ✓ Simple code for fallback
  ✓ Resilient (multi-provider chain)
  ✓ Cost-effective (fast models)

  **Files:**
  - automation/agents.py (AI normalization)
  - automation/workflow.py (fallback + orchestration)
  - automation/patcher.py (provider chain execution)
end note

@enduml