CascadingDev/docs/workflow-marker-extraction....

@startuml workflow-marker-extraction
!theme plain
title Workflow Marker Extraction with Regex Pattern Matching

start

:Discussion file staged\n(feature.discussion.md,\ndesign.discussion.md, etc);

:workflow.py reads file content;

partition "Parse Comments" {
  :Split file into lines;

  repeat
    :Read next line;

    if (Line is HTML comment?) then (yes)
      :Skip (metadata);
    else if (Line is heading?) then (yes)
      :Skip (structure);
    else (participant comment)
      :Extract participant name\n(before first ":");

      note right
        **Participant Format:**
        - Rob: Comment text...
        - Sarah: Comment text...
        - AI_Claude: Comment text...

        Names starting with "AI_"
        are excluded from voting if
        allow_agent_votes: false
      end note

      partition "Extract Structured Markers" {
        :Apply regex patterns\nto comment text;

        if (**DECISION**: found?) then (yes)
          :Pattern: (?:\\*\\*)?DECISION(?:\\*\\*)?\n\\s*:\\s*(.+?)(?=\\*\\*|VOTE:|$);
          :Extract decision text;
          :Store: {
          participant: "Rob",
          decision: "text...",
          rationale: "",
          supporters: []
          };
        endif

        if (**QUESTION**: found?) then (yes)
          :Pattern: (?:\\*\\*)?(?:QUESTION|Q)(?:\\*\\*)?\n\\s*:\\s*(.+?)(?=\\*\\*|VOTE:|$);
          :Extract question text;
          :Store: {
          participant: "Rob",
          question: "text...",
          status: "OPEN"
          };
        endif

        if (**ACTION**: found?) then (yes)
          :Pattern: (?:\\*\\*)?(?:ACTION|TODO)(?:\\*\\*)?\n\\s*:\\s*(.+?)(?=\\*\\*|VOTE:|$);
          :Extract action text;
          :Search for @mention in text;
          :Store: {
          participant: "Rob",
          action: "text...",
          assignee: "Sarah",
          status: "TODO"
          };
        endif

        if (Line ends with "?") then (yes)
          :Auto-detect as question;
          note right
            Fallback heuristic:
            If no explicit marker but
            line ends with "?",
            treat as question
          end note
        endif

        if (@mention found?) then (yes)
          :Extract @mentions;
          :Store in "Awaiting Replies" list;
        endif
      }

      if (VOTE: line found?) then (yes)
        :Extract vote value:\nREADY|CHANGES|REJECT;
        :Store latest vote per participant;
      endif
    endif

  repeat while (More lines?) is (yes)
  -> no;
}

partition "Generate Summary Sections" {
  :Format Decisions section:
  - Group by participant
  - Number sequentially
  - Include rationale if present;

  :Format Open Questions section:
  - List unanswered questions
  - Track by participant
  - Mark status (OPEN/PARTIAL);

  :Format Action Items section:
  - Group by status (TODO/ASSIGNED/DONE)
  - Show assignees
  - Link to requesters;

  :Format Awaiting Replies section:
  - Group by @mentioned person
  - Show context of request
  - Track unresolved mentions;

  :Format Votes section:
  - Count by value (READY/CHANGES/REJECT)
  - List latest vote per participant
  - Exclude AI votes if configured;

  :Format Timeline section:
  - Chronological order (newest first)
  - Include status changes
  - Summarize key events;
}

:Update marker blocks in .sum.md:
<!-- SUMMARY:DECISIONS START -->
...
<!-- SUMMARY:DECISIONS END -->;

:Stage updated .sum.md file;

stop

legend bottom
  **Example Input (feature.discussion.md):**

  Rob: The architecture looks solid. **DECISION**: We'll use PostgreSQL
  for the database. **QUESTION**: Should we use TypeScript or JavaScript?
  **ACTION**: @Sarah please research auth libraries. Looking forward to
  feedback. VOTE: CHANGES

  **Extracted Output (.sum.md):**

  <!-- SUMMARY:DECISIONS START -->
  ## Decisions (ADR-style)
  ### Decision 1: We'll use PostgreSQL for the database.
  - **Proposed by:** @Rob
  <!-- SUMMARY:DECISIONS END -->

  <!-- SUMMARY:OPEN_QUESTIONS START -->
  ## Open Questions
  - @Rob: Should we use TypeScript or JavaScript?
  <!-- SUMMARY:OPEN_QUESTIONS END -->

  <!-- SUMMARY:ACTION_ITEMS START -->
  ## Action Items
  ### TODO (unassigned):
  - [ ] @Sarah please research auth libraries (suggested by @Rob)
  <!-- SUMMARY:ACTION_ITEMS END -->

  <!-- SUMMARY:AWAITING START -->
  ## Awaiting Replies
  ### @Sarah
  - @Rob: ... **ACTION**: @Sarah please research auth libraries ...
  <!-- SUMMARY:AWAITING END -->
endlegend

note right
  **Regex Pattern Details:**

  **Decision Pattern:**
  (?:\\*\\*)?DECISION(?:\\*\\*)?\\s*:\\s*(.+?)
  (?=\\s*(?:\\*\\*QUESTION|\\*\\*ACTION|VOTE:)|$)

  **Features:**
  - Case-insensitive
  - Optional markdown bold (**) on both sides
  - Captures text until next marker or VOTE:
  - DOTALL mode for multi-line capture

  **Supported Formats:**
  - DECISION: text
  - **DECISION**: text
  - decision: text
  - **decision**: text
end note

note right
  **Why Regex Instead of Line-Start Matching?**

  ✗ Old approach: `if line.startswith("decision:"):`
  Problem: Markers embedded mid-sentence fail

  ✓ New approach: Regex search anywhere in line
  Handles: "Good point. **DECISION**: We'll use X."

  **Benefits:**
  - Natural conversational style
  - Markdown formatting preserved
  - Multiple markers per comment
  - Robust extraction
end note

@enduml