CascadingDev/docs/ai-provider-fallback.puml

@startuml ai-provider-fallback
!theme plain
title AI Provider Fallback Chain with Model Hints

start

:Automation needs AI generation\n(from patcher.py or runner.py);

:Read config/ai.yml;

if (Rule has model_hint?) then (yes)
  if (model_hint == "fast"?) then (yes)
    :Use **command_chain_fast**:
    - claude -p (→ Haiku subagent)
    - codex --model gpt-5-mini
    - gemini --model gemini-2.5-flash;
  else if (model_hint == "quality"?) then (yes)
    :Use **command_chain_quality**:
    - claude -p (→ Sonnet subagent)
    - codex --model o3
    - gemini --model gemini-2.5-pro;
  else (unknown hint)
    :Fall back to default chain;
  endif
else (no hint)
  :Use **command_chain** (default):
  - claude -p (→ auto-select subagent)
  - codex --model gpt-5
  - gemini --model gemini-2.5-flash;
endif

partition "Provider Loop" {
  :Get next provider from chain;

  if (Provider == "claude"?) then (yes)
    :Execute: **claude -p**;
    note right
      Claude CLI uses TASK COMPLEXITY hint
      from prompt to select subagent:
      - FAST → cdev-patch (Haiku)
      - QUALITY → cdev-patch-quality (Sonnet)
      - Default → auto-select
    end note

    if (Returned output?) then (yes)
      if (Contains diff markers?) then (yes)
        :✓ Success! Extract diff;
        stop
      else (no - non-diff response)
        :Log: "Claude non-diff output";
        :Try next provider;
      endif
    else (command failed)
      :Log: "Claude command failed";
      :Try next provider;
    endif

  else if (Provider == "codex"?) then (yes)
    :Execute: **codex exec --model X --json -**;
    note right
      Codex requires special handling:
      - Add "exec" subcommand
      - Add "--json" flag
      - Add "--color=never"
      - Add "-" to read from stdin
      - Parse JSON output for agent_message
    end note

    if (Exit code == 0?) then (yes)
      :Parse JSON lines;
      :Extract agent_message text;

      if (Contains diff?) then (yes)
        :✓ Success! Extract diff;
        stop
      else (no diff)
        :Log: "Codex no diff output";
        :Try next provider;
      endif
    else (exit code 1)
      :Log: "Codex exited with 1";
      :Try next provider;
    endif

  else if (Provider == "gemini"?) then (yes)
    :Execute: **gemini --model X**;
    note right
      Gemini is the most reliable fallback:
      - Accepts plain text input
      - Returns consistent output
      - Supports sentinel token
    end note

    if (Returned output?) then (yes)
      if (Output == sentinel token?) then (yes)
        :Log: "No changes needed";
        :Return empty (intentional);
        stop
      else if (Contains diff?) then (yes)
        :✓ Success! Extract diff;
        stop
      else (no diff)
        :Log: "Gemini no diff output";
        :Try next provider;
      endif
    else (command failed)
      :Log: "Gemini command failed";
      :Try next provider;
    endif
  endif

  if (More providers in chain?) then (yes)
    :Continue loop;
  else (no)
    :✗ All providers failed;
    :Raise PatchGenerationError;
    stop
  endif
}

stop

legend bottom
  **Configuration Example (config/ai.yml):**

  runner:
    command_chain:
      - "claude -p"
      - "codex --model gpt-5"
      - "gemini --model gemini-2.5-flash"

    command_chain_fast:
      - "claude -p"
      - "codex --model gpt-5-mini"
      - "gemini --model gemini-2.5-flash"

    command_chain_quality:
      - "claude -p"
      - "codex --model o3"
      - "gemini --model gemini-2.5-pro"

    sentinel: "CASCADINGDEV_NO_CHANGES"

  **Environment Override:**
  export CDEV_AI_COMMAND="claude -p || gemini --model gemini-2.5-pro"
  (Overrides config.yml for this commit only)
endlegend

note right
  **Why Fallback Chain?**

  1. **Redundancy**: Rate limits, API outages
  2. **Model specialization**: Different models excel at different tasks
  3. **Cost optimization**: Try cheaper models first
  4. **Quality assurance**: Fast models for simple tasks, quality for complex

  **Observed Behavior:**
  - Claude occasionally returns non-diff output
  - Codex consistently exits with code 1 (auth issues?)
  - Gemini is the most reliable fallback
end note

@enduml