Add project vision and architecture to CLAUDE.md

2026-01-05 19:42:55 -04:00 · 2026-01-05 19:42:55 -04:00 · 0df0a31d79
parent fc1418ced3
commit 0df0a31d79
1 changed files with 48 additions and 7 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -4,7 +4,24 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 ## Project Overview

-**Live Two-Way Chat** - Real-time conversational AI with natural speech flow
+**Live Two-Way Chat** - Real-time conversational AI that simulates natural human conversation, moving beyond forum-style turn-taking.
+
+### Vision
+
+Current chatbots work like forums - wait for input, generate response, repeat. This project aims for natural conversation:
+
+- **Continuous transcription** - Voice transcribed in small chunks, not waiting for silence
+- **Predictive responses** - AI pre-prepares replies, modifying as context arrives
+- **Natural interruption** - AI decides when to speak (interrupt with important point, wait for question)
+- **Bidirectional listening** - AI listens even while speaking, handles interruptions gracefully
+- **Shared context window** - Drag-and-drop workspace for images, code, documents
+
+### Shared Context Window
+
+A visual workspace both human and AI can see/edit:
+- Images: displayed and analyzed
+- Code: displayed, editable by both
+- Split view: multiple files at once

 ## Development Commands

@ -15,24 +32,43 @@ pip install -e ".[dev]"
 # Run tests
 pytest

-# Run a single test
-pytest tests/test_file.py::test_name
+# Run the demo
+python -m live_two_way_chat.demo
 ```

 ## Architecture

-*TODO: Describe the project architecture*
+### Components

-### Key Modules
+1. **Streaming ASR** - Real-time speech-to-text (Whisper or similar)
+2. **Response Engine** - Predictive response generation with incremental updates
+3. **Turn-Taking Model** - Decides when to speak/wait/interrupt
+4. **TTS Output** - Text-to-speech with ducking for interruptions
+5. **Context Window** - Shared visual workspace (PyQt6)

-*TODO: List key modules and their purposes*
+### Key Modules (planned)
+
+- `src/live_two_way_chat/asr.py` - Streaming speech recognition
+- `src/live_two_way_chat/response.py` - Predictive response engine
+- `src/live_two_way_chat/turn_taking.py` - Conversation flow control
+- `src/live_two_way_chat/tts.py` - Text-to-speech output
+- `src/live_two_way_chat/context_window.py` - Shared workspace UI
+- `src/live_two_way_chat/main.py` - Application entry point

 ### Key Paths

- **Source code**: `src/live-two-way-chat/`
+- **Source code**: `src/live_two_way_chat/`
 - **Tests**: `tests/`
 - **Documentation**: `docs/` (symlink to project-docs)

+## Technical Challenges
+
+1. Low-latency streaming ASR
+2. Incremental response generation (partial responses that update)
+3. Turn-taking model (when to speak/wait/interrupt)
+4. Context threading during interruptions
+5. Audio ducking for simultaneous speech
+
 ## Documentation

 Documentation for this project lives in the centralized docs system:
@ -47,3 +83,8 @@ When updating documentation:
 4. Run `~/PycharmProjects/project-docs/scripts/build-public-docs.sh live-two-way-chat --deploy` to publish

 Do NOT create documentation files directly in this repository.
+
+## Related Projects
+
+- **Ramble** - Voice transcription (could provide ASR component)
+- **Artifact Editor** - Could power the shared context window