diff --git a/CLAUDE.md b/CLAUDE.md index f077876..478e8f0 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,7 +4,24 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co ## Project Overview -**Live Two-Way Chat** - Real-time conversational AI with natural speech flow +**Live Two-Way Chat** - Real-time conversational AI that simulates natural human conversation, moving beyond forum-style turn-taking. + +### Vision + +Current chatbots work like forums - wait for input, generate response, repeat. This project aims for natural conversation: + +- **Continuous transcription** - Voice transcribed in small chunks, not waiting for silence +- **Predictive responses** - AI pre-prepares replies, modifying as context arrives +- **Natural interruption** - AI decides when to speak (interrupt with important point, wait for question) +- **Bidirectional listening** - AI listens even while speaking, handles interruptions gracefully +- **Shared context window** - Drag-and-drop workspace for images, code, documents + +### Shared Context Window + +A visual workspace both human and AI can see/edit: +- Images: displayed and analyzed +- Code: displayed, editable by both +- Split view: multiple files at once ## Development Commands @@ -15,24 +32,43 @@ pip install -e ".[dev]" # Run tests pytest -# Run a single test -pytest tests/test_file.py::test_name +# Run the demo +python -m live_two_way_chat.demo ``` ## Architecture -*TODO: Describe the project architecture* +### Components -### Key Modules +1. **Streaming ASR** - Real-time speech-to-text (Whisper or similar) +2. **Response Engine** - Predictive response generation with incremental updates +3. **Turn-Taking Model** - Decides when to speak/wait/interrupt +4. **TTS Output** - Text-to-speech with ducking for interruptions +5. **Context Window** - Shared visual workspace (PyQt6) -*TODO: List key modules and their purposes* +### Key Modules (planned) + +- `src/live_two_way_chat/asr.py` - Streaming speech recognition +- `src/live_two_way_chat/response.py` - Predictive response engine +- `src/live_two_way_chat/turn_taking.py` - Conversation flow control +- `src/live_two_way_chat/tts.py` - Text-to-speech output +- `src/live_two_way_chat/context_window.py` - Shared workspace UI +- `src/live_two_way_chat/main.py` - Application entry point ### Key Paths -- **Source code**: `src/live-two-way-chat/` +- **Source code**: `src/live_two_way_chat/` - **Tests**: `tests/` - **Documentation**: `docs/` (symlink to project-docs) +## Technical Challenges + +1. Low-latency streaming ASR +2. Incremental response generation (partial responses that update) +3. Turn-taking model (when to speak/wait/interrupt) +4. Context threading during interruptions +5. Audio ducking for simultaneous speech + ## Documentation Documentation for this project lives in the centralized docs system: @@ -47,3 +83,8 @@ When updating documentation: 4. Run `~/PycharmProjects/project-docs/scripts/build-public-docs.sh live-two-way-chat --deploy` to publish Do NOT create documentation files directly in this repository. + +## Related Projects + +- **Ramble** - Voice transcription (could provide ASR component) +- **Artifact Editor** - Could power the shared context window