live-two-way-chat/CLAUDE.md

91 lines
3.1 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
**Live Two-Way Chat** - Real-time conversational AI that simulates natural human conversation, moving beyond forum-style turn-taking.
### Vision
Current chatbots work like forums - wait for input, generate response, repeat. This project aims for natural conversation:
- **Continuous transcription** - Voice transcribed in small chunks, not waiting for silence
- **Predictive responses** - AI pre-prepares replies, modifying as context arrives
- **Natural interruption** - AI decides when to speak (interrupt with important point, wait for question)
- **Bidirectional listening** - AI listens even while speaking, handles interruptions gracefully
- **Shared context window** - Drag-and-drop workspace for images, code, documents
### Shared Context Window
A visual workspace both human and AI can see/edit:
- Images: displayed and analyzed
- Code: displayed, editable by both
- Split view: multiple files at once
## Development Commands
```bash
# Install for development
pip install -e ".[dev]"
# Run tests
pytest
# Run the demo
python -m live_two_way_chat.demo
```
## Architecture
### Components
1. **Streaming ASR** - Real-time speech-to-text (Whisper or similar)
2. **Response Engine** - Predictive response generation with incremental updates
3. **Turn-Taking Model** - Decides when to speak/wait/interrupt
4. **TTS Output** - Text-to-speech with ducking for interruptions
5. **Context Window** - Shared visual workspace (PyQt6)
### Key Modules (planned)
- `src/live_two_way_chat/asr.py` - Streaming speech recognition
- `src/live_two_way_chat/response.py` - Predictive response engine
- `src/live_two_way_chat/turn_taking.py` - Conversation flow control
- `src/live_two_way_chat/tts.py` - Text-to-speech output
- `src/live_two_way_chat/context_window.py` - Shared workspace UI
- `src/live_two_way_chat/main.py` - Application entry point
### Key Paths
- **Source code**: `src/live_two_way_chat/`
- **Tests**: `tests/`
- **Documentation**: `docs/` (symlink to project-docs)
## Technical Challenges
1. Low-latency streaming ASR
2. Incremental response generation (partial responses that update)
3. Turn-taking model (when to speak/wait/interrupt)
4. Context threading during interruptions
5. Audio ducking for simultaneous speech
## Documentation
Documentation for this project lives in the centralized docs system:
- **Source**: `~/PycharmProjects/project-docs/docs/projects/live-two-way-chat/`
- **Public URL**: `https://pages.brrd.tech/rob/live-two-way-chat/`
When updating documentation:
1. Edit files in `docs/` (the symlink) or the full path above
2. Use `public: true` frontmatter for public-facing docs
3. Use `<!-- PRIVATE_START -->` / `<!-- PRIVATE_END -->` to hide sections
4. Run `~/PycharmProjects/project-docs/scripts/build-public-docs.sh live-two-way-chat --deploy` to publish
Do NOT create documentation files directly in this repository.
## Related Projects
- **Ramble** - Voice transcription (could provide ASR component)
- **Artifact Editor** - Could power the shared context window