3.1 KiB
3.1 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Live Two-Way Chat - Real-time conversational AI that simulates natural human conversation, moving beyond forum-style turn-taking.
Vision
Current chatbots work like forums - wait for input, generate response, repeat. This project aims for natural conversation:
- Continuous transcription - Voice transcribed in small chunks, not waiting for silence
- Predictive responses - AI pre-prepares replies, modifying as context arrives
- Natural interruption - AI decides when to speak (interrupt with important point, wait for question)
- Bidirectional listening - AI listens even while speaking, handles interruptions gracefully
- Shared context window - Drag-and-drop workspace for images, code, documents
Shared Context Window
A visual workspace both human and AI can see/edit:
- Images: displayed and analyzed
- Code: displayed, editable by both
- Split view: multiple files at once
Development Commands
# Install for development
pip install -e ".[dev]"
# Run tests
pytest
# Run the demo
python -m live_two_way_chat.demo
Architecture
Components
- Streaming ASR - Real-time speech-to-text (Whisper or similar)
- Response Engine - Predictive response generation with incremental updates
- Turn-Taking Model - Decides when to speak/wait/interrupt
- TTS Output - Text-to-speech with ducking for interruptions
- Context Window - Shared visual workspace (PyQt6)
Key Modules (planned)
src/live_two_way_chat/asr.py- Streaming speech recognitionsrc/live_two_way_chat/response.py- Predictive response enginesrc/live_two_way_chat/turn_taking.py- Conversation flow controlsrc/live_two_way_chat/tts.py- Text-to-speech outputsrc/live_two_way_chat/context_window.py- Shared workspace UIsrc/live_two_way_chat/main.py- Application entry point
Key Paths
- Source code:
src/live_two_way_chat/ - Tests:
tests/ - Documentation:
docs/(symlink to project-docs)
Technical Challenges
- Low-latency streaming ASR
- Incremental response generation (partial responses that update)
- Turn-taking model (when to speak/wait/interrupt)
- Context threading during interruptions
- Audio ducking for simultaneous speech
Documentation
Documentation for this project lives in the centralized docs system:
- Source:
~/PycharmProjects/project-docs/docs/projects/live-two-way-chat/ - Public URL:
https://pages.brrd.tech/rob/live-two-way-chat/
When updating documentation:
- Edit files in
docs/(the symlink) or the full path above - Use
public: truefrontmatter for public-facing docs - Use
<!-- PRIVATE_START -->/<!-- PRIVATE_END -->to hide sections - Run
~/PycharmProjects/project-docs/scripts/build-public-docs.sh live-two-way-chat --deployto publish
Do NOT create documentation files directly in this repository.
Related Projects
- Ramble - Voice transcription (could provide ASR component)
- Artifact Editor - Could power the shared context window