Add project vision and architecture to CLAUDE.md
This commit is contained in:
parent
fc1418ced3
commit
0df0a31d79
55
CLAUDE.md
55
CLAUDE.md
|
|
@ -4,7 +4,24 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
|||
|
||||
## Project Overview
|
||||
|
||||
**Live Two-Way Chat** - Real-time conversational AI with natural speech flow
|
||||
**Live Two-Way Chat** - Real-time conversational AI that simulates natural human conversation, moving beyond forum-style turn-taking.
|
||||
|
||||
### Vision
|
||||
|
||||
Current chatbots work like forums - wait for input, generate response, repeat. This project aims for natural conversation:
|
||||
|
||||
- **Continuous transcription** - Voice transcribed in small chunks, not waiting for silence
|
||||
- **Predictive responses** - AI pre-prepares replies, modifying as context arrives
|
||||
- **Natural interruption** - AI decides when to speak (interrupt with important point, wait for question)
|
||||
- **Bidirectional listening** - AI listens even while speaking, handles interruptions gracefully
|
||||
- **Shared context window** - Drag-and-drop workspace for images, code, documents
|
||||
|
||||
### Shared Context Window
|
||||
|
||||
A visual workspace both human and AI can see/edit:
|
||||
- Images: displayed and analyzed
|
||||
- Code: displayed, editable by both
|
||||
- Split view: multiple files at once
|
||||
|
||||
## Development Commands
|
||||
|
||||
|
|
@ -15,24 +32,43 @@ pip install -e ".[dev]"
|
|||
# Run tests
|
||||
pytest
|
||||
|
||||
# Run a single test
|
||||
pytest tests/test_file.py::test_name
|
||||
# Run the demo
|
||||
python -m live_two_way_chat.demo
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
*TODO: Describe the project architecture*
|
||||
### Components
|
||||
|
||||
### Key Modules
|
||||
1. **Streaming ASR** - Real-time speech-to-text (Whisper or similar)
|
||||
2. **Response Engine** - Predictive response generation with incremental updates
|
||||
3. **Turn-Taking Model** - Decides when to speak/wait/interrupt
|
||||
4. **TTS Output** - Text-to-speech with ducking for interruptions
|
||||
5. **Context Window** - Shared visual workspace (PyQt6)
|
||||
|
||||
*TODO: List key modules and their purposes*
|
||||
### Key Modules (planned)
|
||||
|
||||
- `src/live_two_way_chat/asr.py` - Streaming speech recognition
|
||||
- `src/live_two_way_chat/response.py` - Predictive response engine
|
||||
- `src/live_two_way_chat/turn_taking.py` - Conversation flow control
|
||||
- `src/live_two_way_chat/tts.py` - Text-to-speech output
|
||||
- `src/live_two_way_chat/context_window.py` - Shared workspace UI
|
||||
- `src/live_two_way_chat/main.py` - Application entry point
|
||||
|
||||
### Key Paths
|
||||
|
||||
- **Source code**: `src/live-two-way-chat/`
|
||||
- **Source code**: `src/live_two_way_chat/`
|
||||
- **Tests**: `tests/`
|
||||
- **Documentation**: `docs/` (symlink to project-docs)
|
||||
|
||||
## Technical Challenges
|
||||
|
||||
1. Low-latency streaming ASR
|
||||
2. Incremental response generation (partial responses that update)
|
||||
3. Turn-taking model (when to speak/wait/interrupt)
|
||||
4. Context threading during interruptions
|
||||
5. Audio ducking for simultaneous speech
|
||||
|
||||
## Documentation
|
||||
|
||||
Documentation for this project lives in the centralized docs system:
|
||||
|
|
@ -47,3 +83,8 @@ When updating documentation:
|
|||
4. Run `~/PycharmProjects/project-docs/scripts/build-public-docs.sh live-two-way-chat --deploy` to publish
|
||||
|
||||
Do NOT create documentation files directly in this repository.
|
||||
|
||||
## Related Projects
|
||||
|
||||
- **Ramble** - Voice transcription (could provide ASR component)
|
||||
- **Artifact Editor** - Could power the shared context window
|
||||
|
|
|
|||
Loading…
Reference in New Issue