live-two-way-chat/CLAUDE.md

3.1 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Live Two-Way Chat - Real-time conversational AI that simulates natural human conversation, moving beyond forum-style turn-taking.

Vision

Current chatbots work like forums - wait for input, generate response, repeat. This project aims for natural conversation:

  • Continuous transcription - Voice transcribed in small chunks, not waiting for silence
  • Predictive responses - AI pre-prepares replies, modifying as context arrives
  • Natural interruption - AI decides when to speak (interrupt with important point, wait for question)
  • Bidirectional listening - AI listens even while speaking, handles interruptions gracefully
  • Shared context window - Drag-and-drop workspace for images, code, documents

Shared Context Window

A visual workspace both human and AI can see/edit:

  • Images: displayed and analyzed
  • Code: displayed, editable by both
  • Split view: multiple files at once

Development Commands

# Install for development
pip install -e ".[dev]"

# Run tests
pytest

# Run the demo
python -m live_two_way_chat.demo

Architecture

Components

  1. Streaming ASR - Real-time speech-to-text (Whisper or similar)
  2. Response Engine - Predictive response generation with incremental updates
  3. Turn-Taking Model - Decides when to speak/wait/interrupt
  4. TTS Output - Text-to-speech with ducking for interruptions
  5. Context Window - Shared visual workspace (PyQt6)

Key Modules (planned)

  • src/live_two_way_chat/asr.py - Streaming speech recognition
  • src/live_two_way_chat/response.py - Predictive response engine
  • src/live_two_way_chat/turn_taking.py - Conversation flow control
  • src/live_two_way_chat/tts.py - Text-to-speech output
  • src/live_two_way_chat/context_window.py - Shared workspace UI
  • src/live_two_way_chat/main.py - Application entry point

Key Paths

  • Source code: src/live_two_way_chat/
  • Tests: tests/
  • Documentation: docs/ (symlink to project-docs)

Technical Challenges

  1. Low-latency streaming ASR
  2. Incremental response generation (partial responses that update)
  3. Turn-taking model (when to speak/wait/interrupt)
  4. Context threading during interruptions
  5. Audio ducking for simultaneous speech

Documentation

Documentation for this project lives in the centralized docs system:

  • Source: ~/PycharmProjects/project-docs/docs/projects/live-two-way-chat/
  • Public URL: https://pages.brrd.tech/rob/live-two-way-chat/

When updating documentation:

  1. Edit files in docs/ (the symlink) or the full path above
  2. Use public: true frontmatter for public-facing docs
  3. Use <!-- PRIVATE_START --> / <!-- PRIVATE_END --> to hide sections
  4. Run ~/PycharmProjects/project-docs/scripts/build-public-docs.sh live-two-way-chat --deploy to publish

Do NOT create documentation files directly in this repository.

  • Ramble - Voice transcription (could provide ASR component)
  • Artifact Editor - Could power the shared context window