Toju/toju-app/src/app/domains/voice-connection/README.md

# Voice Connection Domain

Bridges the application layer to the low-level realtime infrastructure for voice calls and in-channel camera transport. Provides speaking detection via Web Audio analysis and per-peer volume control for playback. The actual WebRTC plumbing lives in `infrastructure/realtime`; this domain wraps it with a clean facade.

## Module map

```
voice-connection/
├── application/
│   ├── facades/
│   │   └── voice-connection.facade.ts           Proxy to RealtimeSessionFacade for voice and camera signals/methods
│   └── services/
│       ├── voice-activity.service.ts            RMS-based speaking detection via AnalyserNode (per-user signals)
│       └── voice-playback.service.ts            Per-peer GainNode chain, 0-200% volume, deafen support
│
├── domain/
│   └── models/
│       └── voice-connection.model.ts            Re-exports LatencyProfile, VoiceStateSnapshot from shared-kernel / realtime
│
└── index.ts                                 Barrel exports
```

## Service relationships

```mermaid
graph TD
    VCF[VoiceConnectionFacade]
    VAS[VoiceActivityService]
    VPS[VoicePlaybackService]
    RSF[RealtimeSessionFacade]
    Models[voice-connection.models]

    VCF --> RSF
    VAS --> VCF
    VPS --> VCF

    click VCF "application/facades/voice-connection.facade.ts" "Proxy to RealtimeSessionFacade" _blank
    click VAS "application/services/voice-activity.service.ts" "RMS-based speaking detection" _blank
    click VPS "application/services/voice-playback.service.ts" "Per-peer GainNode volume chain" _blank
    click RSF "../../infrastructure/realtime/realtime-session.service.ts" "Low-level WebRTC composition root" _blank
    click Models "domain/models/voice-connection.model.ts" "Re-exported types" _blank
```

## Voice connection facade

`VoiceConnectionFacade` exposes signals and methods from `RealtimeSessionFacade` without leaking infrastructure details into feature components. It covers:

- Connection state: `isVoiceConnected`, `isMuted`, `isDeafened`, `isCameraEnabled`, `hasConnectionError`
- Stream access: `getRemoteVoiceStream`, `getRemoteCameraStream`, `getLocalStream`, `getLocalCameraStream`, `getRawMicStream`
- Controls: `enableVoice`, `disableVoice`, `enableCamera`, `disableCamera`, `toggleMute`, `toggleDeafen`, `toggleNoiseReduction`
- Audio tuning: `setOutputVolume`, `setInputVolume`, `setAudioBitrate`, `setLatencyProfile`
- Peer events: `onRemoteStream`, `onPeerConnected`, `onPeerDisconnected`
- Heartbeat: `startVoiceHeartbeat`, `stopVoiceHeartbeat`

## Camera transport

Camera capture is treated as voice-adjacent transport, not screen share. The underlying realtime layer routes webcam video only to peers in the same active voice channel, exposes remote camera streams through `getRemoteCameraStream(peerId)`, and keeps webcam senders separate from screen-share senders so both features can run at the same time.

## Speaking detection

`VoiceActivityService` monitors audio levels for local and remote streams using the Web Audio API. Each tracked stream gets its own `AudioContext` with an `AnalyserNode`. A single `requestAnimationFrame` loop polls all analysers.

```mermaid
graph LR
    Stream[MediaStream] --> Ctx[AudioContext]
    Ctx --> Src[MediaStreamAudioSourceNode]
    Src --> Analyser[AnalyserNode<br/>fftSize = 256]
    Analyser --> Poll[rAF poll loop]
    Poll --> RMS{RMS >= 0.015?}
    RMS -- yes --> Speaking[speakingSignal = true]
    RMS -- no, 8 frames --> Silent[speakingSignal = false]

    click Stream "application/services/voice-activity.service.ts" "VoiceActivityService.trackStream()" _blank
    click Poll "application/services/voice-activity.service.ts" "VoiceActivityService.poll()" _blank
```

| Parameter | Value |
|---|---|
| FFT size | 256 samples |
| Speaking threshold | RMS >= 0.015 |
| Silent grace period | 8 consecutive frames below threshold |

The service exposes `isSpeaking(userId)` and `volume(userId)` as Angular signals. It automatically tracks remote peers via the `onRemoteStream` and `onPeerDisconnected` observables. Local mic tracking is started explicitly by calling `trackLocalMic(userId, stream)`.

A reactive `speakingMap` signal (a `Map<string, boolean>`) is published whenever any user's speaking state changes, so components can bind directly.

## Voice playback

`VoicePlaybackService` handles audio output for remote peers. Each peer gets an independent Web Audio pipeline. Pipelines are rebuilt only when that peer's live voice audio track set changes — composite remote-stream notifications (camera, screen share, SDP renegotiation) reuse the existing graph so AudioContexts are not churned.

```mermaid
graph LR
    Remote[Remote stream] --> Src[MediaStreamAudioSourceNode]
    Src --> Gain[GainNode<br/>0 - 200%]
    Gain --> Dest[MediaStreamAudioDestinationNode]
    Dest --> Audio[HTMLAudioElement<br/>.play]

    click Remote "application/voice-playback.service.ts" "VoicePlaybackService.setupPeer()" _blank
    click Gain "application/voice-playback.service.ts" "VoicePlaybackService.setUserVolume()" _blank
```

Volume per peer is stored in localStorage and restored on reconnect. The range is 0% to 200% (gain values 0.0 to 2.0). When the user deafens, all gain nodes are set to zero; undeafening restores the previous values.

A Chrome workaround attaches a muted `<audio>` element to keep the `AudioContext` from suspending when no audible output is detected.