105 lines
5.4 KiB
Markdown
105 lines
5.4 KiB
Markdown
# Voice Connection Domain
|
|
|
|
Bridges the application layer to the low-level realtime infrastructure for voice calls and in-channel camera transport. Provides speaking detection via Web Audio analysis and per-peer volume control for playback. The actual WebRTC plumbing lives in `infrastructure/realtime`; this domain wraps it with a clean facade.
|
|
|
|
## Module map
|
|
|
|
```
|
|
voice-connection/
|
|
├── application/
|
|
│ ├── facades/
|
|
│ │ └── voice-connection.facade.ts Proxy to RealtimeSessionFacade for voice and camera signals/methods
|
|
│ └── services/
|
|
│ ├── voice-activity.service.ts RMS-based speaking detection via AnalyserNode (per-user signals)
|
|
│ └── voice-playback.service.ts Per-peer GainNode chain, 0-200% volume, deafen support
|
|
│
|
|
├── domain/
|
|
│ └── models/
|
|
│ └── voice-connection.model.ts Re-exports LatencyProfile, VoiceStateSnapshot from shared-kernel / realtime
|
|
│
|
|
└── index.ts Barrel exports
|
|
```
|
|
|
|
## Service relationships
|
|
|
|
```mermaid
|
|
graph TD
|
|
VCF[VoiceConnectionFacade]
|
|
VAS[VoiceActivityService]
|
|
VPS[VoicePlaybackService]
|
|
RSF[RealtimeSessionFacade]
|
|
Models[voice-connection.models]
|
|
|
|
VCF --> RSF
|
|
VAS --> VCF
|
|
VPS --> VCF
|
|
|
|
click VCF "application/facades/voice-connection.facade.ts" "Proxy to RealtimeSessionFacade" _blank
|
|
click VAS "application/services/voice-activity.service.ts" "RMS-based speaking detection" _blank
|
|
click VPS "application/services/voice-playback.service.ts" "Per-peer GainNode volume chain" _blank
|
|
click RSF "../../infrastructure/realtime/realtime-session.service.ts" "Low-level WebRTC composition root" _blank
|
|
click Models "domain/models/voice-connection.model.ts" "Re-exported types" _blank
|
|
```
|
|
|
|
## Voice connection facade
|
|
|
|
`VoiceConnectionFacade` exposes signals and methods from `RealtimeSessionFacade` without leaking infrastructure details into feature components. It covers:
|
|
|
|
- Connection state: `isVoiceConnected`, `isMuted`, `isDeafened`, `isCameraEnabled`, `hasConnectionError`
|
|
- Stream access: `getRemoteVoiceStream`, `getRemoteCameraStream`, `getLocalStream`, `getLocalCameraStream`, `getRawMicStream`
|
|
- Controls: `enableVoice`, `disableVoice`, `enableCamera`, `disableCamera`, `toggleMute`, `toggleDeafen`, `toggleNoiseReduction`
|
|
- Audio tuning: `setOutputVolume`, `setInputVolume`, `setAudioBitrate`, `setLatencyProfile`
|
|
- Peer events: `onRemoteStream`, `onPeerConnected`, `onPeerDisconnected`
|
|
- Heartbeat: `startVoiceHeartbeat`, `stopVoiceHeartbeat`
|
|
|
|
## Camera transport
|
|
|
|
Camera capture is treated as voice-adjacent transport, not screen share. The underlying realtime layer routes webcam video only to peers in the same active voice channel, exposes remote camera streams through `getRemoteCameraStream(peerId)`, and keeps webcam senders separate from screen-share senders so both features can run at the same time.
|
|
|
|
## Speaking detection
|
|
|
|
`VoiceActivityService` monitors audio levels for local and remote streams using the Web Audio API. Each tracked stream gets its own `AudioContext` with an `AnalyserNode`. A single `requestAnimationFrame` loop polls all analysers.
|
|
|
|
```mermaid
|
|
graph LR
|
|
Stream[MediaStream] --> Ctx[AudioContext]
|
|
Ctx --> Src[MediaStreamAudioSourceNode]
|
|
Src --> Analyser[AnalyserNode<br/>fftSize = 256]
|
|
Analyser --> Poll[rAF poll loop]
|
|
Poll --> RMS{RMS >= 0.015?}
|
|
RMS -- yes --> Speaking[speakingSignal = true]
|
|
RMS -- no, 8 frames --> Silent[speakingSignal = false]
|
|
|
|
click Stream "application/services/voice-activity.service.ts" "VoiceActivityService.trackStream()" _blank
|
|
click Poll "application/services/voice-activity.service.ts" "VoiceActivityService.poll()" _blank
|
|
```
|
|
|
|
| Parameter | Value |
|
|
|---|---|
|
|
| FFT size | 256 samples |
|
|
| Speaking threshold | RMS >= 0.015 |
|
|
| Silent grace period | 8 consecutive frames below threshold |
|
|
|
|
The service exposes `isSpeaking(userId)` and `volume(userId)` as Angular signals. It automatically tracks remote peers via the `onRemoteStream` and `onPeerDisconnected` observables. Local mic tracking is started explicitly by calling `trackLocalMic(userId, stream)`.
|
|
|
|
A reactive `speakingMap` signal (a `Map<string, boolean>`) is published whenever any user's speaking state changes, so components can bind directly.
|
|
|
|
## Voice playback
|
|
|
|
`VoicePlaybackService` handles audio output for remote peers. Each peer gets an independent Web Audio pipeline. Pipelines are rebuilt only when that peer's live voice audio track set changes — composite remote-stream notifications (camera, screen share, SDP renegotiation) reuse the existing graph so AudioContexts are not churned.
|
|
|
|
```mermaid
|
|
graph LR
|
|
Remote[Remote stream] --> Src[MediaStreamAudioSourceNode]
|
|
Src --> Gain[GainNode<br/>0 - 200%]
|
|
Gain --> Dest[MediaStreamAudioDestinationNode]
|
|
Dest --> Audio[HTMLAudioElement<br/>.play]
|
|
|
|
click Remote "application/voice-playback.service.ts" "VoicePlaybackService.setupPeer()" _blank
|
|
click Gain "application/voice-playback.service.ts" "VoicePlaybackService.setUserVolume()" _blank
|
|
```
|
|
|
|
Volume per peer is stored in localStorage and restored on reconnect. The range is 0% to 200% (gain values 0.0 to 2.0). When the user deafens, all gain nodes are set to zero; undeafening restores the previous values.
|
|
|
|
A Chrome workaround attaches a muted `<audio>` element to keep the `AudioContext` from suspending when no audible output is detected.
|