Move toju-app into own its folder
This commit is contained in:
322
toju-app/src/app/infrastructure/realtime/README.md
Normal file
322
toju-app/src/app/infrastructure/realtime/README.md
Normal file
@@ -0,0 +1,322 @@
|
||||
# Realtime Infrastructure
|
||||
|
||||
Low-level WebRTC and WebSocket plumbing that the rest of the app sits on top of. Nothing in here knows about Angular components, NgRx, or domain logic. It exposes observables, signals, and callbacks that higher layers (facades, effects, components) consume.
|
||||
|
||||
## Module map
|
||||
|
||||
```
|
||||
realtime/
|
||||
├── realtime-session.service.ts Composition root (WebRTCService)
|
||||
├── realtime.types.ts PeerData, credentials, tracker types
|
||||
├── realtime.constants.ts ICE servers, signal types, bitrates, intervals
|
||||
│
|
||||
├── signaling/ WebSocket layer
|
||||
│ ├── signaling.manager.ts One WebSocket per signaling URL
|
||||
│ ├── signaling-transport-handler.ts Routes messages to the right socket
|
||||
│ ├── server-signaling-coordinator.ts Maps peers/servers to signaling URLs
|
||||
│ ├── signaling-message-handler.ts Dispatches incoming signaling messages
|
||||
│ └── server-membership-signaling-handler.ts Join / leave / switch protocol
|
||||
│
|
||||
├── peer-connection-manager/ WebRTC peer connections
|
||||
│ ├── peer-connection.manager.ts Owns all RTCPeerConnection instances
|
||||
│ ├── shared.ts PeerData type + state factory
|
||||
│ ├── connection/
|
||||
│ │ ├── create-peer-connection.ts RTCPeerConnection factory (ICE, transceivers)
|
||||
│ │ └── negotiation.ts Offer/answer/ICE with collision handling
|
||||
│ ├── messaging/
|
||||
│ │ ├── data-channel.ts Ordered data channel for chat + control
|
||||
│ │ └── ping.ts Latency measurement (PING/PONG every 5s)
|
||||
│ ├── recovery/
|
||||
│ │ └── peer-recovery.ts Disconnect grace period + reconnect loop
|
||||
│ └── streams/
|
||||
│ └── remote-streams.ts Classifies incoming tracks (voice vs screen)
|
||||
│
|
||||
├── media/ Local capture and processing
|
||||
│ ├── media.manager.ts getUserMedia, mute, deafen, gain pipeline
|
||||
│ ├── noise-reduction.manager.ts RNNoise AudioWorklet graph
|
||||
│ ├── voice-session-controller.ts Higher-level wrapper over MediaManager
|
||||
│ ├── screen-share.manager.ts Screen capture + per-peer track distribution
|
||||
│ └── screen-share-platforms/
|
||||
│ ├── shared.ts Electron desktopCapturer types
|
||||
│ ├── browser-screen-share.capture.ts Standard getDisplayMedia
|
||||
│ ├── desktop-electron-screen-share.capture.ts Electron source picker (Windows)
|
||||
│ └── linux-electron-screen-share.capture.ts PulseAudio/PipeWire routing (Linux)
|
||||
│
|
||||
├── streams/ Stream facades
|
||||
│ ├── peer-media-facade.ts Unified API over peers, media, screen share
|
||||
│ └── remote-screen-share-request-controller.ts On-demand screen share delivery
|
||||
│
|
||||
├── state/
|
||||
│ └── webrtc-state-controller.ts Angular Signals for all connection state
|
||||
│
|
||||
└── logging/
|
||||
├── webrtc-logger.ts Conditional [WebRTC] prefixed logging
|
||||
└── debug-network-metrics.ts Per-peer stats (drops, latency, throughput)
|
||||
```
|
||||
|
||||
## How it all fits together
|
||||
|
||||
`WebRTCService` is the composition root. It instantiates every other manager, then wires their callbacks together after construction (to avoid circular references). No manager imports another manager directly.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
WS[WebRTCService<br/>composition root]
|
||||
|
||||
WS --> SC[SignalingTransportHandler]
|
||||
WS --> PCM[PeerConnectionManager]
|
||||
WS --> MM[MediaManager]
|
||||
WS --> SSM[ScreenShareManager]
|
||||
WS --> State[WebRtcStateController<br/>Angular Signals]
|
||||
WS --> VSC[VoiceSessionController]
|
||||
WS --> PMF[PeerMediaFacade]
|
||||
WS --> RSSRC[RemoteScreenShareRequestController]
|
||||
|
||||
SC --> SM1[SignalingManager<br/>socket A]
|
||||
SC --> SM2[SignalingManager<br/>socket B]
|
||||
SC --> Coord[ServerSignalingCoordinator]
|
||||
|
||||
PCM --> Conn[create-peer-connection]
|
||||
PCM --> Neg[negotiation]
|
||||
PCM --> DC[data-channel]
|
||||
PCM --> Ping[ping]
|
||||
PCM --> Rec[peer-recovery]
|
||||
PCM --> RS[remote-streams]
|
||||
|
||||
MM --> NR[NoiseReductionManager<br/>RNNoise worklet]
|
||||
SSM --> BrowserCap[Browser capture]
|
||||
SSM --> ElectronCap[Electron capture]
|
||||
SSM --> LinuxCap[Linux audio routing]
|
||||
|
||||
click WS "realtime-session.service.ts" "WebRTCService - composition root" _blank
|
||||
click SC "signaling/signaling-transport-handler.ts" "Routes messages to the right WebSocket" _blank
|
||||
click PCM "peer-connection-manager/peer-connection.manager.ts" "Owns all RTCPeerConnection instances" _blank
|
||||
click MM "media/media.manager.ts" "getUserMedia, mute, deafen, gain pipeline" _blank
|
||||
click SSM "media/screen-share.manager.ts" "Screen capture and per-peer distribution" _blank
|
||||
click State "state/webrtc-state-controller.ts" "Angular Signals for connection state" _blank
|
||||
click VSC "media/voice-session-controller.ts" "Higher-level voice session wrapper" _blank
|
||||
click PMF "streams/peer-media-facade.ts" "Unified API over peers, media, screen share" _blank
|
||||
click RSSRC "streams/remote-screen-share-request-controller.ts" "On-demand screen share delivery" _blank
|
||||
click SM1 "signaling/signaling.manager.ts" "One WebSocket per signaling URL" _blank
|
||||
click SM2 "signaling/signaling.manager.ts" "One WebSocket per signaling URL" _blank
|
||||
click Coord "signaling/server-signaling-coordinator.ts" "Maps peers/servers to signaling URLs" _blank
|
||||
click Conn "peer-connection-manager/connection/create-peer-connection.ts" "RTCPeerConnection factory" _blank
|
||||
click Neg "peer-connection-manager/connection/negotiation.ts" "Offer/answer/ICE with collision handling" _blank
|
||||
click DC "peer-connection-manager/messaging/data-channel.ts" "Ordered data channel for chat + control" _blank
|
||||
click Ping "peer-connection-manager/messaging/ping.ts" "Latency measurement via PING/PONG" _blank
|
||||
click Rec "peer-connection-manager/recovery/peer-recovery.ts" "Disconnect grace period + reconnect loop" _blank
|
||||
click RS "peer-connection-manager/streams/remote-streams.ts" "Classifies incoming tracks" _blank
|
||||
click NR "media/noise-reduction.manager.ts" "RNNoise AudioWorklet graph" _blank
|
||||
click BrowserCap "media/screen-share-platforms/browser-screen-share.capture.ts" "Standard getDisplayMedia" _blank
|
||||
click ElectronCap "media/screen-share-platforms/desktop-electron-screen-share.capture.ts" "Electron source picker" _blank
|
||||
click LinuxCap "media/screen-share-platforms/linux-electron-screen-share.capture.ts" "PulseAudio/PipeWire routing" _blank
|
||||
```
|
||||
|
||||
## Signaling (WebSocket)
|
||||
|
||||
The signaling layer's only job is getting two peers to exchange SDP offers/answers and ICE candidates so they can establish a direct WebRTC connection. Once the peer connection is up, signaling is only used for presence (user joined/left) and reconnection.
|
||||
|
||||
Each signaling URL gets its own `SignalingManager` (one WebSocket each). `SignalingTransportHandler` picks the right socket based on which server the message is for. `ServerSignalingCoordinator` tracks which peers belong to which servers and which signaling URLs, so we know when it is safe to tear down a peer connection after leaving a server.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant UI as App
|
||||
participant STH as SignalingTransportHandler
|
||||
participant SM as SignalingManager
|
||||
participant WS as WebSocket
|
||||
participant Srv as Signaling Server
|
||||
|
||||
UI->>STH: identify(credentials)
|
||||
STH->>SM: send(identify message)
|
||||
SM->>WS: ws.send(JSON)
|
||||
WS->>Srv: identify
|
||||
|
||||
UI->>STH: joinServer(serverId)
|
||||
STH->>SM: send(join_server)
|
||||
SM->>WS: ws.send(JSON)
|
||||
|
||||
Srv-->>WS: server_users [peerA, peerB]
|
||||
WS-->>SM: onmessage
|
||||
SM-->>STH: messageReceived$
|
||||
STH-->>UI: routes to SignalingMessageHandler
|
||||
```
|
||||
|
||||
### Reconnection
|
||||
|
||||
When the WebSocket drops, `SignalingManager` schedules reconnection with exponential backoff (1s, 2s, 4s, ... up to 30s). On reconnect it replays the cached `identify` and `join_server` messages so presence is restored without the UI doing anything.
|
||||
|
||||
## Peer connection lifecycle
|
||||
|
||||
Peers connect to each other directly with `RTCPeerConnection`. The "initiator" (whoever was already in the room) creates the data channel and audio/video transceivers, then sends an offer. The other side creates an answer.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Peer A (initiator)
|
||||
participant Sig as Signaling Server
|
||||
participant B as Peer B
|
||||
|
||||
Note over A: createPeerConnection(B, initiator=true)
|
||||
Note over A: Creates data channel + transceivers
|
||||
|
||||
A->>Sig: offer (SDP)
|
||||
Sig->>B: offer (SDP)
|
||||
|
||||
Note over B: createPeerConnection(A, initiator=false)
|
||||
Note over B: setRemoteDescription(offer)
|
||||
Note over B: Attach local audio tracks
|
||||
B->>Sig: answer (SDP)
|
||||
Sig->>A: answer (SDP)
|
||||
Note over A: setRemoteDescription(answer)
|
||||
|
||||
A->>Sig: ICE candidates
|
||||
Sig->>B: ICE candidates
|
||||
B->>Sig: ICE candidates
|
||||
Sig->>A: ICE candidates
|
||||
|
||||
Note over A,B: RTCPeerConnection state -> "connected"
|
||||
Note over A,B: Data channel opens, voice flows
|
||||
```
|
||||
|
||||
### Offer collision
|
||||
|
||||
Both peers might send offers at the same time ("glare"). The negotiation module implements the "polite peer" pattern: one side is designated polite (the non-initiator) and will roll back its local offer if it detects a collision, then accept the remote offer instead. The impolite side ignores the incoming offer.
|
||||
|
||||
### Disconnect recovery
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
[*] --> Connected
|
||||
Connected --> Disconnected: connectionState = "disconnected"
|
||||
Disconnected --> Connected: recovers within 10s
|
||||
Disconnected --> Failed: grace period expires
|
||||
Failed --> Reconnecting: schedule reconnect (every 5s)
|
||||
Reconnecting --> Connected: new offer accepted
|
||||
Reconnecting --> GaveUp: 12 attempts failed
|
||||
Connected --> Closed: leave / cleanup
|
||||
GaveUp --> [*]
|
||||
Closed --> [*]
|
||||
```
|
||||
|
||||
When a peer connection enters `disconnected`, a 10-second grace period starts. If it recovers on its own (network blip), nothing happens. If it reaches `failed`, the connection is torn down and a reconnect loop starts: a fresh `RTCPeerConnection` is created and a new offer is sent every 5 seconds, up to 12 attempts.
|
||||
|
||||
## Data channel
|
||||
|
||||
A single ordered data channel carries all peer-to-peer messages: chat events, voice/screen state broadcasts, state requests, pings, and screen share control.
|
||||
|
||||
Back-pressure is handled with a high-water mark (4 MB) and low-water mark (1 MB). `sendToPeerBuffered()` waits for the buffer to drain before sending, which matters during file transfers.
|
||||
|
||||
Every 5 seconds a PING message is sent to each peer. The peer responds with PONG carrying the original timestamp, and the round-trip latency is stored in a signal.
|
||||
|
||||
## Media pipeline
|
||||
|
||||
### Voice
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
Mic[getUserMedia] --> Raw[Raw mic stream]
|
||||
Raw --> RNN{RNNoise<br/>enabled?}
|
||||
RNN -- yes --> Worklet[AudioWorklet<br/>NoiseSuppressor]
|
||||
RNN -- no --> Gain
|
||||
Worklet --> Gain{Input gain<br/>adjusted?}
|
||||
Gain -- yes --> GainNode[GainNode pipeline]
|
||||
Gain -- no --> Out[Local media stream]
|
||||
GainNode --> Out
|
||||
Out --> Peers[replaceTrack on<br/>all peer audio senders]
|
||||
|
||||
click Mic "media/media.manager.ts" "MediaManager.enableVoice()" _blank
|
||||
click Worklet "media/noise-reduction.manager.ts" "NoiseReductionManager.enable()" _blank
|
||||
click GainNode "media/media.manager.ts" "MediaManager.applyInputGainToCurrentStream()" _blank
|
||||
click Out "media/media.manager.ts" "MediaManager.localMediaStream" _blank
|
||||
click Peers "media/media.manager.ts" "MediaManager.bindLocalTracksToAllPeers()" _blank
|
||||
```
|
||||
|
||||
`MediaManager` grabs the mic with `getUserMedia`, optionally pipes it through the RNNoise AudioWorklet for noise reduction (48 kHz, loaded from `rnnoise-worklet.js`), optionally runs it through a `GainNode` for input volume control, and then pushes the resulting stream to every connected peer via `replaceTrack`.
|
||||
|
||||
Mute just disables the audio track (`track.enabled = false`), the connection stays up. Deafen suppresses incoming audio playback on the local side.
|
||||
|
||||
### Screen share
|
||||
|
||||
Screen capture uses a platform-specific strategy:
|
||||
|
||||
| Platform | Capture method |
|
||||
|---|---|
|
||||
| Browser | `getDisplayMedia` with quality presets |
|
||||
| Windows (Electron) | Electron `desktopCapturer.getSources()` with a source picker UI |
|
||||
| Linux (Electron) | `getDisplayMedia` for video + PulseAudio/PipeWire routing for system audio, keeping voice playback out of the capture |
|
||||
|
||||
Screen share tracks are distributed on-demand. A peer sends a `SCREEN_SHARE_REQUEST` message over the data channel, and only then does the sharer attach screen tracks to that peer's connection and renegotiate.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant V as Viewer
|
||||
participant S as Sharer
|
||||
|
||||
V->>S: SCREEN_SHARE_REQUEST (data channel)
|
||||
Note over S: Add viewer to requestedViewerPeerIds
|
||||
Note over S: Attach screen video + audio senders
|
||||
S->>V: renegotiate (new offer with screen tracks)
|
||||
V->>S: answer
|
||||
Note over V: ontrack fires with screen video
|
||||
Note over V: Classified as screen share stream
|
||||
Note over V: UI renders video
|
||||
|
||||
V->>S: SCREEN_SHARE_STOP (data channel)
|
||||
Note over S: Remove screen senders
|
||||
S->>V: renegotiate (offer without screen tracks)
|
||||
```
|
||||
|
||||
## State
|
||||
|
||||
`WebRtcStateController` holds all connection state as Angular Signals: `isConnected`, `isMuted`, `isDeafened`, `isScreenSharing`, `connectedPeers`, `peerLatencies`, etc. Managers call update methods on the controller after state changes. Components and facades read these signals reactively.
|
||||
|
||||
## Logging
|
||||
|
||||
`WebRTCLogger` wraps `console.*` with a `[WebRTC]` prefix and a debug flag so logging can be toggled at runtime. `DebugNetworkMetrics` tracks per-peer stats (connection drops, handshake counts, message counts, download rates) for the debug console UI.
|
||||
|
||||
## ICE and STUN
|
||||
|
||||
WebRTC connections require a way for two peers to discover how to reach each other across different networks (NATs, firewalls, etc.). This is handled by ICE, with help from STUN.
|
||||
|
||||
### ICE (Interactive Connectivity Establishment)
|
||||
|
||||
ICE is the mechanism WebRTC uses to establish a connection between peers. Instead of relying on a single network path, it:
|
||||
|
||||
- Gathers multiple possible connection candidates (IP address + port pairs)
|
||||
- Exchanges those candidates via the signaling layer
|
||||
- Attempts connectivity checks between all candidate pairs
|
||||
- Selects the first working path
|
||||
|
||||
Typical candidate types include:
|
||||
|
||||
- **Host candidates** - local network interfaces (e.g. LAN IPs)
|
||||
- **Server reflexive candidates** - public-facing address discovered via STUN
|
||||
- **Relay candidates** - provided by TURN servers (fallback)
|
||||
|
||||
ICE runs automatically as part of `RTCPeerConnection`. As candidates are discovered, they are emitted via `onicecandidate` and must be forwarded to the remote peer through signaling.
|
||||
|
||||
Connection state transitions (e.g. `checking` → `connected` → `failed`) reflect ICE progress.
|
||||
|
||||
### STUN (Session Traversal Utilities for NAT)
|
||||
|
||||
STUN is used to determine a peer's public-facing IP address and port when behind a NAT.
|
||||
|
||||
A STUN server responds with the external address it observes for a request. This allows a peer to generate a **server reflexive candidate**, which can be used by other peers to attempt a direct connection.
|
||||
|
||||
Without STUN, only local (host) candidates would be available, which typically do not work across different networks.
|
||||
|
||||
### TURN
|
||||
|
||||
TURN (Traversal Using Relays around NAT) is a fallback mechanism used in some WebRTC systems when direct peer-to-peer connectivity cannot be established.
|
||||
|
||||
Instead of connecting peers directly:
|
||||
|
||||
- Each peer establishes a connection to a TURN server
|
||||
- The TURN server relays all media and data between peers
|
||||
|
||||
This approach is more reliable in restrictive network environments but introduces additional latency and bandwidth overhead, since all traffic flows through the relay instead of directly between peers.
|
||||
|
||||
Toju/Zoracord does not use TURN and does not have code written to support it.
|
||||
|
||||
### Summary
|
||||
|
||||
- **ICE** coordinates connection establishment by trying multiple network paths
|
||||
- **STUN** provides public-facing address discovery for NAT traversal
|
||||
Reference in New Issue
Block a user