Repair connectivity correctly v1

This commit is contained in:
2026-05-17 15:15:14 +02:00
parent e769a6ee4a
commit 9d0a4478b2
18 changed files with 1125 additions and 25 deletions

View File

@@ -172,6 +172,8 @@ Join and leave broadcasts are also identity-aware: `handleJoinServer` only broad
Peer routing also has to stay scoped to the signaling server that reported the membership. A `user_left` from one signaling cluster must only subtract that cluster's shared servers; otherwise a leave on `signal.toju.app` can incorrectly tear down a peer that is still shared through `signal-sweden.toju.app` or a local signaling server. Route metadata is therefore kept across peer recreation and only cleared once the renderer no longer shares any servers with that peer.
When local voice is active, a transient `user_left` or stale presence snapshot must not immediately mute or tear down a peer whose P2P transport is still alive. The users store receives the current connected peer IDs with signaling presence updates and preserves live voice/camera/screen state while that transport is connected. The signaling handler also preserves an active voice peer route after a `user_left` blip so a later data-channel or peer reconnect can still target the correct signaling URL. Explicit voice leave still travels as a `voice-state` event with `isConnected: false`, and closed/failed peer connections still clean themselves up through the peer recovery path.
## Peer connection lifecycle
Peers connect to each other directly with `RTCPeerConnection`. The initiator is chosen deterministically from the identified logical peer IDs so only one side creates the offer and primary data channel for a given pair. The other side creates an answer. If identity or negotiation is still settling, the retry timer defers instead of comparing against the ephemeral local transport ID or reusing a half-open peer forever.
@@ -246,6 +248,8 @@ Profile avatar sync follows attachment-style chunk transport plus server-icon-st
Every 5 seconds a PING message is sent to each peer. The peer responds with PONG carrying the original timestamp, and the round-trip latency is stored in a signal.
Data-channel failures are treated as control-plane failures, not proof that RTP audio has stopped. When an open channel reports a non-fatal error, the client requests a fresh voice-state snapshot over that same channel. When the channel closes or cannot carry the resync request, the peer manager waits a short grace period so any still-flowing audio is not interrupted by a transient event. If the `RTCPeerConnection` is still connected after that grace period, the elected initiator replaces only the data channel in-place and preserves the media transport. Full peer recreation is reserved for cases where the media transport is no longer connected or the in-place control-channel repair fails.
## Media pipeline
### Voice