fix: recurriing network issue
All checks were successful
Queue Release Build / prepare (push) Successful in 18s
Deploy Web Apps / deploy (push) Successful in 6m32s
Queue Release Build / build-windows (push) Successful in 26m8s
Queue Release Build / build-linux (push) Successful in 40m18s
Queue Release Build / finalize (push) Successful in 42s
All checks were successful
Queue Release Build / prepare (push) Successful in 18s
Deploy Web Apps / deploy (push) Successful in 6m32s
Queue Release Build / build-windows (push) Successful in 26m8s
Queue Release Build / build-linux (push) Successful in 40m18s
Queue Release Build / finalize (push) Successful in 42s
This commit is contained in:
@@ -115,18 +115,22 @@ graph TD
|
||||
|
||||
## Signaling (WebSocket)
|
||||
|
||||
The signaling layer's only job is getting two peers to exchange SDP offers/answers and ICE candidates so they can establish a direct WebRTC connection. Once the peer connection is up, signaling is only used for presence (user joined/left) and reconnection.
|
||||
The signaling layer gets peers to exchange SDP offers/answers and ICE candidates so they can establish direct WebRTC connections. It also carries identity, room membership, presence, typing, and selected server-relayed fallback events when the peer data channel is unavailable.
|
||||
|
||||
Each signaling URL gets its own `SignalingManager` (one WebSocket each). `SignalingTransportHandler` picks the right socket based on which server the message is for. `ServerSignalingCoordinator` tracks which peers belong to which servers and which signaling URLs, so we know when it is safe to tear down a peer connection after leaving a server.
|
||||
|
||||
Room affinity is authoritative at this layer as well. The renderer repairs each room's saved `sourceId` / `sourceUrl` from server-directory responses and routes `join_server`, `view_server`, and room-scoped signaling traffic to that room's signaling URL first. If that route fails, alternate endpoints can be tried temporarily, but server-scoped raw messages are no longer broadcast to every connected signaling manager when the route is unknown.
|
||||
|
||||
Server-relayed fallbacks are intentionally narrow. Room chat (`chat_message`), direct-message events (`direct-message`, `direct-message-status`, `direct-message-mutation`), and voice presence (`voice_state`) may flow over signaling so users can still see written chat and voice roster state while P2P data channels are down. Media, attachments, message inventory sync, screen/camera state, and plugin data-channel traffic remain peer-plane responsibilities.
|
||||
|
||||
In UI/debug conversations, a **chat-server** means one of the saved rooms navigated from the server rail. Each chat-server has its own assigned signal server via `sourceId` / `sourceUrl`, and room-scoped feature/config checks must prefer that signal server before considering any global active endpoint. For example, KLIPY GIF picker visibility is resolved against the currently viewed chat-server's signal server so an unrelated offline chat-server does not hide the button everywhere.
|
||||
|
||||
Cold-start routing now waits for the initial server-directory health probes so same-backend aliases can collapse to one canonical signaling endpoint before any saved rooms reconnect. When a room is reconnected on a chosen socket, its background rooms are re-joined on that same socket as well so stale per-signal memberships do not keep orphan managers alive, and reconnect replay only sends `view_server` for rooms that manager still has joined.
|
||||
|
||||
This is still a non-federated model. Different signaling servers do not share peer registries or relay WebRTC offers for each other, so users in the same room must converge on the same signaling endpoint to discover one another reliably.
|
||||
|
||||
The fallback path is fragile by design: it only helps when a usable signaling socket exists. If a production origin returns Cloudflare `521`/`522` or the WebSocket closes with `1006`, room reconnect must continue to other active compatible endpoints instead of treating the room as missing or the client as incompatible.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant UI as App
|
||||
|
||||
@@ -399,6 +399,7 @@ export class WebRTCService implements OnDestroy {
|
||||
*/
|
||||
broadcastMessage(event: ChatEvent): void {
|
||||
this.peerMediaFacade.broadcastMessage(event);
|
||||
this.relayBroadcastEvent(event);
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -430,6 +431,12 @@ export class WebRTCService implements OnDestroy {
|
||||
return this.peerMediaFacade.getConnectedPeerIds();
|
||||
}
|
||||
|
||||
hasSignalingRouteForPeer(peerId: string): boolean {
|
||||
const signalUrl = this.signalingCoordinator.getPeerSignalUrl(peerId);
|
||||
|
||||
return !!signalUrl && this.signalingCoordinator.isSignalingConnectedTo(signalUrl);
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the composite remote {@link MediaStream} for a connected peer.
|
||||
*
|
||||
@@ -658,6 +665,26 @@ export class WebRTCService implements OnDestroy {
|
||||
this.peerMediaFacade.stopScreenShare();
|
||||
}
|
||||
|
||||
private relayBroadcastEvent(event: ChatEvent): void {
|
||||
if (event.type === 'chat-message' && event.message?.roomId) {
|
||||
this.signalingTransportHandler.sendRawMessage({
|
||||
type: 'chat_message',
|
||||
serverId: event.message.roomId,
|
||||
message: event.message
|
||||
});
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
if (event.type === 'voice-state' && event.voiceState?.serverId) {
|
||||
this.signalingTransportHandler.sendRawMessage({
|
||||
...event,
|
||||
type: 'voice_state',
|
||||
serverId: event.voiceState.serverId
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
/** Disconnect from the signaling server and clean up all state. */
|
||||
disconnect(): void {
|
||||
this.leaveRoom();
|
||||
|
||||
@@ -134,6 +134,14 @@ export class SignalingTransportHandler<TMessage> {
|
||||
const connectedManagers = this.getConnectedSignalingManagers();
|
||||
|
||||
if (connectedManagers.length === 0) {
|
||||
if (messageType === 'status_update') {
|
||||
this.dependencies.logger.warn('[signaling] Skipping status update without an active signaling connection', {
|
||||
type: messageType
|
||||
});
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
this.dependencies.logger.error('[signaling] No active signaling connection for outbound message', new Error('No signaling manager available'), {
|
||||
type: messageType
|
||||
});
|
||||
|
||||
Reference in New Issue
Block a user