Files
Toju/agents-docs/LESSONS.md
2026-06-09 22:00:06 +02:00

228 lines
26 KiB
Markdown

# Agent Lessons
Durable rules for AI agents working on this project. Read this file at session start. Append to it when this session produces a correction worth remembering.
## How to use this file
**At session start:** scan the rules below. If any match the work you're about to do, apply them.
**During the session:** if the user corrects you, reverts your edit, or re-prompts with the same instruction — that is a signal to record a lesson before closing the task. See the trigger list in `agents-docs/AGENT_WORKFLOW.md`.
**Format of a lesson:** every entry uses the four-slot template below. Brevity matters — if you can't state the rule in one sentence, the lesson isn't sharp enough yet.
```markdown
### <short imperative title>
- **Trigger:** what you were about to do that turned out wrong (one line, concrete enough to pattern-match against)
- **Rule:** what to do instead (one sentence, imperative voice)
- **Why:** the consequence of getting it wrong — past incident, hidden constraint, user preference
- **Example:** one concrete instance, ideally a code or command snippet
```
**Keep lessons sharp.** Tag each rule with one or two tags in square brackets after the title (e.g. `[testing] [migrations]`) so future agents can grep for relevance. If a rule no longer applies, delete it — stale rules drown the real ones.
---
## Lessons
### When renaming an Angular route, sweep every navigate/url-match/doc reference [routing]
- **Trigger:** the find-servers route was renamed `/search``/servers` in `app.routes.ts`, but `servers-rail.component.ts` still called `router.navigate(['/search'])` (leave-server) and matched `startsWith('/search')` for the user-bar visibility signal, throwing `NG04002: 'search'` on leave and never showing the user-bar on the discovery page.
- **Rule:** after changing a `path:` in `app.routes.ts`, grep the whole repo for the old literal (`/search`) across `*.ts`/`*.html` (router calls, `startsWith`/url-match signals) and docs (`docs-site`, `.agents/skills/playwright-e2e/SKILL.md` route tables, domain READMEs) and update them all in the same change.
- **Why:** `router.navigate` to a non-existent path raises `NG04002` and aborts navigation, and stale `startsWith` matches silently break route-derived UI state — neither is caught by the build (string literals) and there was no `servers-rail` spec to catch it.
- **Example:** fixed `isOnServers`/`router.navigate(['/servers'])` in `servers-rail.component.{ts,html}`; canonical post-leave/discovery route is `/servers` (`FindServersComponent`), matching `DashboardComponent`'s `router.navigate(['/servers'])`.
### Server discovery (featured/trending) must fan out across all online endpoints like search [server-directory]
- **Trigger:** the `/servers` (find-servers) page showed no servers by default but found them as soon as the user typed in the search box. Discovery (`getDiscoveryServers`) queried only the *active* endpoint via `getApiBaseUrl()`, and when that endpoint is a discovery-unsupported production host (`signal.toju.app` / `signal-sweden.toju.app` in `DISCOVERY_UNSUPPORTED_HOSTS`) it short-circuited to `[]`; search meanwhile fans out across every online endpoint, so typing surfaced the servers that lived on other endpoints (e.g. localhost).
- **Rule:** make `getFeaturedServers`/`getTrendingServers` fan out across `getSearchableEndpoints()` with `forkJoin` + `deduplicateById` (mirroring all-endpoint search), and apply the `endpointSupportsServerDiscovery` gate *per endpoint* (skip → `[]`) instead of short-circuiting the whole request on the active endpoint.
- **Why:** the empty-query find-servers view renders discovery sections, not search results, so any divergence between discovery's endpoint set and search's endpoint set makes the default view look broken while search works.
- **Example:** `getDiscoveryServers` + `fetchDiscoveryFromEndpoint` in `server-directory-api.service.ts`; verified the live server returns 12 featured/12 trending while the active production host is gated out client-side.
### Server registration needs `ownerPublicKey: oderId || id`, and must not be fire-and-forget [server-directory] [rooms]
- **Trigger:** creating a server appeared to work (the creator landed in the room view) but the server didn't exist on the backend — invite-link creation and search both 404'd. `createRoom$` sent `ownerPublicKey: currentUser.oderId` with no fallback; on restored sessions `oderId` can be falsy (identify still works because it falls back to `id`), so `POST /api/servers` returned `400 Missing required fields`, and the `.subscribe()` swallowed the error while `createRoomSuccess` fired regardless.
- **Rule:** resolve owner identity as `oderId || id` everywhere it's required (the server rejects an empty `ownerPublicKey`), and give `registerServer().subscribe()` an `error` handler so a failed registration is never silent.
- **Why:** verified against the live server — authed POST with a truthy `ownerPublicKey` → 201; authed POST with an empty one → 400; the swallowed 400 is exactly what produces a "ghost" room the creator can enter but no one can find.
- **Example:** `buildServerRegistrationPayload(room, currentUser, normalizedPassword)` in `toju-app/src/app/store/rooms/server-registration.rules.ts`, used by `RoomsEffects.createRoom$`.
### Identify must fall back to the legacy session token, not only the new credential store [realtime] [authentication]
- **Trigger:** the multi-signal-server auth refactor changed `resolveCredentialForSignalUrl` to read *only* `SignalServerCredentialStoreService`; sessions restored from disk (and logins where `user.homeSignalServerUrl` is unset) have an empty credential store, so `identify` was skipped on every signal server ("Skipping identify because no session token is available") and users appeared alone — no presence, no peers, sent messages visible only to themselves. E2E never caught it because every e2e flow does a *fresh* register/login that writes the credential store directly.
- **Rule:** when resolving the identify credential for a signal URL, prefer the per-signal credential but fall back to the legacy `AuthTokenStoreService` token reconstructed with the current home user's `id`/`displayName`; never gate `identify` solely on the new credential store.
- **Why:** `persistSessionToken` always writes the legacy `metoyou.authTokens` store on login, but the per-signal credential store is only populated on fresh login (with a `loginResponse`) or successful migration/provisioning — so on reload it can be empty while a valid session still exists.
- **Example:** `resolveSignalIdentity(credential, legacyTokenEntry, homeUser)` in `signal-server-credential-resolution.rules.ts`, wired through `SignalServerAuthService.resolveCredentialForSignalUrl` (which now passes `this.authTokenStore.getTokenEntry(httpUrl)` and a `homeUser` carrying `id`). Test cross-user behavior via a *session-restore* path, not just fresh login.
### Keep the per-signal-URL identify credential resolvable from the store [realtime] [authentication]
- **Trigger:** after the multi-signal-server auth refactor, `SignalingManager.getLastIdentify` was switched to `getIdentifyCredentialsForSignalUrl`, which only read an in-memory cache populated *after* `identify()` ran; a freshly (re)connected socket then emitted `join_server` before any identify and users silently never appeared in the presence roster (almost all multi-user e2e tests timed out waiting for the peer's `room-user-card`).
- **Rule:** `getIdentifyCredentialsForSignalUrl` must fall back to resolving the credential from the credential store so a new socket's `onopen` re-identifies before it re-joins; never restrict it to only the in-memory identify cache.
- **Why:** the server drops `join_server`/`view_server` on any unauthenticated connection, so an identify-less join is lost with no error and recovery only happens on a later reconnect (often beyond the 20s test timeout).
- **Example:** server log showed `join_server authed=false ... display=User` dropped, then `User identified: Alice` on a different connection but no `Alice joined server`; fixed in `signaling-transport-handler.ts` by resolving via `dependencies.resolveCredential(signalUrl)` when the cache is empty.
### Store clientInstanceId in sessionStorage not localStorage [realtime] [multi-device]
- **Trigger:** same user logged in on two tabs, browsers, or synced profiles sees alternating "Disconnected from signaling server" and no cross-device chat/voice sync.
- **Rule:** persist `metoyou.clientInstanceId` in `sessionStorage` (one id per tab/window) and clear any legacy `localStorage` copy on first read.
- **Why:** server identify evicts stale sockets with the same `(oderId, connectionScope, clientInstanceId)` tuple; a shared localStorage id makes each client kick the other in a reconnect loop.
- **Example:** `ClientInstanceService.getClientInstanceId()` writes to `sessionStorage`; two tabs get different ids and stay connected simultaneously.
### Revalidate IndexedDB scope without reinitializing on every read [persistence] [performance]
- **Trigger:** `DatabaseService.ensureReady()` called `initialize()` before every delegated read/write to fix user-scope races.
- **Rule:** cache the last validated `metoyou_currentUserId` and only re-run backend initialization when that scope changes or an in-flight initialize completes with a different scope.
- **Why:** per-operation revalidation fans out across ban lookups, room loads, and message reads, causing channel/chat UI to stay blank until repeated server clicks eventually win the race.
- **Example:** `ensureReady()` returns immediately when `isReady()` and `validatedUserScope` still match `getStoredCurrentUserId()`.
### Restore local user scope before protected writes [authentication] [persistence]
- **Trigger:** a logged-in in-memory user can create rooms or messages after `metoyou_currentUserId` was cleared by a late session-expired path.
- **Rule:** before protected local persistence or server-directory actions, restore `metoyou_currentUserId` from the current user and avoid treating a live current user as unauthenticated.
- **Why:** otherwise rooms/messages fall into the anonymous IndexedDB scope, and route checks redirect to login even though NgRx still has the authenticated user.
- **Example:** `MessagesEffects.sendMessage$`, `RoomsEffects.createRoom$`, and server-directory create/join components call `setStoredCurrentUserId(currentUser.id)` before writing or joining.
### Persisted local user state still requires a session token [authentication] [signaling]
- **Trigger:** Users appear logged in from local storage but cannot see peers online or send chat after session-token auth shipped.
- **Rule:** before connecting signaling or loading rooms for a persisted user, require a non-expired token in `metoyou.authTokens`; redirect to `/login` on `SESSION_EXPIRED`, `auth_required`, or `auth_error`.
- **Why:** WebSocket `identify` is skipped without a token, so `join_server`, RTC relay, and presence never establish even though the profile exists locally.
- **Example:** `hasValidPersistedSession()` in `auth-session.rules.ts` from `loadCurrentUser$`.
### Declare MODIFY_AUDIO_SETTINGS for Android WebRTC mic capture [mobile] [android]
- **Trigger:** Android users accept the microphone prompt but voice calls and channels still fail to join.
- **Rule:** include `android.permission.MODIFY_AUDIO_SETTINGS` in `toju-app/android/app/src/main/AndroidManifest.xml` and preflight Capacitor capture through `MobileMediaService.ensureVoiceCapturePermissions()` before `getUserMedia`.
- **Why:** Capacitor's `BridgeWebChromeClient.onPermissionRequest` requests `RECORD_AUDIO` and `MODIFY_AUDIO_SETTINGS` together; if the latter is undeclared, the combined grant is treated as denied even after the user taps Allow.
- **Example:** `ANDROID_REQUIRED_MANIFEST_PERMISSIONS` in `mobile-android-manifest-permissions.rules.ts`.
### Do not override Tailwind with box-sizing inherit [mobile] [css]
- **Trigger:** mobile pages still overflow horizontally until devtools disables `*, *::before, *::after { box-sizing: inherit }` in global styles.
- **Rule:** in `src/styles.scss` keep `box-sizing: border-box` on the universal selector (matching Tailwind preflight); never replace it with `inherit` from `html`.
- **Why:** `inherit` overrides preflight and some nested component hosts resolve to `content-box`, so `w-full` plus padding becomes wider than the parent — especially visible on the mobile dashboard beside the servers rail.
- **Example:** `src/styles.scss` `@layer base` universal rule uses `border-box`, not `inherit`.
### Use the app-shell servers rail for mobile discovery pages [mobile] [layout]
- **Trigger:** patching `min-w-0` / `overflow-x-hidden` on the dashboard (or find-people/find-servers) while the page still renders wider than the phone beside an embedded servers rail.
- **Rule:** on mobile discovery routes (`/dashboard`, `/people`, `/servers`, …) show the global `app.html` servers rail and render the page full-width in `appWorkspace`; keep embedded swiper+rail stacks only for chat/DM/call routes (`shouldShowMobileAppServersRail` in `mobile-shell-layout.rules.ts`).
- **Why:** nesting a second rail+Swiper stack inside `router-outlet` fights the shell flex width and content keeps sizing to intrinsic width, clipping cards and inputs on every viewport.
- **Example:** `hideAppServersRail()` in `app.html` + dashboard `pageContent` only (no local `<app-servers-rail>`).
### Defer attachment blob hydration on Electron startup [attachments] [electron]
- **Trigger:** fixing inline attachment display by eagerly calling `tryRestoreAttachmentFromLocal()` for every persisted attachment during `initFromDatabase()`.
- **Rule:** load attachment metadata at startup, but hydrate blob URLs only for the watched room on demand; read disk files through chunked IPC (`readFileChunk`) and yield between chunks/attachments so large images never block the renderer.
- **Why:** restoring every saved attachment as a single base64 round-trip plus synchronous `atob()` can freeze Electron for seconds even after the shell paints.
- **Example:** `runInitFromDatabase()` stops at `loadFromDatabase()`; `restoreLocalAttachmentsForRoom()` hydrates lazily via `restoreAttachmentBlobFromDiskPath()`.
### Lazy-load Capacitor modules on Electron/desktop [mobile] [electron]
- **Trigger:** adding mobile facades that statically import Capacitor adapters or `@capacitor/*` plugins into shared Angular services used by the desktop app.
- **Rule:** keep web/electron shells on web adapters synchronously and load Capacitor adapters/plugins only through dynamic `import()` after `runtime === 'capacitor'` — never top-level `import '@capacitor/...'` in code reachable from `app.ts` / `DirectCallService`.
- **Why:** bundlers evaluate static Capacitor imports during Electron startup, which can freeze the renderer before first paint even when runtime detection would have chosen the web adapter.
- **Example:** `resolveMobileAdapter()` in `mobile-capacitor-adapter.rules.ts` plus async `capacitor-plugin-loader.ts` / `loadMetoyouMobilePlugin()`.
### Use the upgrade transaction during IndexedDB schema migrations [persistence] [browser]
- **Trigger:** bumping `BROWSER_DATABASE_VERSION` and opening existing stores via `database.transaction(...)` inside `onupgradeneeded`.
- **Rule:** during `onupgradeneeded`, reuse `event.transaction.objectStore(name)` for existing stores and only call `database.createObjectStore` for missing ones — never start a second transaction while the version-change transaction is active.
- **Why:** nested transactions abort the upgrade, `authenticateUser` storage prep fails, and login/register navigates before `setCurrentUser` so DM routes throw "Cannot use direct messages without a current user."
- **Example:** `ensureObjectStoreDuringUpgrade(database, upgradeTransaction, 'messages')` in `browser-database-schema.ts`.
### Wait for authenticateUser storage prep before post-login navigation [authentication] [browser]
- **Trigger:** dispatching `UsersActions.authenticateUser` from login/register and immediately calling `router.navigate(...)`.
- **Rule:** wait for `setCurrentUser` or `loadCurrentUserFailure` (e.g. `waitForAuthenticationOutcome(actions$)`) before navigating to `returnUrl` or `/dashboard`.
- **Why:** `authenticateUser$` prepares per-user IndexedDB asynchronously; early navigation renders DM/shell routes before the current user exists in the store.
- **Example:** `await firstValueFrom(waitForAuthenticationOutcome(this.actions$))` in `register.component.ts` and `login.component.ts`.
### Use dense arrays for chunked transfer buffers [custom-emoji] [webrtc]
- **Trigger:** chunked P2P asset assembly marks a transfer complete after the first chunk because `array.some()` skips sparse holes created by `new Array(total)`.
- **Rule:** initialize chunk buffers with `Array.from({ length: total }, () => undefined)` (or another dense initializer) before using `some`/`every`/`filter` to detect completion.
- **Why:** a single assigned slot in a sparse array makes `.some((chunk) => !chunk)` return false, so multi-chunk custom emoji transfers are dropped and peers never receive uploaded images larger than one chunk.
- **Example:** `CustomEmojiService.receiveTransferStart` stores `chunks: Array.from({ length: total }, () => undefined)` instead of `new Array(total)`.
### Route custom emoji right-click through the native context menu [custom-emoji] [ux]
- **Trigger:** adding a second emoji-specific context menu beside `NativeContextMenuComponent`, or attaching handlers only to `<img>` nodes.
- **Rule:** mark emoji hosts with `data-custom-emoji` / `data-custom-emoji-library` plus `data-custom-emoji-id`, let `NativeContextMenuComponent` own add/remove actions, and use a capture-phase `preventDefault` so Electron/browser image menus do not override them.
- **Why:** the shell context menu already intercepts every image right-click; duplicate menus fight each other and button/div wrappers miss img-only handlers.
- **Example:** reaction pills and picker buttons carry the data attributes; `resolveCustomEmojiContextMenuTarget()` opens **Add to emoji library** / **Remove from emoji library** from the global menu.
### Separate known emoji assets from saved library [custom-emoji] [ux]
- **Trigger:** syncing remote custom emoji directly into the picker/library when it is first seen in chat.
- **Rule:** store remote emoji as known renderable assets, but only show them in the user's picker after an explicit save action such as right-clicking the rendered emoji.
- **Why:** users need messages to render, but they should control which seen emoji become part of their local emoji library.
- **Example:** `CustomEmojiService.emojis` filters to saved emoji, while `findEmoji(id)` can still resolve unsaved known assets for message rendering.
### Chunk custom emoji assets over data channels [custom-emoji] [webrtc]
- **Trigger:** sending uploaded custom emoji image data through a single `custom-emoji-full` peer event.
- **Rule:** stream custom emoji assets as a metadata envelope plus bounded `custom-emoji-chunk` events; use buffered sends for back-pressure, but never rely on buffering to make oversized messages safe.
- **Why:** a single base64 data URL can exceed browser SCTP message limits and fire `RTCDataChannel.onerror`, breaking the app-wide chat channel.
- **Example:** send `{ type: 'custom-emoji-full', customEmojiTransfer, total }`, then `custom-emoji-chunk` events with small `data` slices.
### Re-clear visible notification channels after recompute [notifications] [startup]
- **Trigger:** fixing startup unread badges by only changing read-marker writes or initial hydration.
- **Rule:** also check later `loadMessagesSuccess` and `syncMessages` recomputes, and re-clear the focused visible channel after applying derived unread counts.
- **Why:** the startup-selected server can load or sync messages after it was marked read, reintroducing a channel unread badge even though the user is viewing that channel.
- **Example:** `NotificationsService.refreshRoomUnreadFromMessages(...)` should clear `activeChannelId` for `currentRoom` after recalculating counts from a startup message batch.
### Disambiguate nested chat cards [chat] [ui]
- **Trigger:** removing a visual treatment from chat history when a system message has both an outer row wrapper and an inner pill/card.
- **Rule:** preserve the intended inner timeline pill unless the user explicitly targets it; render system messages outside the themed `chatMessageBubble` wrapper and keep `data-message-id` off direct child `div`s.
- **Why:** PM call-started history should stay as a compact centered pill, while theme CSS such as `app-chat-message-item > div[data-message-id]` can turn the full-width row around it into the unnecessary card.
- **Example:** In `chat-message-item.component.html`, keep `data-testid="chat-system-message"` with `rounded-full border bg-secondary/45`, put `appThemeNode="chatMessageBubble"` only on the non-system branch, and place `[attr.data-message-id]` on the nested pill instead of the system row wrapper.
### Use terminal Vitest when the test tool hangs [testing]
- **Trigger:** VS Code test execution stays at "Starting test run..." without producing Vitest output.
- **Rule:** run the focused spec through the terminal with `cd toju-app && npx vitest run <spec-path>` and report the direct Vitest result.
- **Why:** the test integration can hang before starting the runner, while the terminal Vitest command returns quickly and gives actionable failures.
- **Example:** `cd toju-app && npx vitest run src/app/domains/game-activity/application/game-activity.service.spec.ts`.
### Do not add fake chrome around screenshots [website] [design]
- **Trigger:** wrapping a real product screenshot in decorative titlebar/window chrome or placing oversized marketing headings beside copy without checking overlap.
- **Rule:** use the screenshot's existing frame when it already includes app chrome, and top-align large heading/copy columns with explicit readable widths.
- **Why:** duplicated chrome makes CTA/product previews look broken, and bottom-aligned large headings can cover accompanying text on the marketing site.
- **Example:** `website/src/app/pages/home/home.component.html` should render the screenshot directly; `host-section` should use top-aligned heading and `.host-section-copy` columns.
### Verify lint exits 0 before claiming done [verification]
- **Trigger:** about to report a task as complete after running tests but skipping ESLint.
- **Rule:** run `npm run lint` from the repo root and confirm exit code 0 before any "done" claim.
- **Why:** `npm run test` only runs the toju-app Vitest suite — it doesn't cover the server, Electron, or website packages. ESLint (flat config in `eslint.config.js`) is the universal check across every package; type-style violations slip through tests and break Gitea Workflows for the next agent.
- **Example:** `npm run lint && echo OK` — only claim done after seeing `OK`. For Electron type errors specifically, also confirm `npm run build:electron` succeeds (it invokes `tsc -p tsconfig.electron.json`).
### Use blob URLs for inline attachment previews [attachments] [electron]
- **Trigger:** receiving users see broken image icons or video players that never start, but "Download" saves a valid file.
- **Rule:** never bind `attachment.objectUrl` to `file://` URLs for chat `<img>`, `<video>`, or `<audio>` — always create a `blob:` URL from the bytes on disk or in memory; keep `savedPath`/`filePath` for IPC download/open only.
- **Why:** Electron runs with `webSecurity: true`, so renderer pages cannot load arbitrary `file://` app-data paths even when CSP allows `file:`; IPC download still works because it reads the path in the main process.
- **Example:** `ensureInlineDisplayObjectUrl()` in `AttachmentPersistenceService`, and `URL.createObjectURL(blob)` in `finalizeTransferIfComplete` / `handleDiskFileChunk` instead of `getFileUrl(savedPath)`.
### Resolve Electron drag-and-drop file paths with webUtils [attachments] [electron]
- **Trigger:** large videos play after drag-and-drop upload, but after restart the uploader sees a peer-download error even though they sent the file from disk.
- **Rule:** when accepting dropped or pasted files in Electron, call `webUtils.getPathForFile(file)` from preload (`getPathForFile` on `electronAPI`) and annotate the `File` before `publishAttachments`; never rely on `File.path` in the renderer.
- **Why:** Chromium removed direct `File.path` access in modern Electron; without `getPathForFile`, large uploads only exist as in-memory blobs and cannot be copied into app data for reload playback.
- **Example:** `annotateLocalFilePath(file, { getPathForFile: electronApi.getPathForFile })` in `ChatMessageComposerComponent.addPendingFiles`.
### Preserve uploader local attachment paths across sync [attachments] [persistence]
- **Trigger:** large Electron uploads play from `filePath` after send, but after reload the uploader sees "The connected peers do not have this file right now" and must P2P-download their own file.
- **Rule:** never persist synced attachment metadata with `filePath`/`savedPath` stripped — merge with stored local paths, finish attachment DB init before applying sync batches, and try local disk restore before sending `file-request` to peers.
- **Why:** P2P sync intentionally omits local-only paths; a startup race can overwrite the uploader's saved `filePath` with `null`, and large videos (>10 MB) are not auto-copied to app data so only the original path can restore playback.
- **Example:** copy large Electron uploads into app-data on `publishAttachments`, `mergeAttachmentLocalPaths(incomingMeta, storedRecord)` in `persistAttachmentMeta`, `await persistence.whenReady()` in `registerSyncedAttachments`, and `tryRestoreAttachmentFromLocal()` before any `file-request`.
<!--
Add new lessons above this comment, newest at the top.
Delete this example once the project has accumulated 2-3 real lessons.
-->