diff --git a/crates/lanspread-peer/ARCHITECTURE.md b/crates/lanspread-peer/ARCHITECTURE.md new file mode 100644 index 0000000..5691880 --- /dev/null +++ b/crates/lanspread-peer/ARCHITECTURE.md @@ -0,0 +1,127 @@ +# lanspread-peer proposed protocol and architecture + +This document proposes a tighter, more fault-tolerant protocol while keeping +the current idea: mDNS discovery, QUIC transport, on-demand metadata, and +chunked file transfers. + +## Goals (unchanged) +- Local LAN discovery via mDNS. +- QUIC + JSON messages for control, raw streams for file data. +- UI drives operations through `PeerCommand`, peers remain headless. +- Peers can appear/disappear at any time without data loss. + +## Peer lifecycle and message flow + +### 1) Startup and advertise +- Start QUIC server. +- Advertise via mDNS with TXT records: + - `peer_id` (stable ID, not tied to IP) + - `proto_ver` + - `library_rev` (monotonic local library revision) + - optional `hostname` + +### 2) Discovery and handshake +When a peer is discovered: +1. Connect and send `Hello { peer_id, proto_ver, library_rev, library_digest, features }`. +2. Receive `HelloAck { peer_id, proto_ver, library_rev, library_digest, features }`. +3. If the remote `peer_id` is already known but the address changed, update it. +4. If protocol versions are incompatible, drop the peer (and keep mDNS watching). +5. If library digests match, do nothing else. +6. If digests differ: + - If we have a known `library_rev` for that peer, request `LibraryDelta`. + - Otherwise request `LibrarySnapshot`. + +### 3) Steady state +- Any message updates `last_seen`. +- Pings run only when idle (or on a longer interval), not every 5 seconds. +- Library updates are pushed as deltas, debounced and coalesced. + +### 4) Shutdown +- Optional `Goodbye { peer_id }` lets others remove the peer quickly. +- If a peer vanishes without goodbye, stale timeout + ping removal handle it. +- Goodbye is a hint, never required for correctness. + +## Library sync protocol + +### Summary and snapshot +- `LibrarySummary { library_rev, library_digest, game_count }` +- `LibrarySnapshot { library_rev, games: Vec }` + +### Delta updates +- `LibraryDelta { from_rev, to_rev, added, updated, removed }` +- `removed` is a list of game IDs. +- Deltas are idempotent; ignore if `to_rev` <= known rev. + +### GameSummary (concept) +- `id`, `name`, `eti_version`, `size`, `downloaded`, `installed` +- `manifest_hash` (hash of file list + sizes) +- `availability` (e.g., `ready`, `downloading`, `local_only`) + +## When peers broadcast their game list +- Only on changes, not on a timer. +- Use a short debounce (1-2 seconds) to coalesce bursts of filesystem events. +- Send `LibraryDelta` to known peers; send `LibrarySummary` on new connections. + +## Local game scanning: fast and low cost + +### Strategy +1. Maintain a persistent on-disk index (per game): + - `manifest_hash`, total size, file list (optional), and a fingerprint + (version.ini mtime, .eti mtime/size, local install dir presence). +2. Use filesystem watchers to update only changed games. +3. Keep a fallback periodic scan with a long interval (minutes) to recover from + missed events. + +### Fast-path scanning +- On startup, list only top-level game directories. +- For each game, read a cheap fingerprint: + - `.eti` size + mtime + - `version.ini` mtime (if installed) + - presence of `local/` content +- If fingerprint unchanged, reuse cached size and manifest hash. +- Only run a recursive scan for new or changed games. + +### Result +Most scans become O(number of game dirs), with full recursion only when needed. + +## File manifests and downloads +- Keep `GetGame`/manifest requests, but keyed by `manifest_hash` so repeated + calls can be skipped when unchanged. +- Downloads remain chunked QUIC streams with the existing integrity checks. + +## Fault tolerance rules +- Every peer is keyed by `peer_id`, not by IP address. +- `library_rev` is monotonic and guards against out-of-order updates. +- Any mismatch or missing delta falls back to `LibrarySnapshot`. +- Loss of goodbye is harmless; stale timeout is authoritative. + +## TODO: roadmap from current design to this one +1. Protocol updates in `lanspread-proto`: + - Add `Hello`, `HelloAck`, `LibrarySummary`, `LibrarySnapshot`, + `LibraryDelta`, and optional `Goodbye`. + - Add `peer_id`, `library_rev`, and `manifest_hash` to relevant types. +2. Peer identity: + - Introduce stable `peer_id` in `PeerInfo` and `PeerGameDB`. + - Map `peer_id` to current `SocketAddr` and update on IP changes. +3. Discovery handshake: + - Advertise TXT records in mDNS. + - Add handshake in `run_peer_discovery` or connection setup. + - Keep compatibility fallback to `ListGames` for older peers. +4. Library revisioning: + - Track `library_rev` locally. + - Apply `LibraryDelta` and reject stale revisions. + - Use `LibrarySnapshot` for first sync or delta mismatch. +5. Local index + scan optimizations: + - Add cached index storage (e.g., `.lanspread/index.json`). + - Implement filesystem watchers with debounce. + - Add a low-frequency full scan as a safety net. +6. Announce updates: + - Replace broad `AnnounceGames` with deltas. + - Send `LibrarySummary` on new connections. +7. File manifest caching: + - Store per-game `manifest_hash` and only fetch details when changed. +8. Liveness: + - Reduce ping frequency; update `last_seen` on any message. + - Add optional `Goodbye` on shutdown paths. +9. Tests: + - Delta apply/merge, rev ordering, manifest hashing, and scan cache behavior.