348a02c35f
Peers discovered over mDNS could still attribute later library sync traffic to temporary QUIC source ports. In a real GUI LAN run this made Host B try to push its library to Host A's outbound port instead of Host A's advertised listener, so Host A discovered the peer but never saw its games. Carry the stable listener address in Hello and HelloAck, and key library sync messages by peer_id instead of inferring identity from the transport source address. The handshake path now explicitly refreshes an empty peer library from the known listener address, matching the reliability of the direct-connect CLI path without overwriting richer snapshot state when it already arrived. This changes the current wire protocol, so PROTOCOL_VERSION is bumped to 3 and all peers must be rebuilt together. The architecture note now documents that listener addresses come from mDNS or Hello/HelloAck, never from ephemeral QUIC source ports. Test Plan: - just fmt - just test - just clippy - just build - git diff --check Refs: Local Linux/Win11 GUI LAN test logs from 2026-05-18.
8.2 KiB
8.2 KiB
lanspread-peer proposed protocol and architecture
This document proposes a tighter, more fault-tolerant protocol while keeping the current idea: mDNS discovery, QUIC transport, on-demand metadata, and chunked file transfers.
Goals (unchanged)
- Local LAN discovery via mDNS.
- QUIC + JSON messages for control, raw streams for file data.
- UI drives operations through
PeerCommand, peers remain headless. - Peers can appear/disappear at any time without data loss.
Peer lifecycle and message flow
1) Startup and advertise
- Start QUIC server.
- Advertise via mDNS with TXT records:
peer_id(stable ID, not tied to IP)proto_verlibrary_rev(monotonic local library revision)- optional
hostname
2) Discovery and handshake
When a peer is discovered:
- Connect and send
Hello { peer_id, proto_ver, library_rev, library_digest, features }.Helloalso carries the sender's advertisedlisten_addr; the QUIC source port is only a temporary transport port and must not be recorded as the peer's listener. - Receive
HelloAck { peer_id, proto_ver, listen_addr, library_rev, library_digest, features }. - If the remote
peer_idis already known but the address changed, update it. - If protocol versions are incompatible, drop the peer (and keep mDNS watching).
- If library digests match, do nothing else.
- If digests differ:
- If we have a known
library_revfor that peer, requestLibraryDelta. - Otherwise request
LibrarySnapshot.
- If we have a known
3) Steady state
- Any message updates
last_seen. - Pings run only when idle (or on a longer interval), not every 5 seconds.
- Library updates are pushed as deltas, debounced and coalesced.
4) Shutdown
- Optional
Goodbye { peer_id }lets others remove the peer quickly. - If a peer vanishes without goodbye, stale timeout + ping removal handle it.
- Goodbye is a hint, never required for correctness.
Library sync protocol
Summary and snapshot
LibrarySummary { peer_id, summary: { library_rev, library_digest, game_count } }LibrarySnapshot { peer_id, snapshot: { library_rev, games: Vec<GameSummary> } }
Delta updates
LibraryDelta { peer_id, delta: { from_rev, to_rev, added, updated, removed } }removedis a list of game IDs.- Deltas are idempotent; ignore if
to_rev<= known rev.
GameSummary (concept)
id,name,eti_version,size,downloaded,installedmanifest_hash(hash of file list + sizes)availability(e.g.,ready,downloading,local_only)
When peers broadcast their game list
- Only on changes, not on a timer.
- Filesystem events are gated per game ID instead of time-debounced:
- an active operation lock drops events for that game;
- a rescan already running for the ID sets a rescan-pending flag;
- the running rescan loops once more when that flag was set.
- Send
LibraryDeltato known peers; sendLibrarySummaryon new connections.
Local game scanning: fast and low cost
Strategy
- Maintain a persistent on-disk index (per game):
manifest_hash, total size, file list (optional), and a fingerprint (root-levelversion.inimtime, root-level.etimtime/size, andlocal/directory presence).
- Use filesystem watchers to update only changed games.
- Keep a 300-second fallback scan to recover from missed events.
Fast-path scanning
- On startup, list only top-level game directories.
- For each game, read a cheap fingerprint:
- root-level
.etifile names, sizes, and mtimes - root-level
version.inimtime - presence of
local/as a directory
- root-level
- If fingerprint unchanged, reuse cached size and manifest hash.
- Only run a recursive scan for new or changed games.
Local State and Recovery
Downloaded and installed are independent predicates:
downloadedis true only when<game_root>/version.iniexists as a regular file. The sentinel is written last through.version.ini.tmpand atomic rename. An interrupted replacement leaves no restored old sentinel because archive bytes may already have changed.installedis true when<game_root>/local/is a directory. The contents oflocal/are user-owned and are skipped by manifests, fingerprints, and file serving.
Reserved per-game paths:
.version.ini.tmpand.version.ini.discardedare download transaction scratch files and are swept during startup recovery..local.installing/is extraction staging..local.backup/holds the previous install while an update or uninstall is in flight..lanspread.jsonis the atomic per-game intent log..lanspread_ownedinside.local.*directories proves Lanspread ownership when the current intent isNone.
Recovery reads .lanspread.json and combines the recorded intent with the
observed local/, .local.installing/, and .local.backup/ state. Intent
states Installing, Updating, and Uninstalling prove ownership of the
corresponding reserved directories even if the marker was not flushed before a
crash. With intent None, markerless .local.* directories are left untouched.
Result
Most scans become O(number of game dirs), with full recursion only when needed.
File manifests and downloads
- Keep
GetGame/manifest requests, but keyed bymanifest_hashso repeated calls can be skipped when unchanged. - Downloads remain chunked QUIC streams with the existing integrity checks.
- A game is transferable only when its ID is in the catalog, no operation is
active for that ID, and the root-level
version.inisentinel exists. local/paths are never served, even if a stale or malicious manifest request asks for them.
Fault tolerance rules
- Every peer is keyed by
peer_id, not by IP address. - Peer addresses are listener addresses from mDNS or
Hello/HelloAck, never ephemeral QUIC source ports. library_revis monotonic and guards against out-of-order updates.- Any mismatch or missing delta falls back to
LibrarySnapshot. - Loss of goodbye is harmless; stale timeout is authoritative.
Roadmap from current design to this one
- Protocol updates in
lanspread-proto:- Define
Hello,HelloAck,LibrarySummary,LibrarySnapshot,LibraryDelta, and optionalGoodbyemessages. - Thread
peer_id,library_rev, andmanifest_hashthrough all library and manifest-bearing types. - Make
HelloandHelloAckcarry the sender'slisten_addr,library_rev, andlibrary_digestso both sides can record stable listener addresses and immediately selectLibraryDeltavsLibrarySnapshot.
- Define
- Peer identity:
- Persist a stable
peer_id(UUID) in the peer config and inject it intoPeerInfoandPeerGameDBat startup. - Track
peer_id -> SocketAddrin the discovery table and update the address on any incoming handshake or mDNS refresh.
- Persist a stable
- Discovery handshake:
- Publish
peer_idandlibrary_revin mDNS TXT records to avoid immediate TCP/QUIC roundtrips when nothing changed. - Add a lightweight handshake in
run_peer_discoverythat exchangesHello/HelloAckbefore any library sync. - Ignore peers that do not advertise the current protocol version.
- Publish
- Library revisioning:
- Store a monotonic
library_revlocally and increment only after a successful index refresh completes. - Apply
LibraryDeltawhenlibrary_revmatches; reject stale or future revisions and requestLibrarySnapshotinstead. - Cache the last accepted
manifest_hashper peer to short-circuit manifest requests when unchanged.
- Store a monotonic
- Local index + scan optimizations:
- Introduce a cached index file (e.g.,
.lanspread/index.json) that stores per-root fingerprints and computed manifests. - Use filesystem watchers with a debounce window to collect changes and incrementally update the cache.
- Schedule a low-frequency full scan to reconcile missed watcher events.
- Introduce a cached index file (e.g.,
- Announce updates:
- Broadcast
LibraryDeltaupdates keyed bylibrary_rev. - Send
LibrarySummaryon new connections to seed the delta flow.
- Broadcast
- File manifest caching:
- Store per-game
manifest_hashand only fetch details when changed.
- Store per-game
- Liveness:
- Reduce ping frequency; update
last_seenon any message. - Add optional
Goodbyeon shutdown paths.
- Reduce ping frequency; update
- Tests:
- Delta apply/merge, rev ordering, manifest hashing, and scan cache behavior.