Follow-up hardening for 348a02c, where `listen_addr` was added to Hello and
HelloAck as `Option<SocketAddr>`. Code review surfaced three concrete problems
that the previous commit left open:
1. Cold-start asymmetry. Discovery and the QUIC/mDNS advertiser are spawned
concurrently. If discovery saw a cached peer advertisement before our own
advertiser had written `ctx.local_peer_addr`, our outbound Hello carried
`listen_addr: None`. The receiver's `peer_record_addr` then returned `None`
and silently dropped the Hello while we still recorded their HelloAck, so
peer A learned about peer B but B never learned about A until a later
handshake happened to win the race.
2. Duplicate game-list pipeline. The previous commit added
`refresh_peer_games`, which post-handshake issued a `ListGames` to fetch
`peer.games`. The library-sync path (`LibrarySnapshot`) already populates
the same field. Both could race on first contact and overwrite each other.
Worse, `refresh_peer_games` was misnamed: a `peer_game_count > 0` guard
turned it into a fetch-once-then-no-op helper, while
`handle_library_summary` independently re-triggered a full handshake when
`previous_count == 0` was observed, producing a redundant ping-pong on
every first contact.
3. Argument explosion. `perform_handshake_with_peer`, `spawn_library_resync`,
and `after_peer_library_recorded` had grown to 6-8 individual parameters
and acquired `#[allow(clippy::too_many_arguments)]` opt-outs. Every caller
was destructuring the same fields out of `Ctx`/`PeerCtx`.
Changes (all in one commit because they jointly enforce the same invariant:
"a peer is only ever recorded by its listener address, and the local
listener address must exist before we participate in the protocol"):
- `Hello.listen_addr` and `HelloAck.listen_addr` are now `SocketAddr`, not
`Option<SocketAddr>`. Wire-incompatible, but PROTOCOL_VERSION already moved
to 3 in 348a02c so no additional version bump is needed.
- `required_listen_addr` reads `ctx.local_peer_addr` and returns an
`eyre::Result`; `build_hello_from_state` and `build_hello_ack` both call
it, so an outbound or inbound Hello can no longer be constructed before
the local QUIC listener is bound. The inbound path maps this into a
`Response::InternalPeerError` so the remote peer fails cleanly instead of
seeing a malformed HelloAck.
- `run_peer_discovery` blocks on `wait_for_local_peer_addr` (25 ms poll,
shutdown-aware) before subscribing to the mDNS browser. This closes the
cold-start race for outbound handshakes at the source.
- `refresh_peer_games`, `request_game_list_from_peer`, and the
`previous_count == 0` re-handshake trigger are removed. The post-handshake
flow now relies solely on `LibrarySummary`/`LibrarySnapshot`/`LibraryDelta`
for peer-library state; `ListGames` survives only for the
`request_game_details_*` paths that fetch per-game file descriptions on
demand.
- New `HandshakeCtx` (with `from_ctx` and `from_peer_ctx` constructors)
replaces the long argument lists. All `too_many_arguments` allow-attrs in
`handshake.rs` are gone, and call sites in `handlers.rs`, `discovery.rs`,
and `stream.rs` collapse to a single clone.
- `handle_library_delta` no longer acquires a read lock on the apply path:
the `peer_addr` lookup moved into the `else` resync branch where it is
actually needed.
- `accept_inbound_hello`'s `remote_addr` parameter is renamed to
`transport_addr`. It is now used only for warn-log formatting, and the
new name signals that this is the ephemeral QUIC source port, never the
authoritative listener address that gets recorded.
User-visible effect: on cold start, peers can no longer end up with an
asymmetric view of each other ("A sees B but B never sees A"). First-contact
library sync now does one handshake plus one snapshot/delta exchange instead
of the previous handshake + ListGames + redundant follow-up handshake. The
direct-connect CLI path (`handle_connect_peer_command`) now fails fast with
"local peer listener address is not ready" if invoked before the QUIC server
has bound; this is intentional - the previous behaviour would have sent a
Hello that the receiver had to silently discard.
Test Plan:
- just fmt
- just clippy
- just test (80 peer + 3 cli + 5 tauri tests pass)
- just build
- Manual: bring up `just peer-cli-alpha`/`bravo`/`charlie`, confirm symmetric
peer discovery and that games show up on every side after one library
digest cycle, with no duplicated ListGames traffic in trace logs.
Refs: Review feedback on commit 348a02c (listener-address handshake fix).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Peers discovered over mDNS could still attribute later library sync traffic to
temporary QUIC source ports. In a real GUI LAN run this made Host B try to
push its library to Host A's outbound port instead of Host A's advertised
listener, so Host A discovered the peer but never saw its games.
Carry the stable listener address in Hello and HelloAck, and key library sync
messages by peer_id instead of inferring identity from the transport source
address. The handshake path now explicitly refreshes an empty peer library from
the known listener address, matching the reliability of the direct-connect CLI
path without overwriting richer snapshot state when it already arrived.
This changes the current wire protocol, so PROTOCOL_VERSION is bumped to 3 and
all peers must be rebuilt together. The architecture note now documents that
listener addresses come from mDNS or Hello/HelloAck, never from ephemeral QUIC
source ports.
Test Plan:
- just fmt
- just test
- just clippy
- just build
- git diff --check
Refs: Local Linux/Win11 GUI LAN test logs from 2026-05-18.
The follow-up backlog had drifted into three settled peer/runtime issues: the
legacy game-list fallback contradicted the one-wire-version policy, the Tauri
shell still re-derived local install state from disk after peer snapshots, and
`Availability::Downloading` existed even though active operations are already
reported through a separate operation table.
Remove the legacy `AnnounceGames` request and fallback service. Discovery now
ignores peers that do not advertise the current protocol and a peer id, and
library changes are sent through the current delta path only. This keeps the
runtime aligned with the documented current-build-only interoperability model.
Make peer `LocalGamesUpdated` snapshots authoritative for local fields in the
Tauri database. The GUI-side catalog still owns static metadata such as names,
sizes, and descriptions, but downloaded, installed, local version, and
availability now come from the peer runtime instead of a second whole-library
filesystem scan. Snapshot reconciliation also pins the missing-begin and
missing-finish lifecycle cases in tests.
Collapse availability back to the settled `Ready` and `LocalOnly` states.
Aggregation now counts only `Ready` peers as download sources, and the frontend
no longer carries a dead `Downloading` enum value.
The core peer also exposes the small non-GUI hooks needed by scripted callers:
startup options for state and mDNS, a local-ready event, direct connection, peer
snapshots, and an explicit post-download install policy. Those hooks reuse the
same current protocol path and do not add compatibility shims.
Test Plan:
- `git diff --check`
- `just fmt`
- `just clippy`
- `just test`
Refs: BACKLOG.md, FINDINGS.md, IMPL_DECISIONS.md
Game::availability used string labels that were carried through persisted
library data, protocol summaries, and the Tauri-facing game payload. That
allowed invalid states to exist and required legacy summary conversion code to
defensively map strings back into protocol availability values.
Move Availability to lanspread-db and re-export it from lanspread-proto so the
persisted Game type and wire GameSummary type share one serde enum. The JSON
spelling stays Ready, Downloading, or LocalOnly, so the serialized shape does
not change for current library indexes or peer payloads.
Add typed helpers for sentinel-derived download state. Game::set_downloaded
keeps downloaded and Ready/LocalOnly in lockstep and intentionally collapses
non-ready local state, including Downloading, back to LocalOnly. That matches
the current local-summary contract where active operations are suppressed
instead of advertised as Downloading. Game::normalized_availability keeps the
legacy Game-to-summary path from publishing an inconsistent Ready value when
downloaded is false.
Update the follow-up status note so typed availability is no longer listed as
open work.
Test Plan:
- just fmt
- just test
- just clippy
- just build
Refs: none
The follow-up review found a few stale lifecycle edges around local game
transactions. Recovery could sweep active roots, post-operation refreshes
still re-ran full startup recovery, and the UI kept inferring local-only state
from downloaded and installed flags instead of the backend availability.
This updates the peer lifecycle so startup recovery skips active operations,
install/update/uninstall refresh only the affected game after the operation
guard is dropped, and path-changing game-directory updates are rejected while
operations are active. It also removes the dead UpdateGame command, drops the
unused manifest_hash write field while preserving old JSON reads, renames the
internal install-finished event, and carries availability through the DB,
peer summaries, Tauri refreshes, and the React model.
The included follow-up documents record the review source, implementation
decisions, and the remaining FOLLOW_UP_2.md work so later commits can stay
small instead of reopening the completed plan items.
Test Plan:
- git diff --check
- just fmt
- just clippy
- just test
Follow-up-Plan: FOLLOW_UP_PLAN.md
- more robust client <-> server connection
- new client event: DownloadGameFilesFailed
- 3 seconds to reconnect
- retry forever if server is gone and never lose a UI request
- code cleanup here and there (mostly server)