Follow-up hardening for 348a02c, where `listen_addr` was added to Hello and
HelloAck as `Option<SocketAddr>`. Code review surfaced three concrete problems
that the previous commit left open:
1. Cold-start asymmetry. Discovery and the QUIC/mDNS advertiser are spawned
concurrently. If discovery saw a cached peer advertisement before our own
advertiser had written `ctx.local_peer_addr`, our outbound Hello carried
`listen_addr: None`. The receiver's `peer_record_addr` then returned `None`
and silently dropped the Hello while we still recorded their HelloAck, so
peer A learned about peer B but B never learned about A until a later
handshake happened to win the race.
2. Duplicate game-list pipeline. The previous commit added
`refresh_peer_games`, which post-handshake issued a `ListGames` to fetch
`peer.games`. The library-sync path (`LibrarySnapshot`) already populates
the same field. Both could race on first contact and overwrite each other.
Worse, `refresh_peer_games` was misnamed: a `peer_game_count > 0` guard
turned it into a fetch-once-then-no-op helper, while
`handle_library_summary` independently re-triggered a full handshake when
`previous_count == 0` was observed, producing a redundant ping-pong on
every first contact.
3. Argument explosion. `perform_handshake_with_peer`, `spawn_library_resync`,
and `after_peer_library_recorded` had grown to 6-8 individual parameters
and acquired `#[allow(clippy::too_many_arguments)]` opt-outs. Every caller
was destructuring the same fields out of `Ctx`/`PeerCtx`.
Changes (all in one commit because they jointly enforce the same invariant:
"a peer is only ever recorded by its listener address, and the local
listener address must exist before we participate in the protocol"):
- `Hello.listen_addr` and `HelloAck.listen_addr` are now `SocketAddr`, not
`Option<SocketAddr>`. Wire-incompatible, but PROTOCOL_VERSION already moved
to 3 in 348a02c so no additional version bump is needed.
- `required_listen_addr` reads `ctx.local_peer_addr` and returns an
`eyre::Result`; `build_hello_from_state` and `build_hello_ack` both call
it, so an outbound or inbound Hello can no longer be constructed before
the local QUIC listener is bound. The inbound path maps this into a
`Response::InternalPeerError` so the remote peer fails cleanly instead of
seeing a malformed HelloAck.
- `run_peer_discovery` blocks on `wait_for_local_peer_addr` (25 ms poll,
shutdown-aware) before subscribing to the mDNS browser. This closes the
cold-start race for outbound handshakes at the source.
- `refresh_peer_games`, `request_game_list_from_peer`, and the
`previous_count == 0` re-handshake trigger are removed. The post-handshake
flow now relies solely on `LibrarySummary`/`LibrarySnapshot`/`LibraryDelta`
for peer-library state; `ListGames` survives only for the
`request_game_details_*` paths that fetch per-game file descriptions on
demand.
- New `HandshakeCtx` (with `from_ctx` and `from_peer_ctx` constructors)
replaces the long argument lists. All `too_many_arguments` allow-attrs in
`handshake.rs` are gone, and call sites in `handlers.rs`, `discovery.rs`,
and `stream.rs` collapse to a single clone.
- `handle_library_delta` no longer acquires a read lock on the apply path:
the `peer_addr` lookup moved into the `else` resync branch where it is
actually needed.
- `accept_inbound_hello`'s `remote_addr` parameter is renamed to
`transport_addr`. It is now used only for warn-log formatting, and the
new name signals that this is the ephemeral QUIC source port, never the
authoritative listener address that gets recorded.
User-visible effect: on cold start, peers can no longer end up with an
asymmetric view of each other ("A sees B but B never sees A"). First-contact
library sync now does one handshake plus one snapshot/delta exchange instead
of the previous handshake + ListGames + redundant follow-up handshake. The
direct-connect CLI path (`handle_connect_peer_command`) now fails fast with
"local peer listener address is not ready" if invoked before the QUIC server
has bound; this is intentional - the previous behaviour would have sent a
Hello that the receiver had to silently discard.
Test Plan:
- just fmt
- just clippy
- just test (80 peer + 3 cli + 5 tauri tests pass)
- just build
- Manual: bring up `just peer-cli-alpha`/`bravo`/`charlie`, confirm symmetric
peer discovery and that games show up on every side after one library
digest cycle, with no duplicated ListGames traffic in trace logs.
Refs: Review feedback on commit 348a02c (listener-address handshake fix).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The follow-up backlog had drifted into three settled peer/runtime issues: the
legacy game-list fallback contradicted the one-wire-version policy, the Tauri
shell still re-derived local install state from disk after peer snapshots, and
`Availability::Downloading` existed even though active operations are already
reported through a separate operation table.
Remove the legacy `AnnounceGames` request and fallback service. Discovery now
ignores peers that do not advertise the current protocol and a peer id, and
library changes are sent through the current delta path only. This keeps the
runtime aligned with the documented current-build-only interoperability model.
Make peer `LocalGamesUpdated` snapshots authoritative for local fields in the
Tauri database. The GUI-side catalog still owns static metadata such as names,
sizes, and descriptions, but downloaded, installed, local version, and
availability now come from the peer runtime instead of a second whole-library
filesystem scan. Snapshot reconciliation also pins the missing-begin and
missing-finish lifecycle cases in tests.
Collapse availability back to the settled `Ready` and `LocalOnly` states.
Aggregation now counts only `Ready` peers as download sources, and the frontend
no longer carries a dead `Downloading` enum value.
The core peer also exposes the small non-GUI hooks needed by scripted callers:
startup options for state and mDNS, a local-ready event, direct connection, peer
snapshots, and an explicit post-download install policy. Those hooks reuse the
same current protocol path and do not add compatibility shims.
Test Plan:
- `git diff --check`
- `just fmt`
- `just clippy`
- `just test`
Refs: BACKLOG.md, FINDINGS.md, IMPL_DECISIONS.md
The peer runtime used to spawn each long-running service inline inside
run_peer. That made the startup path harder to scan because service names,
clone setup, and task error handling were interleaved with the command loop.
Move the task wrappers into a startup module and leave run_peer as the
lifecycle overview: create shared context, start services, handle commands,
then send shutdown goodbyes. The spawned services and their error handling are
unchanged; only the ownership plumbing moved into named helpers.
Test Plan:
- cargo clippy
- cargo clippy --benches
- cargo clippy --tests
- cargo +nightly fmt
Refs: none