refactor(peer): tighten listener-addr handshake invariant
Follow-up hardening for348a02c, where `listen_addr` was added to Hello and HelloAck as `Option<SocketAddr>`. Code review surfaced three concrete problems that the previous commit left open: 1. Cold-start asymmetry. Discovery and the QUIC/mDNS advertiser are spawned concurrently. If discovery saw a cached peer advertisement before our own advertiser had written `ctx.local_peer_addr`, our outbound Hello carried `listen_addr: None`. The receiver's `peer_record_addr` then returned `None` and silently dropped the Hello while we still recorded their HelloAck, so peer A learned about peer B but B never learned about A until a later handshake happened to win the race. 2. Duplicate game-list pipeline. The previous commit added `refresh_peer_games`, which post-handshake issued a `ListGames` to fetch `peer.games`. The library-sync path (`LibrarySnapshot`) already populates the same field. Both could race on first contact and overwrite each other. Worse, `refresh_peer_games` was misnamed: a `peer_game_count > 0` guard turned it into a fetch-once-then-no-op helper, while `handle_library_summary` independently re-triggered a full handshake when `previous_count == 0` was observed, producing a redundant ping-pong on every first contact. 3. Argument explosion. `perform_handshake_with_peer`, `spawn_library_resync`, and `after_peer_library_recorded` had grown to 6-8 individual parameters and acquired `#[allow(clippy::too_many_arguments)]` opt-outs. Every caller was destructuring the same fields out of `Ctx`/`PeerCtx`. Changes (all in one commit because they jointly enforce the same invariant: "a peer is only ever recorded by its listener address, and the local listener address must exist before we participate in the protocol"): - `Hello.listen_addr` and `HelloAck.listen_addr` are now `SocketAddr`, not `Option<SocketAddr>`. Wire-incompatible, but PROTOCOL_VERSION already moved to 3 in348a02cso no additional version bump is needed. - `required_listen_addr` reads `ctx.local_peer_addr` and returns an `eyre::Result`; `build_hello_from_state` and `build_hello_ack` both call it, so an outbound or inbound Hello can no longer be constructed before the local QUIC listener is bound. The inbound path maps this into a `Response::InternalPeerError` so the remote peer fails cleanly instead of seeing a malformed HelloAck. - `run_peer_discovery` blocks on `wait_for_local_peer_addr` (25 ms poll, shutdown-aware) before subscribing to the mDNS browser. This closes the cold-start race for outbound handshakes at the source. - `refresh_peer_games`, `request_game_list_from_peer`, and the `previous_count == 0` re-handshake trigger are removed. The post-handshake flow now relies solely on `LibrarySummary`/`LibrarySnapshot`/`LibraryDelta` for peer-library state; `ListGames` survives only for the `request_game_details_*` paths that fetch per-game file descriptions on demand. - New `HandshakeCtx` (with `from_ctx` and `from_peer_ctx` constructors) replaces the long argument lists. All `too_many_arguments` allow-attrs in `handshake.rs` are gone, and call sites in `handlers.rs`, `discovery.rs`, and `stream.rs` collapse to a single clone. - `handle_library_delta` no longer acquires a read lock on the apply path: the `peer_addr` lookup moved into the `else` resync branch where it is actually needed. - `accept_inbound_hello`'s `remote_addr` parameter is renamed to `transport_addr`. It is now used only for warn-log formatting, and the new name signals that this is the ephemeral QUIC source port, never the authoritative listener address that gets recorded. User-visible effect: on cold start, peers can no longer end up with an asymmetric view of each other ("A sees B but B never sees A"). First-contact library sync now does one handshake plus one snapshot/delta exchange instead of the previous handshake + ListGames + redundant follow-up handshake. The direct-connect CLI path (`handle_connect_peer_command`) now fails fast with "local peer listener address is not ready" if invoked before the QUIC server has bound; this is intentional - the previous behaviour would have sent a Hello that the receiver had to silently discard. Test Plan: - just fmt - just clippy - just test (80 peer + 3 cli + 5 tauri tests pass) - just build - Manual: bring up `just peer-cli-alpha`/`bravo`/`charlie`, confirm symmetric peer discovery and that games show up on every side after one library digest cycle, with no duplicated ListGames traffic in trace logs. Refs: Review feedback on commit348a02c(listener-address handshake fix). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -11,7 +11,7 @@ use crate::{
|
||||
context::Ctx,
|
||||
events,
|
||||
peer_db::PeerId,
|
||||
services::handshake::perform_handshake_with_peer,
|
||||
services::handshake::{HandshakeCtx, perform_handshake_with_peer},
|
||||
};
|
||||
|
||||
struct MdnsPeerInfo {
|
||||
@@ -29,6 +29,10 @@ pub async fn run_peer_discovery(
|
||||
) -> eyre::Result<()> {
|
||||
log::info!("Starting peer discovery task");
|
||||
|
||||
if !wait_for_local_peer_addr(&ctx).await {
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
let service_type = LANSPREAD_SERVICE_TYPE.to_string();
|
||||
let (service_tx, mut service_rx) = tokio::sync::mpsc::unbounded_channel();
|
||||
let worker_shutdown = ctx.shutdown.clone();
|
||||
@@ -93,6 +97,19 @@ pub async fn run_peer_discovery(
|
||||
}
|
||||
}
|
||||
|
||||
async fn wait_for_local_peer_addr(ctx: &Ctx) -> bool {
|
||||
loop {
|
||||
if ctx.local_peer_addr.read().await.is_some() {
|
||||
return true;
|
||||
}
|
||||
|
||||
tokio::select! {
|
||||
() = ctx.shutdown.cancelled() => return false,
|
||||
() = tokio::time::sleep(Duration::from_millis(25)) => {}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
fn parse_mdns_peer(service: &MdnsService) -> MdnsPeerInfo {
|
||||
MdnsPeerInfo {
|
||||
addr: service.addr,
|
||||
@@ -161,33 +178,21 @@ async fn handle_discovered_peer(
|
||||
}
|
||||
|
||||
if upsert.is_new || upsert.addr_changed {
|
||||
spawn_protocol_negotiation(&info, ctx, tx_notify_ui.clone(), peer_id);
|
||||
spawn_protocol_negotiation(&info, ctx, tx_notify_ui, peer_id);
|
||||
}
|
||||
}
|
||||
|
||||
fn spawn_protocol_negotiation(
|
||||
info: &MdnsPeerInfo,
|
||||
ctx: &Ctx,
|
||||
tx_notify_ui: UnboundedSender<PeerEvent>,
|
||||
tx_notify_ui: &UnboundedSender<PeerEvent>,
|
||||
peer_id: PeerId,
|
||||
) {
|
||||
let peer_addr = info.addr;
|
||||
let peer_id_arc = ctx.peer_id.clone();
|
||||
let local_peer_addr = ctx.local_peer_addr.clone();
|
||||
let local_library = ctx.local_library.clone();
|
||||
let peer_game_db = ctx.peer_game_db.clone();
|
||||
let handshake_ctx = HandshakeCtx::from_ctx(ctx, tx_notify_ui);
|
||||
|
||||
ctx.task_tracker.spawn(async move {
|
||||
if let Err(err) = perform_handshake_with_peer(
|
||||
peer_id_arc,
|
||||
local_peer_addr,
|
||||
local_library,
|
||||
peer_game_db,
|
||||
tx_notify_ui,
|
||||
peer_addr,
|
||||
Some(peer_id),
|
||||
)
|
||||
.await
|
||||
if let Err(err) = perform_handshake_with_peer(handshake_ctx, peer_addr, Some(peer_id)).await
|
||||
{
|
||||
log::warn!("Failed to negotiate protocol with peer {peer_addr}: {err}");
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user