Files
lanspread/crates/lanspread-peer
ddidderr 2d53848e0c test(peer-cli): harden S1-S47 scenario suite against vacuous and flaky checks
An adversarial audit of the headless peer-to-peer scenario suite
(crates/lanspread-peer-cli/scripts/run_extended_scenarios.py, driven by
`just peer-cli-tests`) found assertions that passed even when the behavior
they claim to test was not happening, plus timing races and doc-vs-code
divergences. A full baseline run (S1-S47) passed beforehand, confirming these
were test-quality gaps, not peer regressions; the baseline output itself
exposed the worst offenders -- e.g. S14 chunk totals {128 MiB, 1 MiB} (a
two-chunk file whose "balanced within one chunk" check can never fail) and
S16/S18 serving the whole ~120 MiB alienswarm.eti from a single source, so
fanout and retry were never exercised.

Test-correctness fixes (a broken behavior could previously pass green):
- S18: the "no download-failed" check was dead -- it reused a LineWaiter
  already advanced past download-finished, so it scanned an empty tail.
  Replaced with assert_no_event_since over the whole window. Switched to a
  4*CHUNK_SIZE sparse archive so both peers get chunks; the test now proves the
  download SURVIVES a mid-download source kill (every byte delivered, survivor
  served part, no download-failed, clean diff). Retry-onto-survivor is the
  mechanism but is not asserted: the kill/serve race against `docker rm -f`
  cannot be forced, so asserting an exact split would be flaky.
- S7: the only check was a diff against byte-identical ggoo fixtures, so it was
  source-agnostic. Added assertions that the download committed exactly once,
  every chunk came from the validated two-peer set, both peers served, and no
  chunk was fetched twice.
- S14: enlarged to 4*CHUNK_SIZE so the balance check can fail under a 3+1
  imbalance; asserts an exact 2+2 split summing to the file size.
- S16: inflated the .eti to 2*CHUNK_SIZE so it fans out across both
  catalog-version peers (the stock 120 MiB fixture is a single chunk).
- S37: validate the throughput rate fields (positive, self-consistent
  mbit/mib == 8.388608, mib_per_s == bytes/duration), not just the byte count.
- S35: assert the source actually advertises the unknown game before checking
  it is filtered, so "absent" means "filtered" and not "never sent".
- S15: cross-check each peer's raw advertised eti_version via list-peers; the
  list-games eti_game_version is synthesized from the catalog and can only ever
  equal the asserted value.
- S2: poll for library convergence and verify the bidirectional exchange
  (bravo sees alpha's 3 games, not just alpha seeing bravo's 4).
- S12/S28: require the gating unit test to appear as "<name> ... ok" so an
  #[ignore]d (un-run) test no longer satisfies the check.
- S24/S25: assert the requested install=false final state.
- S34: assert exactly 21 coherent chunks (20 files + version.ini), 21 distinct
  paths, no duplicates, instead of a >= 21 floor.

Flake fixes:
- S19: force-kill the sole source right after download-begin on a 4*CHUNK_SIZE
  file and accept download-failed or download-peers-gone. The old graceful
  shutdown on a single-chunk file could let the transfer finish first, turning
  the expected failure into a download-finished. A chunk may complete before
  the kill lands, but the full transfer cannot, so the failure is deterministic.
- S26: use a large sparse source so the first operation is reliably still
  active when the duplicate request is issued (TOCTOU on active_operations);
  also assert the active operation == "Downloading".
- S11: drop the "listener address must change" assertion -- it tested the OS
  ephemeral-port allocator and could fail spuriously; keep the same-identity /
  no-duplicate invariant.

Coverage and determinism:
- S27: add handshake::tests::inbound_hello_from_self_is_ignored for the
  protocol-level self guard. The CLI scenario only exercises the CLI
  string-compare guard, which short-circuits before any network call, so the
  peer-crate guard had no test.
- find_fixture_game now iterates sorted(FIXTURES) so the ambiguous cnctw
  (fixture-bravo/multi/solid) resolves deterministically to fixture-bravo.

Reviewed and deliberately left as-is (documented in the run log): S20, S21,
S30, S32/S39/S44 absence checks, S42 IP-order precondition, S45.

PEER_CLI_SCENARIOS.md rows S2, S11, S14, S16, S18, S19, S27 are updated to
match the harness, and a dated run-log entry records the audit, the fixes, the
accepted items, and the live-run evidence.

Test Plan:
- `just peer-cli-tests` (rebuilds the image, runs S1-S47 in Docker): baseline
  passed; post-fix passed; a final run on the exact committed code passed
  47/47. Evidence: S14 {268435456, 268435456} balanced 2+2; S16 .eti split
  across B and C {134217728, 134217728}; S18 all 536870912 bytes delivered with
  no download-failed; S19 deterministic download-failed; S37 ~874 MiB/s.
- `just test` (incl. inbound_hello_from_self_is_ignored), `just clippy`
  (-D warnings, all-targets), and `just fmt` all pass.

Refs: PEER_CLI_SCENARIOS.md scenario matrix and 2026-06-21 run-log entry.
2026-06-21 10:23:45 +02:00
..

lanspread-peer

lanspread-peer is the networking runtime that lets Lanspread nodes find each other on the local network, exchange library metadata, and transfer game files. It is designed to run headless other crates (most notably lanspread-tauri-deno-ts) embed it and drive it through a channel-based API.

Runtime Overview

  • start_peer(game_dir, tx_events, peer_game_db, unpacker, catalog) boots the asynchronous runtime in the background and returns a PeerRuntimeHandle whose sender controls the peer. The injected Unpacker keeps archive extraction out of the peer crate's platform layer, and the catalog set gates which local game roots are announced or served.
  • PeerCommand represents the small control surface exposed to the UI layer: ListGames, GetGame, FetchLatestFromPeers, DownloadGameFiles, StreamInstallGame, InstallGame, UninstallGame, RemoveDownloadedGame, CancelDownload, SetGameDir, and GetPeerCount.
  • PeerEvent enumerates everything the peer runtime reports back to the UI: library snapshots, download/install/uninstall lifecycle updates, runtime failures, and peer membership changes.
  • PeerGameDB collects remote peer metadata. It aggregates discovered peers Game definitions, tracks the latest ETI version per title, and keeps the last seen list of GameFileDescription entries for each peer.

Internally the peer runtime owns four long-lived tasks that run for the lifetime of the process:

  1. Server component (run_server_component) listens for QUIC connections, advertises via mDNS, and serves Request::ListGames, Request::GetGame, Request::GetGameFileData, Request::GetGameFileChunk, and Request::StreamInstall by reading from the local game directory.
  2. Discovery loop (run_peer_discovery) uses the lanspread-mdns helper to discover other peers. The blocking mDNS work is executed on a dedicated thread via tokio::task::spawn_blocking so that the Tokio runtime remains responsive.
  3. Ping service (run_ping_service) periodically issues QUIC ping requests to keep peer liveness up to date and prunes stale entries from PeerGameDB.
  4. Local game monitor (run_local_game_monitor) watches the configured game directory and each game root non-recursively, gates per-ID rescans while operations are active, emits local-library changes separately from active operation snapshots, and runs a 300-second fallback scan for missed events.

scan_local_library maintains a lightweight on-disk index and produces both a GameDB and protocol summaries. A game is downloaded only when its root-level version.ini sentinel exists; local/ being a directory is the install signal.

Networking and File Transfer

  • Transport is handled by s2n-quic; TLS cert/key material is compiled in from the repository root.
  • Protocol messages are JSON-encoded structures defined in lanspread-proto::{Request, Response}.
  • File transfers stream raw bytes over dedicated bidirectional QUIC streams. peer::send_game_file_data sends entire files, while peer::send_game_file_chunk services ranged requests.

Download Pipeline

When the UI asks to download a game:

  1. The UI first issues PeerCommand::GetGame for a new download, or PeerCommand::FetchLatestFromPeers for an update that must bypass local archives. The selected peers are queried via request_game_details_from_peer, and their file manifests are merged inside PeerGameDB.
  2. Once the UI receives PeerEvent::GotGameFiles, it forwards the selected file list back with PeerCommand::DownloadGameFiles.
  3. download_game_files starts a version-sentinel transaction, parks any old version.ini as .version.ini.discarded, prepares non-sentinel files, emits PeerEvent::DownloadGameFilesBegin, and builds a per-peer plan (build_peer_plans) that round-robins file chunks across the available peers that advertise the latest version.
  4. Each plan is executed in its own task (download_from_peer). Chunk requests use per-chunk QUIC streams and write into pre-created files. The chunk writer keeps existing data intact and only truncates when we intentionally fall back to a full file transfer, which prevents corruption when multiple peers fill different regions of the same file.
  5. DownloadProgressTracker samples byte counters, transfer speed, and the number of unique peers that are actively streaming chunks. The Tauri UI sees those values together through the regular download-progress event.
  6. version.ini chunks are buffered in memory and committed last via .version.ini.tmp followed by an atomic rename. Failures are accumulated and retried (up to MAX_RETRY_COUNT) via retry_failed_chunks; failed downloads sweep .version.ini.tmp and .version.ini.discarded without restoring the previous sentinel. Cancelled downloads also discard the peer-owned download payload while preserving local/ and install transaction metadata.
  7. After a successful sentinel commit, PeerEvent::DownloadGameFilesFinished is emitted and the peer auto-runs the install transaction.

Streamed Install Pipeline

Low-disk installs use PeerCommand::StreamInstallGame instead of the normal archive download pipeline. The peer core owns the whole operation: it refreshes file metadata from catalog-version peers, runs the same majority file-size validation used by normal downloads, selects a validated peer list, and emits the regular download/install lifecycle events while streaming archive-expanded bytes directly into a StreamedInstallTransaction.

The sender-side StreamInstallProvider writes control and chunk frames through a cancellable StreamInstallFrameSink. If the QUIC writer fails because the receiver cancelled or disconnected, the sink wakes any producer blocked on the bounded frame channel and lets the transfer guard drop normally.

Each failed peer attempt rolls back its staging directory before trying the next validated peer. A transaction that created a previously missing game root removes that root again when rollback leaves it empty. Once staging has been renamed to local/, post-promote intent or launch-settings cleanup failures are logged for startup recovery rather than reported as a failed install.

PeerCommand::CancelDownload cancels the tracked download token for an active transfer. The transfer task remains responsible for clearing active_operations, discarding partial payload files, and refreshing the settled local snapshot, so the UI continues to treat active-operation snapshots as the single source of truth for whether a download is still running.

Install Transactions

Install, update, uninstall, downloaded-file removal, and startup recovery live under src/install/. Install-side operation intent is stored atomically under the configured peer state directory, at games/<game_id>/install_intent.json. Game roots still use Lanspread-owned .local.installing/ and .local.backup/ directories marked by .lanspread_owned. Startup recovery combines the recorded intent with the observed filesystem state and only deletes reserved directories when intent or marker ownership proves they belong to Lanspread. Downloaded-file removal is deliberately separate from uninstall: it only accepts catalog IDs that are direct children of the configured game directory, refuses installed or in-flight roots, and deletes the whole game root only after finding a regular root-level version.ini sentinel.

Legacy launcher-owned files in game directories are migrated by a dedicated pre-start phase. Normal install, recovery, scan, and transfer paths use only the configured state directory for launcher-owned metadata.

Integration with lanspread-tauri-deno-ts

The Tauri application embeds this crate in crates/lanspread-tauri-deno-ts/src-tauri/src/lib.rs:

  • LanSpreadState holds onto the peer control channel, the latest aggregated GameDB, per-game operation state, the catalog set, and the user-selected game directory.
  • The Tauri commands (request_games, install_game, update_game, remove_downloaded_game, and update_game_directory) translate UI actions into PeerCommands. In particular, update_game_directory validates the filesystem path before storing it, loads the bundled catalog on first use, kicks off the peer runtime on demand, and mirrors the installed/uninstalled state into the UI-facing database.
  • A background task consumes PeerEvents and fans them out to the front-end via Tauri publish/subscribe events (games-list-updated, game-download-*, game-install-*, game-uninstall-*, peer-*). The Tauri crate now only provides the unrar sidecar through the injected Unpacker; rollback and cleanup live in the peer transaction code.

Security & Operational Notes

  • All QUIC connections are TLS encrypted; the shipped certificates are suitable for local-network trust but should be rotated for production deployments.
  • Peer discovery is restricted to the local link via mDNS.
  • Long-running blocking mDNS calls are isolated on dedicated threads which keeps the async runtime responsive even when discovery takes a long time.
  • File writes are chunk-safe: partial chunk downloads open files without truncating existing data, and root-level version.ini is written only after the rest of the download has succeeded.

Known Limitations

  • PeerGameDB currently models the latest metadata that other peers advertise. If the UI needs to surface titles that only exist locally, additional merging with the locally scanned GameDB will be required.
  • The download planner uses a simple round-robin and does not yet take per-peer throughput or failures into account when distributing work.

Refer to the source (particularly src/lib.rs) for the exact message shapes and state machines.