d9a6181ae2fd054aaf6df43bc975c14e50091cbf
12 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
2d53848e0c
|
test(peer-cli): harden S1-S47 scenario suite against vacuous and flaky checks
An adversarial audit of the headless peer-to-peer scenario suite
(crates/lanspread-peer-cli/scripts/run_extended_scenarios.py, driven by
`just peer-cli-tests`) found assertions that passed even when the behavior
they claim to test was not happening, plus timing races and doc-vs-code
divergences. A full baseline run (S1-S47) passed beforehand, confirming these
were test-quality gaps, not peer regressions; the baseline output itself
exposed the worst offenders -- e.g. S14 chunk totals {128 MiB, 1 MiB} (a
two-chunk file whose "balanced within one chunk" check can never fail) and
S16/S18 serving the whole ~120 MiB alienswarm.eti from a single source, so
fanout and retry were never exercised.
Test-correctness fixes (a broken behavior could previously pass green):
- S18: the "no download-failed" check was dead -- it reused a LineWaiter
already advanced past download-finished, so it scanned an empty tail.
Replaced with assert_no_event_since over the whole window. Switched to a
4*CHUNK_SIZE sparse archive so both peers get chunks; the test now proves the
download SURVIVES a mid-download source kill (every byte delivered, survivor
served part, no download-failed, clean diff). Retry-onto-survivor is the
mechanism but is not asserted: the kill/serve race against `docker rm -f`
cannot be forced, so asserting an exact split would be flaky.
- S7: the only check was a diff against byte-identical ggoo fixtures, so it was
source-agnostic. Added assertions that the download committed exactly once,
every chunk came from the validated two-peer set, both peers served, and no
chunk was fetched twice.
- S14: enlarged to 4*CHUNK_SIZE so the balance check can fail under a 3+1
imbalance; asserts an exact 2+2 split summing to the file size.
- S16: inflated the .eti to 2*CHUNK_SIZE so it fans out across both
catalog-version peers (the stock 120 MiB fixture is a single chunk).
- S37: validate the throughput rate fields (positive, self-consistent
mbit/mib == 8.388608, mib_per_s == bytes/duration), not just the byte count.
- S35: assert the source actually advertises the unknown game before checking
it is filtered, so "absent" means "filtered" and not "never sent".
- S15: cross-check each peer's raw advertised eti_version via list-peers; the
list-games eti_game_version is synthesized from the catalog and can only ever
equal the asserted value.
- S2: poll for library convergence and verify the bidirectional exchange
(bravo sees alpha's 3 games, not just alpha seeing bravo's 4).
- S12/S28: require the gating unit test to appear as "<name> ... ok" so an
#[ignore]d (un-run) test no longer satisfies the check.
- S24/S25: assert the requested install=false final state.
- S34: assert exactly 21 coherent chunks (20 files + version.ini), 21 distinct
paths, no duplicates, instead of a >= 21 floor.
Flake fixes:
- S19: force-kill the sole source right after download-begin on a 4*CHUNK_SIZE
file and accept download-failed or download-peers-gone. The old graceful
shutdown on a single-chunk file could let the transfer finish first, turning
the expected failure into a download-finished. A chunk may complete before
the kill lands, but the full transfer cannot, so the failure is deterministic.
- S26: use a large sparse source so the first operation is reliably still
active when the duplicate request is issued (TOCTOU on active_operations);
also assert the active operation == "Downloading".
- S11: drop the "listener address must change" assertion -- it tested the OS
ephemeral-port allocator and could fail spuriously; keep the same-identity /
no-duplicate invariant.
Coverage and determinism:
- S27: add handshake::tests::inbound_hello_from_self_is_ignored for the
protocol-level self guard. The CLI scenario only exercises the CLI
string-compare guard, which short-circuits before any network call, so the
peer-crate guard had no test.
- find_fixture_game now iterates sorted(FIXTURES) so the ambiguous cnctw
(fixture-bravo/multi/solid) resolves deterministically to fixture-bravo.
Reviewed and deliberately left as-is (documented in the run log): S20, S21,
S30, S32/S39/S44 absence checks, S42 IP-order precondition, S45.
PEER_CLI_SCENARIOS.md rows S2, S11, S14, S16, S18, S19, S27 are updated to
match the harness, and a dated run-log entry records the audit, the fixes, the
accepted items, and the live-run evidence.
Test Plan:
- `just peer-cli-tests` (rebuilds the image, runs S1-S47 in Docker): baseline
passed; post-fix passed; a final run on the exact committed code passed
47/47. Evidence: S14 {268435456, 268435456} balanced 2+2; S16 .eti split
across B and C {134217728, 134217728}; S18 all 536870912 bytes delivered with
no download-failed; S19 deterministic download-failed; S37 ~874 MiB/s.
- `just test` (incl. inbound_hello_from_self_is_ignored), `just clippy`
(-D warnings, all-targets), and `just fmt` all pass.
Refs: PEER_CLI_SCENARIOS.md scenario matrix and 2026-06-21 run-log entry.
|
||
|
|
c00e6eae84
|
fix(peer): drain streamed install senders after completion
A streamed install sender kept the original frame sink alive outside the producer task. After the producer sent Complete, or an Error for a provider failure, the forwarding loop still had a live mpsc sender in scope and waited forever for another frame. Move the sink into the producer so the channel closes when the producer exits. That lets the QUIC writer close, the request task return, and the outbound TransferGuard drop after successful streamed installs and provider-side failures. The peer-cli harness now keeps the outbound-transfer map it passes into the peer runtime and exposes per-game counts in status. S39 asserts that the source has no active outbound transfer for cnctw after the streamed install finishes, which catches the sender-side lifecycle leak that receiver-only assertions missed. The peer-cli README and scenario table document that status field and expectation. Test Plan: - just fmt - just test - just clippy - git diff --check - git diff --cached --check - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S39 S40 --build-image - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S41 S42 S43 S44 S45 S46 S47 Refs: NEXT_STEPS.md streamed install lifecycle hardening |
||
|
|
66c7d5912b
|
fix(peer): harden streamed install lifecycle
Claude Fable 5's branch review found that receiver cancellation or a QUIC send failure could leave the sender-side archive producer blocked on the bounded frame channel. That kept the outbound transfer guard alive and could block later installs or updates of the same game. Route archive frames through a cancellable StreamInstallFrameSink instead of exposing the raw channel sender to providers. The QUIC forwarder now cancels and closes the receive side before awaiting the producer, so a blocked send wakes and the transfer guard can drop normally. Make PeerCommand::StreamInstallGame own its peer metadata preflight inside the peer core. The Tauri layer now sends the command directly, and the peer runtime fetches file details from catalog-version peers before running the existing majority validation and retry logic. This removes the UI-only pending streamed install set and gives PeerEvent::GotGameFiles one meaning again: continue a normal archive download. Tighten the receiver transaction edge cases too. Rollback removes a newly created empty game root, but preserves pre-existing roots. Once streamed staging has been promoted to local/, intent or launch-settings cleanup failures are logged for startup recovery instead of reporting a failed install for bytes that are already committed. Accept missing RAR CRC32 metadata for zero-byte files as CRC32 00000000 while still requiring CRC32 metadata for non-empty files. Update the peer README, scenario docs, and next-steps handoff so the documented ownership and remaining trust limitation match the implementation. Test Plan: - just fmt - just test - just frontend-test - just clippy - git diff --check - python3 -m py_compile \ crates/lanspread-peer-cli/scripts/run_extended_scenarios.py - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py \ S39 S40 S41 S42 S43 S44 S45 S46 S47 --build-image Refs: streamed-install review handoff from Claude Fable 5 |
||
|
|
47ef87748f
|
test(peer-cli): align scenarios with catalog versions
Remote aggregation now filters to catalog-version roots, but the checked-in peer-cli fixtures and skew scenarios still stamped synthetic future versions. That hid fixture rows in S3 and left scenario docs asserting latest-version behavior. Teach the harness the catalog versions for fixture game IDs, stamp generated fixtures with catalog versions by default, and update skew, mesh, propagation, and throughput scenarios to expect only catalog-version peers. Also wire S38 into the executable matrix so the documented first-play launch-setting scenario is covered by the same full run as S1-S47. This keeps stale peers as negative coverage: they are absent from list-games and cannot provide descriptors, votes, or chunks. The fixture version.ini updates are checked in so alpha, bravo, charlie, and persona roots advertise downloadable catalog games again. Test Plan: - python3 -m py_compile crates/lanspread-peer-cli/scripts/run_extended_scenarios.py - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py \ S3 S8 S14 S15 S16 S17 S21 S22 S23 S24 S29 S30 S31 S34 S36 S37 \ S39 S40 S41 S42 S43 S44 S45 S46 S47 --build-image - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S38 - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py - git diff --check - git diff --cached --check Docs: PEER_CLI_SCENARIOS.md |
||
|
|
9288fda037
|
test(peer-cli): expand streamed install edge coverage
NEXT_STEPS item 6 called for the remaining streamed-install edge cases to be covered in the peer-cli matrix. Add S43-S47 for already-installed rejection, corrupt archive rollback, sender disconnect, receiver cancel, and sorted multi-archive streaming. The receiver-cancel scenario needs the harness to drive the same runtime path as the GUI, so `lanspread-peer-cli` now accepts a narrow `cancel-download` command that forwards to `PeerCommand::CancelDownload`. A parser test covers the new JSONL command shape. Add `fixture-multi/cnctw`, a tiny two-archive RAR fixture. S47 uses it to prove streamed installs process root `.eti` archives in sorted order and commit only extracted `local/` payloads, not the root archives or `version.ini` sentinel. Test Plan: - just fmt - python3 -m py_compile crates/lanspread-peer-cli/scripts/run_extended_scenarios.py - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S43 S44 S45 S46 S47 --build-image - just test - just clippy - git diff --check - git diff --cached --check Refs: NEXT_STEPS.md item 6 |
||
|
|
88bfaeb04a
|
test(peer-cli): cover streamed retry fallback
NEXT_STEPS item 5 needs streamed installs to have an explicit retry policy. The handler already retries whole-stream attempts across the majority-validated peer set, so add S42 to prove that behavior with the Docker harness instead of leaving it implicit. S42 starts two catalog-version-matching `cnctw` sources. The first source sorts first in retry order but has `--unrar /missing-unrar`, so its stream attempt fails before sending chunks. The second source then completes a fresh whole-stream attempt. The scenario asserts local-only installed state, no root archive or sentinel, no `.local.installing` staging leftover, chunk events only from the good source, matching streamed byte count, and SHA-256 payload equality against the good source's `unrar p`. This pins the current policy: retry the entire stream from another validated peer, do not preserve partial files across attempts, and do not promise byte-offset resume. Test Plan: - python3 -m py_compile crates/lanspread-peer-cli/scripts/run_extended_scenarios.py - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S42 - git diff --check - git diff --cached --check Refs: NEXT_STEPS.md item 5 |
||
|
|
0e970dcec7
|
test(peer-cli): cover solid streamed installs
NEXT_STEPS item 3 needed solid archive handling to be a deliberate contract instead of an incidental RAR header attribute. Add a tiny real solid RAR fixture and S41 to the extended peer-cli scenarios so the Docker harness proves this path end to end. The scenario verifies the source archive with container-bundled `unrar lt`, streams the install with the injected provider, and then asserts the receiver is installed local-only without a root archive or root `version.ini`. It also compares local payload SHA-256 hashes against `unrar p` output and checks the streamed byte count matches the extracted entries. This keeps the existing one metadata pass plus one sequential payload pass contract covered for solid archives. Test Plan: - just fmt - just test - python3 -m py_compile crates/lanspread-peer-cli/scripts/run_extended_scenarios.py - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S41 --build-image - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S41 - git diff --check - git diff --cached --check Refs: NEXT_STEPS.md item 3 |
||
|
|
373def6d44
|
feat(peer): prototype streamed installs
Add a streamed-install prototype that can receive archive-derived install bytes straight into local/ without first storing the peer-owned root archive payload. This is intended for low-disk clients that want to install a game but opt out of becoming a downloadable peer source for that game. The protocol gains a current-version-only StreamInstall request and framed StreamInstallFrame responses. The peer core owns the generic transport, transaction, path validation, size checks, CRC32 verification, and lifecycle state. The archive-specific work is hidden behind StreamInstallProvider so the prototype can use unrar while the final implementation can swap in a better provider without rewriting the peer command path. The receiver writes into .local.installing and only promotes to local/ after the full stream verifies. It deliberately does not write the root version.ini or archive files, so the settled local state is installed=true, downloaded=false, and availability=LocalOnly. That preserves the existing rule that local/ is not served to peers and makes streamed receivers non-sources by construction. The CLI is the only caller for now. It exposes stream-install and provides the prototype unrar implementation with unrar lt for entry metadata and unrar p for file bytes. This is simple and good enough to prove non-solid archive streaming, but it is not the production provider shape for solid archives because per-file unrar p would repeatedly decompress prefixes. The Tauri app explicitly passes stream_install_provider: None, so the GUI behavior stays unchanged until a real product path is designed. Document the production-readiness work in NEXT_STEPS.md. The main follow-up is to make the provider abstraction final-ish and replace the per-file CLI unrar provider with a one-pass archive provider, then wire a deliberate GUI low-disk mode, retry semantics, and broader failure scenarios. Test Plan: - just fmt - RUSTC_WRAPPER= CARGO_BUILD_RUSTC_WRAPPER= just test - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py \ S39 S40 --build-image - RUSTC_WRAPPER= CARGO_BUILD_RUSTC_WRAPPER= just clippy - git diff --check - git diff --cached --check Follow-up: NEXT_STEPS.md |
||
|
|
9835e77e8d
|
feat: store launcher state outside game dirs
Move launcher-owned metadata from game roots into the configured peer state area. Peer identity, the local library index, install intent logs, and setup markers now live under app/CLI state instead of being written beside games. The Tauri shell passes its app data directory into the peer, and the peer CLI runs the same path through its explicit --state-dir. Add a dedicated pre-start migration phase for legacy files. It migrates the old global library index, per-game install intents, and the old first-start marker into app state, then deletes legacy files only after the replacement write succeeds. Normal scan, install, recovery, and transfer paths no longer read legacy state files. Rename the old first-start meaning to setup_done and only set it after launching game_setup.cmd. Start/setup scripts keep the shared argument shape, while server_start.cmd now uses cmd /k and a visible window so server logs stay open for inspection. While validating the Docker scenario matrix, make download terminal events come from the handler after local state refresh and operation cleanup. This makes download-finished/download-failed safe points for immediate follow-up CLI commands. Also update the multi-peer chunking scenario to use a sparse archive large enough to actually span multiple production chunks. Test Plan: - just fmt - just test - just frontend-test - just build - just clippy - git diff --check - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py Refs: local app-state migration discussion |
||
|
|
0f10108438
|
perf(peer): widen LAN bulk-transfer windows and buffers
Centralize the bulk-transfer sizing in config.rs and bump the values used
on both ends of a QUIC connection:
- CHUNK_SIZE: 32 MiB -> 128 MiB
- QUIC_CONNECTION_DATA_WINDOW: 64 MiB -> 256 MiB
- QUIC_STREAM_DATA_WINDOW: 32 MiB -> 128 MiB
- QUIC_MAX_SEND_BUFFER_SIZE: 32 MiB -> 128 MiB
- QUIC_INITIAL_CONGESTION_WINDOW: 1 MiB -> 4 MiB
- FILE_TRANSFER_BUFFER_SIZE: 64 KiB -> 1 MiB (new constant)
The previous 32 MiB stream window was already comfortably above the
bandwidth-delay product of a sub-millisecond LAN at 2.5 GbE. The further
bump is deliberately generous: the goal is to push flow control and
per-syscall overhead far enough out of the way that they cannot be the
suspect when isolating the remaining LAN download bottleneck (disk, NIC,
or s2n-quic platform offload on the sending host). Memory pressure from
the larger windows is not observable on a desktop client moving GB-sized
games.
stream_file_bytes previously read the local file in 64 KiB chunks. At
multi-Gbit/s send rates that produced many thousands of disk reads per
second; bumping to 1 MiB keeps the per-file syscall load modest with no
measurable latency cost on streamed bulk transfers. The buffer size lives
in config.rs as FILE_TRANSFER_BUFFER_SIZE so it stays adjustable from one
place.
Also add a started/MiB-per-second log line at info level when a file
finishes streaming. This matches the S37 measurement methodology already
used in the peer-cli harness and makes per-file send throughput visible in
normal operation.
The peer-cli extended-scenarios harness uses CHUNK_SIZE as the tolerance
bound for chunk-boundary variance in its assertions, so its constant is
bumped to match. The multi-chunk planning unit test is rewritten to
reference CHUNK_SIZE symbolically (CHUNK_SIZE * 3 + CHUNK_SIZE / 2)
instead of a hardcoded 120 MiB; the previous literal would silently
degrade into a single-chunk test at the new chunk size and stop
exercising the spread-across-peers code path.
Test Plan:
- just fmt
- just clippy
- just test
- python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S37 \
--build-image
- python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S37
Refs: local LAN download performance investigation on 2026-05-20.
Depends-on:
|
||
|
|
8a9f420a06
|
test(peer-cli): measure single-source download throughput
Add peer-cli accounting for download sessions so terminal download events report bytes, chunks, elapsed time, MiB/s, and Mbit/s. The extended scenario runner now has S37, a focused single-source download benchmark that creates a 2 GiB sparse bf1942 archive, downloads it from one peer with install disabled, and checks the destination archive size and reported byte count. This gives the QUIC performance work a repeatable measurement below the 5 GiB limit from the original request. The source file is sparse, so S37 is aimed at the app, QUIC, and destination-write path rather than raw source-disk reads; the existing correctness scenarios still cover normal game downloads. Baseline S37 before QUIC tuning: - 733.22 MiB/s - 6150.72 Mbit/s - 2.793s for 2.00 GiB plus version.ini - 65 reported chunks Test Plan: - just fmt - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py S37 --build-image Refs: local LAN download performance investigation on 2026-05-20. |
||
|
|
a8edcd7450
|
test(peer-cli): cover full docker scenario matrix
Merge the S18-S36 scenario ideas into the official peer-cli scenario matrix and add a Docker-backed runner that now exercises S1-S36 with concrete file proofs. The runner creates temporary fixtures under .lanspread-peer-cli, drives JSONL peer containers, checks transferred roots with diff and SHA-256 manifests, and covers startup, discovery, transfer, failure, mutation, concurrency, mesh, lifecycle, and catalog edge cases. The scenarios exposed a few harness/runtime boundary gaps that would otherwise make the contract ambiguous. The peer CLI now rejects self-connects, rejects commands for game IDs outside the receiver catalog, filters unknown remote games from its command/event surface, and reports duplicate active same-game commands as operation-in-progress errors. The peer core also refuses non-catalog download commands before transfer, and PeerGameDB has a unit check that address changes preserve identity and library state. S12 and S28 remain unit-level invariants because the CLI cannot stably race raw serve-gate requests or rebind a live listener without restart. The runner treats those scenarios as covered by just test and checks the expected unit test names appear in the output. Test Plan: - just fmt - python3 -m py_compile crates/lanspread-peer-cli/scripts/run_extended_scenarios.py - RUSTC_WRAPPER= just test - RUSTC_WRAPPER= just clippy - RUSTC_WRAPPER= just peer-cli-build - just peer-cli-image - python3 crates/lanspread-peer-cli/scripts/run_extended_scenarios.py - git diff --check Refs: PEER_CLI_SCENARIOS.md S1-S36 |