An adversarial audit of the headless peer-to-peer scenario suite
(crates/lanspread-peer-cli/scripts/run_extended_scenarios.py, driven by
`just peer-cli-tests`) found assertions that passed even when the behavior
they claim to test was not happening, plus timing races and doc-vs-code
divergences. A full baseline run (S1-S47) passed beforehand, confirming these
were test-quality gaps, not peer regressions; the baseline output itself
exposed the worst offenders -- e.g. S14 chunk totals {128 MiB, 1 MiB} (a
two-chunk file whose "balanced within one chunk" check can never fail) and
S16/S18 serving the whole ~120 MiB alienswarm.eti from a single source, so
fanout and retry were never exercised.
Test-correctness fixes (a broken behavior could previously pass green):
- S18: the "no download-failed" check was dead -- it reused a LineWaiter
already advanced past download-finished, so it scanned an empty tail.
Replaced with assert_no_event_since over the whole window. Switched to a
4*CHUNK_SIZE sparse archive so both peers get chunks; the test now proves the
download SURVIVES a mid-download source kill (every byte delivered, survivor
served part, no download-failed, clean diff). Retry-onto-survivor is the
mechanism but is not asserted: the kill/serve race against `docker rm -f`
cannot be forced, so asserting an exact split would be flaky.
- S7: the only check was a diff against byte-identical ggoo fixtures, so it was
source-agnostic. Added assertions that the download committed exactly once,
every chunk came from the validated two-peer set, both peers served, and no
chunk was fetched twice.
- S14: enlarged to 4*CHUNK_SIZE so the balance check can fail under a 3+1
imbalance; asserts an exact 2+2 split summing to the file size.
- S16: inflated the .eti to 2*CHUNK_SIZE so it fans out across both
catalog-version peers (the stock 120 MiB fixture is a single chunk).
- S37: validate the throughput rate fields (positive, self-consistent
mbit/mib == 8.388608, mib_per_s == bytes/duration), not just the byte count.
- S35: assert the source actually advertises the unknown game before checking
it is filtered, so "absent" means "filtered" and not "never sent".
- S15: cross-check each peer's raw advertised eti_version via list-peers; the
list-games eti_game_version is synthesized from the catalog and can only ever
equal the asserted value.
- S2: poll for library convergence and verify the bidirectional exchange
(bravo sees alpha's 3 games, not just alpha seeing bravo's 4).
- S12/S28: require the gating unit test to appear as "<name> ... ok" so an
#[ignore]d (un-run) test no longer satisfies the check.
- S24/S25: assert the requested install=false final state.
- S34: assert exactly 21 coherent chunks (20 files + version.ini), 21 distinct
paths, no duplicates, instead of a >= 21 floor.
Flake fixes:
- S19: force-kill the sole source right after download-begin on a 4*CHUNK_SIZE
file and accept download-failed or download-peers-gone. The old graceful
shutdown on a single-chunk file could let the transfer finish first, turning
the expected failure into a download-finished. A chunk may complete before
the kill lands, but the full transfer cannot, so the failure is deterministic.
- S26: use a large sparse source so the first operation is reliably still
active when the duplicate request is issued (TOCTOU on active_operations);
also assert the active operation == "Downloading".
- S11: drop the "listener address must change" assertion -- it tested the OS
ephemeral-port allocator and could fail spuriously; keep the same-identity /
no-duplicate invariant.
Coverage and determinism:
- S27: add handshake::tests::inbound_hello_from_self_is_ignored for the
protocol-level self guard. The CLI scenario only exercises the CLI
string-compare guard, which short-circuits before any network call, so the
peer-crate guard had no test.
- find_fixture_game now iterates sorted(FIXTURES) so the ambiguous cnctw
(fixture-bravo/multi/solid) resolves deterministically to fixture-bravo.
Reviewed and deliberately left as-is (documented in the run log): S20, S21,
S30, S32/S39/S44 absence checks, S42 IP-order precondition, S45.
PEER_CLI_SCENARIOS.md rows S2, S11, S14, S16, S18, S19, S27 are updated to
match the harness, and a dated run-log entry records the audit, the fixes, the
accepted items, and the live-run evidence.
Test Plan:
- `just peer-cli-tests` (rebuilds the image, runs S1-S47 in Docker): baseline
passed; post-fix passed; a final run on the exact committed code passed
47/47. Evidence: S14 {268435456, 268435456} balanced 2+2; S16 .eti split
across B and C {134217728, 134217728}; S18 all 536870912 bytes delivered with
no download-failed; S19 deterministic download-failed; S37 ~874 MiB/s.
- `just test` (incl. inbound_hello_from_self_is_ignored), `just clippy`
(-D warnings, all-targets), and `just fmt` all pass.
Refs: PEER_CLI_SCENARIOS.md scenario matrix and 2026-06-21 run-log entry.
lanspread-peer-cli
Scriptable peer harness for automated LAN-spread tests. The binary starts the core peer runtime without the Tauri GUI, reads one JSON command per stdin line, and writes JSONL events, results, and errors to stdout.
Running
just peer-cli-build
just peer-cli-image
just peer-cli-tests
just peer-cli-run alpha
Useful flags:
--games-dir PATHstores local archives and installs.--state-dir PATHstores the generated peer identity.--fixture GAME_IDseeds a tiny archive that the fixture unpacker can install.
Fixture Game Directories
fixtures/fixture-alpha, fixtures/fixture-bravo, and
fixtures/fixture-charlie are ready-to-use game directories for local CLI
smoke tests. Point --games-dir at one of them to start a peer with several
catalog-backed fake games. Each game includes version.ini and a real RAR
archive renamed to .eti; fixture-alpha and fixture-bravo share ggoo,
while fixture-bravo and fixture-charlie share cnc4.
Commands
Every command is a JSON object with cmd or command; id is optional and is
echoed back on the result or error line.
{"id":"s1","cmd":"status"}
{"id":"p1","cmd":"wait-peers","count":1,"timeout_ms":5000}
{"id":"c1","cmd":"connect","addr":"127.0.0.1:34567"}
{"id":"g1","cmd":"list-games"}
{"id":"d1","cmd":"download","game_id":"fixture-one","install":true}
{"id":"i1","cmd":"install","game_id":"fixture-one"}
{"id":"u1","cmd":"uninstall","game_id":"fixture-one"}
{"id":"q1","cmd":"shutdown"}
The status result includes receiver-side active_operations and
sender-side active_outbound_transfers counts by game ID, which the scenario
runner uses to verify transfer lifecycle cleanup.