Files
lanspread/crates/lanspread-peer-cli/scripts
ddidderr 2d53848e0c test(peer-cli): harden S1-S47 scenario suite against vacuous and flaky checks
An adversarial audit of the headless peer-to-peer scenario suite
(crates/lanspread-peer-cli/scripts/run_extended_scenarios.py, driven by
`just peer-cli-tests`) found assertions that passed even when the behavior
they claim to test was not happening, plus timing races and doc-vs-code
divergences. A full baseline run (S1-S47) passed beforehand, confirming these
were test-quality gaps, not peer regressions; the baseline output itself
exposed the worst offenders -- e.g. S14 chunk totals {128 MiB, 1 MiB} (a
two-chunk file whose "balanced within one chunk" check can never fail) and
S16/S18 serving the whole ~120 MiB alienswarm.eti from a single source, so
fanout and retry were never exercised.

Test-correctness fixes (a broken behavior could previously pass green):
- S18: the "no download-failed" check was dead -- it reused a LineWaiter
  already advanced past download-finished, so it scanned an empty tail.
  Replaced with assert_no_event_since over the whole window. Switched to a
  4*CHUNK_SIZE sparse archive so both peers get chunks; the test now proves the
  download SURVIVES a mid-download source kill (every byte delivered, survivor
  served part, no download-failed, clean diff). Retry-onto-survivor is the
  mechanism but is not asserted: the kill/serve race against `docker rm -f`
  cannot be forced, so asserting an exact split would be flaky.
- S7: the only check was a diff against byte-identical ggoo fixtures, so it was
  source-agnostic. Added assertions that the download committed exactly once,
  every chunk came from the validated two-peer set, both peers served, and no
  chunk was fetched twice.
- S14: enlarged to 4*CHUNK_SIZE so the balance check can fail under a 3+1
  imbalance; asserts an exact 2+2 split summing to the file size.
- S16: inflated the .eti to 2*CHUNK_SIZE so it fans out across both
  catalog-version peers (the stock 120 MiB fixture is a single chunk).
- S37: validate the throughput rate fields (positive, self-consistent
  mbit/mib == 8.388608, mib_per_s == bytes/duration), not just the byte count.
- S35: assert the source actually advertises the unknown game before checking
  it is filtered, so "absent" means "filtered" and not "never sent".
- S15: cross-check each peer's raw advertised eti_version via list-peers; the
  list-games eti_game_version is synthesized from the catalog and can only ever
  equal the asserted value.
- S2: poll for library convergence and verify the bidirectional exchange
  (bravo sees alpha's 3 games, not just alpha seeing bravo's 4).
- S12/S28: require the gating unit test to appear as "<name> ... ok" so an
  #[ignore]d (un-run) test no longer satisfies the check.
- S24/S25: assert the requested install=false final state.
- S34: assert exactly 21 coherent chunks (20 files + version.ini), 21 distinct
  paths, no duplicates, instead of a >= 21 floor.

Flake fixes:
- S19: force-kill the sole source right after download-begin on a 4*CHUNK_SIZE
  file and accept download-failed or download-peers-gone. The old graceful
  shutdown on a single-chunk file could let the transfer finish first, turning
  the expected failure into a download-finished. A chunk may complete before
  the kill lands, but the full transfer cannot, so the failure is deterministic.
- S26: use a large sparse source so the first operation is reliably still
  active when the duplicate request is issued (TOCTOU on active_operations);
  also assert the active operation == "Downloading".
- S11: drop the "listener address must change" assertion -- it tested the OS
  ephemeral-port allocator and could fail spuriously; keep the same-identity /
  no-duplicate invariant.

Coverage and determinism:
- S27: add handshake::tests::inbound_hello_from_self_is_ignored for the
  protocol-level self guard. The CLI scenario only exercises the CLI
  string-compare guard, which short-circuits before any network call, so the
  peer-crate guard had no test.
- find_fixture_game now iterates sorted(FIXTURES) so the ambiguous cnctw
  (fixture-bravo/multi/solid) resolves deterministically to fixture-bravo.

Reviewed and deliberately left as-is (documented in the run log): S20, S21,
S30, S32/S39/S44 absence checks, S42 IP-order precondition, S45.

PEER_CLI_SCENARIOS.md rows S2, S11, S14, S16, S18, S19, S27 are updated to
match the harness, and a dated run-log entry records the audit, the fixes, the
accepted items, and the live-run evidence.

Test Plan:
- `just peer-cli-tests` (rebuilds the image, runs S1-S47 in Docker): baseline
  passed; post-fix passed; a final run on the exact committed code passed
  47/47. Evidence: S14 {268435456, 268435456} balanced 2+2; S16 .eti split
  across B and C {134217728, 134217728}; S18 all 536870912 bytes delivered with
  no download-failed; S19 deterministic download-failed; S37 ~874 MiB/s.
- `just test` (incl. inbound_hello_from_self_is_ignored), `just clippy`
  (-D warnings, all-targets), and `just fmt` all pass.

Refs: PEER_CLI_SCENARIOS.md scenario matrix and 2026-06-21 run-log entry.
2026-06-21 10:23:45 +02:00
..