Replace the `a9f9845` local-update dedup cache with explicit peer event
semantics. Local scans now emit `LocalLibraryChanged` when the library changes,
while operation mutations emit `ActiveOperationsChanged` from the mutation
path. Tauri keeps joining those facts into the existing `games-list-updated`
payload, so the frontend contract stays stable.
This removes the cache/invalidation coupling between scan emission and
operation state. The remaining forced local snapshot is explicit: accepted game
directory changes can refresh the UI for an equivalent new path without sending
a peer library delta.
Operation guard cleanup and liveness cancellation now publish the same active
operation snapshot as normal command-handler transitions. The peer CLI JSONL
events follow the same split with `local-library-changed` and
`active-operations-changed`.
Test Plan:
- `just fmt`
- `CARGO_BUILD_RUSTC_WRAPPER= just test`
- `CARGO_BUILD_RUSTC_WRAPPER= just clippy`
- `git diff --check`
Refs: CLEAN_CODE_PLAN_1.md
Add deeper peer CLI coverage for file-transfer integrity and multi-peer
chunking. The alpha fixture now carries a real renamed RAR archive larger
than 100 MB for alienswarm, which gives the chunk planner enough work to
split a single game archive across multiple peers.
Expose completed chunk source details as a peer event and have the CLI print
that event as JSONL. This keeps transfer behavior in lanspread-peer while the
CLI remains a harness that reports what the peer runtime did. The Tauri shell
logs the event at debug level so the shared PeerEvent enum stays exhaustive.
Document the new S13/S14 scenarios and record the manual run evidence,
including SHA-256 manifests and the per-peer byte split for the large archive.
Test Plan:
- just fmt
- just test
- just peer-cli-build
- just clippy
- just peer-cli-image
- unrar t -idq crates/lanspread-peer-cli/fixtures/fixture-alpha/alienswarm/alienswarm.eti
- Manual peer CLI: bravo -> deep-small-client bfbc2 download with matching SHA-256 manifests
- Manual peer CLI: alpha -> deep-stage-b alienswarm download with matching SHA-256 manifests
- Manual peer CLI: alpha + deep-stage-b -> deep-stage-c alienswarm download with chunk events from both peers and matching SHA-256 manifests
Refs: PEER_CLI_SCENARIOS.md S13 S14
The follow-up backlog had drifted into three settled peer/runtime issues: the
legacy game-list fallback contradicted the one-wire-version policy, the Tauri
shell still re-derived local install state from disk after peer snapshots, and
`Availability::Downloading` existed even though active operations are already
reported through a separate operation table.
Remove the legacy `AnnounceGames` request and fallback service. Discovery now
ignores peers that do not advertise the current protocol and a peer id, and
library changes are sent through the current delta path only. This keeps the
runtime aligned with the documented current-build-only interoperability model.
Make peer `LocalGamesUpdated` snapshots authoritative for local fields in the
Tauri database. The GUI-side catalog still owns static metadata such as names,
sizes, and descriptions, but downloaded, installed, local version, and
availability now come from the peer runtime instead of a second whole-library
filesystem scan. Snapshot reconciliation also pins the missing-begin and
missing-finish lifecycle cases in tests.
Collapse availability back to the settled `Ready` and `LocalOnly` states.
Aggregation now counts only `Ready` peers as download sources, and the frontend
no longer carries a dead `Downloading` enum value.
The core peer also exposes the small non-GUI hooks needed by scripted callers:
startup options for state and mDNS, a local-ready event, direct connection, peer
snapshots, and an explicit post-download install policy. Those hooks reuse the
same current protocol path and do not add compatibility shims.
Test Plan:
- `git diff --check`
- `just fmt`
- `just clippy`
- `just test`
Refs: BACKLOG.md, FINDINGS.md, IMPL_DECISIONS.md
FINDINGS.md identified three merge blockers in the post-plan install/update
flow.
Updates now use FetchLatestFromPeers so the Tauri update command bypasses
local manifest serving and asks peers that advertise the latest version for
fresh file metadata. PeerGameDB now aggregates and validates file descriptions
from latest-version peers, keeping stale cached metadata for older versions
from poisoning chunk planning when filenames stay the same but sizes change.
Download-to-install handoff now performs explicit async state transitions.
The download task mutates Downloading to Installing or Updating under the
active-operation write lock, clears the cancellation token, and then runs the
install transaction. OperationGuard remains armed only as crash or abort
cleanup and is disarmed after normal explicit cleanup, so final refreshes no
longer race a deferred Drop cleanup.
Local library index writers now serialize the load/mutate/save window with one
async mutex. The index fingerprint also includes the root version.ini contents
so a same-length version rewrite in the same mtime second still updates the
reported local version.
The tradeoff is that local index mutations are serialized in-process instead
of moved into a dedicated actor. That keeps the fix small and scoped to the
merge blockers while preserving the existing scanner API.
Test Plan:
- just fmt
- just test
- just clippy
- just build
- git diff --check
Refs:
- FINDINGS.md
Game::availability used string labels that were carried through persisted
library data, protocol summaries, and the Tauri-facing game payload. That
allowed invalid states to exist and required legacy summary conversion code to
defensively map strings back into protocol availability values.
Move Availability to lanspread-db and re-export it from lanspread-proto so the
persisted Game type and wire GameSummary type share one serde enum. The JSON
spelling stays Ready, Downloading, or LocalOnly, so the serialized shape does
not change for current library indexes or peer payloads.
Add typed helpers for sentinel-derived download state. Game::set_downloaded
keeps downloaded and Ready/LocalOnly in lockstep and intentionally collapses
non-ready local state, including Downloading, back to LocalOnly. That matches
the current local-summary contract where active operations are suppressed
instead of advertised as Downloading. Game::normalized_availability keeps the
legacy Game-to-summary path from publishing an inconsistent Ready value when
downloaded is false.
Update the follow-up status note so typed availability is no longer listed as
open work.
Test Plan:
- just fmt
- just test
- just clippy
- just build
Refs: none
Local operation spinners were driven by begin, finish, and failure event
history. If one of those lifecycle events was missed, the Tauri bridge could
keep a stale active operation and the React state would keep showing an
in-progress spinner until restart.
Peer local scan updates now carry an authoritative active-operation snapshot.
The peer still suppresses active game roots from peer-facing library deltas,
but it emits LocalGamesUpdated to the UI even when no library delta changed so
the snapshot can clear stale state after rollback or completion. The Tauri
bridge replaces its active-operation map from that snapshot, emits it with the
games-list payload, and the React merge uses it to restore download, install,
update, and uninstall spinners from current peer state rather than event
history alone.
This also enables the Tauri lib unit-test target so the reconciliation helper
can stay covered by the workspace test recipe.
Test Plan:
- git diff --check
- just fmt
- just clippy
- just test
Follow-up-Plan: FOLLOW_UP_2.md
The follow-up review found a few stale lifecycle edges around local game
transactions. Recovery could sweep active roots, post-operation refreshes
still re-ran full startup recovery, and the UI kept inferring local-only state
from downloaded and installed flags instead of the backend availability.
This updates the peer lifecycle so startup recovery skips active operations,
install/update/uninstall refresh only the affected game after the operation
guard is dropped, and path-changing game-directory updates are rejected while
operations are active. It also removes the dead UpdateGame command, drops the
unused manifest_hash write field while preserving old JSON reads, renames the
internal install-finished event, and carries availability through the DB,
peer summaries, Tauri refreshes, and the React model.
The included follow-up documents record the review source, implementation
decisions, and the remaining FOLLOW_UP_2.md work so later commits can stay
small instead of reopening the completed plan items.
Test Plan:
- git diff --check
- just fmt
- just clippy
- just test
Follow-up-Plan: FOLLOW_UP_PLAN.md
Remove the Tauri-side whole-game backup and unpack flow. The Tauri shell now
provides an injected unrar sidecar implementation and lets the peer own
install, update, uninstall, rollback, and recovery decisions.
Route install commands by local state: missing version.ini fetches from peers,
downloaded archives without local/ send InstallGame directly, and already
installed games are left to the Play action. Updates request a fresh download
and uninstalls forward UninstallGame. The UI mirrors peer operation events for
downloading, installing, updating, and uninstalling.
Render installed-but-not-downloaded games as LocalOnly and surface the local
version for downloaded-but-not-installed games. Add a secondary uninstall
affordance that does not change the main Install/Open action.
Test Plan:
- just fmt
- just clippy
- just test
- just build
Refs: PLAN.md
Replace the detached tokio::spawn pattern in the peer runtime with a
supervised model built on tokio_util's CancellationToken and TaskTracker.
Long-lived services and child tasks now have an explicit parent, a
cancellation path, and a join point. Tauri can request a clean shutdown
on app exit instead of leaking work into process termination.
Background
~~~~~~~~~~
start_peer() previously returned only a command sender. The four startup
services (QUIC server, mDNS discovery, peer liveness, local library
monitor) and their child tasks (ping workers, handshake jobs, download
workers, announcement fan-outs, connection/stream handlers) were spawned
with raw tokio::spawn and detached. Closing the command channel sent
Goodbye notifications but did not stop those services. The mDNS blocking
worker had no cancellation path at all. Active downloads were stored as
JoinHandle<()> and force-aborted, which could interrupt file writes
mid-chunk.
Supervisor
~~~~~~~~~~
The runtime now owns a CancellationToken and a TaskTracker, threaded
through Ctx and PeerCtx. Each long-lived service is spawned through a
small supervisor (spawn_supervised_service) that wraps the service in
catch_unwind and enforces an explicit SupervisionPolicy:
QuicServer: Required (fatal; cancels the runtime if it dies)
Discovery: Restart(5s) (matches the prior self-restart loop)
Liveness: Restart(5s)
LocalMonitor: BestEffort (logs and exits, no restart)
A Required failure emits a new RuntimeFailed { component, error } event
to the UI and cancels the runtime; the command loop and goodbye
notifications still run to completion. The Tauri layer forwards the
event as "peer-runtime-failed" so a future UI can surface it.
mDNS cancellation
~~~~~~~~~~~~~~~~~
MdnsBrowser previously blocked on receiver.recv() forever. It now
exposes next_service_timeout(Duration) returning an MdnsServicePoll
enum (Service/Timeout/Closed) via recv_timeout(). The discovery worker
polls at 250ms and checks the shutdown flag between ticks, so
cancellation reaches the blocking thread within one poll interval
instead of waiting for the next mDNS event.
Downloads
~~~~~~~~~
active_downloads is now HashMap<String, CancellationToken>. Each
download gets a child token of the runtime shutdown, checked at chunk
and peer-attempt boundaries (never inside file writes). When all peers
with a game disappear, liveness cancels the token and emits
DownloadGameFilesAllPeersGone; the download exits Ok(()) without
emitting a duplicate Failed event.
DownloadStateGuard (context.rs) is held inside the download task and
clears downloading_games + active_downloads on Drop, covering the happy
path, error returns, cancellation, and task abort. Drop falls back to
spawning the cleanup if write-lock contention prevents try_write.
Public API and Tauri integration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
start_peer() now returns PeerRuntimeHandle exposing:
fn sender(&self) -> UnboundedSender<PeerCommand>
fn shutdown(&self)
async fn wait_stopped(&mut self)
The Tauri layer stores the handle in managed state and switches its
main loop from .run(ctx) to .build(ctx).run(|h, e| ...). On
RunEvent::Exit it calls handle.shutdown() and blocks up to 2s on
wait_stopped(), giving services time to cancel and Goodbye packets time
to flush over a healthy LAN while staying short enough not to delay
process exit noticeably on a dead network.
The command loop distinguishes graceful shutdown from unexpected
channel closure: if recv() returns None and shutdown.is_cancelled() is
set, the loop returns Ok(()) silently. Only an unexpected close (no
cancellation observed) still emits RuntimeFailed. This avoids a
spurious failure event on every normal app close.
User-visible behavior changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Closing the app no longer leaks services into process termination;
Goodbye notifications are reliably attempted before exit.
- Downloads cancel cleanly (between chunks) instead of force-aborting
mid-write.
- A new "peer-runtime-failed" Tauri event fires when a Required service
cannot recover. No frontend handler exists yet — that is a follow-up.
Tradeoffs
~~~~~~~~~
- Workspace tokio-util now requires the "rt" feature for TaskTracker.
- The mDNS worker still runs in spawn_blocking and may stay parked
briefly between 250ms polls — acceptable for a desktop app.
- The 2s shutdown timeout on app exit is a deliberate compromise.
Tests
~~~~~
New unit tests:
- DownloadStateGuard clears tracking on completion, cancellation, and
parent-task abort (context.rs).
- Required failure cancels the runtime and emits RuntimeFailed
(startup.rs).
- Restart policy restarts until shutdown is requested (startup.rs).
- PeerRuntimeHandle.shutdown() observable via wait_stopped()
(startup.rs).
- Peers-gone cancellation emits only PeersGone, no duplicate Failed
(services/liveness.rs).
Test plan
~~~~~~~~~
cargo test --workspace
cargo clippy --workspace --all-targets
Manual smoke test on two peers on the same LAN:
1. Start a download, verify chunks transfer.
2. Close the receiving app mid-download — verify the sending peer
logs a Goodbye, not a connection-reset error.
3. Stop the sending peer mid-download — verify the receiver emits
DownloadGameFilesAllPeersGone, not Failed.
Follow-ups
~~~~~~~~~~
- Frontend handler for "peer-runtime-failed".
- Consider exposing the runtime handle's stopped watch to the frontend
for a reconnecting indicator on Required failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Peer startup used to bootstrap itself by spawning the runtime and immediately
sending a SetGameDir command back through its own control channel. The Tauri
integration then polled shared state until a directory appeared and waited two
seconds before asking peers for games. That made startup ordering implicit and
left a race-prone sleep in the UI bridge.
Install the initial game directory directly into the peer context instead. The
runtime now attempts the initial local-library scan before starting discovery,
then launches the server, discovery, liveness, and local monitor services from
that initialized context. Later directory changes still use SetGameDir, so the
existing UI command surface stays intact.
Use PathBuf and Path references across peer filesystem boundaries so directory
state is represented as a path rather than an optional string. The Tauri layer
now validates a selected game directory before storing it, loads the bundled
catalog on first use, and starts or updates the peer runtime from one helper.
Peer event fan-out is split into named handlers so the Tauri setup closure only
wires state and starts the event loop.
Shutdown goodbye notifications are still best-effort, but they are now awaited
with a short timeout instead of being spawned and forgotten. The tradeoff is a
small bounded wait during peer runtime shutdown in exchange for clearer task
ownership.
Test Plan:
- cargo test -p lanspread-peer
- cargo clippy
- cargo clippy --benches
- cargo clippy --tests
- cargo +nightly fmt
- git diff --check
Refs: none
Move the required game.db resource resolution and ETI catalog loading out of
Tauri setup into small helpers. The setup closure now describes the startup
flow instead of carrying resource-resolution and conversion details inline.
This keeps the existing fail-fast behavior for a missing or unreadable bundled
catalog, while giving the required resource path and in-memory GameDB conversion
clear names. There is no intended user-visible behavior change.
Test Plan:
- cargo clippy
- cargo clippy --benches
- cargo clippy --tests
- cargo +nightly fmt
Refs: none
LanSpreadState now owns its empty initialization through Default. This keeps
the root runtime state construction in one place instead of building each
Arc<RwLock<_>> value inline before registering it with Tauri.
The setup hook now retrieves peer_game_db from the managed state and clones the
Arc before spawning async peer initialization. That preserves the existing
lifetime boundary while removing the separate outer peer_game_db binding.
There is no user-visible behavior change. The peer database, game list,
download tracking, games folder, and peer control channel still start empty and
are populated through the same setup and command paths.
Test Plan:
- cargo clippy
- cargo clippy --benches
- cargo clippy --tests
- cargo +nightly fmt
Refs: none