Files
softlan-vpn/README.md
T
ddidderr d15031c9d1 fix(client): clear gateway status from welcome identity
The client initialized gateway connectivity from ServerWelcome, but welcome only
exposed a boolean. If a gateway disconnected before the client saw the catch-up
PeerJoined event, the later unknown PeerLeft could not be tied to the gateway
and the status could stay connected.

Carry an optional gateway peer id in ServerWelcome. The relay fills it from the
joining gateway or the existing room gateway, and the Windows client stores it
so a matching unknown PeerLeft clears gateway connectivity. The boolean remains
for wire compatibility with older welcomes that do not carry the id.

Test Plan:
- cargo fmt --check
- cargo test -p lanparty-ctrl server_welcome
- cargo test -p lanparty-relay accepts_gateway_and_client_into_room
- cargo test -p lanparty-relay reports_missing_gateway_to_client_joining_first
- cargo test -p lanparty-client-win relay_lifecycle
- cargo test -p lanparty-client-win \
  clears_gateway_status_when_welcome_gateway_leaves_before_join_event
- cargo test -p lanparty-relay bridges_real_client_and_gateway_sessions
- cargo test -p lanparty-client-core connects_to_relay_control_stream_as_client
- cargo test --workspace
- cargo clippy --workspace --all-targets -- -D warnings
- cargo build --release -p lanparty-relay -p lanparty-gateway
- git diff --check
- git diff --cached --check

Refs: MVP lifecycle cleanup
2026-05-22 07:22:06 +02:00

295 lines
15 KiB
Markdown

# softlan-vpn
Monorepo for a Layer 2 over QUIC LAN party bridge.
## Workspace crates
- `lanparty-proto`: shared frame format, MAC validation, MTU helpers.
- `lanparty-ctrl`: control-plane messages (join/hello/role/version).
- `lanparty-net`: shared relay endpoint parsing and resolution.
- `lanparty-obs`: shared diagnostics/logging event models.
- `lanparty-client-core`: platform-agnostic client session state.
- `lanparty-client-route`: Windows relay-route inspection.
- `lanparty-client-tap`: TAP-Windows6 adapter discovery and frame I/O.
- `lanparty-client-win`: Windows TAP + route/metric handling binary.
- `lanparty-gateway`: Linux AF_PACKET gateway binary.
- `lanparty-relay`: public QUIC relay binary.
### `lanparty-proto`
Transport-agnostic tunnel contract shared by all binaries:
- overlay datagram header encoding and decoding
- v1 overlay datagrams reject reserved nonzero flags until their semantics are
defined
- negotiated QUIC datagram budget validation before send
- Ethernet frame header parsing
- MAC address parsing and identity validation
- QUIC datagram to TAP MTU budget helpers
### `lanparty-ctrl`
Reliable control-plane schema shared by the QUIC stream handlers:
- endpoint hello messages with role, room, MAC, and datagram budget
- server welcome mode, reject, peer lifecycle, stats, and disconnect messages
- initial room gateway-presence status in server welcomes
- room-code, role/MAC, peer-id, and effective-MTU validation
- length-prefixed JSON control frames for reliable QUIC streams
### `lanparty-obs`
Shared diagnostics and structured logging vocabulary:
- client/gateway/relay frame logs with MACs, ethertype, length, peer, and action
- tunnel counters shared by control messages and runtime diagnostics
- client connectivity/TAP diagnostics and user-facing status messages
### `lanparty-net`
Shared network address handling for tunnel binaries:
- relay DNS name, IP literal, and socket-address parsing
- UDP/443 default for bare relay hosts
- relay address resolution before tunnel interface activation
### `lanparty-client-core`
Platform-neutral remote client relay session:
- relay QUIC connection with pinned relay certificate trust
- client hello with room, virtual MAC, and datagram budget
- welcome/reject handling with assigned peer id and effective TAP MTU
- QUIC DATAGRAM support and negotiated datagram budget diagnostics
- relay RTT diagnostics from the active QUIC connection
- reliable relay control-event reads for peer lifecycle messages
- Ethernet frame send/receive helpers over QUIC DATAGRAM with budget, source
MAC, and remote-to-LAN safety checks plus local drop outcomes
- client tunnel statistics for frame/datagram rx/tx and drops
- reliable client stats snapshot sends for relay diagnostics
- best-effort graceful disconnect messages before QUIC close
### `lanparty-client-route`
Windows route-table boundary:
- read-only best-route lookup for a relay destination IP
- selected source address, next hop, interface index/LUID, prefix, and metric
- interface index/LUID lookup from Windows network adapter GUIDs
- scoped IP interface MTU overrides with restore-on-drop behavior
- scoped IP interface metric overrides with restore-on-drop behavior
- scoped default-route suppression with restore-on-drop behavior
- unicast IP address snapshots for TAP diagnostics
- scoped host-route pinning for the relay IP on the pre-TAP interface
- reuse of an already-existing matching relay host route without deleting it on exit
- non-Windows builds return a clear unsupported-platform error
### `lanparty-client-tap`
Windows TAP adapter boundary:
- TAP-Windows6 adapter discovery from the Windows network adapter registry
- TAP `NetworkAddress` registry configuration for the tunnel MAC identity
- `\\.\Global\{NetCfgInstanceId}.tap` device path construction
- blocking Ethernet frame reads/writes through the TAP device handle
- TAP driver IOCTL helpers for media status, adapter MAC, and MTU
### `lanparty-relay`
Public relay binary and relay-owned room state:
- QUIC endpoint binding and first-stream hello/welcome admission
- room admission for clients and gateways
- one gateway per room, duplicate client MAC rejection, and room limits
- stable effective room MTU chosen before Ethernet datagrams flow
- live Ethernet datagram forwarding with no ingress reflection
- forwarding drops for Ethernet frames above the negotiated TAP MTU
- per-peer egress budget checks against the negotiated datagram size
- reliable `PeerJoined`/`PeerLeft` notifications plus gateway identity in
welcome messages
- L2 safety filters for invalid-source, jumbo, switch-control, remote VLAN
tags, remote IPv6 fragments, IPv4/IPv6 DHCP-server, and IPv6-RA frames,
including frames behind ordinary IPv6 extension headers
- client broadcast/multicast, unknown-unicast, and total bandwidth limiting
- malformed peer datagram disconnect threshold
- peer stats control events retained for relay diagnostics
- graceful disconnect control events propagated as peer-leave reasons
- per-peer last-seen timestamps in relay room snapshots
- peer leave cleanup for room membership and MAC indexes
## Build
```bash
cargo check --workspace
```
For the manual MVP end-to-end proof, see [TESTING.md](TESTING.md).
## Relay
```bash
cargo run -p lanparty-relay -- --listen 443/udp --dev-cert-der-out relay-cert.der
```
`--listen` accepts either a socket address or a UDP port shorthand such as
`443/udp`. The relay binds a QUIC endpoint, accepts a control-stream `hello`,
replies with `welcome` or `reject`, and forwards live Ethernet QUIC datagrams
between accepted peers in the same room. It currently uses a generated
self-signed development certificate; `--dev-cert-der-out` writes that
certificate so the gateway and client can pin it in development. Production
certificate handling remains future work. Ethernet forwarding decisions are
logged with room, peer, MAC, ethertype, action, drop reason, and target count.
Safety-policy rejects use the `filtered` action so they are distinguishable
from malformed/unknown-destination drops and rate limits.
Malformed peer datagrams log their per-peer count before the relay disconnects
peers that cross the malformed-datagram threshold.
Relay egress skips caused by a target peer's smaller datagram budget are logged
with the ingress peer, target peer, encoded length, and target budget.
Ingress datagrams larger than the sending peer's negotiated datagram budget are
dropped before decode/forwarding and logged with `reason=datagram_budget`.
Unknown unicast from a client is forwarded only to the gateway port; unknown
unicast from the gateway is dropped instead of flooded to every remote client.
When a peer joins or leaves, the relay sends a reliable lifecycle control event
to peers that are still present in the room. Newly joined peers also receive
`PeerJoined` events for peers that were already present. Accepted joins notify
existing peers before the joining peer receives its welcome, so gateways can
seed client MAC state before a freshly accepted client starts sending frames.
### MVP Trust Model
The MVP relay terminates QUIC for every client and gateway connection. QUIC
protects traffic on the public network path, but the relay process sees
plaintext Ethernet frames while forwarding them between peers in a room. That is
acceptable for the first LAN-party proof, where the relay is an operator-trusted
component, but it is not end-to-end encrypted.
Future room-key payload encryption should keep the relay-visible routing header
small and leave only Ethernet payload bytes encrypted end-to-end between clients
and the LAN gateway.
## Gateway
```bash
cargo run -p lanparty-gateway -- \
--relay lanparty-relay.local \
--server-name lanparty-relay.local \
--relay-ca-cert relay-cert.der \
--room ROOM1 \
--interface eth0
```
The gateway first opens the wired LAN interface as an AF_PACKET socket with
promiscuous packet membership, then connects to the relay as `role = gateway`
and completes the control-stream hello/welcome handshake. That startup order
keeps an invalid, wireless, or unplugged interface from briefly advertising a
gateway that cannot bridge. Once both sides are ready, it bridges Ethernet
frames between the relay and wired LAN until shutdown. It captures whole LAN
frames up to the
overlay payload-length ceiling before deciding whether they fit the tunnel. It
never fragments Ethernet frames; LAN frames with invalid source MACs, L2
control-plane traffic, jumbo frames, frames above the negotiated TAP MTU, or
encoded datagrams exceeding the negotiated QUIC budget are counted, dropped,
and logged locally instead of stopping the bridge or consuming relay bandwidth.
Remote frames received from
the relay are safety-checked again before LAN injection and must use the
announced virtual MAC for their source peer, so invalid-source, forged-source,
L2 control-plane, remote VLAN, DHCP-server, IPv6 Router Advertisement, IPv6
fragment, jumbo, and over-TAP-MTU frames cannot cross the gateway's final
physical-LAN boundary even if they reached the gateway over QUIC.
`--relay` accepts a DNS name or socket address; bare hosts default to UDP/443.
The gateway rejects Linux interfaces that sysfs identifies as Wi-Fi, and rejects
wired interfaces whose sysfs carrier state reports no link; managed wireless
NICs are not supported for the physical LAN bridge.
It tracks remote-client MACs from relay lifecycle events and periodically emits
small CAM refresh frames, logged with `reason=periodic`, so the physical
switch keeps those MACs associated with the gateway port. A newly observed
client also triggers an immediate CAM refresh frame logged with
`reason=peer_joined` instead of waiting for the first periodic refresh tick.
When control events and frame work are both ready, the bridge handles the
lifecycle event first so first packets after a client joins use the freshest
remote-MAC state available locally. Gateway
frame logs include direction, peer id when present, MACs, ethertype/length,
frame length, action, and drop reason. The gateway also tracks frame/datagram
counters and periodically sends stats snapshots to the relay. Malformed or runt
LAN frames are counted and logged as dropped instead of disappearing before
accounting. It drops unrelated LAN unicast locally once the destination is known
not to be a connected remote client, so busy LAN traffic is not sent to the
public relay just to be discarded there. Relay lifecycle events seed and retire
remote-client MACs for CAM refresh and LAN-destination filtering even before
that client sends traffic. On shutdown, the gateway sends a best-effort
disconnect control message before closing QUIC so the relay can report the
intended reason.
## Windows Client
```bash
cargo run -p lanparty-client-win -- \
--relay lanparty-relay.local \
--server-name lanparty-relay.local \
--relay-ca-cert relay-cert.der \
--room ROOM1
```
The Windows client binary currently connects to the relay as `role = client`
with a generated locally administered virtual MAC persisted in
`lanparty-client-identity.json`. Before resolving or connecting to the relay,
it writes the generated tunnel MAC to the selected TAP driver's
`NetworkAddress` registry setting and marks TAP media disconnected. That clears
stale connected state from a previous crashed run without letting the TAP
adapter influence relay DNS or route selection. The client then resolves the
relay endpoint, completes the control-stream hello/welcome handshake, pins a
host route for the resolved relay IP on the current pre-TAP interface, verifies
that the relay route still uses that pinned host route after TAP activation,
and bridges Ethernet frames between the relay and the TAP-Windows6 adapter
until shutdown. `--relay` accepts a DNS name or socket address; bare hosts
default to UDP/443.
TAP frames whose source MAC does not match that generated tunnel MAC are
dropped locally before they can consume relay bandwidth; the relay still
enforces the same source-MAC rule.
If the exact relay host route already exists, the client uses it and leaves it
alone on exit. The startup status reports whether the relay already has a LAN
gateway for the room.
`--virtual-mac` can still override the stored identity for manual testing. On
Windows it sets the TAP IP interface MTU to the relay-selected MTU, marks the
TAP media connected for the scoped client run, and reports the driver MAC/MTU
before forwarding frames, along with the TAP interface index/LUID. The client
applies a scoped TAP interface metric and disables TAP default routes while it
runs, periodically rechecks that the relay route remains pinned, then restores
the previous route policy and TAP media status on exit. Startup prints a warning
when TAP default routes were enabled
before the scoped protection was applied. Startup still fails before bridging
if the driver-reported MAC does not match the tunnel identity, because an
already-initialized Windows TAP adapter may need to be disabled/enabled or
reinstalled before it reloads the configured `NetworkAddress`.
If exactly one TAP-Windows6 adapter is installed, the client opens it
automatically. If multiple TAP-Windows6 adapters are installed, startup fails
until `--tap-instance-id` selects the intended adapter by NetCfgInstanceId /
InterfaceGuid. `--list-tap-adapters` prints the TAP adapter ids and exits
without connecting.
It prints and reports client diagnostics snapshots with relay reachability,
LAN-gateway presence, route-pinning, QUIC datagram budget, relay RTT, TAP
status/IP, broadcast frame flow, frame/datagram counters, and drops. The
periodic diagnostics refresh the TAP unicast IP so DHCP results that arrive
after bridging starts become visible in later status lines, preferring a
non-link-local IPv4 address when Windows reports several TAP addresses. Each
snapshot also emits short user-facing lines such as relay/gateway connection status,
relay-route and TAP readiness warnings, DHCP address presence, relay RTT, and
broadcast-flow confirmation. One-way broadcast diagnostics distinguish frames
sent toward the LAN from broadcast frames received back from the LAN. Malformed frames
read from TAP, invalid or unauthorized source-MAC frames, L2 control-plane
traffic, remote VLAN tags, DHCP server replies, IPv6 Router Advertisements, IPv6
fragments, jumbo frames, frames above the negotiated TAP MTU, and TAP frames
whose encoded datagrams exceed the negotiated QUIC budget are counted and
dropped before relay send without stopping the bridge. Relayed LAN frames are
also safety-checked before TAP writes, so switch-control traffic,
invalid-source frames, jumbo frames, and over-TAP-MTU frames stay out of the
Windows adapter even if they reached the client over QUIC.
Misdirected unicast frames not addressed to the client's virtual MAC are also
counted and skipped; accepted TAP-to-relay and relay-to-TAP frames are logged
with direction, peer id, MACs, ethertype/length, frame length, action, and drop
reason. TAP device read/write errors still stop the bridge.
Relay lifecycle events are logged as they arrive, including gateway joins and
peer leaves. The client remembers peer identities from join and catch-up events
so later leave logs can identify a disconnected LAN gateway or client MAC when
that peer was known.