From b41f75fbc96f26cf4a94d756f6cd5512ab2ab719 Mon Sep 17 00:00:00 2001 From: ddidderr Date: Thu, 21 May 2026 17:00:58 +0200 Subject: [PATCH] plan --- PLAN.md | 578 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 578 insertions(+) create mode 100644 PLAN.md diff --git a/PLAN.md b/PLAN.md new file mode 100644 index 0000000..388f15e --- /dev/null +++ b/PLAN.md @@ -0,0 +1,578 @@ +What I want to do: + +A simple one-click Layer 2 tunnel software (Windows 11 client) to bridge people who cannot participate in person at a LAN party to the LAN party. And a simple server endpoint (Linux) software that runs physically at the LAN party and bridges the tunneled traffic and the real LAN network. + +I already talked a bit with different AIs about how to do this, here's the current plan: + +# LAN Party Tunnel Plan + +Build a **TAP-based L2-over-QUIC tunnel**. + +The remote Windows client gets a real virtual Ethernet adapter. Ethernet frames from that adapter are sent over QUIC to a public relay. The relay forwards them to a Linux gateway at the LAN party. The Linux gateway injects those frames onto the physical LAN and captures replies. + +```text +Windows game + ⇄ Windows TAP adapter + ⇄ lanparty-client.exe + ⇄ QUIC datagrams + ⇄ public relay + ⇄ QUIC datagrams + ⇄ Linux LAN gateway + ⇄ physical Ethernet LAN +``` + +No WireGuard. +No Npcap. +No Windows bridge. +No packet rewriting from the user’s real NIC. +No tunnel fragmentation for MVP. + +## Goal + +The remote player should do this: + +```text +1. Install client. +2. Start it. +3. Enter domain / room code. +4. Click Connect. +5. Game sees a normal LAN adapter. +``` + +The physical LAN party host does this: + +```text +1. Plug Linux gateway PC into the LAN with wired Ethernet. +2. Run lanparty-gateway --iface eth0 --room ABCD. +3. Done. +``` + +The public server does this: + +```text +lanparty-relay --listen 443/udp +``` + +UDP/443 is a good default, but the port must be configurable because some networks block QUIC/UDP. + +## Components + +### 1. Windows client: `lanparty-client.exe` + +Written in Rust. + +Responsibilities: + +```text +- create/open TAP adapter +- give the TAP adapter a unique stable MAC address +- set TAP MTU to a safe small value +- connect to the relay via QUIC +- read Ethernet frames from TAP +- send one Ethernet frame per QUIC datagram +- receive Ethernet frames from QUIC datagrams +- write frames back into TAP +- keep the relay connection routed through the real internet NIC +``` + +Use a real TAP/Ethernet adapter. `tap-windows6` is an NDIS TAP-Windows driver used by OpenVPN and other apps, which is the right class of device here because we need Ethernet frames, not just IP packets. ([GitHub][1]) + +Do **not** use Wintun for this design. Wintun is L3/TUN-style and does not give you the Ethernet/L2 behavior needed for ARP, DHCP, broadcast discovery, and old LAN games. + +The TAP adapter is the remote player’s LAN-party identity. + +```text +Game binds to TAP +TAP gets DHCP from real LAN via tunnel +Game sends ARP/broadcast/multicast through TAP +Client tunnels the Ethernet frames +``` + +### 2. Linux gateway: `lanparty-gateway` + +Runs on the physical LAN party machine. + +Responsibilities: + +```text +- connect outbound to relay +- open raw L2 socket on the wired LAN interface +- capture Ethernet frames from the LAN +- inject remote Ethernet frames onto the LAN +- learn remote MAC addresses +- apply safety filters +- periodically refresh switch CAM table entries +``` + +Use Linux `AF_PACKET` / `SOCK_RAW` on the real wired NIC. Packet sockets operate at device-driver / OSI Layer 2 level, and `SOCK_RAW` includes the link-layer header, which is exactly what we need for Ethernet frames. ([man7.org][2]) + +For MVP, run as root. Later, reduce privileges. Opening raw sockets and changing/promiscuous network behavior needs elevated networking privileges; `CAP_NET_ADMIN` covers things like setting promiscuous mode, and `CAP_NET_RAW` covers raw packet access. ([man7.org][3]) + +No Linux bridge is needed for MVP. No `br0`. No moving the host’s IP from `eth0` to a bridge. The gateway daemon directly captures and injects frames on the physical NIC. + +Wired Ethernet only. No Wi-Fi gateway mode for MVP. Managed Wi-Fi NICs are not reliable for arbitrary source-MAC injection. + +### 3. Public relay: `lanparty-relay` + +Runs on VPS/public server. + +Responsibilities: + +```text +- accept QUIC connections +- group clients and gateway into rooms +- forward Ethernet datagrams +- enforce room limits +- reject duplicate MACs +- rate-limit abuse +- later: auth / invite codes / E2E overlay encryption +``` + +For MVP, the relay is the full data path, not merely NAT traversal. + +That gives the best UX: + +```text +client → outbound QUIC → relay +gateway → outbound QUIC → relay +``` + +No port forwarding. No NAT traversal pain. Direct P2P can come later. + +## Transport + +Use QUIC. + +Use reliable QUIC streams for control: + +```text +hello +join room +role = client | gateway +version negotiation +assigned peer id +announced MAC +MTU negotiation +stats +disconnect reason +future auth +``` + +Use QUIC DATAGRAM for Ethernet frames. QUIC DATAGRAM is specifically the unreliable datagram extension for QUIC, which fits Ethernet/game traffic better than reliable streams because old frames should not block newer frames. ([IETF Datatracker][4]) + +Rust QUIC implementation: start with `quinn`. It exposes `Connection::max_datagram_size()`, which returns the maximum datagram payload size or `None` if datagrams are unsupported/disabled. ([Docs.rs][5]) + +## No fragmentation for MVP + +Do **not** fragment Ethernet frames inside the overlay. + +Instead: + +```text +small TAP MTU +one TAP Ethernet frame = one QUIC datagram +``` + +Startup flow: + +```text +1. establish QUIC connection +2. verify QUIC DATAGRAM support +3. query max_datagram_size() +4. compute safe inner MTU +5. configure TAP MTU +6. bring TAP up +``` + +MVP default: + +```text +TAP MTU: 1200 or 1280-ish +hard fail if QUIC datagram budget is too small +``` + +Formula: + +```text +tap_mtu <= quic_max_datagram_size + - overlay_header_len + - ethernet_header_len + - safety_margin +``` + +No fragment table. No reassembly timeout. No “one lost fragment kills the whole Ethernet frame.” Add fragmentation later only if testing proves it is necessary. + +## Overlay frame format + +Keep the outer routing header small and stable. + +Example: + +```text +magic: u32 +version: u8 +type: u8 // frame, control, keepalive +room_id: u64 +peer_id: u32 +flags: u16 +payload_len: u16 +payload: Ethernet frame bytes +``` + +For future relay-blind encryption, split this mentally into: + +```text +clear routing header +encrypted Ethernet payload +``` + +MVP can skip payload encryption beyond QUIC, but the wire format should not make later E2E encryption painful. + +## Trust model + +MVP relay sees plaintext Ethernet frames. + +QUIC encrypts traffic on the wire, but because the relay terminates QUIC connections, it decrypts frames from clients and re-encrypts them to the gateway. + +That is acceptable for a LAN-party MVP, but it should be explicitly documented. + +Future version: + +```text +client/gateway share room key +Ethernet payload is AEAD-encrypted before QUIC +relay only sees room id, peer id, size, timing +``` + +Do not retrofit this into a bad packet format later. Reserve the shape now. + +## Switching model + +Treat the whole thing as a tiny user-space Ethernet switch. + +Maintain: + +```text +MAC -> peer_id +peer_id -> QUIC connection +last_seen timestamp +``` + +Forwarding rules: + +```text +source MAC from client: + learn source MAC -> client + +known unicast: + forward only to target peer/gateway + +broadcast/multicast: + flood to gateway and relevant clients + +unknown unicast: + flood initially, later rate-limit + +never reflect frame back to ingress peer +``` + +For MVP, simplify: + +```text +remote client frames mostly go to gateway +LAN frames go to matching remote client or all clients if broadcast/multicast +``` + +But MAC learning belongs in the real design. + +## MAC identity + +Each Windows client needs a unique locally administered unicast MAC. + +Example range: + +```text +02:xx:xx:xx:xx:xx +``` + +Generate once per install or per profile. Store it. Configure TAP with it. Announce it during join. + +Relay must reject: + +```text +- duplicate MAC in same room +- broadcast/multicast source MAC +- obviously invalid MAC +- too many source MACs per client +``` + +Default policy: + +```text +1 MAC per client +maybe 2 later for weird cases +``` + +This is your responsibility, not the user’s. + +## Linux gateway CAM-table refresh + +The physical LAN switch must learn that remote clients’ MACs live behind the gateway port. + +That happens when the gateway injects frames onto the LAN using the remote client’s source MAC. + +But switch CAM entries age out. So the gateway should periodically refresh them. + +Every ~60 seconds: + +```text +for each connected remote MAC: + inject a tiny harmless Ethernet frame with that MAC as source +``` + +The exact frame can be decided during implementation, but the goal is simple: keep the LAN switch mapping the remote MAC to the gateway’s physical port. + +Phase 1 success criterion: + +```text +remote client MAC appears in the LAN switch MAC table on the gateway port +``` + +If that is false, the L2 illusion is broken. + +## Safety filters + +Remote clients must not be allowed to spray arbitrary L2 control-plane junk onto the real LAN. + +Drop remote → LAN unconditionally: + +```text +- EAPOL / 802.1X +- STP / BPDUs +- LLDP +- LACP +- DHCP server replies +- IPv6 Router Advertisements +- jumbo frames +- frames from unauthorized source MACs +``` + +Also drop LAN → remote: + +```text +- EAPOL +- STP +- LLDP +- LACP +``` + +No remote Windows client needs to see switch/control-plane traffic. + +EAPOL is especially important: remote clients should never be able to interfere with 802.1X or port authentication behavior on the physical switch. + +Add rate limits: + +```text +- broadcast/multicast per client +- unknown unicast per client +- total bandwidth per client +- malformed packet disconnect threshold +``` + +## Windows routing / metric handling + +The TAP adapter may receive DHCP from the party LAN. That is good. + +But if DHCP gives it a default gateway, Windows might try to route the relay connection through the tunnel itself. That would break the tunnel. + +Client startup should: + +```text +1. resolve relay domain before TAP is active +2. remember current real default gateway/interface +3. add explicit host route to relay IP via real NIC +4. bring TAP up +5. set TAP interface metric appropriately +6. detect and neutralize TAP default-route takeover +``` + +The TAP should be preferred for the party LAN subnet, but it must not steal general internet traffic. + +Also strongly recommend uncommon LAN party subnets: + +```text +good: 10.73.42.0/24 +bad: 192.168.0.0/24 +bad: 192.168.1.0/24 +bad: 192.168.178.0/24 +``` + +Duplicate subnet with a remote user’s home LAN will be painful. + +## Relay placement / latency + +Relay-as-data-path is the right MVP. It makes the product work through NAT immediately. + +But latency becomes: + +```text +client → relay → gateway +``` + +So relay location matters. + +For Europe/Germany-focused usage, put the relay near the expected players and LAN site, e.g. Frankfurt/Nuremberg/Amsterdam depending on hosting. Later, add direct QUIC path attempts with relay fallback, but do not block MVP on NAT traversal. + +Design the room protocol so future modes are possible: + +```text +mode = relay +mode = direct-p2p +mode = direct-failed-relay-fallback +``` + +## Logging / diagnostics + +Phase 1 should log heavily. + +Gateway frame log: + +```text +direction +src MAC +dst MAC +ethertype +length +peer id +action = forwarded | dropped | filtered | rate-limited +``` + +Client diagnostics: + +```text +relay reachable: yes/no +QUIC datagram support: yes/no +max datagram size +TAP adapter found: yes/no +TAP MAC +TAP MTU +TAP IP from DHCP +relay route pinned: yes/no +frames rx/tx +drops +``` + +User-facing diagnostics should eventually say things like: + +```text +Connected to relay +Connected to LAN gateway +DHCP received: 10.73.42.51 +Gateway latency: 23 ms +Broadcast traffic flowing +Warning: TAP received default route, adjusted metric +``` + +## Phase plan + +### Phase 1: prove the illusion + +Manual, ugly, real. + +```text +- manual TAP install on Windows +- Rust Windows client opens TAP +- fixed TAP MTU, e.g. 1200 +- Linux gateway opens AF_PACKET on wired eth0 +- relay forwards one client +- no auth except room string +- no fragmentation +- heavy frame logging +``` + +Success criteria: + +```text +- Windows TAP gets DHCP from real LAN +- Windows client can ARP LAN host +- Windows client can ping LAN host +- remote MAC appears in switch MAC table on gateway port +- one real LAN game discovers or joins a LAN server +``` + +### Phase 2: multi-client + +```text +- multiple Windows clients +- unique MAC generation +- duplicate MAC rejection +- MAC learning +- broadcast/multicast fanout +- CAM refresh frames +- reconnect handling +``` + +### Phase 3: safety and correctness + +```text +- L2 control-plane filters +- DHCP server reply filtering +- IPv6 RA filtering +- MAC limits +- rate limits +- route/metric protection +- better malformed-frame handling +``` + +### Phase 4: product UX + +```text +- Windows installer +- TAP driver install/check +- simple GUI +- room code / domain field +- diagnostics screen +- configurable relay port +- logs export button +``` + +Driver signing and TAP bundling must be validated early. `tap-windows6` is the right kind of driver, but Windows driver installation/signing is a product risk, not something to handwave. ([GitHub][1]) + +### Phase 5: better security and latency + +```text +- invite tokens / auth +- room ACLs +- optional room-key E2E payload encryption +- direct QUIC path attempt +- relay fallback +- regional relay selection +``` + +## Explicit non-goals + +For MVP, do not build: + +```text +- Npcap mode +- WinDivert mode +- source-IP rewriting +- Windows bridge +- Hyper-V virtual switch +- WireGuard underlay +- custom Ethernet fragmentation +- Wi-Fi LAN gateway support +- full internet VPN mode +``` + +## One-sentence version + +Build a **Rust Windows TAP client + public QUIC relay + Linux AF_PACKET gateway** that carries one small-MTU Ethernet frame per QUIC datagram, gives each remote player a unique virtual MAC on the real LAN, filters dangerous L2 control traffic, and keeps the physical LAN gateway as the only machine touching the real LAN. + +[1]: https://github.com/OpenVPN/tap-windows6?utm_source=chatgpt.com "OpenVPN/tap-windows6: Windows TAP driver (NDIS 6)" +[2]: https://man7.org/linux/man-pages/man7/packet.7.html?utm_source=chatgpt.com "packet(7) - Linux manual page" +[3]: https://man7.org/linux/man-pages/man7/capabilities.7.html?utm_source=chatgpt.com "capabilities(7) - Linux manual page" +[4]: https://datatracker.ietf.org/doc/html/rfc9221?utm_source=chatgpt.com "RFC 9221 - An Unreliable Datagram Extension to QUIC" +[5]: https://docs.rs/quinn/latest/quinn/struct.Connection.html?utm_source=chatgpt.com "Connection in quinn - Rust" + +I want a mono-repo, Rust code, crates into a "crates" folder, one cargo workspace.