Files
2026-05-21 17:00:58 +02:00

579 lines
14 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
What I want to do:
A simple one-click Layer 2 tunnel software (Windows 11 client) to bridge people who cannot participate in person at a LAN party to the LAN party. And a simple server endpoint (Linux) software that runs physically at the LAN party and bridges the tunneled traffic and the real LAN network.
I already talked a bit with different AIs about how to do this, here's the current plan:
# LAN Party Tunnel Plan
Build a **TAP-based L2-over-QUIC tunnel**.
The remote Windows client gets a real virtual Ethernet adapter. Ethernet frames from that adapter are sent over QUIC to a public relay. The relay forwards them to a Linux gateway at the LAN party. The Linux gateway injects those frames onto the physical LAN and captures replies.
```text
Windows game
⇄ Windows TAP adapter
⇄ lanparty-client.exe
⇄ QUIC datagrams
⇄ public relay
⇄ QUIC datagrams
⇄ Linux LAN gateway
⇄ physical Ethernet LAN
```
No WireGuard.
No Npcap.
No Windows bridge.
No packet rewriting from the users real NIC.
No tunnel fragmentation for MVP.
## Goal
The remote player should do this:
```text
1. Install client.
2. Start it.
3. Enter domain / room code.
4. Click Connect.
5. Game sees a normal LAN adapter.
```
The physical LAN party host does this:
```text
1. Plug Linux gateway PC into the LAN with wired Ethernet.
2. Run lanparty-gateway --iface eth0 --room ABCD.
3. Done.
```
The public server does this:
```text
lanparty-relay --listen 443/udp
```
UDP/443 is a good default, but the port must be configurable because some networks block QUIC/UDP.
## Components
### 1. Windows client: `lanparty-client.exe`
Written in Rust.
Responsibilities:
```text
- create/open TAP adapter
- give the TAP adapter a unique stable MAC address
- set TAP MTU to a safe small value
- connect to the relay via QUIC
- read Ethernet frames from TAP
- send one Ethernet frame per QUIC datagram
- receive Ethernet frames from QUIC datagrams
- write frames back into TAP
- keep the relay connection routed through the real internet NIC
```
Use a real TAP/Ethernet adapter. `tap-windows6` is an NDIS TAP-Windows driver used by OpenVPN and other apps, which is the right class of device here because we need Ethernet frames, not just IP packets. ([GitHub][1])
Do **not** use Wintun for this design. Wintun is L3/TUN-style and does not give you the Ethernet/L2 behavior needed for ARP, DHCP, broadcast discovery, and old LAN games.
The TAP adapter is the remote players LAN-party identity.
```text
Game binds to TAP
TAP gets DHCP from real LAN via tunnel
Game sends ARP/broadcast/multicast through TAP
Client tunnels the Ethernet frames
```
### 2. Linux gateway: `lanparty-gateway`
Runs on the physical LAN party machine.
Responsibilities:
```text
- connect outbound to relay
- open raw L2 socket on the wired LAN interface
- capture Ethernet frames from the LAN
- inject remote Ethernet frames onto the LAN
- learn remote MAC addresses
- apply safety filters
- periodically refresh switch CAM table entries
```
Use Linux `AF_PACKET` / `SOCK_RAW` on the real wired NIC. Packet sockets operate at device-driver / OSI Layer 2 level, and `SOCK_RAW` includes the link-layer header, which is exactly what we need for Ethernet frames. ([man7.org][2])
For MVP, run as root. Later, reduce privileges. Opening raw sockets and changing/promiscuous network behavior needs elevated networking privileges; `CAP_NET_ADMIN` covers things like setting promiscuous mode, and `CAP_NET_RAW` covers raw packet access. ([man7.org][3])
No Linux bridge is needed for MVP. No `br0`. No moving the hosts IP from `eth0` to a bridge. The gateway daemon directly captures and injects frames on the physical NIC.
Wired Ethernet only. No Wi-Fi gateway mode for MVP. Managed Wi-Fi NICs are not reliable for arbitrary source-MAC injection.
### 3. Public relay: `lanparty-relay`
Runs on VPS/public server.
Responsibilities:
```text
- accept QUIC connections
- group clients and gateway into rooms
- forward Ethernet datagrams
- enforce room limits
- reject duplicate MACs
- rate-limit abuse
- later: auth / invite codes / E2E overlay encryption
```
For MVP, the relay is the full data path, not merely NAT traversal.
That gives the best UX:
```text
client → outbound QUIC → relay
gateway → outbound QUIC → relay
```
No port forwarding. No NAT traversal pain. Direct P2P can come later.
## Transport
Use QUIC.
Use reliable QUIC streams for control:
```text
hello
join room
role = client | gateway
version negotiation
assigned peer id
announced MAC
MTU negotiation
stats
disconnect reason
future auth
```
Use QUIC DATAGRAM for Ethernet frames. QUIC DATAGRAM is specifically the unreliable datagram extension for QUIC, which fits Ethernet/game traffic better than reliable streams because old frames should not block newer frames. ([IETF Datatracker][4])
Rust QUIC implementation: start with `quinn`. It exposes `Connection::max_datagram_size()`, which returns the maximum datagram payload size or `None` if datagrams are unsupported/disabled. ([Docs.rs][5])
## No fragmentation for MVP
Do **not** fragment Ethernet frames inside the overlay.
Instead:
```text
small TAP MTU
one TAP Ethernet frame = one QUIC datagram
```
Startup flow:
```text
1. establish QUIC connection
2. verify QUIC DATAGRAM support
3. query max_datagram_size()
4. compute safe inner MTU
5. configure TAP MTU
6. bring TAP up
```
MVP default:
```text
TAP MTU: 1200 or 1280-ish
hard fail if QUIC datagram budget is too small
```
Formula:
```text
tap_mtu <= quic_max_datagram_size
- overlay_header_len
- ethernet_header_len
- safety_margin
```
No fragment table. No reassembly timeout. No “one lost fragment kills the whole Ethernet frame.” Add fragmentation later only if testing proves it is necessary.
## Overlay frame format
Keep the outer routing header small and stable.
Example:
```text
magic: u32
version: u8
type: u8 // frame, control, keepalive
room_id: u64
peer_id: u32
flags: u16
payload_len: u16
payload: Ethernet frame bytes
```
For future relay-blind encryption, split this mentally into:
```text
clear routing header
encrypted Ethernet payload
```
MVP can skip payload encryption beyond QUIC, but the wire format should not make later E2E encryption painful.
## Trust model
MVP relay sees plaintext Ethernet frames.
QUIC encrypts traffic on the wire, but because the relay terminates QUIC connections, it decrypts frames from clients and re-encrypts them to the gateway.
That is acceptable for a LAN-party MVP, but it should be explicitly documented.
Future version:
```text
client/gateway share room key
Ethernet payload is AEAD-encrypted before QUIC
relay only sees room id, peer id, size, timing
```
Do not retrofit this into a bad packet format later. Reserve the shape now.
## Switching model
Treat the whole thing as a tiny user-space Ethernet switch.
Maintain:
```text
MAC -> peer_id
peer_id -> QUIC connection
last_seen timestamp
```
Forwarding rules:
```text
source MAC from client:
learn source MAC -> client
known unicast:
forward only to target peer/gateway
broadcast/multicast:
flood to gateway and relevant clients
unknown unicast:
flood initially, later rate-limit
never reflect frame back to ingress peer
```
For MVP, simplify:
```text
remote client frames mostly go to gateway
LAN frames go to matching remote client or all clients if broadcast/multicast
```
But MAC learning belongs in the real design.
## MAC identity
Each Windows client needs a unique locally administered unicast MAC.
Example range:
```text
02:xx:xx:xx:xx:xx
```
Generate once per install or per profile. Store it. Configure TAP with it. Announce it during join.
Relay must reject:
```text
- duplicate MAC in same room
- broadcast/multicast source MAC
- obviously invalid MAC
- too many source MACs per client
```
Default policy:
```text
1 MAC per client
maybe 2 later for weird cases
```
This is your responsibility, not the users.
## Linux gateway CAM-table refresh
The physical LAN switch must learn that remote clients MACs live behind the gateway port.
That happens when the gateway injects frames onto the LAN using the remote clients source MAC.
But switch CAM entries age out. So the gateway should periodically refresh them.
Every ~60 seconds:
```text
for each connected remote MAC:
inject a tiny harmless Ethernet frame with that MAC as source
```
The exact frame can be decided during implementation, but the goal is simple: keep the LAN switch mapping the remote MAC to the gateways physical port.
Phase 1 success criterion:
```text
remote client MAC appears in the LAN switch MAC table on the gateway port
```
If that is false, the L2 illusion is broken.
## Safety filters
Remote clients must not be allowed to spray arbitrary L2 control-plane junk onto the real LAN.
Drop remote → LAN unconditionally:
```text
- EAPOL / 802.1X
- STP / BPDUs
- LLDP
- LACP
- DHCP server replies
- IPv6 Router Advertisements
- jumbo frames
- frames from unauthorized source MACs
```
Also drop LAN → remote:
```text
- EAPOL
- STP
- LLDP
- LACP
```
No remote Windows client needs to see switch/control-plane traffic.
EAPOL is especially important: remote clients should never be able to interfere with 802.1X or port authentication behavior on the physical switch.
Add rate limits:
```text
- broadcast/multicast per client
- unknown unicast per client
- total bandwidth per client
- malformed packet disconnect threshold
```
## Windows routing / metric handling
The TAP adapter may receive DHCP from the party LAN. That is good.
But if DHCP gives it a default gateway, Windows might try to route the relay connection through the tunnel itself. That would break the tunnel.
Client startup should:
```text
1. resolve relay domain before TAP is active
2. remember current real default gateway/interface
3. add explicit host route to relay IP via real NIC
4. bring TAP up
5. set TAP interface metric appropriately
6. detect and neutralize TAP default-route takeover
```
The TAP should be preferred for the party LAN subnet, but it must not steal general internet traffic.
Also strongly recommend uncommon LAN party subnets:
```text
good: 10.73.42.0/24
bad: 192.168.0.0/24
bad: 192.168.1.0/24
bad: 192.168.178.0/24
```
Duplicate subnet with a remote users home LAN will be painful.
## Relay placement / latency
Relay-as-data-path is the right MVP. It makes the product work through NAT immediately.
But latency becomes:
```text
client → relay → gateway
```
So relay location matters.
For Europe/Germany-focused usage, put the relay near the expected players and LAN site, e.g. Frankfurt/Nuremberg/Amsterdam depending on hosting. Later, add direct QUIC path attempts with relay fallback, but do not block MVP on NAT traversal.
Design the room protocol so future modes are possible:
```text
mode = relay
mode = direct-p2p
mode = direct-failed-relay-fallback
```
## Logging / diagnostics
Phase 1 should log heavily.
Gateway frame log:
```text
direction
src MAC
dst MAC
ethertype
length
peer id
action = forwarded | dropped | filtered | rate-limited
```
Client diagnostics:
```text
relay reachable: yes/no
QUIC datagram support: yes/no
max datagram size
TAP adapter found: yes/no
TAP MAC
TAP MTU
TAP IP from DHCP
relay route pinned: yes/no
frames rx/tx
drops
```
User-facing diagnostics should eventually say things like:
```text
Connected to relay
Connected to LAN gateway
DHCP received: 10.73.42.51
Gateway latency: 23 ms
Broadcast traffic flowing
Warning: TAP received default route, adjusted metric
```
## Phase plan
### Phase 1: prove the illusion
Manual, ugly, real.
```text
- manual TAP install on Windows
- Rust Windows client opens TAP
- fixed TAP MTU, e.g. 1200
- Linux gateway opens AF_PACKET on wired eth0
- relay forwards one client
- no auth except room string
- no fragmentation
- heavy frame logging
```
Success criteria:
```text
- Windows TAP gets DHCP from real LAN
- Windows client can ARP LAN host
- Windows client can ping LAN host
- remote MAC appears in switch MAC table on gateway port
- one real LAN game discovers or joins a LAN server
```
### Phase 2: multi-client
```text
- multiple Windows clients
- unique MAC generation
- duplicate MAC rejection
- MAC learning
- broadcast/multicast fanout
- CAM refresh frames
- reconnect handling
```
### Phase 3: safety and correctness
```text
- L2 control-plane filters
- DHCP server reply filtering
- IPv6 RA filtering
- MAC limits
- rate limits
- route/metric protection
- better malformed-frame handling
```
### Phase 4: product UX
```text
- Windows installer
- TAP driver install/check
- simple GUI
- room code / domain field
- diagnostics screen
- configurable relay port
- logs export button
```
Driver signing and TAP bundling must be validated early. `tap-windows6` is the right kind of driver, but Windows driver installation/signing is a product risk, not something to handwave. ([GitHub][1])
### Phase 5: better security and latency
```text
- invite tokens / auth
- room ACLs
- optional room-key E2E payload encryption
- direct QUIC path attempt
- relay fallback
- regional relay selection
```
## Explicit non-goals
For MVP, do not build:
```text
- Npcap mode
- WinDivert mode
- source-IP rewriting
- Windows bridge
- Hyper-V virtual switch
- WireGuard underlay
- custom Ethernet fragmentation
- Wi-Fi LAN gateway support
- full internet VPN mode
```
## One-sentence version
Build a **Rust Windows TAP client + public QUIC relay + Linux AF_PACKET gateway** that carries one small-MTU Ethernet frame per QUIC datagram, gives each remote player a unique virtual MAC on the real LAN, filters dangerous L2 control traffic, and keeps the physical LAN gateway as the only machine touching the real LAN.
[1]: https://github.com/OpenVPN/tap-windows6?utm_source=chatgpt.com "OpenVPN/tap-windows6: Windows TAP driver (NDIS 6)"
[2]: https://man7.org/linux/man-pages/man7/packet.7.html?utm_source=chatgpt.com "packet(7) - Linux manual page"
[3]: https://man7.org/linux/man-pages/man7/capabilities.7.html?utm_source=chatgpt.com "capabilities(7) - Linux manual page"
[4]: https://datatracker.ietf.org/doc/html/rfc9221?utm_source=chatgpt.com "RFC 9221 - An Unreliable Datagram Extension to QUIC"
[5]: https://docs.rs/quinn/latest/quinn/struct.Connection.html?utm_source=chatgpt.com "Connection in quinn - Rust"
I want a mono-repo, Rust code, crates into a "crates" folder, one cargo workspace.