Files

T

ddidderr 1923ff2a6f

feat: support parallel multi-file uploads

The browser upload flow was built around one selected file and one global
upload state. That made the existing chunk pool useful for a single file, but
users could not start several selected files at the same time.

Refactor the browser state into per-file upload items. Each selected file now
has its own upload record, completed-chunk set, abort controller, retry state,
progress row, and saved IndexedDB resume record. The picker accepts multiple
files, `Start all` and `Resume all` use a bounded file-level pool, and each file
keeps the existing bounded chunk pool. This keeps parallel uploads useful
without letting one large selection create unbounded request fan-out.

Keep the server API unchanged. Each file still receives a separate server upload
id, and server-side progress remains authoritative before any missing chunks are
scheduled. Terminal conflicts still stop the affected file without overwriting
completed data.

Update the user-facing markup, styles, project docs, and test checklist for the
multi-file scheduler. Add a server regression test that interleaves two uploads
and verifies the completed files contain exactly their own bytes.

Test Plan:
- just check
- git diff --check

2026-05-30 18:32:29 +02:00

8.6 KiB

Raw Permalink Blame History

Resumable Upload Plan

Goal

Build a small personal web app for uploading large files without losing progress when the network drops, the tab closes, or the Rust server restarts.

The final deployment is:

browser -> nginx -> upl Rust server -> local filesystem

The program should stay simple:

one Rust server binary
one static browser UI
no database server
no frontend framework
no Tus/Uppy/Resumable.js for the first version
local filesystem metadata as the source of truth

Top-Level Design

Browser

The browser owns file selection and chunk scheduling.

Let the user pick one or more files.
Slice it into fixed-size chunks with Blob.slice().
Upload a few files concurrently, with a separate chunk pool per file.
Retry failed chunks with exponential backoff.
Persist pending upload state in IndexedDB.
Use the File System Access API when available so the same local file can be reopened after a browser restart without making the user browse to it again.

nginx

nginx owns TLS, external access control, and reverse proxying.

Bind the Rust server to localhost only.
Terminate HTTPS in nginx.
Protect the app because it is a personal upload tool.
Forward upload API requests to the Rust server without buffering whole request bodies before they reach Rust.

Rust Server

The Rust server owns upload identity, chunk validation, progress reporting, and final assembly.

Serve the static page.
Create upload records.
Accept raw binary chunk bodies.
Store chunks on disk as they arrive.
Report which chunks already exist.
Assemble chunks into the final file once all chunks are present.

Storage Layout

data/
  staging/
    <upload_id>/
      meta.json
      chunks/
        000000.part
        000001.part
        000002.part
  complete/
    <safe_file_name>

meta.json is the durable upload record:

{
  "id": "random-server-id",
  "original_name": "movie.mkv",
  "safe_name": "movie.mkv",
  "size": 1234567890,
  "last_modified": 1760000000000,
  "chunk_size": 16777216,
  "total_chunks": 74,
  "created_at": "2026-05-30T16:00:00Z"
}

The server should generate upload_id. The browser should not invent the primary upload identity from file metadata. File name, size, and modified time are useful for display and sanity checks, but they are not unique enough to be the durable server identity.

HTTP API

Keep the API small and boring.

GET  /
POST /api/uploads
GET  /api/uploads/:id
PUT  /api/uploads/:id/chunks/:index
POST /api/uploads/:id/complete

Create Upload

POST /api/uploads

Request:

{
  "name": "movie.mkv",
  "size": 1234567890,
  "last_modified": 1760000000000
}

Response:

{
  "upload_id": "random-server-id",
  "chunk_size": 16777216,
  "total_chunks": 74,
  "completed_chunks": []
}

Start with a fixed chunk size of 16 MiB. This keeps request count reasonable while making failed chunks cheap enough to retry.

Query Progress

GET /api/uploads/:id

Response:

{
  "upload_id": "random-server-id",
  "name": "movie.mkv",
  "size": 1234567890,
  "chunk_size": 16777216,
  "total_chunks": 74,
  "completed_chunks": [0, 1, 2, 5]
}

The server can compute completed_chunks by scanning the chunk directory and checking file lengths. This avoids needing a database.

Upload Chunk

PUT /api/uploads/:id/chunks/:index

Use a raw request body:

Content-Type: application/octet-stream

Do not use multipart form uploads for chunks in the minimal version. Raw bytes make the Rust handler simpler and avoid multipart parsing.

Server rules:

reject unknown upload IDs
reject out-of-range chunk indexes
reject chunks with the wrong length
allow the final chunk to be shorter than chunk_size
write to 000123.part.tmp first
rename the temp file to 000123.part only after the write succeeds
make duplicate chunk uploads idempotent when the existing chunk has the expected length

Complete Upload

POST /api/uploads/:id/complete

The server should:

Load meta.json.
Verify every expected chunk exists.
Verify every chunk has the expected length.
Concatenate chunks in order into a temp final file.
Rename the temp final file into data/complete/.
Return the final file path or download URL.

The server should not delete staging data until assembly succeeds.

Resume Flow

First Upload

User selects one or more files.
Browser creates one selected upload row per file.
Browser calls POST /api/uploads once for each file being started.
Browser stores each returned upload_id and file handle in IndexedDB.
Browser uploads missing chunks with bounded file and chunk concurrency pools.
Browser calls /complete for each file when all of its chunks are uploaded.

After Interruption

Browser loads pending upload records from IndexedDB.
Browser calls GET /api/uploads/:id.
Browser asks for read permission on the saved file handle.
Browser compares server completed_chunks with total chunks.
Browser uploads only missing chunks.
Browser calls /complete.

The server is authoritative. Browser state helps find the file again, but server state decides what has actually been uploaded.

Browser State

IndexedDB record:

{
  "upload_id": "random-server-id",
  "name": "movie.mkv",
  "size": 1234567890,
  "last_modified": 1760000000000,
  "chunk_size": 16777216,
  "total_chunks": 74,
  "file_handle": "<FileSystemFileHandle>",
  "updated_at": "2026-05-30T16:00:00Z"
}

If showOpenFilePicker() is unavailable, fall back to a normal <input type="file">. That fallback can still resume server-side progress, but the user must reselect the same file after a page reload.

Upload Scheduler

Start with these defaults:

chunk size: 16 MiB
file concurrency: 3
chunk concurrency per file: 3
max retries per chunk: 5

The scheduler should support:

pause with AbortController
resume by rebuilding the missing chunk list
retry with exponential backoff
visible progress based on verified completed chunks

Progress should be based on chunks the server has accepted, not bytes merely sent by the browser.

nginx Requirements

Example shape:

server {
    listen 443 ssl;
    server_name uploads.example.com;

    client_max_body_size 64m;

    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_http_version 1.1;
        proxy_request_buffering off;
        proxy_read_timeout 3600s;
        proxy_send_timeout 3600s;
    }
}

Notes:

client_max_body_size only needs to exceed the maximum single chunk size, not the full file size.
proxy_request_buffering off lets the Rust server receive upload bodies directly instead of waiting for nginx to buffer the whole chunk first.
Long timeouts are useful for slow links and large chunks.
Add HTTP basic auth, an IP allowlist, VPN-only access, or another protection layer before exposing this publicly.

Rust Implementation Shape

Suggested crates:

axum for HTTP routing
tokio for async runtime and filesystem operations
serde and serde_json for metadata
uuid or nanoid for upload IDs
tower-http for static file serving

Suggested modules:

src/
  main.rs
  api.rs
  storage.rs
  model.rs
  static_files.rs

storage.rs should be the only module that knows the on-disk layout.

Validation

Manual checks for the MVP:

upload a small file in one pass
upload a file larger than one chunk
kill the browser tab mid-upload and resume
restart the Rust server mid-upload and resume
interrupt the network and resume
retry a duplicate chunk and confirm it is accepted idempotently
attempt an invalid chunk index and confirm it is rejected
attempt a wrong-size non-final chunk and confirm it is rejected
complete an upload and compare the final file with the source file

Useful checksum command:

sha256sum source-file data/complete/uploaded-file

Milestones

Serve a static page from Rust.
Add upload creation and on-disk metadata.
Add raw chunk upload and chunk validation.
Add progress query from existing chunk files.
Add browser chunk slicing and concurrency.
Add IndexedDB state.
Add File System Access API resume.
Add completion assembly.
Put the server behind nginx and verify resume still works.

Explicit Non-Goals For The First Version

multiple-user accounts
cloud object storage
encryption at rest
background service worker upload
content-addressed deduplication
full-file hashing before upload
Tus protocol compatibility
drag-and-drop polish
mobile browser support

These can be added later if they become useful, but they are unnecessary for a correct personal uploader.

8.6 KiB Raw Permalink Blame History