4527e23b8b
Document the minimal design for a personal large-file upload app that can resume after browser, network, or server interruptions. The plan keeps the first version intentionally small: one Rust server, one static browser UI, filesystem-backed upload metadata, raw chunk uploads, and no database or third-party resumable upload protocol. The deployment notes include nginx as the external TLS and access-control layer, with the Rust server bound behind it and upload-specific proxy settings called out. Test Plan: - git diff --cached --check Refs: user request
356 lines
8.4 KiB
Markdown
356 lines
8.4 KiB
Markdown
# Resumable Upload Plan
|
|
|
|
## Goal
|
|
|
|
Build a small personal web app for uploading large files without losing
|
|
progress when the network drops, the tab closes, or the Rust server restarts.
|
|
|
|
The final deployment is:
|
|
|
|
```text
|
|
browser -> nginx -> upl Rust server -> local filesystem
|
|
```
|
|
|
|
The program should stay simple:
|
|
|
|
- one Rust server binary
|
|
- one static browser UI
|
|
- no database server
|
|
- no frontend framework
|
|
- no Tus/Uppy/Resumable.js for the first version
|
|
- local filesystem metadata as the source of truth
|
|
|
|
## Top-Level Design
|
|
|
|
### Browser
|
|
|
|
The browser owns file selection and chunk scheduling.
|
|
|
|
- Let the user pick one file.
|
|
- Slice it into fixed-size chunks with `Blob.slice()`.
|
|
- Upload a few chunks concurrently.
|
|
- Retry failed chunks with exponential backoff.
|
|
- Persist pending upload state in IndexedDB.
|
|
- Use the File System Access API when available so the same local file can be
|
|
reopened after a browser restart without making the user browse to it again.
|
|
|
|
### nginx
|
|
|
|
nginx owns TLS, external access control, and reverse proxying.
|
|
|
|
- Bind the Rust server to localhost only.
|
|
- Terminate HTTPS in nginx.
|
|
- Protect the app because it is a personal upload tool.
|
|
- Forward upload API requests to the Rust server without buffering whole request
|
|
bodies before they reach Rust.
|
|
|
|
### Rust Server
|
|
|
|
The Rust server owns upload identity, chunk validation, progress reporting, and
|
|
final assembly.
|
|
|
|
- Serve the static page.
|
|
- Create upload records.
|
|
- Accept raw binary chunk bodies.
|
|
- Store chunks on disk as they arrive.
|
|
- Report which chunks already exist.
|
|
- Assemble chunks into the final file once all chunks are present.
|
|
|
|
## Storage Layout
|
|
|
|
```text
|
|
data/
|
|
staging/
|
|
<upload_id>/
|
|
meta.json
|
|
chunks/
|
|
000000.part
|
|
000001.part
|
|
000002.part
|
|
complete/
|
|
<safe_file_name>
|
|
```
|
|
|
|
`meta.json` is the durable upload record:
|
|
|
|
```json
|
|
{
|
|
"id": "random-server-id",
|
|
"original_name": "movie.mkv",
|
|
"safe_name": "movie.mkv",
|
|
"size": 1234567890,
|
|
"last_modified": 1760000000000,
|
|
"chunk_size": 16777216,
|
|
"total_chunks": 74,
|
|
"created_at": "2026-05-30T16:00:00Z"
|
|
}
|
|
```
|
|
|
|
The server should generate `upload_id`. The browser should not invent the
|
|
primary upload identity from file metadata. File name, size, and modified time
|
|
are useful for display and sanity checks, but they are not unique enough to be
|
|
the durable server identity.
|
|
|
|
## HTTP API
|
|
|
|
Keep the API small and boring.
|
|
|
|
```text
|
|
GET /
|
|
POST /api/uploads
|
|
GET /api/uploads/:id
|
|
PUT /api/uploads/:id/chunks/:index
|
|
POST /api/uploads/:id/complete
|
|
```
|
|
|
|
### Create Upload
|
|
|
|
`POST /api/uploads`
|
|
|
|
Request:
|
|
|
|
```json
|
|
{
|
|
"name": "movie.mkv",
|
|
"size": 1234567890,
|
|
"last_modified": 1760000000000
|
|
}
|
|
```
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"upload_id": "random-server-id",
|
|
"chunk_size": 16777216,
|
|
"total_chunks": 74,
|
|
"completed_chunks": []
|
|
}
|
|
```
|
|
|
|
Start with a fixed chunk size of 16 MiB. This keeps request count reasonable
|
|
while making failed chunks cheap enough to retry.
|
|
|
|
### Query Progress
|
|
|
|
`GET /api/uploads/:id`
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"upload_id": "random-server-id",
|
|
"name": "movie.mkv",
|
|
"size": 1234567890,
|
|
"chunk_size": 16777216,
|
|
"total_chunks": 74,
|
|
"completed_chunks": [0, 1, 2, 5]
|
|
}
|
|
```
|
|
|
|
The server can compute `completed_chunks` by scanning the chunk directory and
|
|
checking file lengths. This avoids needing a database.
|
|
|
|
### Upload Chunk
|
|
|
|
`PUT /api/uploads/:id/chunks/:index`
|
|
|
|
Use a raw request body:
|
|
|
|
```http
|
|
Content-Type: application/octet-stream
|
|
```
|
|
|
|
Do not use multipart form uploads for chunks in the minimal version. Raw bytes
|
|
make the Rust handler simpler and avoid multipart parsing.
|
|
|
|
Server rules:
|
|
|
|
- reject unknown upload IDs
|
|
- reject out-of-range chunk indexes
|
|
- reject chunks with the wrong length
|
|
- allow the final chunk to be shorter than `chunk_size`
|
|
- write to `000123.part.tmp` first
|
|
- rename the temp file to `000123.part` only after the write succeeds
|
|
- make duplicate chunk uploads idempotent when the existing chunk has the
|
|
expected length
|
|
|
|
### Complete Upload
|
|
|
|
`POST /api/uploads/:id/complete`
|
|
|
|
The server should:
|
|
|
|
1. Load `meta.json`.
|
|
2. Verify every expected chunk exists.
|
|
3. Verify every chunk has the expected length.
|
|
4. Concatenate chunks in order into a temp final file.
|
|
5. Rename the temp final file into `data/complete/`.
|
|
6. Return the final file path or download URL.
|
|
|
|
The server should not delete staging data until assembly succeeds.
|
|
|
|
## Resume Flow
|
|
|
|
### First Upload
|
|
|
|
1. User selects a file.
|
|
2. Browser calls `POST /api/uploads`.
|
|
3. Browser stores the returned `upload_id` and file handle in IndexedDB.
|
|
4. Browser uploads missing chunks with a small concurrency pool.
|
|
5. Browser calls `/complete` when all chunks are uploaded.
|
|
|
|
### After Interruption
|
|
|
|
1. Browser loads pending upload records from IndexedDB.
|
|
2. Browser calls `GET /api/uploads/:id`.
|
|
3. Browser asks for read permission on the saved file handle.
|
|
4. Browser compares server `completed_chunks` with total chunks.
|
|
5. Browser uploads only missing chunks.
|
|
6. Browser calls `/complete`.
|
|
|
|
The server is authoritative. Browser state helps find the file again, but
|
|
server state decides what has actually been uploaded.
|
|
|
|
## Browser State
|
|
|
|
IndexedDB record:
|
|
|
|
```json
|
|
{
|
|
"upload_id": "random-server-id",
|
|
"name": "movie.mkv",
|
|
"size": 1234567890,
|
|
"last_modified": 1760000000000,
|
|
"chunk_size": 16777216,
|
|
"total_chunks": 74,
|
|
"file_handle": "<FileSystemFileHandle>",
|
|
"updated_at": "2026-05-30T16:00:00Z"
|
|
}
|
|
```
|
|
|
|
If `showOpenFilePicker()` is unavailable, fall back to a normal
|
|
`<input type="file">`. That fallback can still resume server-side progress, but
|
|
the user must reselect the same file after a page reload.
|
|
|
|
## Upload Scheduler
|
|
|
|
Start with these defaults:
|
|
|
|
```text
|
|
chunk size: 16 MiB
|
|
concurrency: 3
|
|
max retries per chunk: 5
|
|
```
|
|
|
|
The scheduler should support:
|
|
|
|
- pause with `AbortController`
|
|
- resume by rebuilding the missing chunk list
|
|
- retry with exponential backoff
|
|
- visible progress based on verified completed chunks
|
|
|
|
Progress should be based on chunks the server has accepted, not bytes merely
|
|
sent by the browser.
|
|
|
|
## nginx Requirements
|
|
|
|
Example shape:
|
|
|
|
```nginx
|
|
server {
|
|
listen 443 ssl;
|
|
server_name uploads.example.com;
|
|
|
|
client_max_body_size 64m;
|
|
|
|
location / {
|
|
proxy_pass http://127.0.0.1:3000;
|
|
proxy_http_version 1.1;
|
|
proxy_request_buffering off;
|
|
proxy_read_timeout 3600s;
|
|
proxy_send_timeout 3600s;
|
|
}
|
|
}
|
|
```
|
|
|
|
Notes:
|
|
|
|
- `client_max_body_size` only needs to exceed the maximum single chunk size, not
|
|
the full file size.
|
|
- `proxy_request_buffering off` lets the Rust server receive upload bodies
|
|
directly instead of waiting for nginx to buffer the whole chunk first.
|
|
- Long timeouts are useful for slow links and large chunks.
|
|
- Add HTTP basic auth, an IP allowlist, VPN-only access, or another protection
|
|
layer before exposing this publicly.
|
|
|
|
## Rust Implementation Shape
|
|
|
|
Suggested crates:
|
|
|
|
- `axum` for HTTP routing
|
|
- `tokio` for async runtime and filesystem operations
|
|
- `serde` and `serde_json` for metadata
|
|
- `uuid` or `nanoid` for upload IDs
|
|
- `tower-http` for static file serving
|
|
|
|
Suggested modules:
|
|
|
|
```text
|
|
src/
|
|
main.rs
|
|
api.rs
|
|
storage.rs
|
|
model.rs
|
|
static_files.rs
|
|
```
|
|
|
|
`storage.rs` should be the only module that knows the on-disk layout.
|
|
|
|
## Validation
|
|
|
|
Manual checks for the MVP:
|
|
|
|
- upload a small file in one pass
|
|
- upload a file larger than one chunk
|
|
- kill the browser tab mid-upload and resume
|
|
- restart the Rust server mid-upload and resume
|
|
- interrupt the network and resume
|
|
- retry a duplicate chunk and confirm it is accepted idempotently
|
|
- attempt an invalid chunk index and confirm it is rejected
|
|
- attempt a wrong-size non-final chunk and confirm it is rejected
|
|
- complete an upload and compare the final file with the source file
|
|
|
|
Useful checksum command:
|
|
|
|
```sh
|
|
sha256sum source-file data/complete/uploaded-file
|
|
```
|
|
|
|
## Milestones
|
|
|
|
1. Serve a static page from Rust.
|
|
2. Add upload creation and on-disk metadata.
|
|
3. Add raw chunk upload and chunk validation.
|
|
4. Add progress query from existing chunk files.
|
|
5. Add browser chunk slicing and concurrency.
|
|
6. Add IndexedDB state.
|
|
7. Add File System Access API resume.
|
|
8. Add completion assembly.
|
|
9. Put the server behind nginx and verify resume still works.
|
|
|
|
## Explicit Non-Goals For The First Version
|
|
|
|
- multiple-user accounts
|
|
- cloud object storage
|
|
- encryption at rest
|
|
- background service worker upload
|
|
- content-addressed deduplication
|
|
- full-file hashing before upload
|
|
- Tus protocol compatibility
|
|
- drag-and-drop polish
|
|
- mobile browser support
|
|
|
|
These can be added later if they become useful, but they are unnecessary for a
|
|
correct personal uploader.
|