feat: write chunks directly to temp upload files

Completed uploads used to copy every staged chunk into a second file before
renaming the result into data/complete. That doubled write volume and required
peak disk space for both the chunk set and the final file.

Write each chunk directly into one private temp upload file at its final offset
instead. After a chunk write succeeds, record a tiny durable completion marker
for progress and resume scans. Completion now verifies the temp file length and
all markers, then renames the temp file into the completed upload directory.

Add UPL_TEMP_DIR and --temp-dir so operators can choose where upload metadata,
markers, and temp files live. The default remains data/staging, and docs call
out that the temp directory must be on the same filesystem as data/complete for
atomic promotion. The nginx example now aliases only the completed upload
directory, and the smoke test verifies that final-file alias.

This keeps the existing length-based validation model; it does not add per-chunk
hashing.

Test Plan:
- just check
- just nginx-smoke
- cargo clippy && cargo clippy --benches && cargo clippy --tests
- cargo +nightly fmt --all
- cargo clippy && cargo clippy --benches && cargo clippy --tests

Refs: none
This commit is contained in:
2026-05-30 18:10:05 +02:00
parent 428af52e2f
commit c072b93726
10 changed files with 232 additions and 101 deletions
+18 -8
View File
@@ -6,9 +6,9 @@
browser -> nginx -> upl Rust server -> local filesystem
```
The first implementation milestone provides the Rust server shell and static
browser UI. Upload metadata, chunk persistence, resume state, and completion
assembly are tracked in `PLAN.md` and will be added in later coherent slices.
The server writes upload chunks directly into an inaccessible temp file at
their final offsets. Once every chunk is present, completion atomically renames
that temp file into the completed upload directory.
## Project Structure
@@ -19,7 +19,7 @@ upl
src/app.rs Axum router, shared state, static file service
src/api.rs HTTP handlers and API error responses
src/model.rs JSON request, response, and metadata shapes
src/storage.rs local filesystem layout, chunks, and assembly
src/storage.rs local filesystem layout, offset writes, and final rename
src/lib.rs library surface used by integration tests
Browser UI
static/index.html upload tool markup
@@ -39,11 +39,17 @@ upl
`127.0.0.1:3000`.
- `--static-dir` sets the static asset directory. It overrides `UPL_STATIC_DIR`
and defaults to `static/` inside this repository.
- `--data-dir` sets the upload data directory. It overrides `UPL_DATA_DIR` and
defaults to `data/` inside this repository.
- `--data-dir` sets the completed upload data root. Completed files land under
its `complete/` subdirectory. It overrides `UPL_DATA_DIR` and defaults to
`data/` inside this repository.
- `--temp-dir` sets the directory for upload metadata, completion markers, and
inaccessible temp upload files. It overrides `UPL_TEMP_DIR` and defaults to
`<data-dir>/staging`.
- `upl --help` prints the full argument help text.
- The server accepts request bodies up to 64 MiB, which leaves room for the
planned 16 MiB upload chunks and matches the nginx example in `PLAN.md`.
- Keep `UPL_TEMP_DIR` on the same filesystem as `<data-dir>/complete` so
completion can promote files with an atomic rename.
## Common Commands
@@ -61,12 +67,16 @@ just run
Run `upl` on localhost and put nginx in front of it for TLS and access control:
```sh
UPL_BIND=127.0.0.1:3000 UPL_DATA_DIR=/srv/upl/data upl
UPL_BIND=127.0.0.1:3000 \
UPL_DATA_DIR=/srv/upl/data \
UPL_TEMP_DIR=/srv/upl/data/staging \
upl
```
Use `deploy/nginx/upl.conf.example` as the starting point for the nginx site.
Before exposing the service, replace the certificate paths and add a protection
layer such as HTTP basic auth, an IP allowlist, or VPN-only access.
layer such as HTTP basic auth, an IP allowlist, or VPN-only access. The nginx
example aliases only `/srv/upl/data/complete`; do not expose `UPL_TEMP_DIR`.
For a local Docker-based reverse-proxy smoke test: