feat!: multi-threaded pipeline + length-committed/random-access decrypt

Completes the two follow-ups deferred from the v0.10 format/secrets
work: multi-threaded AEAD encrypt/decrypt and a length-committed file
format that enables random-access decryption.

# Format change (file format v2)

Bumps the on-disk header version to 2 and introduces a flag bit
(`FLAG_LENGTH_COMMITTED`, bit 0). When set, an authenticated `u64 LE`
plaintext length is appended to the header after the nonce prefix. v1
files still decrypt unchanged. v2 readers reject unknown flag bits.

The flag is set automatically when the input is a regular file (we
stat the open FD to avoid TOCTOU). Stdin/pipes/FIFOs encrypt as before
with the flag clear. Sequential decrypt cross-checks the produced byte
count against the committed length as defense in depth (the AEAD
already authenticates the value via header AAD, but failing before we
rename the temp file into place is preferable to failing after).

# Random-access decrypt

`fcry -d -i FILE --offset N --length L` seeks directly to the chunk(s)
covering `[N, N+L)` and decrypts only those, without scanning the
predecessors. Requires a seekable file whose header has the
length-committed flag — stdin/pipe-encrypted files cannot use this
path and the CLI rejects it with a clear error.

The chunk layout is fully determined by `chunk_size` and the committed
total length (last chunk's plaintext is
`total - (n_chunks-1)*chunk_size`; its ciphertext length is
`last_pt + 16`). Each chunk's nonce is
`make_nonce(prefix, chunk_index, is_last_chunk)` which matches what
sequential encrypt produced, so plaintext slices come out
bit-identical to a full sequential decrypt.

# Multi-threaded pipeline

New `src/pipeline.rs` implements:

  reader thread → bounded jobs channel → N AEAD workers
                → bounded results channel → writer thread

The reader stays serial (it owns the input handle and uses lookahead
to detect the last chunk). Workers parallelize the AEAD step (each
chunk is independent under STREAM). The writer holds a
`BTreeMap<u32, Vec<u8>>` reorder buffer and only flushes in counter
order. Commit is deferred to the main thread, so a failure anywhere —
reader I/O, AEAD auth, writer I/O — drops `OutSink` without renaming
the temp file into place. The
`atomic_output_no_stale_tmp_on_failure` integration test still
passes.

Channel and reorder capacities scale with worker count (`2*threads`);
peak memory is roughly `chunk_size * 4 * threads`. With 1 MiB chunks
and 8 cores that's ~32 MiB, which we accept.

Default thread count is `std::thread::available_parallelism()`;
override with `-j/--threads N`. `-j 1` keeps the original serial path.
Stdin/stdout streaming works under the parallel path because `Stdin`
(unlocked) is `Send` — only `StdinLock` isn't, so the boxed reader
wraps `Stdin` directly in a `BufReader`.

Adds `crossbeam-channel = "0.5"` for bounded MPMC. The cipher
(`XChaCha20Poly1305`) and the header AAD are shared across workers via
`Arc`; the AEAD's internal key copy is zeroized on drop as before.

# CLI surface

  -j, --threads <N>     worker thread count (default: cores)
      --offset <BYTES>  random-access decrypt: slice start
      --length <BYTES>  random-access decrypt: slice length

`--offset`/`--length` require `--decrypt` and `--input-file` (clap
enforces; we also surface a clean runtime error if only one is
supplied).

# Test plan

* `cargo test` — 5 unit + 27 integration, all green.
* New integration coverage:
  - parallel roundtrip on multi-chunk inputs (`-j 4`)
  - parallel-encrypted ciphertext decrypted serially, and vice-versa
    (output bit-identical regardless of worker count)
  - parallel pipe stdin↔stdout (asserts flag byte is 0 for stdin
    inputs — no length committed without a known size)
  - file inputs auto-commit length (asserts version=2 and flags bit 0
    set in the raw header bytes)
  - random-access slices spanning chunk-aligned, mid-chunk,
    last-chunk, and full-file ranges
  - random-access rejects out-of-range and stdin-encrypted inputs,
    accepts zero-length
  - tampering the committed length byte fails AEAD authentication
  - hand-crafted v1 header still decodes (no flag bit set)
* `cargo clippy --all-targets -- -D warnings` clean.
* `cargo +nightly fmt` clean.

Removes `TODO.md` since both deferred items are now implemented.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-02 20:33:00 +02:00
parent f72f9034f3
commit 75afadb1ec
10 changed files with 1095 additions and 51 deletions
+301
View File
@@ -422,6 +422,307 @@ fn atomic_output_no_stale_tmp_on_failure() {
assert!(!tmp.exists(), "temp file must be cleaned up");
}
// ---------------------------------------------------------------------------
// Multi-threaded pipeline + length-committed + random-access tests
// ---------------------------------------------------------------------------
fn encrypt_file_threads(
plain: &std::path::Path,
ct: &std::path::Path,
chunk_size: Option<u32>,
threads: usize,
) {
let mut cmd = fcry();
cmd.arg("-i")
.arg(plain)
.arg("-o")
.arg(ct)
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg(threads.to_string());
if let Some(cs) = chunk_size {
cmd.arg("--chunk-size").arg(cs.to_string());
}
let out = cmd.output().unwrap();
assert!(
out.status.success(),
"encrypt -j{threads} failed: {}",
String::from_utf8_lossy(&out.stderr)
);
}
fn decrypt_file_threads(ct: &std::path::Path, rt: &std::path::Path, threads: usize) {
let out = fcry()
.arg("-d")
.arg("-i")
.arg(ct)
.arg("-o")
.arg(rt)
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg(threads.to_string())
.output()
.unwrap();
assert!(
out.status.success(),
"decrypt -j{threads} failed: {}",
String::from_utf8_lossy(&out.stderr)
);
}
#[test]
fn roundtrip_multi_threaded() {
// Multi-chunk input. Encrypt+decrypt with -j 4 must round-trip.
let dir = TempDir::new().unwrap();
let plain = dir.path().join("p.bin");
let ct = dir.path().join("c.bin");
let rt = dir.path().join("r.bin");
let data = pseudo_random(11, 5 * 1024 * 1024 + 12345);
fs::write(&plain, &data).unwrap();
encrypt_file_threads(&plain, &ct, Some(64 * 1024), 4);
decrypt_file_threads(&ct, &rt, 4);
assert_eq!(fs::read(&rt).unwrap(), data);
}
#[test]
fn parallel_and_serial_outputs_round_trip() {
// Encrypt with -j 4 and decrypt serially (and vice-versa); both directions
// must yield the original plaintext.
let dir = TempDir::new().unwrap();
let plain = dir.path().join("p.bin");
let data = pseudo_random(13, 256 * 1024 + 17);
fs::write(&plain, &data).unwrap();
let ct_par = dir.path().join("c_par.bin");
let ct_ser = dir.path().join("c_ser.bin");
encrypt_file_threads(&plain, &ct_par, Some(8192), 4);
encrypt_file_threads(&plain, &ct_ser, Some(8192), 1);
let rt1 = dir.path().join("r1.bin");
let rt2 = dir.path().join("r2.bin");
// par-encrypted, serial-decrypted
decrypt_file_threads(&ct_par, &rt1, 1);
// serial-encrypted, par-decrypted
decrypt_file_threads(&ct_ser, &rt2, 4);
assert_eq!(fs::read(&rt1).unwrap(), data);
assert_eq!(fs::read(&rt2).unwrap(), data);
}
#[test]
fn roundtrip_pipe_multi_threaded() {
// stdin/stdout mode with -j 4: length flag must NOT be set (no committed
// length when we don't know the input size), but encrypt/decrypt must still
// round-trip cleanly across the pipeline.
let data = pseudo_random(14, 200_000);
let mut enc = fcry()
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg("4")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.unwrap();
enc.stdin.as_mut().unwrap().write_all(&data).unwrap();
let enc_out = enc.wait_with_output().unwrap();
assert!(
enc_out.status.success(),
"pipe encrypt -j4 failed: {}",
String::from_utf8_lossy(&enc_out.stderr)
);
// flags byte at offset 6 must be 0 (no length committed for stdin input).
assert_eq!(
enc_out.stdout[6], 0,
"stdin-encrypted file unexpectedly committed length"
);
let mut dec = fcry()
.arg("-d")
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg("4")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.unwrap();
dec.stdin
.as_mut()
.unwrap()
.write_all(&enc_out.stdout)
.unwrap();
let dec_out = dec.wait_with_output().unwrap();
assert!(
dec_out.status.success(),
"pipe decrypt -j4 failed: {}",
String::from_utf8_lossy(&dec_out.stderr)
);
assert_eq!(dec_out.stdout, data);
}
#[test]
fn file_input_commits_length() {
// Encrypting from a regular file must auto-set FLAG_LENGTH_COMMITTED (bit 0
// of the flags byte at offset 6) and embed the length.
let dir = TempDir::new().unwrap();
let plain = dir.path().join("p.bin");
let ct = dir.path().join("c.bin");
let data = pseudo_random(15, 50_000);
fs::write(&plain, &data).unwrap();
encrypt_file(&plain, &ct, Some(4096));
let bytes = fs::read(&ct).unwrap();
// Magic(4) + version(1) + alg(1) + flags(1) = byte 6
assert_eq!(bytes[4], 2, "version should be 2");
assert_eq!(bytes[6] & 0x01, 0x01, "length-committed flag should be set");
}
fn encrypt_random_access_fixture(
dir: &std::path::Path,
data: &[u8],
chunk_size: u32,
) -> std::path::PathBuf {
let plain = dir.join("p.bin");
let ct = dir.join("c.bin");
fs::write(&plain, data).unwrap();
encrypt_file(&plain, &ct, Some(chunk_size));
ct
}
fn random_access_decrypt(
ct: &std::path::Path,
out: &std::path::Path,
offset: u64,
length: u64,
) -> std::process::Output {
fcry()
.arg("-d")
.arg("-i")
.arg(ct)
.arg("-o")
.arg(out)
.arg("--raw-key")
.arg(KEY_STR)
.arg("--offset")
.arg(offset.to_string())
.arg("--length")
.arg(length.to_string())
.output()
.unwrap()
}
#[test]
fn random_access_decrypt_slices() {
let dir = TempDir::new().unwrap();
let chunk = 4096u32;
let total = 5 * 1024 * 1024 + 12345;
let data = pseudo_random(16, total);
let ct = encrypt_random_access_fixture(dir.path(), &data, chunk);
// (offset, length) cases:
// - chunk-aligned start, mid-chunk end
// - mid-chunk start crossing several chunks
// - last partial chunk
// - last byte
// - entire file
let cases: &[(u64, u64)] = &[
(0, 1),
(chunk as u64, 7),
(chunk as u64 - 5, 100),
(10, chunk as u64 * 3 + 17),
(total as u64 - 1, 1),
(total as u64 - 100, 100),
(0, total as u64),
];
for (i, (offset, length)) in cases.iter().copied().enumerate() {
let out = dir.path().join(format!("slice_{i}.bin"));
let r = random_access_decrypt(&ct, &out, offset, length);
assert!(
r.status.success(),
"slice {i} ({offset}, {length}) failed: {}",
String::from_utf8_lossy(&r.stderr)
);
let got = fs::read(&out).unwrap();
let expected = &data[offset as usize..(offset + length) as usize];
assert_eq!(got, expected, "slice {i} mismatch");
}
}
#[test]
fn random_access_rejects_out_of_range() {
let dir = TempDir::new().unwrap();
let data = pseudo_random(17, 1000);
let ct = encrypt_random_access_fixture(dir.path(), &data, 256);
let out = dir.path().join("oob.bin");
let r = random_access_decrypt(&ct, &out, 900, 1000); // 900+1000 > 1000
assert!(!r.status.success(), "out-of-range slice should fail");
}
#[test]
fn random_access_rejects_stdin_encrypted() {
// Encrypt via stdin → no length committed → random access must refuse.
let data = pseudo_random(18, 2000);
let dir = TempDir::new().unwrap();
let ct = dir.path().join("c.bin");
let mut enc = fcry()
.arg("--raw-key")
.arg(KEY_STR)
.arg("-o")
.arg(&ct)
.stdin(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.unwrap();
enc.stdin.as_mut().unwrap().write_all(&data).unwrap();
assert!(enc.wait().unwrap().success());
let out = dir.path().join("slice.bin");
let r = random_access_decrypt(&ct, &out, 0, 100);
assert!(
!r.status.success(),
"random access on stdin-encrypted file should fail"
);
}
#[test]
fn random_access_zero_length() {
let dir = TempDir::new().unwrap();
let data = pseudo_random(19, 1000);
let ct = encrypt_random_access_fixture(dir.path(), &data, 256);
let out = dir.path().join("empty.bin");
let r = random_access_decrypt(&ct, &out, 500, 0);
assert!(r.status.success(), "zero-length slice should succeed");
assert_eq!(fs::read(&out).unwrap(), Vec::<u8>::new());
}
#[test]
fn random_access_tampered_length_fails() {
// Flip a byte inside the committed plaintext_length field. The header is
// AAD for every chunk, so the AEAD must reject decryption.
let dir = TempDir::new().unwrap();
let data = pseudo_random(20, 4000);
let ct = encrypt_random_access_fixture(dir.path(), &data, 1024);
let mut bytes = fs::read(&ct).unwrap();
// For raw-kdf header: magic(4)+ver(1)+alg(1)+flags(1)+rsv(1)+chunksize(4)+kdf_id(1)+nonce_prefix(19) = 32
// plaintext_length is at offset 32..40.
bytes[34] ^= 0xff;
fs::write(&ct, &bytes).unwrap();
let out = dir.path().join("bad.bin");
let r = random_access_decrypt(&ct, &out, 0, 100);
assert!(
!r.status.success(),
"tampered plaintext_length must fail authentication"
);
}
#[test]
fn header_chunk_size_is_authoritative_on_decrypt() {
// Encrypt with a non-default chunk size; decrypt without specifying one.