Compare commits

..

5 Commits

Author SHA1 Message Date
ddidderr 2c101abdbd fix(cli): reject -j 0 instead of silently falling through to serial
`--threads` was an `Option<usize>` with no value-parser bound. Passing
`-j 0` slipped past clap and reached the dispatch in `crypto::encrypt`
/ `decrypt`, where the `threads > 1` check evaluates to false for 0 and
the call quietly fell through to the serial path. The user thought
they had asked for "no worker threads" and got something else instead;
either way, 0 workers is not a meaningful configuration.

Switch the field to `Option<u32>` and add `value_parser!(u32).range(1..)`
so clap rejects `-j 0` at parse time with a usage error. Cast to
`usize` at the single use site. Using `u32` rather than `usize` avoids
shipping a host-pointer-width-dependent CLI surface; thread counts well
past 4 billion are not a thing we need to plan for.

Test plan:
  - New integration test `rejects_zero_threads` invokes the binary
    with `-j 0` and asserts non-zero exit.
  - Existing `roundtrip_multi_threaded` (`-j 4`) still passes, so the
    range bound has not broken normal usage.

Refs: external review (GLM51 #1; Gemini #3 misdescribed the symptom
as "silent corruption" — verified the actual behaviour was a fall-
through to the serial path, not output corruption, but the fix is
the same).
2026-05-02 21:29:28 +02:00
ddidderr 91b459657e fix(pipeline): bound reorder buffer and fail fast on worker error
The multi-threaded pipeline introduced in 75afadb had two related defects
flagged by external review:

1. The writer's reorder buffer was unbounded. `ordered_writer` accepted
   a `_cap` parameter that was documented as the in-flight bound but
   was never read. The writer drained `done_rx` eagerly into a
   `BTreeMap`, so neither the bounded job channel nor the bounded done
   channel ever exerted backpressure on the reader. A slow or stuck
   worker would let the writer accumulate every subsequent chunk in
   `pending` until the system ran out of memory. With adversarial input
   this is a memory-exhaustion vector; with merely uneven workers it
   silently violated the documented memory ceiling.

2. The pipeline did not fail fast. When a worker hit an AEAD
   authentication failure it returned `Err`, dropped its channel
   clones, and exited — but the other workers, the reader, and the
   writer kept running until natural EOF. On a tampered N-byte file
   we burned full I/O plus (T-1)/T of the AEAD CPU before surfacing
   the error. Combined with (1) this also stretched the window in
   which `pending` could grow.

Both issues are addressed by a single rewrite of `pipeline.rs`:

  - A bounded "permit" channel pre-filled with `in_flight_capacity`
    `()` tokens. The reader acquires one before sending each job; the
    writer releases one after flushing the corresponding chunk in
    order. Total in-flight chunks (queued jobs + in-progress at
    workers + pending in the reorder map) is now hard-capped at
    `4 * threads`, with the writer in lockstep with the actual disk
    write rather than ahead of it.

  - An `Arc<AtomicBool> cancel` flag that workers set on AEAD failure.
    Workers check it at the top of their loop and drain remaining
    queued jobs without doing AEAD work. The reader checks it before
    each new chunk, so a tampered chunk causes the reader to stop
    within the in-flight window rather than after EOF.

The reader uses `permit_rx.recv_timeout(50ms)` rather than a blocking
`recv` so it can poll the cancel flag even when the rest of the
pipeline has quiesced. Without this, a 3-way deadlock is possible:
worker errors after all permits are out, the writer is blocked on a
missing-counter chunk that will never arrive, the other workers are
idle on `jobs_rx`, and the reader is blocked on `permit_rx`. The
50 ms wakeup is well below typical user-perceptible latency and only
runs when the pipeline is otherwise idle, so its cost is negligible.

While rewriting I also collapsed `encrypt_parallel`/`decrypt_parallel`
onto a shared `run_pipeline` helper parameterised by an `is_encrypt`
bool — the two functions previously duplicated ~150 lines of channel
plumbing for a one-line difference (`encrypt_in_place` vs
`decrypt_in_place`). Same for the reorder writer: a single
`ordered_writer` now returns `(OutSink, u64)`, and encrypt simply
ignores the byte count (decrypt uses it for the length cross-check).
Removed the stale "wrapping_add" on the in-order counter — wrapping
here would mask a real bug since `bump_counter` already rejects
overflow upstream — and corrected the per-thread memory estimate in
the module-level doc to match the new bounded model.

The job-channel capacity (`channel_capacity = 2 * threads`) is left
unchanged. The new permit cap (`4 * threads`) is deliberately larger
so out-of-order completion has slack; if the gap is ever exhausted
the only consequence is reader backpressure, never unbounded growth.

Test plan:
  - `cargo test` still passes the full 28-test integration suite,
    including `parallel_and_serial_outputs_round_trip` (proves the
    refactored unified pipeline produces bit-identical output to the
    serial path) and `rejects_tampered_ciphertext` (still surfaces
    the AEAD error, now via the cancel path).
  - Manual fail-fast probe: 200 MiB random plaintext, encrypt with
    `-j 8`, flip a byte at offset 2000 (inside chunk 0), decrypt
    with `-j 8`. Errors in ~2 ms, vs ~28 ms for a clean decrypt of
    the same file — confirming the reader stopped within the
    in-flight window rather than draining the whole input.
  - The `ordered_writer` cancel deadlock case is hit organically by
    the same probe: chunk 0 fails authentication, no further
    counter-0 chunk ever arrives, but the reader exits via the
    50 ms cancel poll and the rest of the pipeline drains.

Refs: external review (P2 / Gemini #1, Gemini #2, GLM51 #2/#7/#8).
2026-05-02 21:29:08 +02:00
ddidderr 53bb927a87 fix(header): preserve on-disk version through decode/encode
`Header::encode()` previously hard-coded `VERSION_CURRENT` (= 2) on every
write. Because the encoded header is fed back as AEAD AAD on decrypt, this
broke decryption of any v1 ciphertext written before commit 75afadb: the
file's authenticated AAD has version byte 1, but the recomputed AAD has
byte 2, so AEAD verification fails on every chunk. The release notes for
75afadb explicitly promised v1 compatibility, so this is a regression
against the documented contract — caught by an external reviewer who
reproduced it by encrypting with HEAD^ and decrypting with HEAD.

Fix by adding a `version: u8` field to `Header`. `Header::read()` now
captures the on-disk byte and `encode()` writes it back. New encrypts in
`crypto::encrypt()` set `version = VERSION_CURRENT`, so freshly written
files are unchanged on the wire; only the round-trip path through
`read → encode` is now byte-identical for v1 inputs.

This was the simplest fix that preserves the existing AAD design (header
bytes verbatim → AAD). Alternatives considered:

  - Storing the raw header bytes alongside the parsed struct would also
    work but spends an extra allocation and adds a second source of
    truth for the same data.
  - Conditionally emitting v2 only when flags != 0 would happen to
    produce v1 bytes for a v1 input, but it conflates "what version
    does this file claim" with "does it carry length commitment" — two
    things that should stay independent for future flag bits.

Test plan:
  - `header::tests::reads_v1_header` now also asserts that re-encoding
    a parsed v1 header reproduces the original bytes exactly (so the
    AAD round-trips).
  - New `crypto::tests::decrypts_v1_ciphertext` and
    `decrypts_v1_ciphertext_parallel` build a multi-chunk v1 fixture
    by hand (XChaCha20Poly1305 with version-byte-1 AAD) and confirm
    `decrypt()` succeeds on both the serial and parallel paths.
  - Manual: built `HEAD^` in a worktree, encrypted a 200-byte payload
    with `--chunk-size 64`, decrypted with the patched binary at
    `-j 1` and `-j 4`. Both round-trip; output bytes match input.

Refs: external review identifying this regression.
2026-05-02 21:28:26 +02:00
ddidderr 75afadb1ec feat!: multi-threaded pipeline + length-committed/random-access decrypt
Completes the two follow-ups deferred from the v0.10 format/secrets
work: multi-threaded AEAD encrypt/decrypt and a length-committed file
format that enables random-access decryption.

# Format change (file format v2)

Bumps the on-disk header version to 2 and introduces a flag bit
(`FLAG_LENGTH_COMMITTED`, bit 0). When set, an authenticated `u64 LE`
plaintext length is appended to the header after the nonce prefix. v1
files still decrypt unchanged. v2 readers reject unknown flag bits.

The flag is set automatically when the input is a regular file (we
stat the open FD to avoid TOCTOU). Stdin/pipes/FIFOs encrypt as before
with the flag clear. Sequential decrypt cross-checks the produced byte
count against the committed length as defense in depth (the AEAD
already authenticates the value via header AAD, but failing before we
rename the temp file into place is preferable to failing after).

# Random-access decrypt

`fcry -d -i FILE --offset N --length L` seeks directly to the chunk(s)
covering `[N, N+L)` and decrypts only those, without scanning the
predecessors. Requires a seekable file whose header has the
length-committed flag — stdin/pipe-encrypted files cannot use this
path and the CLI rejects it with a clear error.

The chunk layout is fully determined by `chunk_size` and the committed
total length (last chunk's plaintext is
`total - (n_chunks-1)*chunk_size`; its ciphertext length is
`last_pt + 16`). Each chunk's nonce is
`make_nonce(prefix, chunk_index, is_last_chunk)` which matches what
sequential encrypt produced, so plaintext slices come out
bit-identical to a full sequential decrypt.

# Multi-threaded pipeline

New `src/pipeline.rs` implements:

  reader thread → bounded jobs channel → N AEAD workers
                → bounded results channel → writer thread

The reader stays serial (it owns the input handle and uses lookahead
to detect the last chunk). Workers parallelize the AEAD step (each
chunk is independent under STREAM). The writer holds a
`BTreeMap<u32, Vec<u8>>` reorder buffer and only flushes in counter
order. Commit is deferred to the main thread, so a failure anywhere —
reader I/O, AEAD auth, writer I/O — drops `OutSink` without renaming
the temp file into place. The
`atomic_output_no_stale_tmp_on_failure` integration test still
passes.

Channel and reorder capacities scale with worker count (`2*threads`);
peak memory is roughly `chunk_size * 4 * threads`. With 1 MiB chunks
and 8 cores that's ~32 MiB, which we accept.

Default thread count is `std::thread::available_parallelism()`;
override with `-j/--threads N`. `-j 1` keeps the original serial path.
Stdin/stdout streaming works under the parallel path because `Stdin`
(unlocked) is `Send` — only `StdinLock` isn't, so the boxed reader
wraps `Stdin` directly in a `BufReader`.

Adds `crossbeam-channel = "0.5"` for bounded MPMC. The cipher
(`XChaCha20Poly1305`) and the header AAD are shared across workers via
`Arc`; the AEAD's internal key copy is zeroized on drop as before.

# CLI surface

  -j, --threads <N>     worker thread count (default: cores)
      --offset <BYTES>  random-access decrypt: slice start
      --length <BYTES>  random-access decrypt: slice length

`--offset`/`--length` require `--decrypt` and `--input-file` (clap
enforces; we also surface a clean runtime error if only one is
supplied).

# Test plan

* `cargo test` — 5 unit + 27 integration, all green.
* New integration coverage:
  - parallel roundtrip on multi-chunk inputs (`-j 4`)
  - parallel-encrypted ciphertext decrypted serially, and vice-versa
    (output bit-identical regardless of worker count)
  - parallel pipe stdin↔stdout (asserts flag byte is 0 for stdin
    inputs — no length committed without a known size)
  - file inputs auto-commit length (asserts version=2 and flags bit 0
    set in the raw header bytes)
  - random-access slices spanning chunk-aligned, mid-chunk,
    last-chunk, and full-file ranges
  - random-access rejects out-of-range and stdin-encrypted inputs,
    accepts zero-length
  - tampering the committed length byte fails AEAD authentication
  - hand-crafted v1 header still decodes (no flag bit set)
* `cargo clippy --all-targets -- -D warnings` clean.
* `cargo +nightly fmt` clean.

Removes `TODO.md` since both deferred items are now implemented.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 20:33:00 +02:00
ddidderr f72f9034f3 feat(cli): default to interactive passphrase when no key source given
Previously, invoking fcry without any of --raw-key, --passphrase, or
--passphrase-env produced a hard error ("must provide one of ..."). The
common, secure case (interactive TTY passphrase) thus required an
explicit flag, while the dangerous case (--raw-key on the command line)
was equally accessible.

Make the secure path the default: if no key source is specified, fall
back to PassphraseSource::Tty, which prompts on the terminal and runs
argon2id on encrypt. Explicit --passphrase still works and is now
redundant for the default invocation; --raw-key and --passphrase-env
remain unchanged and still suppress the default.

The previous "must provide one of ..." error path becomes unreachable
and is removed: the only way pw_src is None is when raw_key_str is Some,
which is handled by the existing encrypt/decrypt branches.

User-visible change: `fcry -i foo -o foo.enc` now prompts for a
passphrase instead of erroring out. Scripts that relied on the error to
detect missing arguments will instead block on a TTY read; non-TTY
callers should continue to pass --passphrase-env or --raw-key
explicitly.

Test Plan:
- `fcry -i plain -o plain.enc` prompts twice (passphrase + confirm),
  then `fcry -d -i plain.enc -o plain.out` prompts once and round-trips.
- `fcry --raw-key $(head -c32 /dev/urandom | base64) ...` still works
  and does not prompt.
- `PW=hunter2 fcry --passphrase-env PW ...` still works and does not
  prompt.
- `fcry --passphrase --raw-key ...` still rejected by clap
  (conflicts_with_all).
2026-05-02 19:52:19 +02:00
10 changed files with 1256 additions and 58 deletions
Generated
+16
View File
@@ -232,6 +232,21 @@ dependencies = [
"libc",
]
[[package]]
name = "crossbeam-channel"
version = "0.5.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "82b8f8f868b36967f9606790d1903570de9ceaf870a7bf9fbbd3016d636a2cb2"
dependencies = [
"crossbeam-utils",
]
[[package]]
name = "crossbeam-utils"
version = "0.8.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d0a5c400df2834b80a4c3327b3aad3a4c4cd4de0629063962b03235697506a28"
[[package]]
name = "crypto-common"
version = "0.1.7"
@@ -290,6 +305,7 @@ dependencies = [
"assert_cmd",
"chacha20poly1305",
"clap",
"crossbeam-channel",
"getrandom 0.4.2",
"libc",
"rlimit",
+1
View File
@@ -8,6 +8,7 @@ version = "0.10.0"
argon2 = "0.5"
chacha20poly1305 = "0.10"
clap = {version = "4", features = ["derive"]}
crossbeam-channel = "0.5"
getrandom = {version = "0.4"}
protected-secrets = {package = "secrets", version = "1.3"}
zeroize = {version = "1", features = ["derive"]}
-3
View File
@@ -1,3 +0,0 @@
**Deferred to follow-up commits** (in order):
1. Multi-threaded pipeline (worker pool + ordered writer)
2. Length-committed mode + random-access decrypt fast path for files
+318 -22
View File
@@ -1,21 +1,26 @@
// SPDX-License-Identifier: GPL-3.0-only
use chacha20poly1305::{KeyInit, XChaCha20Poly1305, XNonce, aead::AeadInPlace};
use std::io::Write;
use std::fs::File;
use std::io::{BufReader, Read, Seek, SeekFrom, Write};
use std::sync::Arc;
use crate::error::*;
use crate::header::{AlgId, Header, KdfParams, NONCE_PREFIX_LEN, TAG_LEN};
use crate::header::{
AlgId, FLAG_LENGTH_COMMITTED, Header, KdfParams, NONCE_PREFIX_LEN, TAG_LEN, VERSION_CURRENT,
};
use crate::pipeline;
use crate::reader::{AheadReader, ReadInfoChunk};
use crate::secrets::{SecretBytes32, SecretVec};
use crate::utils::*;
/// XChaCha20Poly1305 nonce: 24 bytes total. STREAM splits the trailing 5 bytes
/// into a 4-byte big-endian counter and a 1-byte "last block" flag.
const NONCE_LEN: usize = 24;
const COUNTER_LEN: usize = 4;
pub(crate) const NONCE_LEN: usize = 24;
pub(crate) const COUNTER_LEN: usize = 4;
const _: () = assert!(NONCE_PREFIX_LEN + COUNTER_LEN + 1 == NONCE_LEN);
fn make_nonce(prefix: &[u8; NONCE_PREFIX_LEN], counter: u32, last: bool) -> XNonce {
pub(crate) fn make_nonce(prefix: &[u8; NONCE_PREFIX_LEN], counter: u32, last: bool) -> XNonce {
let mut n = [0u8; NONCE_LEN];
n[..NONCE_PREFIX_LEN].copy_from_slice(prefix);
n[NONCE_PREFIX_LEN..NONCE_PREFIX_LEN + COUNTER_LEN].copy_from_slice(&counter.to_be_bytes());
@@ -55,36 +60,73 @@ pub fn derive_key(
Ok(out)
}
/// Build the AEAD cipher from the protected key. The cipher holds an
/// unprotected copy of the key while alive; `chacha20poly1305` zeroizes that
/// copy on drop. Wrapping in `Arc` lets us share it across worker threads.
fn build_aead(key: &SecretBytes32) -> Arc<XChaCha20Poly1305> {
Arc::new(key.with_array(|key| XChaCha20Poly1305::new(key.into())))
}
/// Bump the per-chunk counter; surface a domain error on overflow rather than
/// panicking on debug or wrapping in release.
pub(crate) fn bump_counter(counter: u32) -> Result<u32, FcryError> {
counter
.checked_add(1)
.ok_or_else(|| FcryError::Format("STREAM counter overflow (input too large)".into()))
}
pub fn encrypt<S: AsRef<str>>(
input_file: Option<S>,
output_file: Option<S>,
key: &SecretBytes32,
chunk_size: u32,
kdf: KdfParams,
threads: usize,
) -> Result<(), FcryError> {
let chunk_sz = chunk_size as usize;
let mut f_plain = AheadReader::from(open_input(input_file)?, chunk_sz);
let input = open_input(input_file)?;
let plaintext_length = input.length;
let mut f_plain = AheadReader::from(input.reader, chunk_sz);
let mut f_encrypted = OutSink::open(output_file)?;
let mut nonce_prefix = [0u8; NONCE_PREFIX_LEN];
getrandom::fill(&mut nonce_prefix)?;
let flags = if plaintext_length.is_some() {
FLAG_LENGTH_COMMITTED
} else {
0
};
let header = Header {
version: VERSION_CURRENT,
alg: AlgId::XChaCha20Poly1305,
flags: 0,
flags,
chunk_size,
kdf,
nonce_prefix,
plaintext_length,
};
let aad = header.encode();
let aad = Arc::new(header.encode());
f_encrypted.write_all(&aad)?;
// The AEAD keeps its own unprotected key copy while the loop runs.
// chacha20poly1305 zeroizes that copy on drop.
let aead = key.with_array(|key| XChaCha20Poly1305::new(key.into()));
let aead = build_aead(key);
if threads > 1 {
return pipeline::encrypt_parallel(
f_plain,
f_encrypted,
aead,
aad,
nonce_prefix,
chunk_sz,
threads,
plaintext_length,
);
}
let mut buf = vec![0u8; chunk_sz];
let mut counter: u32 = 0;
let mut bytes_seen: u64 = 0;
loop {
match f_plain.read_ahead(&mut buf)? {
@@ -93,15 +135,15 @@ pub fn encrypt<S: AsRef<str>>(
aead.encrypt_in_place(&nonce, &aad, &mut buf)?;
f_encrypted.write_all(&buf)?;
buf.truncate(chunk_sz);
counter = counter.checked_add(1).ok_or_else(|| {
FcryError::Format("STREAM counter overflow (input too large)".into())
})?;
bytes_seen = bytes_seen.saturating_add(chunk_sz as u64);
counter = bump_counter(counter)?;
}
ReadInfoChunk::Last(n) => {
buf.truncate(n);
let nonce = make_nonce(&nonce_prefix, counter, true);
aead.encrypt_in_place(&nonce, &aad, &mut buf)?;
f_encrypted.write_all(&buf)?;
bytes_seen = bytes_seen.saturating_add(n as u64);
break;
}
ReadInfoChunk::Empty => {
@@ -116,6 +158,17 @@ pub fn encrypt<S: AsRef<str>>(
}
}
if let Some(committed) = plaintext_length
&& committed != bytes_seen
{
// Defense in depth: the input changed between stat and EOF. The
// committed length is part of the AEAD AAD, so any decrypter would
// also surface this, but we prefer to fail before publishing the file.
return Err(FcryError::Format(format!(
"input length changed during encryption: committed {committed}, read {bytes_seen}"
)));
}
f_encrypted.commit()?;
Ok(())
}
@@ -125,10 +178,11 @@ pub fn decrypt<S: AsRef<str>>(
output_file: Option<S>,
raw_key: Option<&SecretBytes32>,
passphrase: Option<&SecretVec>,
threads: usize,
) -> Result<(), FcryError> {
let mut reader = open_input(input_file)?;
let mut reader = open_input(input_file)?.reader;
let header = Header::read(&mut reader)?;
let aad = header.encode();
let aad = Arc::new(header.encode());
let key = derive_key(&header.kdf, raw_key, passphrase)?;
@@ -138,12 +192,24 @@ pub fn decrypt<S: AsRef<str>>(
let mut f_encrypted = AheadReader::from(reader, cipher_chunk);
let mut f_plain = OutSink::open(output_file)?;
// The AEAD keeps its own unprotected key copy while the loop runs.
// chacha20poly1305 zeroizes that copy on drop.
let aead = key.with_array(|key| XChaCha20Poly1305::new(key.into()));
let aead = build_aead(&key);
if threads > 1 {
return pipeline::decrypt_parallel(
f_encrypted,
f_plain,
aead,
aad,
header.nonce_prefix,
cipher_chunk,
threads,
header.plaintext_length,
);
}
let mut buf = vec![0u8; cipher_chunk];
let mut counter: u32 = 0;
let mut bytes_written: u64 = 0;
loop {
match f_encrypted.read_ahead(&mut buf)? {
@@ -151,16 +217,16 @@ pub fn decrypt<S: AsRef<str>>(
let nonce = make_nonce(&header.nonce_prefix, counter, false);
aead.decrypt_in_place(&nonce, &aad, &mut buf)?;
f_plain.write_all(&buf)?;
bytes_written = bytes_written.saturating_add(buf.len() as u64);
buf.resize(cipher_chunk, 0);
counter = counter
.checked_add(1)
.ok_or_else(|| FcryError::Format("STREAM counter overflow".into()))?;
counter = bump_counter(counter)?;
}
ReadInfoChunk::Last(n) => {
buf.truncate(n);
let nonce = make_nonce(&header.nonce_prefix, counter, true);
aead.decrypt_in_place(&nonce, &aad, &mut buf)?;
f_plain.write_all(&buf)?;
bytes_written = bytes_written.saturating_add(buf.len() as u64);
break;
}
ReadInfoChunk::Empty => {
@@ -171,6 +237,236 @@ pub fn decrypt<S: AsRef<str>>(
}
}
if let Some(committed) = header.plaintext_length
&& committed != bytes_written
{
return Err(FcryError::Format(format!(
"decrypted length {bytes_written} disagrees with committed {committed}"
)));
}
f_plain.commit()?;
Ok(())
}
/// Random-access decrypt of a byte range. Requires a seekable input file
/// whose header has `FLAG_LENGTH_COMMITTED` set, so we know exactly where
/// each ciphertext chunk lives and which chunk is the last (its nonce uses
/// the STREAM last-block flag).
pub fn decrypt_range<S: AsRef<str>>(
input_file: &str,
output_file: Option<S>,
raw_key: Option<&SecretBytes32>,
passphrase: Option<&SecretVec>,
offset: u64,
length: u64,
) -> Result<(), FcryError> {
let file = File::open(input_file)?;
let mut reader = BufReader::new(file);
let header = Header::read(&mut reader)?;
let aad = header.encode();
let header_len = aad.len() as u64;
let total = header.plaintext_length.ok_or_else(|| {
FcryError::Format(
"random-access decrypt requires a length-committed header (encrypt from a regular file)".into(),
)
})?;
let end = offset
.checked_add(length)
.ok_or_else(|| FcryError::Format("offset + length overflows u64".into()))?;
if end > total {
return Err(FcryError::Format(format!(
"range [{offset}, {end}) exceeds plaintext length {total}"
)));
}
let key = derive_key(&header.kdf, raw_key, passphrase)?;
let aead = build_aead(&key);
let chunk_sz = header.chunk_size as u64;
let cipher_chunk = chunk_sz + TAG_LEN as u64;
// Layout invariants:
// n_chunks = ceil(total / chunk_sz), but always ≥ 1 (the empty file
// still authenticates a single empty "last" chunk).
// last_idx = n_chunks - 1
// last_pt = total - last_idx * chunk_sz (in [0, chunk_sz])
let (n_chunks, last_pt) = if total == 0 {
(1u64, 0u64)
} else {
let n = total.div_ceil(chunk_sz);
let last = total - (n - 1) * chunk_sz;
(n, last)
};
let last_idx = n_chunks - 1;
let mut out = OutSink::open(output_file)?;
if length == 0 {
out.commit()?;
return Ok(());
}
let start_chunk = offset / chunk_sz;
let end_chunk = (end - 1) / chunk_sz;
// Reusable buffer sized to a full chunk + tag.
let mut buf = Vec::with_capacity(cipher_chunk as usize);
let mut file = reader.into_inner();
for i in start_chunk..=end_chunk {
let i_u32 =
u32::try_from(i).map_err(|_| FcryError::Format("chunk index exceeds u32".into()))?;
let is_last = i == last_idx;
let cipher_len = if is_last {
last_pt + TAG_LEN as u64
} else {
cipher_chunk
};
let cipher_len_usz =
usize::try_from(cipher_len).map_err(|_| FcryError::Format("chunk too big".into()))?;
let chunk_offset = header_len + i * cipher_chunk;
file.seek(SeekFrom::Start(chunk_offset))?;
buf.clear();
buf.resize(cipher_len_usz, 0);
file.read_exact(&mut buf)?;
let nonce = make_nonce(&header.nonce_prefix, i_u32, is_last);
aead.decrypt_in_place(&nonce, &aad, &mut buf)?;
// `buf` is now plaintext for this chunk. Compute the chunk's plaintext
// window in absolute bytes and intersect with the requested range.
let chunk_start = i * chunk_sz;
let chunk_end = chunk_start + buf.len() as u64;
let lo = offset.max(chunk_start) - chunk_start;
let hi = end.min(chunk_end) - chunk_start;
out.write_all(&buf[lo as usize..hi as usize])?;
}
out.commit()?;
Ok(())
}
#[cfg(test)]
mod tests {
//! Regression tests for cross-version compatibility. The on-disk header
//! is part of the AEAD AAD, so any byte that ends up in `Header::encode()`
//! must match the bytes that were authenticated when the file was
//! written. The v1 test below catches the regression where `encode()`
//! used to hard-code the current version on output.
use super::*;
use crate::header::{Header, KdfParams, NONCE_PREFIX_LEN};
use std::fs;
use tempfile::TempDir;
fn write_v1_ciphertext(path: &std::path::Path, key: &SecretBytes32, plaintext: &[u8]) {
// Build a v1 header by hand: same wire format as v2 with flags=0,
// but with version byte = 1.
let nonce_prefix = [0x42u8; NONCE_PREFIX_LEN];
let header = Header {
version: 1,
alg: AlgId::XChaCha20Poly1305,
flags: 0,
chunk_size: 64,
kdf: KdfParams::Raw,
nonce_prefix,
plaintext_length: None,
};
let aad = header.encode();
// First byte after MAGIC is the version — verify our fixture really
// is v1 (so this test fails open if encode() ever reverts).
assert_eq!(aad[4], 1);
let chunk_sz = header.chunk_size as usize;
let aead = build_aead(key);
let mut out = Vec::new();
out.extend_from_slice(&aad);
let mut counter: u32 = 0;
let mut pos = 0;
while pos < plaintext.len() {
let end = (pos + chunk_sz).min(plaintext.len());
let last = end == plaintext.len() && (end - pos) < chunk_sz;
let mut buf = plaintext[pos..end].to_vec();
let nonce = make_nonce(&nonce_prefix, counter, last);
aead.encrypt_in_place(&nonce, &aad, &mut buf).unwrap();
out.extend_from_slice(&buf);
pos = end;
if last {
break;
}
// If we hit a chunk-boundary EOF we still need a trailing "last"
// empty chunk to authenticate end-of-stream.
counter = bump_counter(counter).unwrap();
if pos == plaintext.len() {
let mut empty = Vec::new();
let nonce = make_nonce(&nonce_prefix, counter, true);
aead.encrypt_in_place(&nonce, &aad, &mut empty).unwrap();
out.extend_from_slice(&empty);
break;
}
}
// Empty plaintext: emit a single empty "last" chunk.
if plaintext.is_empty() {
let mut empty = Vec::new();
let nonce = make_nonce(&nonce_prefix, 0, true);
aead.encrypt_in_place(&nonce, &aad, &mut empty).unwrap();
out.extend_from_slice(&empty);
}
fs::write(path, &out).unwrap();
}
#[test]
fn decrypts_v1_ciphertext() {
let dir = TempDir::new().unwrap();
let ct = dir.path().join("v1.bin");
let rt = dir.path().join("rt.bin");
let mut key = SecretBytes32::zeroed();
key.with_mut_array(|k| k.copy_from_slice(b"0123456789abcdef0123456789abcdef"));
// Multi-chunk plaintext (chunk_size = 64 in the fixture).
let plain: Vec<u8> = (0..200u8).collect();
write_v1_ciphertext(&ct, &key, &plain);
decrypt(
Some(ct.to_str().unwrap()),
Some(rt.to_str().unwrap()),
Some(&key),
None,
1,
)
.expect("v1 decrypt should succeed");
let got = fs::read(&rt).unwrap();
assert_eq!(got, plain);
}
#[test]
fn decrypts_v1_ciphertext_parallel() {
// Same fixture, but exercising the multi-threaded pipeline.
let dir = TempDir::new().unwrap();
let ct = dir.path().join("v1.bin");
let rt = dir.path().join("rt.bin");
let mut key = SecretBytes32::zeroed();
key.with_mut_array(|k| k.copy_from_slice(b"0123456789abcdef0123456789abcdef"));
let plain: Vec<u8> = (0..200u8).collect();
write_v1_ciphertext(&ct, &key, &plain);
decrypt(
Some(ct.to_str().unwrap()),
Some(rt.to_str().unwrap()),
Some(&key),
None,
4,
)
.expect("v1 parallel decrypt should succeed");
assert_eq!(fs::read(&rt).unwrap(), plain);
}
}
+118 -6
View File
@@ -13,25 +13,40 @@
//! kdf_id u8 1
//! kdf_params variable (depends on kdf_id)
//! nonce_prefix [u8; 19] 19 (STREAM nonce prefix)
//! plaintext_length u64 LE 8 (only if version >= 2 and flags & 0x01)
//! --- end of header ---
//! chunk[0..N] each chunk_size + 16 bytes,
//! last may be shorter
//! ```
//!
//! The full encoded header is fed as AAD to every chunk, so any tampering
//! with chunk_size, nonce_prefix, kdf params, etc. causes authentication
//! failure on every chunk.
//! with chunk_size, nonce_prefix, kdf params, plaintext_length, etc. causes
//! authentication failure on every chunk.
//!
//! Versions:
//! * v1 — no length committed, no flag bits used.
//! * v2 — adds `FLAG_LENGTH_COMMITTED` (bit 0); when set, the total plaintext
//! length is appended after `nonce_prefix`. This enables random-access
//! decryption without scanning predecessors.
use std::io::Read;
use crate::error::FcryError;
const MAGIC: [u8; 4] = *b"fcry";
const VERSION: u8 = 1;
pub const VERSION_CURRENT: u8 = 2;
const VERSION_MIN: u8 = 1;
pub const NONCE_PREFIX_LEN: usize = 19;
pub const TAG_LEN: usize = 16;
/// Set in `flags` when the header carries an authenticated `plaintext_length`
/// field. Required for random-access decryption.
pub const FLAG_LENGTH_COMMITTED: u8 = 0x01;
/// Mask of all flag bits this build understands. Unknown bits → reject.
const FLAG_KNOWN_MASK: u8 = FLAG_LENGTH_COMMITTED;
#[repr(u8)]
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum AlgId {
@@ -116,18 +131,24 @@ impl KdfParams {
#[derive(Clone, Debug)]
pub struct Header {
/// On-disk format version. Set to `VERSION_CURRENT` for new encrypts;
/// preserved as-read for decrypt so the AAD recomputed on decode matches
/// the bytes that were authenticated when the file was written.
pub version: u8,
pub alg: AlgId,
pub flags: u8,
pub chunk_size: u32,
pub kdf: KdfParams,
pub nonce_prefix: [u8; NONCE_PREFIX_LEN],
/// Total plaintext byte count. `Some` iff `flags & FLAG_LENGTH_COMMITTED`.
pub plaintext_length: Option<u64>,
}
impl Header {
pub fn encode(&self) -> Vec<u8> {
let mut out = Vec::with_capacity(64);
let mut out = Vec::with_capacity(72);
out.extend_from_slice(&MAGIC);
out.push(VERSION);
out.push(self.version);
out.push(self.alg as u8);
out.push(self.flags);
out.push(0); // reserved
@@ -135,6 +156,12 @@ impl Header {
out.push(self.kdf.id());
self.kdf.write_into(&mut out);
out.extend_from_slice(&self.nonce_prefix);
if (self.flags & FLAG_LENGTH_COMMITTED) != 0 {
let len = self
.plaintext_length
.expect("FLAG_LENGTH_COMMITTED set but plaintext_length is None");
out.extend_from_slice(&len.to_le_bytes());
}
out
}
@@ -148,12 +175,20 @@ impl Header {
let mut fixed = [0u8; 4];
r.read_exact(&mut fixed)?;
let [version, alg_id, flags, reserved] = fixed;
if version != VERSION {
if !(VERSION_MIN..=VERSION_CURRENT).contains(&version) {
return Err(FcryError::Format(format!("unsupported version: {version}")));
}
if reserved != 0 {
return Err(FcryError::Format("reserved byte must be zero".into()));
}
if (flags & !FLAG_KNOWN_MASK) != 0 {
return Err(FcryError::Format(format!(
"unknown flag bits: 0x{flags:02x}"
)));
}
if version < 2 && flags != 0 {
return Err(FcryError::Format("v1 header must have flags == 0".into()));
}
let alg = AlgId::from_u8(alg_id)?;
let mut chunk_size_bytes = [0u8; 4];
@@ -170,12 +205,22 @@ impl Header {
let mut nonce_prefix = [0u8; NONCE_PREFIX_LEN];
r.read_exact(&mut nonce_prefix)?;
let plaintext_length = if (flags & FLAG_LENGTH_COMMITTED) != 0 {
let mut b = [0u8; 8];
r.read_exact(&mut b)?;
Some(u64::from_le_bytes(b))
} else {
None
};
Ok(Self {
version,
alg,
flags,
chunk_size,
kdf,
nonce_prefix,
plaintext_length,
})
}
}
@@ -188,30 +233,55 @@ mod tests {
#[test]
fn roundtrip() {
let h = Header {
version: VERSION_CURRENT,
alg: AlgId::XChaCha20Poly1305,
flags: 0,
chunk_size: 1024 * 1024,
kdf: KdfParams::Raw,
nonce_prefix: [7u8; NONCE_PREFIX_LEN],
plaintext_length: None,
};
let bytes = h.encode();
let mut cur = Cursor::new(&bytes);
let parsed = Header::read(&mut cur).unwrap();
assert_eq!(parsed.version, h.version);
assert_eq!(parsed.alg, h.alg);
assert_eq!(parsed.flags, h.flags);
assert_eq!(parsed.chunk_size, h.chunk_size);
assert_eq!(parsed.nonce_prefix, h.nonce_prefix);
assert_eq!(parsed.plaintext_length, None);
assert_eq!(cur.position() as usize, bytes.len());
}
#[test]
fn roundtrip_length_committed() {
let h = Header {
version: VERSION_CURRENT,
alg: AlgId::XChaCha20Poly1305,
flags: FLAG_LENGTH_COMMITTED,
chunk_size: 65536,
kdf: KdfParams::Raw,
nonce_prefix: [9u8; NONCE_PREFIX_LEN],
plaintext_length: Some(123_456_789),
};
let bytes = h.encode();
let mut cur = Cursor::new(&bytes);
let parsed = Header::read(&mut cur).unwrap();
assert_eq!(parsed.flags, FLAG_LENGTH_COMMITTED);
assert_eq!(parsed.plaintext_length, Some(123_456_789));
assert_eq!(cur.position() as usize, bytes.len());
}
#[test]
fn rejects_bad_magic() {
let mut bytes = Header {
version: VERSION_CURRENT,
alg: AlgId::XChaCha20Poly1305,
flags: 0,
chunk_size: 4096,
kdf: KdfParams::Raw,
nonce_prefix: [0u8; NONCE_PREFIX_LEN],
plaintext_length: None,
}
.encode();
bytes[0] ^= 1;
@@ -220,4 +290,46 @@ mod tests {
Err(FcryError::Format(_))
));
}
#[test]
fn rejects_unknown_flag_bits() {
let mut bytes = Header {
version: VERSION_CURRENT,
alg: AlgId::XChaCha20Poly1305,
flags: 0,
chunk_size: 4096,
kdf: KdfParams::Raw,
nonce_prefix: [0u8; NONCE_PREFIX_LEN],
plaintext_length: None,
}
.encode();
// flags byte is at offset 6 (4 magic + version + alg)
bytes[6] = 0x80;
assert!(matches!(
Header::read(&mut Cursor::new(&bytes)),
Err(FcryError::Format(_))
));
}
#[test]
fn reads_v1_header() {
// hand-crafted v1 header (raw kdf, no length field)
let mut bytes = Vec::new();
bytes.extend_from_slice(b"fcry");
bytes.push(1); // version
bytes.push(1); // alg
bytes.push(0); // flags
bytes.push(0); // reserved
bytes.extend_from_slice(&1024u32.to_le_bytes());
bytes.push(0); // kdf id raw
bytes.extend_from_slice(&[3u8; NONCE_PREFIX_LEN]);
let parsed = Header::read(&mut Cursor::new(&bytes)).unwrap();
assert_eq!(parsed.version, 1);
assert_eq!(parsed.flags, 0);
assert_eq!(parsed.chunk_size, 1024);
assert_eq!(parsed.plaintext_length, None);
// Re-encoding must reproduce the original v1 bytes exactly so the
// recomputed AAD matches what the file was authenticated with.
assert_eq!(parsed.encode(), bytes);
}
}
+62 -9
View File
@@ -3,6 +3,7 @@
mod crypto;
mod error;
mod header;
mod pipeline;
mod reader;
mod secrets;
mod utils;
@@ -39,6 +40,7 @@ struct Cli {
raw_key: Option<Zeroizing<String>>,
/// Read passphrase interactively (terminal). Implies argon2id KDF on encrypt.
/// This is the default when no key source is specified.
#[clap(short, long)]
passphrase: bool,
@@ -62,6 +64,32 @@ struct Cli {
/// Argon2id parallelism / lanes (encryption only).
#[clap(long, default_value_t = 4)]
argon_parallelism: u32,
/// Number of worker threads for AEAD work. Defaults to the number of
/// available CPUs. Set to 1 for fully serial encrypt/decrypt.
#[clap(short = 'j', long, value_parser = clap::value_parser!(u32).range(1..))]
threads: Option<u32>,
/// Random-access decrypt: byte offset of the slice to read.
/// Requires `--decrypt`, an `--input-file` whose header has the
/// length-committed flag set, and `--length`.
#[clap(
long,
requires = "length",
requires = "decrypt",
requires = "input_file"
)]
offset: Option<u64>,
/// Random-access decrypt: byte length of the slice to read.
/// Requires `--decrypt`, `--input-file`, and `--offset`.
#[clap(
long,
requires = "offset",
requires = "decrypt",
requires = "input_file"
)]
length: Option<u64>,
}
fn parse_raw_key(s: &str) -> Result<SecretBytes32, FcryError> {
@@ -131,8 +159,13 @@ fn run(mut cli: Cli) -> Result<(), FcryError> {
let raw_key_str: Option<Zeroizing<String>> = cli.raw_key.take();
let pw_src: Option<PassphraseSource> = if cli.passphrase {
Some(PassphraseSource::Tty)
} else if let Some(var) = cli.passphrase_env.take() {
Some(PassphraseSource::EnvVar(var))
} else if raw_key_str.is_none() {
// Default to interactive TTY passphrase when no key source is given.
Some(PassphraseSource::Tty)
} else {
cli.passphrase_env.take().map(PassphraseSource::EnvVar)
None
};
let decrypt_mode = cli.decrypt;
@@ -142,14 +175,15 @@ fn run(mut cli: Cli) -> Result<(), FcryError> {
let argon_memory = cli.argon_memory;
let argon_passes = cli.argon_passes;
let argon_parallelism = cli.argon_parallelism;
let threads = cli.threads.map(|n| n as usize).unwrap_or_else(|| {
std::thread::available_parallelism()
.map(|n| n.get())
.unwrap_or(1)
});
let offset = cli.offset;
let length = cli.length;
drop(cli);
if pw_src.is_none() && raw_key_str.is_none() {
return Err(FcryError::Format(
"must provide one of --raw-key, --passphrase, --passphrase-env".into(),
));
}
if decrypt_mode {
let raw_key = match raw_key_str.as_deref() {
Some(s) => Some(parse_raw_key(s)?),
@@ -159,7 +193,26 @@ fn run(mut cli: Cli) -> Result<(), FcryError> {
Some(src) => Some(read_passphrase(src, false)?),
None => None,
};
decrypt(input, output, raw_key.as_ref(), pw.as_ref())?;
match (offset, length) {
(Some(o), Some(l)) => {
// clap's `requires` makes this unreachable, but keep the
// dynamic check so the failure mode is a clean error.
let path = input.as_deref().ok_or_else(|| {
FcryError::Format(
"--offset/--length require --input-file (random-access needs a seekable file)".into(),
)
})?;
decrypt_range(path, output, raw_key.as_ref(), pw.as_ref(), o, l)?;
}
(None, None) => {
decrypt(input, output, raw_key.as_ref(), pw.as_ref(), threads)?;
}
_ => {
return Err(FcryError::Format(
"--offset and --length must be supplied together".into(),
));
}
}
} else {
let (key, kdf) = if let Some(src) = &pw_src {
let mut salt = [0u8; ARGON2_SALT_LEN];
@@ -180,7 +233,7 @@ fn run(mut cli: Cli) -> Result<(), FcryError> {
let key = parse_raw_key(raw_key_str.as_deref().unwrap())?;
(key, KdfParams::Raw)
};
encrypt(input, output, &key, chunk_size, kdf)?;
encrypt(input, output, &key, chunk_size, kdf, threads)?;
}
Ok(())
+373
View File
@@ -0,0 +1,373 @@
// SPDX-License-Identifier: GPL-3.0-only
//! Multi-threaded encrypt/decrypt pipeline.
//!
//! Topology:
//!
//! ```text
//! reader thread → jobs (bounded MPMC) → N AEAD workers →
//! → results (bounded MPMC) → writer thread
//! ```
//!
//! The reader is sequential (one input handle, lookahead detects last chunk),
//! workers parallelize the AEAD step (independent per chunk), and the writer
//! reorders results by counter before writing them to `OutSink`.
//!
//! Bounded memory: a permit channel caps the total number of in-flight chunks
//! (queued jobs + in-progress at workers + pending in the writer's reorder
//! buffer). The reader acquires a permit before sending each job; the writer
//! releases a permit after flushing the chunk in order. A slow or stuck worker
//! therefore stalls the reader rather than letting the writer's reorder buffer
//! grow without bound.
//!
//! Fail-fast: a shared `cancel` flag lets workers signal an authentication or
//! AEAD error to the reader. The reader checks it each iteration and exits
//! early, so a tampered chunk doesn't waste full-file I/O on top of the
//! detection.
//!
//! Peak memory ≈ chunk_size × (in_flight_cap + 2). For 1 MiB chunks and 8
//! cores (cap = 32) that's ~34 MiB. Adjust `in_flight_capacity` if you need
//! a different memory/throughput tradeoff.
use std::collections::BTreeMap;
use std::io::Write;
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread::{self, JoinHandle};
use std::time::Duration;
use chacha20poly1305::{XChaCha20Poly1305, aead::AeadInPlace};
use crossbeam_channel::{Receiver, RecvTimeoutError, Sender, bounded};
use crate::crypto::{bump_counter, make_nonce};
use crate::error::FcryError;
use crate::header::NONCE_PREFIX_LEN;
use crate::reader::{AheadReader, ReadInfoChunk};
use crate::utils::OutSink;
struct Job {
counter: u32,
last: bool,
buf: Vec<u8>,
}
struct Done {
counter: u32,
buf: Vec<u8>,
}
/// Job-channel capacity: small multiples of worker count, enough to keep
/// workers fed without unbounded memory.
fn channel_capacity(threads: usize) -> usize {
(threads * 2).max(2)
}
/// Total in-flight chunk cap (jobs queued + at workers + in writer's reorder
/// buffer). Permit count; bounded above the job-channel capacity to absorb
/// reordering without blocking workers unnecessarily.
fn in_flight_capacity(threads: usize) -> usize {
(threads * 4).max(4)
}
#[allow(clippy::too_many_arguments)]
pub(crate) fn encrypt_parallel(
input: AheadReader,
output: OutSink,
aead: Arc<XChaCha20Poly1305>,
aad: Arc<Vec<u8>>,
nonce_prefix: [u8; NONCE_PREFIX_LEN],
chunk_sz: usize,
threads: usize,
expected_length: Option<u64>,
) -> Result<(), FcryError> {
let (sink, bytes_seen) = run_pipeline(
input,
output,
aead,
aad,
nonce_prefix,
chunk_sz,
threads,
true,
)?;
if let Some(committed) = expected_length
&& committed != bytes_seen
{
return Err(FcryError::Format(format!(
"input length changed during encryption: committed {committed}, read {bytes_seen}"
)));
}
sink.commit()?;
Ok(())
}
#[allow(clippy::too_many_arguments)]
pub(crate) fn decrypt_parallel(
input: AheadReader,
output: OutSink,
aead: Arc<XChaCha20Poly1305>,
aad: Arc<Vec<u8>>,
nonce_prefix: [u8; NONCE_PREFIX_LEN],
cipher_chunk: usize,
threads: usize,
expected_length: Option<u64>,
) -> Result<(), FcryError> {
let (sink, written) = run_pipeline(
input,
output,
aead,
aad,
nonce_prefix,
cipher_chunk,
threads,
false,
)?;
if let Some(committed) = expected_length
&& committed != written
{
return Err(FcryError::Format(format!(
"decrypted length {written} disagrees with committed {committed}"
)));
}
sink.commit()?;
Ok(())
}
/// Drives the reader/worker/writer pipeline. `is_encrypt = true` performs
/// `encrypt_in_place` and tracks bytes-read; `false` performs
/// `decrypt_in_place` and tracks bytes-written. The single shared topology
/// keeps backpressure, reorder, and fail-fast logic in one place.
#[allow(clippy::too_many_arguments)]
fn run_pipeline(
mut input: AheadReader,
output: OutSink,
aead: Arc<XChaCha20Poly1305>,
aad: Arc<Vec<u8>>,
nonce_prefix: [u8; NONCE_PREFIX_LEN],
chunk_sz: usize,
threads: usize,
is_encrypt: bool,
) -> Result<(OutSink, u64), FcryError> {
let cap = channel_capacity(threads);
let in_flight = in_flight_capacity(threads);
let (jobs_tx, jobs_rx) = bounded::<Job>(cap);
let (done_tx, done_rx) = bounded::<Done>(cap);
// Pre-fill the permit channel. Each permit represents one in-flight chunk
// slot. The reader consumes a permit before sending a job; the writer
// returns a permit after flushing in order.
let (permit_tx, permit_rx) = bounded::<()>(in_flight);
for _ in 0..in_flight {
permit_tx
.send(())
.expect("pre-fill of permit channel cannot fail");
}
let cancel = Arc::new(AtomicBool::new(false));
// Reader thread: dispatches jobs in counter order and tracks bytes read
// (used for the encrypt-side length cross-check). On decrypt the count is
// ignored — the writer's count is authoritative there.
let reader_handle: JoinHandle<Result<u64, FcryError>> = {
let cancel = cancel.clone();
thread::spawn(move || {
let mut counter: u32 = 0;
let mut bytes_seen: u64 = 0;
loop {
// Acquire an in-flight slot. We recv with a short timeout so
// a worker error (which sets `cancel`) is observed even if
// the rest of the pipeline has quiesced and is no longer
// releasing permits — this avoids a 3-way deadlock between
// reader, idle workers, and a stalled writer.
loop {
if cancel.load(Ordering::Acquire) {
return Ok(bytes_seen);
}
match permit_rx.recv_timeout(Duration::from_millis(50)) {
Ok(()) => break,
Err(RecvTimeoutError::Timeout) => continue,
Err(RecvTimeoutError::Disconnected) => return Ok(bytes_seen),
}
}
let mut buf = vec![0u8; chunk_sz];
match input.read_ahead(&mut buf)? {
ReadInfoChunk::Normal(_) => {
if jobs_tx
.send(Job {
counter,
last: false,
buf,
})
.is_err()
{
return Ok(bytes_seen);
}
bytes_seen = bytes_seen.saturating_add(chunk_sz as u64);
counter = bump_counter(counter)?;
}
ReadInfoChunk::Last(n) => {
buf.truncate(n);
let _ = jobs_tx.send(Job {
counter,
last: true,
buf,
});
bytes_seen = bytes_seen.saturating_add(n as u64);
return Ok(bytes_seen);
}
ReadInfoChunk::Empty => {
if is_encrypt {
buf.clear();
let _ = jobs_tx.send(Job {
counter,
last: true,
buf,
});
return Ok(bytes_seen);
}
// On decrypt an unexpected EOF means the ciphertext is
// truncated. Surface it as an error so the writer
// doesn't commit a partial output.
return Err(FcryError::Format(
"truncated ciphertext: missing final chunk".into(),
));
}
}
}
})
};
// Worker threads: AEAD encrypt/decrypt in place, ship to writer. On error
// we set the cancel flag so the reader exits early, and drop the senders
// so the writer drains and exits.
let mut worker_handles: Vec<JoinHandle<Result<(), FcryError>>> = Vec::with_capacity(threads);
for _ in 0..threads {
let jobs_rx = jobs_rx.clone();
let done_tx = done_tx.clone();
let aead = aead.clone();
let aad = aad.clone();
let cancel = cancel.clone();
worker_handles.push(thread::spawn(move || {
for mut job in jobs_rx.iter() {
if cancel.load(Ordering::Acquire) {
// Drain remaining queued jobs without doing AEAD work.
// Returning Ok here keeps the previously-set error from
// being clobbered by a fresh "ok" status.
break;
}
let nonce = make_nonce(&nonce_prefix, job.counter, job.last);
let res = if is_encrypt {
aead.encrypt_in_place(&nonce, aad.as_slice(), &mut job.buf)
} else {
aead.decrypt_in_place(&nonce, aad.as_slice(), &mut job.buf)
};
if let Err(e) = res {
cancel.store(true, Ordering::Release);
return Err(e.into());
}
if done_tx
.send(Done {
counter: job.counter,
buf: job.buf,
})
.is_err()
{
break;
}
}
Ok(())
}));
}
drop(jobs_rx);
drop(done_tx);
// Writer thread: ordered writeback. Returns the `OutSink` ownership back
// without committing; the caller commits only after every other thread
// has joined cleanly so a failure anywhere drops the sink and unlinks the
// temp file. Releases one permit per chunk flushed so the reader can make
// forward progress in lockstep with the actual disk write.
let writer_handle: JoinHandle<Result<(OutSink, u64), FcryError>> =
thread::spawn(move || ordered_writer(done_rx, output, permit_tx));
let reader_res = reader_handle.join().expect("reader thread panicked");
let mut first_err: Option<FcryError> = None;
let bytes_seen = match reader_res {
Ok(n) => Some(n),
Err(e) => {
cancel.store(true, Ordering::Release);
first_err.get_or_insert(e);
None
}
};
for h in worker_handles {
if let Err(e) = h.join().expect("worker thread panicked")
&& first_err.is_none()
{
first_err = Some(e);
}
}
let writer_res = writer_handle.join().expect("writer thread panicked");
let written = match writer_res {
Ok((sink, n)) => Some((sink, n)),
Err(e) => {
if first_err.is_none() {
first_err = Some(e);
}
None
}
};
if let Some(e) = first_err {
return Err(e);
}
let (sink, n) = written.expect("no error but no sink");
let count = if is_encrypt {
bytes_seen.expect("no error but no reader count")
} else {
n
};
Ok((sink, count))
}
/// Drain `done_rx` in counter order, writing each chunk to `output` and
/// returning a permit to `permit_tx` after every flush so the reader is held
/// in lockstep with disk writes (bounded reorder buffer).
fn ordered_writer(
done_rx: Receiver<Done>,
mut output: OutSink,
permit_tx: Sender<()>,
) -> Result<(OutSink, u64), FcryError> {
let mut next: u32 = 0;
let mut pending: BTreeMap<u32, Vec<u8>> = BTreeMap::new();
let mut total: u64 = 0;
for done in done_rx.iter() {
pending.insert(done.counter, done.buf);
while let Some(buf) = pending.remove(&next) {
output.write_all(&buf)?;
total = total.saturating_add(buf.len() as u64);
// `bump_counter` rejects overflow upstream; a wrap here would be
// a real bug, so use plain addition and let it panic in debug.
next += 1;
// Release one in-flight slot. If the reader is gone the channel
// is closed; we don't care about the send result.
let _ = permit_tx.send(());
}
}
if !pending.is_empty() {
return Err(FcryError::Format(
"internal: ordered writer left chunks unflushed".into(),
));
}
Ok((output, total))
}
// Compile-time check that the job type is Send+Sync (channel sends across
// threads). Kept as a footgun for future struct edits.
#[allow(dead_code)]
fn _assert_send_sync<T: Send + Sync>() {}
const _: fn() = || _assert_send_sync::<Sender<Job>>();
+2 -2
View File
@@ -10,14 +10,14 @@ pub enum ReadInfoChunk {
}
pub struct AheadReader {
inner: Box<dyn BufRead>,
inner: Box<dyn BufRead + Send>,
buf: Vec<u8>,
bufsz: usize,
capacity: usize,
}
impl AheadReader {
pub fn from(reader: Box<dyn BufRead>, capacity: usize) -> Self {
pub fn from(reader: Box<dyn BufRead + Send>, capacity: usize) -> Self {
Self {
inner: reader,
buf: vec![0; capacity],
+32 -4
View File
@@ -10,11 +10,39 @@ use std::path::PathBuf;
/// breaking older files (the decryptor reads the size from the header).
pub const DEFAULT_CHUNK_SIZE: u32 = 1024 * 1024;
pub(crate) fn open_input<S: AsRef<str>>(input_file: Option<S>) -> io::Result<Box<dyn BufRead>> {
Ok(match input_file {
Some(f) => Box::new(BufReader::new(File::open(f.as_ref())?)),
None => Box::new(io::stdin().lock()),
/// Opened input.
///
/// `length` is `Some(n)` only when the source is a regular file (we stat the
/// open FD to avoid TOCTOU). For stdin, FIFOs, sockets, char devices, etc.
/// it is `None` — those paths cannot commit a length in the header.
pub(crate) struct Input {
pub reader: Box<dyn BufRead + Send>,
pub length: Option<u64>,
}
pub(crate) fn open_input<S: AsRef<str>>(input_file: Option<S>) -> io::Result<Input> {
match input_file {
Some(f) => {
let file = File::open(f.as_ref())?;
// Stat the open FD (not the path) so we can't be raced between
// stat and open.
let length = file
.metadata()
.ok()
.filter(|m| m.is_file())
.map(|m| m.len());
Ok(Input {
reader: Box::new(BufReader::new(file)),
length,
})
}
None => Ok(Input {
// `Stdin` is `Send` (unlike `StdinLock`), so wrap it in a
// `BufReader` and box for cross-thread use in the parallel pipeline.
reader: Box::new(BufReader::new(io::stdin())),
length: None,
}),
}
}
/// Output sink that supports atomic file replacement.
+322
View File
@@ -422,6 +422,328 @@ fn atomic_output_no_stale_tmp_on_failure() {
assert!(!tmp.exists(), "temp file must be cleaned up");
}
// ---------------------------------------------------------------------------
// Multi-threaded pipeline + length-committed + random-access tests
// ---------------------------------------------------------------------------
fn encrypt_file_threads(
plain: &std::path::Path,
ct: &std::path::Path,
chunk_size: Option<u32>,
threads: usize,
) {
let mut cmd = fcry();
cmd.arg("-i")
.arg(plain)
.arg("-o")
.arg(ct)
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg(threads.to_string());
if let Some(cs) = chunk_size {
cmd.arg("--chunk-size").arg(cs.to_string());
}
let out = cmd.output().unwrap();
assert!(
out.status.success(),
"encrypt -j{threads} failed: {}",
String::from_utf8_lossy(&out.stderr)
);
}
fn decrypt_file_threads(ct: &std::path::Path, rt: &std::path::Path, threads: usize) {
let out = fcry()
.arg("-d")
.arg("-i")
.arg(ct)
.arg("-o")
.arg(rt)
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg(threads.to_string())
.output()
.unwrap();
assert!(
out.status.success(),
"decrypt -j{threads} failed: {}",
String::from_utf8_lossy(&out.stderr)
);
}
#[test]
fn roundtrip_multi_threaded() {
// Multi-chunk input. Encrypt+decrypt with -j 4 must round-trip.
let dir = TempDir::new().unwrap();
let plain = dir.path().join("p.bin");
let ct = dir.path().join("c.bin");
let rt = dir.path().join("r.bin");
let data = pseudo_random(11, 5 * 1024 * 1024 + 12345);
fs::write(&plain, &data).unwrap();
encrypt_file_threads(&plain, &ct, Some(64 * 1024), 4);
decrypt_file_threads(&ct, &rt, 4);
assert_eq!(fs::read(&rt).unwrap(), data);
}
#[test]
fn parallel_and_serial_outputs_round_trip() {
// Encrypt with -j 4 and decrypt serially (and vice-versa); both directions
// must yield the original plaintext.
let dir = TempDir::new().unwrap();
let plain = dir.path().join("p.bin");
let data = pseudo_random(13, 256 * 1024 + 17);
fs::write(&plain, &data).unwrap();
let ct_par = dir.path().join("c_par.bin");
let ct_ser = dir.path().join("c_ser.bin");
encrypt_file_threads(&plain, &ct_par, Some(8192), 4);
encrypt_file_threads(&plain, &ct_ser, Some(8192), 1);
let rt1 = dir.path().join("r1.bin");
let rt2 = dir.path().join("r2.bin");
// par-encrypted, serial-decrypted
decrypt_file_threads(&ct_par, &rt1, 1);
// serial-encrypted, par-decrypted
decrypt_file_threads(&ct_ser, &rt2, 4);
assert_eq!(fs::read(&rt1).unwrap(), data);
assert_eq!(fs::read(&rt2).unwrap(), data);
}
#[test]
fn roundtrip_pipe_multi_threaded() {
// stdin/stdout mode with -j 4: length flag must NOT be set (no committed
// length when we don't know the input size), but encrypt/decrypt must still
// round-trip cleanly across the pipeline.
let data = pseudo_random(14, 200_000);
let mut enc = fcry()
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg("4")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.unwrap();
enc.stdin.as_mut().unwrap().write_all(&data).unwrap();
let enc_out = enc.wait_with_output().unwrap();
assert!(
enc_out.status.success(),
"pipe encrypt -j4 failed: {}",
String::from_utf8_lossy(&enc_out.stderr)
);
// flags byte at offset 6 must be 0 (no length committed for stdin input).
assert_eq!(
enc_out.stdout[6], 0,
"stdin-encrypted file unexpectedly committed length"
);
let mut dec = fcry()
.arg("-d")
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg("4")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.unwrap();
dec.stdin
.as_mut()
.unwrap()
.write_all(&enc_out.stdout)
.unwrap();
let dec_out = dec.wait_with_output().unwrap();
assert!(
dec_out.status.success(),
"pipe decrypt -j4 failed: {}",
String::from_utf8_lossy(&dec_out.stderr)
);
assert_eq!(dec_out.stdout, data);
}
#[test]
fn file_input_commits_length() {
// Encrypting from a regular file must auto-set FLAG_LENGTH_COMMITTED (bit 0
// of the flags byte at offset 6) and embed the length.
let dir = TempDir::new().unwrap();
let plain = dir.path().join("p.bin");
let ct = dir.path().join("c.bin");
let data = pseudo_random(15, 50_000);
fs::write(&plain, &data).unwrap();
encrypt_file(&plain, &ct, Some(4096));
let bytes = fs::read(&ct).unwrap();
// Magic(4) + version(1) + alg(1) + flags(1) = byte 6
assert_eq!(bytes[4], 2, "version should be 2");
assert_eq!(bytes[6] & 0x01, 0x01, "length-committed flag should be set");
}
fn encrypt_random_access_fixture(
dir: &std::path::Path,
data: &[u8],
chunk_size: u32,
) -> std::path::PathBuf {
let plain = dir.join("p.bin");
let ct = dir.join("c.bin");
fs::write(&plain, data).unwrap();
encrypt_file(&plain, &ct, Some(chunk_size));
ct
}
fn random_access_decrypt(
ct: &std::path::Path,
out: &std::path::Path,
offset: u64,
length: u64,
) -> std::process::Output {
fcry()
.arg("-d")
.arg("-i")
.arg(ct)
.arg("-o")
.arg(out)
.arg("--raw-key")
.arg(KEY_STR)
.arg("--offset")
.arg(offset.to_string())
.arg("--length")
.arg(length.to_string())
.output()
.unwrap()
}
#[test]
fn random_access_decrypt_slices() {
let dir = TempDir::new().unwrap();
let chunk = 4096u32;
let total = 5 * 1024 * 1024 + 12345;
let data = pseudo_random(16, total);
let ct = encrypt_random_access_fixture(dir.path(), &data, chunk);
// (offset, length) cases:
// - chunk-aligned start, mid-chunk end
// - mid-chunk start crossing several chunks
// - last partial chunk
// - last byte
// - entire file
let cases: &[(u64, u64)] = &[
(0, 1),
(chunk as u64, 7),
(chunk as u64 - 5, 100),
(10, chunk as u64 * 3 + 17),
(total as u64 - 1, 1),
(total as u64 - 100, 100),
(0, total as u64),
];
for (i, (offset, length)) in cases.iter().copied().enumerate() {
let out = dir.path().join(format!("slice_{i}.bin"));
let r = random_access_decrypt(&ct, &out, offset, length);
assert!(
r.status.success(),
"slice {i} ({offset}, {length}) failed: {}",
String::from_utf8_lossy(&r.stderr)
);
let got = fs::read(&out).unwrap();
let expected = &data[offset as usize..(offset + length) as usize];
assert_eq!(got, expected, "slice {i} mismatch");
}
}
#[test]
fn random_access_rejects_out_of_range() {
let dir = TempDir::new().unwrap();
let data = pseudo_random(17, 1000);
let ct = encrypt_random_access_fixture(dir.path(), &data, 256);
let out = dir.path().join("oob.bin");
let r = random_access_decrypt(&ct, &out, 900, 1000); // 900+1000 > 1000
assert!(!r.status.success(), "out-of-range slice should fail");
}
#[test]
fn random_access_rejects_stdin_encrypted() {
// Encrypt via stdin → no length committed → random access must refuse.
let data = pseudo_random(18, 2000);
let dir = TempDir::new().unwrap();
let ct = dir.path().join("c.bin");
let mut enc = fcry()
.arg("--raw-key")
.arg(KEY_STR)
.arg("-o")
.arg(&ct)
.stdin(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.unwrap();
enc.stdin.as_mut().unwrap().write_all(&data).unwrap();
assert!(enc.wait().unwrap().success());
let out = dir.path().join("slice.bin");
let r = random_access_decrypt(&ct, &out, 0, 100);
assert!(
!r.status.success(),
"random access on stdin-encrypted file should fail"
);
}
#[test]
fn random_access_zero_length() {
let dir = TempDir::new().unwrap();
let data = pseudo_random(19, 1000);
let ct = encrypt_random_access_fixture(dir.path(), &data, 256);
let out = dir.path().join("empty.bin");
let r = random_access_decrypt(&ct, &out, 500, 0);
assert!(r.status.success(), "zero-length slice should succeed");
assert_eq!(fs::read(&out).unwrap(), Vec::<u8>::new());
}
#[test]
fn random_access_tampered_length_fails() {
// Flip a byte inside the committed plaintext_length field. The header is
// AAD for every chunk, so the AEAD must reject decryption.
let dir = TempDir::new().unwrap();
let data = pseudo_random(20, 4000);
let ct = encrypt_random_access_fixture(dir.path(), &data, 1024);
let mut bytes = fs::read(&ct).unwrap();
// For raw-kdf header: magic(4)+ver(1)+alg(1)+flags(1)+rsv(1)+chunksize(4)+kdf_id(1)+nonce_prefix(19) = 32
// plaintext_length is at offset 32..40.
bytes[34] ^= 0xff;
fs::write(&ct, &bytes).unwrap();
let out = dir.path().join("bad.bin");
let r = random_access_decrypt(&ct, &out, 0, 100);
assert!(
!r.status.success(),
"tampered plaintext_length must fail authentication"
);
}
#[test]
fn rejects_zero_threads() {
// -j 0 is almost certainly a user mistake. Clap should reject it before
// we ever reach the pipeline.
let dir = TempDir::new().unwrap();
let plain = dir.path().join("p.bin");
fs::write(&plain, b"hello").unwrap();
let out = fcry()
.arg("-i")
.arg(&plain)
.arg("-o")
.arg(dir.path().join("c.bin"))
.arg("--raw-key")
.arg(KEY_STR)
.arg("-j")
.arg("0")
.output()
.unwrap();
assert!(!out.status.success(), "-j 0 should be rejected");
}
#[test]
fn header_chunk_size_is_authoritative_on_decrypt() {
// Encrypt with a non-default chunk size; decrypt without specifying one.