75afadb1ecac4f4ac59ad4226d6fcc1e46d23a3e
13 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
75afadb1ec
|
feat!: multi-threaded pipeline + length-committed/random-access decrypt
Completes the two follow-ups deferred from the v0.10 format/secrets
work: multi-threaded AEAD encrypt/decrypt and a length-committed file
format that enables random-access decryption.
# Format change (file format v2)
Bumps the on-disk header version to 2 and introduces a flag bit
(`FLAG_LENGTH_COMMITTED`, bit 0). When set, an authenticated `u64 LE`
plaintext length is appended to the header after the nonce prefix. v1
files still decrypt unchanged. v2 readers reject unknown flag bits.
The flag is set automatically when the input is a regular file (we
stat the open FD to avoid TOCTOU). Stdin/pipes/FIFOs encrypt as before
with the flag clear. Sequential decrypt cross-checks the produced byte
count against the committed length as defense in depth (the AEAD
already authenticates the value via header AAD, but failing before we
rename the temp file into place is preferable to failing after).
# Random-access decrypt
`fcry -d -i FILE --offset N --length L` seeks directly to the chunk(s)
covering `[N, N+L)` and decrypts only those, without scanning the
predecessors. Requires a seekable file whose header has the
length-committed flag — stdin/pipe-encrypted files cannot use this
path and the CLI rejects it with a clear error.
The chunk layout is fully determined by `chunk_size` and the committed
total length (last chunk's plaintext is
`total - (n_chunks-1)*chunk_size`; its ciphertext length is
`last_pt + 16`). Each chunk's nonce is
`make_nonce(prefix, chunk_index, is_last_chunk)` which matches what
sequential encrypt produced, so plaintext slices come out
bit-identical to a full sequential decrypt.
# Multi-threaded pipeline
New `src/pipeline.rs` implements:
reader thread → bounded jobs channel → N AEAD workers
→ bounded results channel → writer thread
The reader stays serial (it owns the input handle and uses lookahead
to detect the last chunk). Workers parallelize the AEAD step (each
chunk is independent under STREAM). The writer holds a
`BTreeMap<u32, Vec<u8>>` reorder buffer and only flushes in counter
order. Commit is deferred to the main thread, so a failure anywhere —
reader I/O, AEAD auth, writer I/O — drops `OutSink` without renaming
the temp file into place. The
`atomic_output_no_stale_tmp_on_failure` integration test still
passes.
Channel and reorder capacities scale with worker count (`2*threads`);
peak memory is roughly `chunk_size * 4 * threads`. With 1 MiB chunks
and 8 cores that's ~32 MiB, which we accept.
Default thread count is `std::thread::available_parallelism()`;
override with `-j/--threads N`. `-j 1` keeps the original serial path.
Stdin/stdout streaming works under the parallel path because `Stdin`
(unlocked) is `Send` — only `StdinLock` isn't, so the boxed reader
wraps `Stdin` directly in a `BufReader`.
Adds `crossbeam-channel = "0.5"` for bounded MPMC. The cipher
(`XChaCha20Poly1305`) and the header AAD are shared across workers via
`Arc`; the AEAD's internal key copy is zeroized on drop as before.
# CLI surface
-j, --threads <N> worker thread count (default: cores)
--offset <BYTES> random-access decrypt: slice start
--length <BYTES> random-access decrypt: slice length
`--offset`/`--length` require `--decrypt` and `--input-file` (clap
enforces; we also surface a clean runtime error if only one is
supplied).
# Test plan
* `cargo test` — 5 unit + 27 integration, all green.
* New integration coverage:
- parallel roundtrip on multi-chunk inputs (`-j 4`)
- parallel-encrypted ciphertext decrypted serially, and vice-versa
(output bit-identical regardless of worker count)
- parallel pipe stdin↔stdout (asserts flag byte is 0 for stdin
inputs — no length committed without a known size)
- file inputs auto-commit length (asserts version=2 and flags bit 0
set in the raw header bytes)
- random-access slices spanning chunk-aligned, mid-chunk,
last-chunk, and full-file ranges
- random-access rejects out-of-range and stdin-encrypted inputs,
accepts zero-length
- tampering the committed length byte fails AEAD authentication
- hand-crafted v1 header still decodes (no flag bit set)
* `cargo clippy --all-targets -- -D warnings` clean.
* `cargo +nightly fmt` clean.
Removes `TODO.md` since both deferred items are now implemented.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f72f9034f3
|
feat(cli): default to interactive passphrase when no key source given
Previously, invoking fcry without any of --raw-key, --passphrase, or
--passphrase-env produced a hard error ("must provide one of ..."). The
common, secure case (interactive TTY passphrase) thus required an
explicit flag, while the dangerous case (--raw-key on the command line)
was equally accessible.
Make the secure path the default: if no key source is specified, fall
back to PassphraseSource::Tty, which prompts on the terminal and runs
argon2id on encrypt. Explicit --passphrase still works and is now
redundant for the default invocation; --raw-key and --passphrase-env
remain unchanged and still suppress the default.
The previous "must provide one of ..." error path becomes unreachable
and is removed: the only way pw_src is None is when raw_key_str is Some,
which is handled by the existing encrypt/decrypt branches.
User-visible change: `fcry -i foo -o foo.enc` now prompts for a
passphrase instead of erroring out. Scripts that relied on the error to
detect missing arguments will instead block on a TTY read; non-TTY
callers should continue to pass --passphrase-env or --raw-key
explicitly.
Test Plan:
- `fcry -i plain -o plain.enc` prompts twice (passphrase + confirm),
then `fcry -d -i plain.enc -o plain.out` prompts once and round-trips.
- `fcry --raw-key $(head -c32 /dev/urandom | base64) ...` still works
and does not prompt.
- `PW=hunter2 fcry --passphrase-env PW ...` still works and does not
prompt.
- `fcry --passphrase --raw-key ...` still rejected by clap
(conflicts_with_all).
|
||
|
|
669f5ed073
|
deps: cargo update | ||
|
|
898697016a
|
refactor(secrets): back SecretBytes32/SecretVec with the secrets crate
Replace the homegrown `region::lock` + `Zeroizing` wrappers with thin
adapters over `secrets::SecretBox` and `secrets::SecretVec`. The
upstream crate already provides everything the local types were doing
by hand (mlock, zero-on-drop) and adds protections we didn't have:
guard pages around the allocation and `mprotect`-based access control
(`PROT_NONE` at rest, `PROT_READ` during a borrow, `PROT_READ|WRITE`
during a mut borrow). Net result is a security upgrade, not just a
dependency swap.
Why a local adapter still exists
--------------------------------
`secrets::SecretVec` is fixed-length — no `push`. The tty passphrase
reader needs to append bytes one at a time without ever reallocating
(a panicking partial read must not leave stale plaintext on the heap),
so `SecretVec` keeps a separate logical `len` over a fixed protected
allocation of `MAX_PASSPHRASE_LEN` bytes. Bytes past `len` stay
zero-padding and are never exposed through `with_slice`.
API shape: closure-scoped borrows
---------------------------------
The previous `as_slice` / `as_array` returned long-lived `&[u8]`
references, which would have kept the upstream pages in `PROT_READ`
for the full lifetime of the borrow. The new API uses
`with_array(|s| ...)`, `with_mut_array(|s| ...)`, `with_slice(|s| ...)`
so the unprotected window is exactly the closure body. This is uglier
at call sites (notably the nested closures in `derive_key`) but it's
the right tradeoff — minimizing the unprotected window is the whole
point of using the crate.
AEAD key copy footnote
----------------------
`XChaCha20Poly1305::new` copies the key into its own (unprotected)
state, which then lives in the `aead` binding for the entire
encrypt/decrypt loop. This is unchanged from before — the cipher
state was never protected — but it's now called out explicitly with
a comment at both call sites noting that `chacha20poly1305` zeroizes
that internal copy on drop. Future readers shouldn't have to
rediscover this by reading upstream source.
`from_vec` zeroing
------------------
`SecretVec::from_vec(v: Vec<u8>)` is used on the env-var path. It
calls `secrets::SecretVec::from(&mut [u8])`, which (verified against
secrets-1.3.0: `Box::from` -> `transfer` -> `memtransfer`) copies the
bytes into protected storage and zeroes the source slice. The
original Vec's allocation is then released through the normal
allocator — the bytes inside it are zero, but the heap block itself
isn't specially handled. The doc comment on `from_vec` reflects this
precisely. As before, the env-var path also leaves a copy in the
process `environ` table, which is a known accepted leak.
Cargo.toml
----------
Use `protected-secrets = { package = "secrets", version = "1.3" }`
with default features. The `secrets` crate has no pure-Rust backend
at v1.3 — disabling default features only switches *how* libsodium
is linked (bundled `libsodium-sys` vs. the crate's own bindings to a
system libsodium), and can break builds where the chosen path isn't
set up. Defaults are correct here. The `region` dependency is
dropped.
Test plan
---------
- `cargo build` clean.
- `cargo test` — 2 unit + 18 integration tests pass, including
`roundtrip_passphrase_argon2id` which exercises the full
passphrase -> argon2id -> AEAD key path through the new wrappers.
- `cargo clippy` (and `--tests`, `--benches`) clean.
- `cargo +nightly fmt` applied.
|
||
|
|
fe65e1f899
|
feat!: argon2id passphrases, secret hardening, atomic output, manual STREAM
This commit lands four follow-up items that were explicitly deferred in
TODO.md after the prior file-format change, plus a CLI/units cleanup
that fell out of reviewing them:
1. Manual STREAM nonce construction (drops `stream` cargo feature).
2. Atomic file output (`.tmp` + rename, with cleanup on failure).
3. Argon2id KDF + passphrase prompt + matching CLI flags.
4. Hardened secret handling: zeroize-on-drop, mlock'd buffers,
custom cross-platform tty reader (replaces `rpassword`).
Why
---
The prior version had three concrete weaknesses that were fine for
"early development" but unacceptable past that point:
* `--raw-key` was the only way to supply a key, exposing it in
`/proc/$pid/cmdline`. There was no passphrase mode at all.
* Crashes/aborts during encrypt could leave a half-written output
file in place of (or replacing) the user's target.
* Key material wasn't zeroed and could end up in swap or coredumps.
rpassword's reallocating String buffers also leaked stale heap
copies of typed passphrases that no `Zeroizing` wrapper could
reach after the fact.
(1) Manual STREAM nonces
------------------------
Replaces `aead::stream::EncryptorBE32` / `DecryptorBE32` with
explicit `make_nonce(prefix, counter, last)` and direct
`XChaCha20Poly1305::{encrypt,decrypt}_in_place` calls. The wire format
is unchanged (XChaCha20Poly1305 STREAM-BE32 = 19-byte prefix || 4-byte
big-endian counter || 1-byte last-block flag), so files written by the
previous version still decrypt. Counter overflow is now an explicit
`Format` error rather than a panic in the upstream stream wrapper.
This removes the `stream` cargo feature from `chacha20poly1305` and
prepares the encrypt path for parallelism: with explicit nonces we can
hand chunks to a worker pool keyed by counter without the stream
wrapper's stateful API getting in the way.
(2) Atomic file output
----------------------
New `utils::OutSink` writes to `<path>.tmp`, calls `sync_all()` on
`commit()`, and renames into place. If dropped without commit (panic,
crypto/IO error, ctrl-C), the temp file is unlinked so the existing
target is untouched. Stdout output is unaffected (no temp dance).
A new integration test (`atomic_output_no_stale_tmp_on_failure`)
verifies that a failed decrypt leaves neither the final output nor
the temp file behind.
(3) Argon2id + passphrase
-------------------------
New `KdfParams::Argon2id { salt, m_cost, t_cost, p_cost }` variant
encoded into the header (and authenticated as AAD), so tampering with
KDF params fails authentication on every chunk.
CLI surface (BREAKING):
* `--raw-key` is now optional; one of `--raw-key`, `--passphrase`,
`--passphrase-env <VAR>` is required.
* `--passphrase` prompts on the controlling terminal with echo off,
and asks for confirmation when encrypting.
* `--passphrase-env <VAR>` reads from a named env var; intended for
non-interactive use (scripts, tests). The env-table copy is a
known leak for that path.
* `--argon-memory <MiB>` (default 1024 = 1 GiB), `--argon-passes`
(default 2), `--argon-parallelism` (default 4). Names follow
argon2 RFC 9106 terminology; memory is MiB rather than KiB to
match how humans actually think about RAM. Defaults follow the
"Balanced" preset for 2026-era hardware (~1.5–4 s on a laptop).
The argon2 crate wants KiB internally, so the CLI value is
multiplied by 1024 with overflow-check.
(4) Secret hardening
--------------------
New `secrets` module provides:
* `SecretBytes32`: heap-allocated 32-byte buffer wrapped in
`Zeroizing<[u8; 32]>` and mlock'd via the `region` crate.
Field order ensures the lock guard drops *before* the buffer is
freed (otherwise munlock would target freed memory).
* `SecretVec`: fixed-capacity, mlock'd, zeroize-on-drop byte
buffer. `push()` rejects writes past the reserved capacity so
the underlying allocation never reallocates and moves — which
would invalidate the lock and leave a stale unzeroed copy on
the heap.
* `read_passphrase_tty()`: direct tty reader. On Unix, opens
`/dev/tty`, clears `ECHO` via `tcgetattr`/`tcsetattr` with an
RAII guard that restores termios on drop. On Windows, opens
`CONIN$`/`CONOUT$` and clears `ENABLE_ECHO_INPUT` via
`Get/SetConsoleMode`. Reads byte-by-byte into a pre-reserved
`SecretVec` (1024 bytes), so neither the Rust side nor the libc
side reallocates during read. This replaces `rpassword`, which
returned a `String` that grew by reallocation and left
unzeroed copies of typed passphrases on the heap.
`PartialEq` on `SecretVec` is constant-time-ish (length check +
xor-or accumulate) so the confirmation comparison doesn't early-out
on the first differing byte.
`disable_core_dumps()` calls `setrlimit(CORE, 0)` on Unix; on
Windows it's a no-op (WER/minidump suppression is a per-machine
policy and intentionally not done here).
`Cli`'s secret-bearing fields are moved out into local bindings at
the top of `run()` and the `Cli` is explicitly dropped, so they
don't sit in the parsed struct for the rest of the function.
`Cli.raw_key` is `Option<Zeroizing<String>>` so the field we own
zeroes itself on drop. Clap's own intermediate copies during
parsing are an accepted leak.
Threat model — what is and isn't covered
-----------------------------------------
Covered (best-effort):
* Secrets in coredumps → rlimit on Unix.
* Secrets paged to swap or hibernation → mlock on the AEAD key
and passphrase buffer.
* Half-written ciphertext on crash → atomic rename.
* Stale heap copies of typed passphrase → custom tty reader,
pre-reserved buffer.
* Stale stack/heap copies of the AEAD
key or passphrase post-process-exit → zeroize on drop.
Not covered (and not pretending to be):
* Live-process attackers with ptrace or `/proc/$pid/mem` access.
* The kernel's tty/line buffer.
* Clap's transient String allocations during arg parsing.
* The `environ` table copy of an env-var passphrase.
* Swap on systems without functioning mlock or with
`RLIMIT_MEMLOCK = 0`.
mlock is small (32 bytes + 1024 bytes — two pages at most on any
of the three target OSes), so it fits well under the typical
unprivileged `RLIMIT_MEMLOCK` of 64 KiB.
Portability
-----------
The whole binary targets Linux, macOS, and Windows 11 with the
same security properties where the OS supports them:
* `region` crate provides cross-platform mlock/munlock.
* `libc::tcgetattr`/`tcsetattr` covers Linux + macOS.
* `windows-sys` covers Console API.
* `rlimit` is gated to `cfg(unix)`.
The Windows tty path compiles in my head but is unverified on this
machine — there is no `x86_64-pc-windows-*` target installed and
no Windows runner. Treat that path as "best-effort, needs CI on
Windows" until exercised.
Files written by the previous v0.10 (Raw KDF, BE32 STREAM) are
still readable: the wire format is unchanged for that path.
Test plan
---------
Existing 17 integration tests pass unchanged. Two new tests:
* `roundtrip_passphrase_argon2id` — encrypts and decrypts via
`--passphrase-env` with cheap argon2 params (8 MiB / 1 pass) so
the test stays fast; also verifies that a wrong passphrase
fails.
* `atomic_output_no_stale_tmp_on_failure` — wrong-key decrypt
leaves neither the final file nor the `.tmp` in place.
Manual sanity (not automated): run with `--passphrase` on a
terminal and confirm echo is off and confirmation works.
Follow-ups (still in TODO.md)
-----------------------------
* Multi-threaded encrypt pipeline (now feasible — manual nonces).
* Length-committed mode + random-access decrypt fast path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4eee8e7a95
|
feat!: add file-format header, configurable chunks, integration tests
Introduce a self-describing on-disk format and use it to address several
shortcomings of the 0.9 file layout, where the file simply began with a
raw 19-byte STREAM nonce prefix and used a hardcoded 64 KiB chunk size.
What changed for users
----------------------
* fcry files now start with a 16-byte header: magic ("fcry"), version,
algorithm id, flags, reserved byte, plaintext chunk_size (u32 LE),
KDF id + params, then the 19-byte nonce prefix. The full encoded
header is bound as AAD to every chunk, so tampering with chunk_size,
algorithm id, nonce prefix, or any future KDF parameter causes
authentication failure on every chunk -- not just the first.
* New `--chunk-size` CLI flag (encryption only). The decryptor reads
the chunk size from the header, so files encrypted with a non-default
size decrypt without the user having to remember it.
* Default plaintext chunk size raised from 64 KiB to 1 MiB.
* Bad input is now reported as an error instead of panicking: empty
ciphertext, truncated final chunk, wrong magic, bad version, zero
chunk_size, unknown algorithm id, and short --raw-key all return a
non-zero exit status with a diagnostic on stderr.
* Empty plaintext now produces a valid (authenticated) empty
ciphertext instead of panicking; the decryptor verifies it.
* `main` exits with status 1 on error (previously it printed and
returned 0).
This is a breaking change to the file format: 0.9.x files have no magic
or header and cannot be read by 0.10.x. Version bumped to 0.10.0.
Why this approach
-----------------
The header-as-AAD pattern is the standard way to make file-format
metadata tamper-evident without a separate signature: any bit-flip in
the header propagates into every chunk's authentication tag check, so
an attacker cannot, for example, change chunk_size to mis-frame the
stream or downgrade the algorithm id.
Storing chunk_size in the header (rather than fixing it at compile
time) lets us experiment with chunk sizes without breaking decrypt
compatibility, and is preparation for the parallel-pipeline work in
Roadmap 1.0 where worker count and chunk size interact.
The KDF section is a tagged variant (currently only `Raw`) so that
adding Argon2id later only adds a new variant + its salt/cost fields;
existing files keep decrypting because they carry `kdf_id = 0`.
Other changes bundled in
------------------------
* Switch RNG from `rand` (0.10) to `getrandom` (0.3). We only need
OS-provided random bytes for the nonce prefix; pulling in the full
`rand` crate for one `OsRng.fill_bytes` call was overkill, and
`rand` 0.10's `OsRng` API churn makes `getrandom` the cleaner fit.
* `FcryError` gains a `Format(String)` variant for header / framing
errors and a `From<getrandom::Error>` impl (replacing the
`rand::Error` impl).
* Drop the noisy `[reader]` / `[encrypt]` / `[decrypt]` stderr
tracing prints and the `dbg!(&cli.raw_key)` (which leaked the key
to stderr).
* Replace `unwrap()` on file open / create with `?` so I/O errors
surface as structured `FcryError::Io` instead of aborting.
* Remove the unused `AheadReader::read_exact` wrapper -- the
decryptor now reads the header through the underlying `BufRead`
directly before wrapping it in `AheadReader`.
Tests
-----
Add `tests/roundtrip.rs` (assert_cmd + tempfile) covering: empty
input, single byte, sub-chunk, exact chunk, chunk+1, multi-chunk,
custom small chunk size (4096), pathological 1-byte chunk size,
stdin/stdout pipe mode, wrong key rejection, tampered header,
tampered ciphertext, truncated ciphertext, bad magic, short raw key,
and the header-is-authoritative property (encrypt with a weird chunk
size, decrypt without specifying one). Also adds a unit test in
`header.rs` for header encode/decode roundtrip and bad-magic rejection.
TODO.md trimmed to the concrete follow-up sequence (manual STREAM
nonces, secrets/rlimit, atomic output, argon2id KDF + prompt,
multi-threaded pipeline, length-committed mode).
Test plan
---------
* `cargo clippy && cargo clippy --tests` -- clean.
* `cargo +nightly fmt` -- no diff.
* `cargo test` -- 16 integration + 2 header unit tests pass.
* Manual: `echo hi | fcry --raw-key 0123456789abcdef0123456789abcdef
| fcry -d --raw-key 0123456789abcdef0123456789abcdef` prints `hi`.
Trailers
--------
Refs: TODO.md (Roadmap 1.0 follow-up sequence)
Breaking-Change: file format; 0.9.x files cannot be decrypted by 0.10.x
|
||
|
|
5e51b4bfe1
|
whatever | ||
|
|
1ae56389fc
|
[fix] debug prints have to go to stderr | ||
|
|
7ffd6a4a11
|
[deps] cargo update
Updating anstream v0.6.11 -> v0.6.12 Updating clap v4.5.0 -> v4.5.1 Updating clap_builder v4.5.0 -> v4.5.1 Updating syn v2.0.48 -> v2.0.49 |
||
|
|
392a752976
|
remove .cargo, disable lto for faster compilation during testing | ||
|
|
ad03e176c3
|
on the way to a usable version | ||
|
|
668e726c21
|
(docs) updated TODO and README to clarify project status | ||
|
|
d7e86d8f88
|
fcry - [f]ile[cry]pt - initial commit (alpha 0.9.0)
A file en-/decryption tool for easy use. Currently `fcry` uses `ChaCha20Poly1305` ([RFC 8439](https://datatracker.ietf.org/doc/html/rfc8439)) as [AEAD](https://en.wikipedia.org/wiki/Authenticated_encryption) cipher provided by the [chacha20poly1305](https://docs.rs/chacha20poly1305/latest/chacha20poly1305/) crate. |