ddidderr/fcry

Author	SHA1	Message	Date
ddidderr	75afadb1ec	feat!: multi-threaded pipeline + length-committed/random-access decrypt Completes the two follow-ups deferred from the v0.10 format/secrets work: multi-threaded AEAD encrypt/decrypt and a length-committed file format that enables random-access decryption. # Format change (file format v2) Bumps the on-disk header version to 2 and introduces a flag bit (`FLAG_LENGTH_COMMITTED`, bit 0). When set, an authenticated `u64 LE` plaintext length is appended to the header after the nonce prefix. v1 files still decrypt unchanged. v2 readers reject unknown flag bits. The flag is set automatically when the input is a regular file (we stat the open FD to avoid TOCTOU). Stdin/pipes/FIFOs encrypt as before with the flag clear. Sequential decrypt cross-checks the produced byte count against the committed length as defense in depth (the AEAD already authenticates the value via header AAD, but failing before we rename the temp file into place is preferable to failing after). # Random-access decrypt `fcry -d -i FILE --offset N --length L` seeks directly to the chunk(s) covering `[N, N+L)` and decrypts only those, without scanning the predecessors. Requires a seekable file whose header has the length-committed flag — stdin/pipe-encrypted files cannot use this path and the CLI rejects it with a clear error. The chunk layout is fully determined by `chunk_size` and the committed total length (last chunk's plaintext is `total - (n_chunks-1)chunk_size`; its ciphertext length is `last_pt + 16`). Each chunk's nonce is `make_nonce(prefix, chunk_index, is_last_chunk)` which matches what sequential encrypt produced, so plaintext slices come out bit-identical to a full sequential decrypt. # Multi-threaded pipeline New `src/pipeline.rs` implements: reader thread → bounded jobs channel → N AEAD workers → bounded results channel → writer thread The reader stays serial (it owns the input handle and uses lookahead to detect the last chunk). Workers parallelize the AEAD step (each chunk is independent under STREAM). The writer holds a `BTreeMap<u32, Vec<u8>>` reorder buffer and only flushes in counter order. Commit is deferred to the main thread, so a failure anywhere — reader I/O, AEAD auth, writer I/O — drops `OutSink` without renaming the temp file into place. The `atomic_output_no_stale_tmp_on_failure` integration test still passes. Channel and reorder capacities scale with worker count (`2threads`); peak memory is roughly `chunk_size * 4 * threads`. With 1 MiB chunks and 8 cores that's ~32 MiB, which we accept. Default thread count is `std::thread::available_parallelism()`; override with `-j/--threads N`. `-j 1` keeps the original serial path. Stdin/stdout streaming works under the parallel path because `Stdin` (unlocked) is `Send` — only `StdinLock` isn't, so the boxed reader wraps `Stdin` directly in a `BufReader`. Adds `crossbeam-channel = "0.5"` for bounded MPMC. The cipher (`XChaCha20Poly1305`) and the header AAD are shared across workers via `Arc`; the AEAD's internal key copy is zeroized on drop as before. # CLI surface -j, --threads <N> worker thread count (default: cores) --offset <BYTES> random-access decrypt: slice start --length <BYTES> random-access decrypt: slice length `--offset`/`--length` require `--decrypt` and `--input-file` (clap enforces; we also surface a clean runtime error if only one is supplied). # Test plan * `cargo test` — 5 unit + 27 integration, all green. * New integration coverage: - parallel roundtrip on multi-chunk inputs (`-j 4`) - parallel-encrypted ciphertext decrypted serially, and vice-versa (output bit-identical regardless of worker count) - parallel pipe stdin↔stdout (asserts flag byte is 0 for stdin inputs — no length committed without a known size) - file inputs auto-commit length (asserts version=2 and flags bit 0 set in the raw header bytes) - random-access slices spanning chunk-aligned, mid-chunk, last-chunk, and full-file ranges - random-access rejects out-of-range and stdin-encrypted inputs, accepts zero-length - tampering the committed length byte fails AEAD authentication - hand-crafted v1 header still decodes (no flag bit set) * `cargo clippy --all-targets -- -D warnings` clean. * `cargo +nightly fmt` clean. Removes `TODO.md` since both deferred items are now implemented. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 20:33:00 +02:00
ddidderr	f72f9034f3	feat(cli): default to interactive passphrase when no key source given Previously, invoking fcry without any of --raw-key, --passphrase, or --passphrase-env produced a hard error ("must provide one of ..."). The common, secure case (interactive TTY passphrase) thus required an explicit flag, while the dangerous case (--raw-key on the command line) was equally accessible. Make the secure path the default: if no key source is specified, fall back to PassphraseSource::Tty, which prompts on the terminal and runs argon2id on encrypt. Explicit --passphrase still works and is now redundant for the default invocation; --raw-key and --passphrase-env remain unchanged and still suppress the default. The previous "must provide one of ..." error path becomes unreachable and is removed: the only way pw_src is None is when raw_key_str is Some, which is handled by the existing encrypt/decrypt branches. User-visible change: `fcry -i foo -o foo.enc` now prompts for a passphrase instead of erroring out. Scripts that relied on the error to detect missing arguments will instead block on a TTY read; non-TTY callers should continue to pass --passphrase-env or --raw-key explicitly. Test Plan: - `fcry -i plain -o plain.enc` prompts twice (passphrase + confirm), then `fcry -d -i plain.enc -o plain.out` prompts once and round-trips. - `fcry --raw-key $(head -c32 /dev/urandom \| base64) ...` still works and does not prompt. - `PW=hunter2 fcry --passphrase-env PW ...` still works and does not prompt. - `fcry --passphrase --raw-key ...` still rejected by clap (conflicts_with_all).	2026-05-02 19:52:19 +02:00
ddidderr	669f5ed073	deps: cargo update	2026-05-02 19:16:50 +02:00
ddidderr	898697016a	refactor(secrets): back SecretBytes32/SecretVec with the `secrets` crate Replace the homegrown `region::lock` + `Zeroizing` wrappers with thin adapters over `secrets::SecretBox` and `secrets::SecretVec`. The upstream crate already provides everything the local types were doing by hand (mlock, zero-on-drop) and adds protections we didn't have: guard pages around the allocation and `mprotect`-based access control (`PROT_NONE` at rest, `PROT_READ` during a borrow, `PROT_READ\|WRITE` during a mut borrow). Net result is a security upgrade, not just a dependency swap. Why a local adapter still exists -------------------------------- `secrets::SecretVec` is fixed-length — no `push`. The tty passphrase reader needs to append bytes one at a time without ever reallocating (a panicking partial read must not leave stale plaintext on the heap), so `SecretVec` keeps a separate logical `len` over a fixed protected allocation of `MAX_PASSPHRASE_LEN` bytes. Bytes past `len` stay zero-padding and are never exposed through `with_slice`. API shape: closure-scoped borrows --------------------------------- The previous `as_slice` / `as_array` returned long-lived `&[u8]` references, which would have kept the upstream pages in `PROT_READ` for the full lifetime of the borrow. The new API uses `with_array(\|s\| ...)`, `with_mut_array(\|s\| ...)`, `with_slice(\|s\| ...)` so the unprotected window is exactly the closure body. This is uglier at call sites (notably the nested closures in `derive_key`) but it's the right tradeoff — minimizing the unprotected window is the whole point of using the crate. AEAD key copy footnote ---------------------- `XChaCha20Poly1305::new` copies the key into its own (unprotected) state, which then lives in the `aead` binding for the entire encrypt/decrypt loop. This is unchanged from before — the cipher state was never protected — but it's now called out explicitly with a comment at both call sites noting that `chacha20poly1305` zeroizes that internal copy on drop. Future readers shouldn't have to rediscover this by reading upstream source. `from_vec` zeroing ------------------ `SecretVec::from_vec(v: Vec<u8>)` is used on the env-var path. It calls `secrets::SecretVec::from(&mut [u8])`, which (verified against secrets-1.3.0: `Box::from` -> `transfer` -> `memtransfer`) copies the bytes into protected storage and zeroes the source slice. The original Vec's allocation is then released through the normal allocator — the bytes inside it are zero, but the heap block itself isn't specially handled. The doc comment on `from_vec` reflects this precisely. As before, the env-var path also leaves a copy in the process `environ` table, which is a known accepted leak. Cargo.toml ---------- Use `protected-secrets = { package = "secrets", version = "1.3" }` with default features. The `secrets` crate has no pure-Rust backend at v1.3 — disabling default features only switches how libsodium is linked (bundled `libsodium-sys` vs. the crate's own bindings to a system libsodium), and can break builds where the chosen path isn't set up. Defaults are correct here. The `region` dependency is dropped. Test plan --------- - `cargo build` clean. - `cargo test` — 2 unit + 18 integration tests pass, including `roundtrip_passphrase_argon2id` which exercises the full passphrase -> argon2id -> AEAD key path through the new wrappers. - `cargo clippy` (and `--tests`, `--benches`) clean. - `cargo +nightly fmt` applied.	2026-05-02 19:14:42 +02:00
ddidderr	fe65e1f899	feat!: argon2id passphrases, secret hardening, atomic output, manual STREAM This commit lands four follow-up items that were explicitly deferred in TODO.md after the prior file-format change, plus a CLI/units cleanup that fell out of reviewing them: 1. Manual STREAM nonce construction (drops `stream` cargo feature). 2. Atomic file output (`.tmp` + rename, with cleanup on failure). 3. Argon2id KDF + passphrase prompt + matching CLI flags. 4. Hardened secret handling: zeroize-on-drop, mlock'd buffers, custom cross-platform tty reader (replaces `rpassword`). Why --- The prior version had three concrete weaknesses that were fine for "early development" but unacceptable past that point: * `--raw-key` was the only way to supply a key, exposing it in `/proc/$pid/cmdline`. There was no passphrase mode at all. * Crashes/aborts during encrypt could leave a half-written output file in place of (or replacing) the user's target. * Key material wasn't zeroed and could end up in swap or coredumps. rpassword's reallocating String buffers also leaked stale heap copies of typed passphrases that no `Zeroizing` wrapper could reach after the fact. (1) Manual STREAM nonces ------------------------ Replaces `aead::stream::EncryptorBE32` / `DecryptorBE32` with explicit `make_nonce(prefix, counter, last)` and direct `XChaCha20Poly1305::{encrypt,decrypt}_in_place` calls. The wire format is unchanged (XChaCha20Poly1305 STREAM-BE32 = 19-byte prefix \|\| 4-byte big-endian counter \|\| 1-byte last-block flag), so files written by the previous version still decrypt. Counter overflow is now an explicit `Format` error rather than a panic in the upstream stream wrapper. This removes the `stream` cargo feature from `chacha20poly1305` and prepares the encrypt path for parallelism: with explicit nonces we can hand chunks to a worker pool keyed by counter without the stream wrapper's stateful API getting in the way. (2) Atomic file output ---------------------- New `utils::OutSink` writes to `<path>.tmp`, calls `sync_all()` on `commit()`, and renames into place. If dropped without commit (panic, crypto/IO error, ctrl-C), the temp file is unlinked so the existing target is untouched. Stdout output is unaffected (no temp dance). A new integration test (`atomic_output_no_stale_tmp_on_failure`) verifies that a failed decrypt leaves neither the final output nor the temp file behind. (3) Argon2id + passphrase ------------------------- New `KdfParams::Argon2id { salt, m_cost, t_cost, p_cost }` variant encoded into the header (and authenticated as AAD), so tampering with KDF params fails authentication on every chunk. CLI surface (BREAKING): * `--raw-key` is now optional; one of `--raw-key`, `--passphrase`, `--passphrase-env <VAR>` is required. * `--passphrase` prompts on the controlling terminal with echo off, and asks for confirmation when encrypting. * `--passphrase-env <VAR>` reads from a named env var; intended for non-interactive use (scripts, tests). The env-table copy is a known leak for that path. * `--argon-memory <MiB>` (default 1024 = 1 GiB), `--argon-passes` (default 2), `--argon-parallelism` (default 4). Names follow argon2 RFC 9106 terminology; memory is MiB rather than KiB to match how humans actually think about RAM. Defaults follow the "Balanced" preset for 2026-era hardware (~1.5–4 s on a laptop). The argon2 crate wants KiB internally, so the CLI value is multiplied by 1024 with overflow-check. (4) Secret hardening -------------------- New `secrets` module provides: * `SecretBytes32`: heap-allocated 32-byte buffer wrapped in `Zeroizing<[u8; 32]>` and mlock'd via the `region` crate. Field order ensures the lock guard drops before the buffer is freed (otherwise munlock would target freed memory). * `SecretVec`: fixed-capacity, mlock'd, zeroize-on-drop byte buffer. `push()` rejects writes past the reserved capacity so the underlying allocation never reallocates and moves — which would invalidate the lock and leave a stale unzeroed copy on the heap. * `read_passphrase_tty()`: direct tty reader. On Unix, opens `/dev/tty`, clears `ECHO` via `tcgetattr`/`tcsetattr` with an RAII guard that restores termios on drop. On Windows, opens `CONIN$`/`CONOUT$` and clears `ENABLE_ECHO_INPUT` via `Get/SetConsoleMode`. Reads byte-by-byte into a pre-reserved `SecretVec` (1024 bytes), so neither the Rust side nor the libc side reallocates during read. This replaces `rpassword`, which returned a `String` that grew by reallocation and left unzeroed copies of typed passphrases on the heap. `PartialEq` on `SecretVec` is constant-time-ish (length check + xor-or accumulate) so the confirmation comparison doesn't early-out on the first differing byte. `disable_core_dumps()` calls `setrlimit(CORE, 0)` on Unix; on Windows it's a no-op (WER/minidump suppression is a per-machine policy and intentionally not done here). `Cli`'s secret-bearing fields are moved out into local bindings at the top of `run()` and the `Cli` is explicitly dropped, so they don't sit in the parsed struct for the rest of the function. `Cli.raw_key` is `Option<Zeroizing<String>>` so the field we own zeroes itself on drop. Clap's own intermediate copies during parsing are an accepted leak. Threat model — what is and isn't covered ----------------------------------------- Covered (best-effort): * Secrets in coredumps → rlimit on Unix. * Secrets paged to swap or hibernation → mlock on the AEAD key and passphrase buffer. * Half-written ciphertext on crash → atomic rename. * Stale heap copies of typed passphrase → custom tty reader, pre-reserved buffer. * Stale stack/heap copies of the AEAD key or passphrase post-process-exit → zeroize on drop. Not covered (and not pretending to be): * Live-process attackers with ptrace or `/proc/$pid/mem` access. * The kernel's tty/line buffer. * Clap's transient String allocations during arg parsing. * The `environ` table copy of an env-var passphrase. * Swap on systems without functioning mlock or with `RLIMIT_MEMLOCK = 0`. mlock is small (32 bytes + 1024 bytes — two pages at most on any of the three target OSes), so it fits well under the typical unprivileged `RLIMIT_MEMLOCK` of 64 KiB. Portability ----------- The whole binary targets Linux, macOS, and Windows 11 with the same security properties where the OS supports them: * `region` crate provides cross-platform mlock/munlock. * `libc::tcgetattr`/`tcsetattr` covers Linux + macOS. * `windows-sys` covers Console API. * `rlimit` is gated to `cfg(unix)`. The Windows tty path compiles in my head but is unverified on this machine — there is no `x86_64-pc-windows-` target installed and no Windows runner. Treat that path as "best-effort, needs CI on Windows" until exercised. Files written by the previous v0.10 (Raw KDF, BE32 STREAM) are still readable: the wire format is unchanged for that path. Test plan --------- Existing 17 integration tests pass unchanged. Two new tests: `roundtrip_passphrase_argon2id` — encrypts and decrypts via `--passphrase-env` with cheap argon2 params (8 MiB / 1 pass) so the test stays fast; also verifies that a wrong passphrase fails. * `atomic_output_no_stale_tmp_on_failure` — wrong-key decrypt leaves neither the final file nor the `.tmp` in place. Manual sanity (not automated): run with `--passphrase` on a terminal and confirm echo is off and confirmation works. Follow-ups (still in TODO.md) ----------------------------- * Multi-threaded encrypt pipeline (now feasible — manual nonces). * Length-committed mode + random-access decrypt fast path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 18:26:44 +02:00
ddidderr	4eee8e7a95	feat!: add file-format header, configurable chunks, integration tests Introduce a self-describing on-disk format and use it to address several shortcomings of the 0.9 file layout, where the file simply began with a raw 19-byte STREAM nonce prefix and used a hardcoded 64 KiB chunk size. What changed for users ---------------------- * fcry files now start with a 16-byte header: magic ("fcry"), version, algorithm id, flags, reserved byte, plaintext chunk_size (u32 LE), KDF id + params, then the 19-byte nonce prefix. The full encoded header is bound as AAD to every chunk, so tampering with chunk_size, algorithm id, nonce prefix, or any future KDF parameter causes authentication failure on every chunk -- not just the first. * New `--chunk-size` CLI flag (encryption only). The decryptor reads the chunk size from the header, so files encrypted with a non-default size decrypt without the user having to remember it. * Default plaintext chunk size raised from 64 KiB to 1 MiB. * Bad input is now reported as an error instead of panicking: empty ciphertext, truncated final chunk, wrong magic, bad version, zero chunk_size, unknown algorithm id, and short --raw-key all return a non-zero exit status with a diagnostic on stderr. * Empty plaintext now produces a valid (authenticated) empty ciphertext instead of panicking; the decryptor verifies it. * `main` exits with status 1 on error (previously it printed and returned 0). This is a breaking change to the file format: 0.9.x files have no magic or header and cannot be read by 0.10.x. Version bumped to 0.10.0. Why this approach ----------------- The header-as-AAD pattern is the standard way to make file-format metadata tamper-evident without a separate signature: any bit-flip in the header propagates into every chunk's authentication tag check, so an attacker cannot, for example, change chunk_size to mis-frame the stream or downgrade the algorithm id. Storing chunk_size in the header (rather than fixing it at compile time) lets us experiment with chunk sizes without breaking decrypt compatibility, and is preparation for the parallel-pipeline work in Roadmap 1.0 where worker count and chunk size interact. The KDF section is a tagged variant (currently only `Raw`) so that adding Argon2id later only adds a new variant + its salt/cost fields; existing files keep decrypting because they carry `kdf_id = 0`. Other changes bundled in ------------------------ * Switch RNG from `rand` (0.10) to `getrandom` (0.3). We only need OS-provided random bytes for the nonce prefix; pulling in the full `rand` crate for one `OsRng.fill_bytes` call was overkill, and `rand` 0.10's `OsRng` API churn makes `getrandom` the cleaner fit. * `FcryError` gains a `Format(String)` variant for header / framing errors and a `From<getrandom::Error>` impl (replacing the `rand::Error` impl). * Drop the noisy `[reader]` / `[encrypt]` / `[decrypt]` stderr tracing prints and the `dbg!(&cli.raw_key)` (which leaked the key to stderr). * Replace `unwrap()` on file open / create with `?` so I/O errors surface as structured `FcryError::Io` instead of aborting. * Remove the unused `AheadReader::read_exact` wrapper -- the decryptor now reads the header through the underlying `BufRead` directly before wrapping it in `AheadReader`. Tests ----- Add `tests/roundtrip.rs` (assert_cmd + tempfile) covering: empty input, single byte, sub-chunk, exact chunk, chunk+1, multi-chunk, custom small chunk size (4096), pathological 1-byte chunk size, stdin/stdout pipe mode, wrong key rejection, tampered header, tampered ciphertext, truncated ciphertext, bad magic, short raw key, and the header-is-authoritative property (encrypt with a weird chunk size, decrypt without specifying one). Also adds a unit test in `header.rs` for header encode/decode roundtrip and bad-magic rejection. TODO.md trimmed to the concrete follow-up sequence (manual STREAM nonces, secrets/rlimit, atomic output, argon2id KDF + prompt, multi-threaded pipeline, length-committed mode). Test plan --------- * `cargo clippy && cargo clippy --tests` -- clean. * `cargo +nightly fmt` -- no diff. * `cargo test` -- 16 integration + 2 header unit tests pass. * Manual: `echo hi \| fcry --raw-key 0123456789abcdef0123456789abcdef \| fcry -d --raw-key 0123456789abcdef0123456789abcdef` prints `hi`. Trailers -------- Refs: TODO.md (Roadmap 1.0 follow-up sequence) Breaking-Change: file format; 0.9.x files cannot be decrypted by 0.10.x	2026-05-02 17:22:47 +02:00
ddidderr	5e51b4bfe1	whatever	2026-05-02 16:20:20 +02:00
ddidderr	1ae56389fc	[fix] debug prints have to go to stderr	2024-05-08 21:08:46 +02:00
ddidderr	7ffd6a4a11	[deps] cargo update Updating anstream v0.6.11 -> v0.6.12 Updating clap v4.5.0 -> v4.5.1 Updating clap_builder v4.5.0 -> v4.5.1 Updating syn v2.0.48 -> v2.0.49	2024-02-18 23:33:07 +01:00
ddidderr	392a752976	remove .cargo, disable lto for faster compilation during testing	2024-02-14 23:36:56 +01:00
ddidderr	ad03e176c3	on the way to a usable version	2024-02-14 22:23:57 +01:00
ddidderr	668e726c21	(docs) updated TODO and README to clarify project status	2022-04-10 21:31:44 +02:00
ddidderr	d7e86d8f88	fcry - [f]ile[cry]pt - initial commit (alpha 0.9.0) A file en-/decryption tool for easy use. Currently `fcry` uses `ChaCha20Poly1305` ([RFC 8439](https://datatracker.ietf.org/doc/html/rfc8439)) as [AEAD](https://en.wikipedia.org/wiki/Authenticated_encryption) cipher provided by the [chacha20poly1305](https://docs.rs/chacha20poly1305/latest/chacha20poly1305/) crate.	2022-04-10 21:09:08 +02:00

13 Commits