ADR-0872: POSIX I/O EINTR-retry + return-value audit on fork-added C¶
- Status: Accepted
- Date: 2026-05-30
- Deciders: lusoris
- Tags:
correctness,mcp,posix,nasa-power-of-10
Context¶
CLAUDE.md §6 mandates Power-of-10 rule 7: "every non-void return value is checked or explicitly (void)-discarded." Two related concerns combine to make POSIX I/O the highest-risk area for silent violations:
- EINTR semantics. A blocking POSIX call (
read,write,close,accept, ...) can return-1witherrno == EINTRwhen a signal interrupts the kernel wait. Code that does not retry on EINTR treats the spurious wake-up as a hard error or as end-of-stream, silently losing data or corrupting stream framing. - Discarded return values. Even on best-effort cleanup paths, leaving the return naked of a
(void)cast hides intent and trips clang-tidy / cppcheck pickier configurations, and erodes the rule by precedent.
The fork's MCP transports (stdio, uds, sse) and compute_vmaf already had hardened read_line / write_all / read_exact helpers that did the EINTR retry + partial-read/write loop correctly. But two secondary drain loops (the "skip to next LF" path used when a JSON-RPC request line exceeds the 64 KiB bound) did not retry on EINTR — under signal pressure they could exit on a spurious EINTR and leave the rest of the over-long line in the kernel buffer, desynchronising the next request boundary.
Seven close(2) call sites scattered across core/src/libvmaf.c, core/src/feature/cambi.c, core/src/sycl/dmabuf_import.cpp, and core/tools/vmaf_vpl.c discarded the return value without the (void) cast that rule 7 demands.
Decision¶
We will:
- Retry on EINTR in the two MCP drain loops so a signal-interrupted single-byte
read(2)resumes instead of bailing. Pattern:
ssize_t r = read(fd, &c, 1);
if (r < 0) {
if (errno == EINTR) continue;
break;
}
if (r == 0 || c == '\n') break;
The previous combined r <= 0 test conflated EOF (legitimate end- of-loop) with EINTR (must retry) with hard I/O error (must bail). 2. Mark every fork-added close(2) call site that intentionally discards the return with an explicit (void) cast. These are all cleanup or fail-path closures where we cannot meaningfully act on EBADF / EIO — but the cast signals intent and silences clang-tidy's cert-err33-c.
No silent data-loss EINTR-violations exist in the primary I/O paths (read_line, write_all_with_newline, sse_read_n, read_exact all already implement the textbook pattern). No partial-write defects exist either — every write_all* helper loops until off == len.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Leave drain-loop EINTR unhandled | Zero change | Silent stream desync under signal pressure — exact class of bug rule 7 is designed to prevent | Rejected: defeats the rule's intent |
Wrap close(2) in a helper macro vmaf_close() that logs on failure | One choke point | All current sites are cleanup paths where logging would add noise without actionable signal; would require new header churn | Rejected: cost > benefit for ~7 sites |
| Audit only the MCP surface (skip the 7 close calls) | Smaller blast radius | Leaves rule-7 violations in tree visible to clang-tidy / next maintainer; rule applies project-wide | Rejected: half-measure |
Convert drain loops to MSG_PEEK + bulk recv | Fewer syscalls per skipped byte | MSG_PEEK is socket-only and the stdio transport reads from a pipe / tty; would diverge the two transports' implementations | Rejected: keeps the cross-transport mirroring discipline already in tree |
Consequences¶
- Positive: MCP transports survive signal-rich environments (containerised runtimes with periodic SIGCHLD / SIGURG, debuggers attaching) without silently corrupting the next request. Rule 7 stays clean on every fork-added C surface that uses
close(2). - Negative: A few extra
(void)casts add visual weight but cost nothing at runtime. The drain loop EINTR retry could in theory loop indefinitely under a pathological signal storm; in practice the loop is still bounded by the caller's read budget (the line is capped atVMAF_MCP_MAX_LINE_BYTES, so even if every byte triggers one EINTR retry the total iterations are bounded). - Neutral / follow-ups: The vendored
core/src/svm.cpp:2522close(fd)on the libsvm-rewrite fail path was left untouched — it is upstream libsvm code that the fork patches but does not own; per CLAUDE.md "vendored is in scope" the fork still has the right to fix it, but the cost (touching a long-stable port-only file for a fail-path cast) outweighs the benefit. Flagged for a future libsvm rebase pass.
References¶
- CLAUDE.md §6 — Power-of-10 rule 7 ("every non-void return value is checked or explicitly
(void)-discarded"). - docs/principles.md — NASA/JPL Power of 10 applicable subset.
- ADR-0278 — NOLINT citation closeout precedent for rule-7 hygiene.
- POSIX.1-2017
read(2),write(2),close(2),accept(2)— EINTR semantics. - Source:
req(direct user direction, audit prompt 2026-05-30: "Audit C/C++ I/O calls for proper EINTR handling + missing error checks.").