ADR-0882: Fuzz target audit — JSON model + DNN sidecar harness expansion¶

Status: Accepted
Date: 2026-05-30
Deciders: lusoris, Claude
Tags: ci, security, fuzzing, dnn

Context¶

ADR-0270 landed the libFuzzer scaffold with fuzz_y4m_input; ADR-0311 expanded it with fuzz_yuv_input + fuzz_cli_parse. Research-0083 ranked the remaining parser surfaces and identified two deferred targets — fuzz_model_load (rank #3, libvmaf SVM model JSON parser) and fuzz_sidecar (rank #4, tiny-AI sidecar JSON loader) — as the next highest risk-weighted coverage delta for the smallest harness LOC. Both surfaces are attacker-reachable (-m path=… / --tiny-model path=… from the CLI) and have classic fuzz-amenable shapes (incremental tokeniser feeding arithmetic-grown arrays for the SVM parser; hand-rolled strstr / strchr walkers with per-key heap allocations for the sidecar parser). Both targets were green-lit in Research-0083 but punted from the ADR-0311 PR to keep that change focused.

Decision¶

We will ship fuzz_json_model (wraps vmaf_read_json_model_from_buffer and the collection variant) and fuzz_dnn_sidecar (wraps vmaf_dnn_sidecar_load), each with a small seed corpus committed verbatim and wired into the nightly .github/workflows/fuzz.yml matrix. The harnesses are opt-in via -Dfuzz=true (clang-only) and pair with AddressSanitizer + UBSan; the nightly runs each for 60 s against the seed corpus and uploads any crash artefacts. First-run findings are documented in docs/state.md per ADR-0404 (keep gates running, document real bugs, no continue-on-error silencing).

Alternatives considered¶

Option	Pros	Cons	Why not chosen
Ship both harnesses with full CI wiring (chosen)	Closes the two deferred targets from Research-0083 in one PR; nightly coverage starts on next scheduled run; matches the precedent set by ADR-0311	First run will fail nightly until the surfaced fork-local OOB in `vmaf_model_destroy` is fixed	The ADR-0404 policy explicitly chose CI-red-but-honest over CI-green-but-silent; the harness is doing its job
Ship the harnesses but mark them `continue-on-error` until backing fixes land	CI badge stays green during the triage window	Direct violation of `feedback_no_test_weakening` and ADR-0404; silences a working detector	Aligned with the user's standing rule; ADR-0404 already adjudicated this exact trade-off for `fuzz_y4m_input`
Defer `fuzz_dnn_sidecar` until ORT integration stabilises (only ship `fuzz_json_model`)	Smaller PR; fewer moving parts	Sidecar parser already ships in master; the `extract_string` / `extract_string_array` family is fork-local code with no upstream fuzz coverage; deferring further keeps a high-risk surface untested	Both targets were already audited and green-lit in Research-0083; further deferral is purely procedural
Onboard the whole stack to OSS-Fuzz instead of expanding the in-tree harness set	Continuous fuzz at OSS-Fuzz scale; canonical "full" Scorecard credit	Onboarding has its own ADR-level scope (build container, contact email, dictionary set, manual maintainer review)	Tracked separately under ADR-0270 §Alternatives — orthogonal to the in-tree harness expansion

Consequences¶

Positive:
Two more attacker-reachable parsers gain continuous fuzz coverage on the nightly job.
The Scorecard Fuzzing check stays in the "partial" bucket but with broader matrix breadth visible to reviewers (5 harnesses vs 3).
The first run surfaced one real fork-local heap-buffer-overflow (T-JSON-MODEL-SLOPES-FEATURE-CAP-OOB-2026-05-30 in docs/state.md), demonstrating the harness works as intended.
Negative:
The nightly fuzz_json_model leg will fail until the slopes / feature_cap reconciliation lands in a follow-up PR. Per ADR-0404 this is the expected behaviour for a working detector.
The link strategy compiles core/src/read_json_model.c + pdjson.c + dict.c + log.c + dnn/model_loader.c directly into the harness binaries (instead of linking against libvmaf.so) because libvmaf is built with -fvisibility=hidden (ADR-0379) and the relevant entry points are not VMAF_EXPORTed. This mirrors the existing test_model + test_model_loader precedents in core/test/meson.build and core/test/dnn/meson.build. The harness build additionally requires -Db_lto=false because ASan + LTO bitcode discards module-dtor sections at link time on the larger source set.
Neutral / follow-ups:
core/test/fuzz/json_model_known_crashes/slopes_oob_destroy.bin is committed for regression coverage after the fix lands. It is deliberately excluded from the nightly seed corpus per the precedent established by y4m_input_known_crashes/. The .bin extension (rather than .json) sidesteps the pre-commit check-json hook, which would otherwise refuse to commit a fuzzer-mutated payload that is intentionally not valid JSON.
core/test/fuzz/README.md is updated with the two new rows in the Targets table and a known-crash note for the json_model OOB.
docs/research/0083-libfuzzer-harness-expansion-target-survey.md rows #3 + #4 are now landed; rows #5–#7 (fuzz_per_shot, fuzz_output, fuzz_dnn_load) remain backlog.

References¶

ADR-0270 — initial scaffold.
ADR-0311 — yuv_input + cli_parse expansion (the direct predecessor).
ADR-0379 — -fvisibility=hidden policy that forces source-level link for non-exported parsers.
ADR-0404 — keep-gates-running policy when a harness surfaces a real bug.
Research-0083 — target ranking that green-lit the deferred pair.
libFuzzer (LLVM).
OSSF Scorecard Fuzzing check.
Source: agent task directive 2026-05-30 — "Audit existing fuzz targets + add fuzzers for high-risk parsing surfaces that lack one." (paraphrased)