Skip to content

ADR-0615: Fast NR Pre-Scoring for CRF Bisect Acceleration

  • Status: Proposed
  • Date: 2026-05-19
  • Deciders: lusoris
  • Tags: ai, planning, vmaf-tune

Context

Full-reference VMAF (FR VMAF) requires decoding both reference and distorted YUV, then running 6 feature extractors frame-by-frame. On a 10-second HD shot a single FR VMAF call takes 2–5 seconds on CPU. The Phase B bisect makes 5–8 calls per shot; a 500-shot title costs 10,000–20,000 VMAF-seconds of scoring. The in-tree nr_metric_v1.onnx model scores the distorted stream alone, with no reference decode, and runs in <200 ms per shot. Using it as a coarse proxy for early bisect iterations can cut wall-time substantially.

Decision

We will implement an NR-early-elimination strategy (Research 0611 Option B): at each bisect midpoint, compute the NR score. If |NR - target| > δ_fast (calibrated threshold, default 8 VMAF), skip the FR call and proceed in the NR-implied direction. If within the uncertainty zone, pay the FR cost. The threshold δ_fast is calibrated on the Netflix corpus via a fit of vmaf_fr ≈ f(vmaf_nr), with δ_fast = 2σ of the residual. An NRProxyBackend class is added to score_backend.py to encapsulate this logic; the seam is the existing backend selector.

Alternatives considered

Option Pros Cons Why not chosen
A — NR-only bisect + FR confirm on final CRF Maximum speedup (3–6×) Risk of selecting wrong CRF on NR-biased content Correctness risk too high without calibration validation
B — NR early elimination (chosen) Correct by construction at boundary; 2–4× speedup Two-model invocation; requires calibrating δ_fast
C — conformal-calibrated NR threshold Principled UQ; lowest false-pass rate 1–2 weeks extra for conformal calibration Deferred as V2 upgrade path
D — bitstream-feature NR (no decode) 5–10× speedup Encoder-specific parsers; non-portable Significant engineering; deferred

Consequences

  • Positive: 2–4× bisect wall-time reduction on in-domain content; directly enables the DO (ADR-0613) and per-shot ABR (ADR-0614) items whose O(candidates) inner loops become tractable.
  • Negative: Two-model inference adds ~200 ms overhead on shots where FR would have been sufficient; calibration requires a one-time corpus sweep.
  • Neutral / follow-ups: δ_fast must be documented in model/tiny/nr_metric_v1.json as a new calibration_threshold field. Conformal upgrade (Option C) is a natural V2.

Dependencies

  • model/tiny/nr_metric_v1.onnx — in-tree; no new model required.
  • tools/vmaf-tune/src/vmaftune/score_backend.py — integration point.
  • Netflix corpus (.workingdir2/netflix/) — for calibration sweep.

Implementation phases

Phase Description Effort
P1 NRProxyBackend in score_backend.py; unit tests with stub NR model 1 day
P2 Calibration sweep script against Netflix corpus; compute and record δ_fast 1 day
P3 Wire NRProxyBackend into Phase B bisect as opt-in --fast-nr flag 0.5 day
P4 Docs docs/usage/vmaf-tune-fast-nr.md 0.5 day

Total estimate: 3 days.

References

  • Research digest: docs/research/0611-fast-nr-prescoring-research.md.
  • model/tiny/nr_metric_v1.onnx, model/tiny/nr_metric_v1.json.
  • tools/vmaf-tune/src/vmaftune/score_backend.py, bisect.py.
  • tools/vmaf-tune/src/vmaftune/conformal.py — future conformal upgrade path.
  • Source: per user direction (roadmap planning session 2026-05-19).