FR regressor v2 — codec-aware (superseded design card)¶
vmaf_tiny_fr_regressor_v2_codec_aware — a codec-conditioned successor to the v1 FR MOS regressor (fr_regressor_v1.onnx). Maps a libvmaf FULL_FEATURES vector plus a one-hot codec id to a single MOS scalar, lifting cross-codec PLCC/SROCC by 1–3 points per the literature cited in ADR-0235 and Research-0040.
Status: superseded by
fr_regressor_v2. This ADR-0235-era card described a canonical-9 /FULL_FEATUREScodec-aware graph before the vmaf-tune Phase-A corpus path became the production route. No separatemodel/tiny/fr_regressor_v2_codec_aware.onnxis shipped. The registry-backed production checkpoint isfr_regressor_v2.onnx(smoke: false), and the wider 16-slot vocabulary successor isfr_regressor_v3.
What the output means¶
One scalar score per (frame, codec) pair, calibrated to the same MOS range as fr_regressor_v1 (typically [0, 100], with the training corpus's per-clip MOS pooling determining the floor / ceiling empirically).
Codec vocabulary¶
The model concatenates a one-hot codec id to its FULL_FEATURES input. The vocabulary is closed and order-stable — ai/src/vmaf_train/codec.py:
| Index | Codec | Distortion signature |
|---|---|---|
| 0 | x264 | Block edges, deblocking-filter aliasing |
| 1 | x265 | CTU-boundary blur, SAO ringing |
| 2 | libsvtav1 | DCT ringing, CDEF + loop-restoration smoothing |
| 3 | libvvenc | Large-CTU deblocking, ALF banding |
| 4 | libvpx-vp9 | Block-transform + loop-filter signature |
| 5 | unknown | Fallback for corpora without codec metadata |
Aliases at parquet-ingest time: h264 → x264, hevc → x265, av1 → libsvtav1, vp9 → libvpx-vp9, vvc / h266 → libvvenc. Anything else collapses to unknown.
The vocabulary is versioned via CODEC_VOCAB_VERSION = 1. Adding a codec is a schema bump that requires retraining; the model-card sidecar pins the vocabulary version against which the ONNX was trained.
Historical ONNX I/O contract¶
Two-input session, matching the LPIPS-Sq precedent (ADR-0040, ADR-0041):
features: float32 [N, 22] # FULL_FEATURES vector
codec_onehot: float32 [N, 6] # one-hot codec id
score: float32 [N] # MOS scalar
Variance-mode export emits score_logvar: float32 [N, 2] (matching the v1 emit_variance precedent — ADR-0207 + the confidence.py gaussian NLL path).
Backwards compatibility with v1¶
FRRegressor(num_codecs=0) is the v1 contract — single input, single output, identical to fr_regressor_v1.onnx. Existing checkpoints load unchanged, the v1 ONNX in model/tiny/ continues to ship, and inference callers without codec metadata can keep calling the v1 graph. v2 ships only when the empirical lift clears the 0.005 PLCC bar (per ADR-0235's "no negative releases" rule).
Historical training recipe¶
Same as v1 (ai/src/vmaf_train/train.py driver, FRRegressor hyperparameters per the C1 baseline) with three deltas:
- Constructor:
FRRegressor(in_features=22, num_codecs=6, ...). - Datamodule: emit 3-tuple
(features, codec_onehot, mos)batches instead of 2-tuple(features, mos). The codec one-hot comes from the parquet'scodeccolumn viavmaf_train.codec.codec_one_hot_batch. - Export:
export_to_onnxcall uses two input names (features,codec_onehot) with dynamic batch axis on both.
Re-extracting features with codec metadata¶
# Netflix Public corpus — distortions are pre-encoded, no codec metadata
python ai/scripts/extract_full_features.py --codec unknown
# BVI-DVC — script encodes internally with libx264 at CRF 35
python ai/scripts/bvi_dvc_to_full_features.py --tier D --codec x264
# KoNViD-1k — synthetic distorted side encoded with libx264
python ai/scripts/konvid_to_full_features.py \
--konvid-root "$VMAF_KONVID_1K_DIR" \
--vmaf-bin core/build-cpu/tools/vmaf \
--codec x264
# Optional audit mode: use ffprobe's source stream codec_name instead.
python ai/scripts/konvid_to_full_features.py \
--konvid-root "$VMAF_KONVID_1K_DIR" \
--vmaf-bin core/build-cpu/tools/vmaf \
--codec-from-source
Known limitations of this superseded design¶
- 22-feature
FULL_FEATURESset only. Re-running with a different feature pool requires a fresh export. - Codec one-hot, not embedding. New codecs can't be added at inference time without retraining (see ADR-0235's alternatives table for why we chose one-hot over
nn.Embedding). - Not the shipped v2 graph. The shipped
fr_regressor_v2.onnxuses canonical-6 features plus an 8-D codec block documented infr_regressor_v2.md, not this card'sFULL_FEATURES+codec_onehotshape. - Single-frame regression. No temporal pooling inside the graph; pool with
mean/harmonic_meanat theVmafContextlevel.
See also¶
- ADR-0235 — design decision.
- Research-0040 — empirical context.
ai/src/vmaf_train/codec.py— vocabulary + one-hot helpers.ai/src/vmaf_train/models/fr_regressor.py— codec-aware model surface.- overview.md — C1 capability map.