Skip to content

Research-0033 — Bristol VI-Lab Neural Video Compression review (2026-04 preprint)

Field Value
Date 2026-05-01
Status Audit; actionable items extracted, no code change
Source .workingdir2/preprints202604.0035.v1.{pdf,html,zip} (40 pages)
Authors of source Gao, Feng, Jiang, Peng, Kwan, Teng, Zeng, Li, Wang, Hamilton, Qi, Zhang, Bull (University of Bristol VI-Lab)
Tags literature, ai, fr-regressor, codec, lpips, dists

What the preprint is

Advances in Neural Video Compression: A Review and Benchmarking (Gao et al. 2026). Taxonomy of scene-agnostic vs scene-adaptive Neural Video Compression (NVC) across paradigms / backbones / test protocols (258 references), followed by an empirical BD-rate + complexity benchmark of DCVC-DC/FM/RT, MaskCRT, PNVC, GIViC, NVRC, HiNeRV, C3 against VTM-20, AV1-3.8.1, ECM-12, AVM-2 on UVG / MCL-JCV / HEVC B-E / AOM A2-A5 with PSNR, MS-SSIM, VMAF as quality metrics.

The paper is not about VMAF directly, but uses VMAF as one of three reporting metrics throughout §6 and lifts several pieces of prior art that map directly to fork-local surfaces.

Direct relevance to this fork (ranked by actionability)

1. Tiny-AI training corpus expansion — BVI-AOM ingest (1 week + ADR)

Surface: ai/src/vmaf_train/data/datasets.py, docs/ai/training-data.md. Mirror the existing BVI-DVC manifest.

The paper's Table 2 catalogues canonical NVC training/test sets: BVI-DVC (Ma, Zhang, Bull 2021 — already ingested) and BVI-AOM (Nawała et al. 2024 — new actionable data we don't ingest). 956 sequences, AOM-CTC-aligned. Concrete change: add BVI-AOM ingest path next to the existing BVI-DVC one.

Source: §5.1.1, Table 2, refs [216,217,224].

2. NEG-VMAF caveat in fr_regressor_v1 model card (1 day, no ADR)

Surface: ai/src/vmaf_train/models/fr_regressor.py, docs/ai/models/fr_regressor_v1.md.

§5.3 cites Siniukov et al. 2021 [238] on "VMAF hacking by pre/post processing" and Netflix's NEG-VMAF response [235]. Our fr_regressor_v1 is trained as a VMAF regressor and inherits this vulnerability. Concrete change: add a NEG-style adversarial holdout (sharpened/contrast-boosted distortions) to the eval suite, document the NEG caveat in the model card.

Source: §5.3, refs [235], [238].

3. ST-VMAF / Zhang-2021 prior art for fr_regressor_v2 design space (1 day digest)

Surface: future fr_regressor_v2, model registry.

§5.3 cites Bampis et al. 2018 [236] (ST-VMAF, spatiotemporal feature integration) and Zhang et al. 2021 [237] ("Enhancing VMAF through new feature integration and model combination", PCS 2021, Bull lab) as published improvements over stock VMAF. These are direct prior work on the architectural axis our fr_regressor already explores (motion-aware FR fusion). Read [237], compare its feature set to our Phase-3 sweep, file a research digest. Multi-week if we re-implement.

Source: §5.3, refs [236,237].

4. NVC-style BD-rate report recipe (1 day, doc-only)

Surface: python/vmaf/tools/bd_rate_calculator.py, docs/usage/.

Tables 3 & 4 use VMAF as a co-equal axis with PSNR and MS-SSIM for BD-rate against VTM-20 LD/RA. Our bd_rate_calculator.py exists; we lack a documented "NVC-style BD-rate report" recipe that mirrors the paper's protocol on UVG / MCL-JCV / HEVC B-E so users can reproduce the standard table.

Source: §5.4, §6.1, Tables 3-4, ref [255] (Bjøntegaard).

5. DISTS extractor as LPIPS companion (1 week, ADR)

Surface: core/src/dnn/, model/tiny/.

§5.3 discusses LPIPS [239] and DISTS [240, Ding et al. PAMI 2020] as the deep-feature FR pair widely used for video quality. We ship lpips_sq.onnx but no DISTS extractor. Concrete change: scaffold a dists_sq extractor analogous to feature_lpips.c. ONNX export + op-allowlist check + ADR per ADR-0041 pattern.

Source: §5.3, ref [240].

Source-list audit

258 references in total. Repo cross-check via grep -rli over docs/, model/, ai/. "already-cited" means the citation key, DOI, or unique title fragment appears in our tree.

Ref Year Title (abbrev.) Class Where in our repo
[4] 2003 Wiegand — H.264/AVC overview already-cited docs/metrics/ (ffmpeg context)
[5] 2012 Sullivan — HEVC overview already-cited docs/metrics/ctc/
[6] 2021 Bross — VVC overview already-cited docs/metrics/
[7] 2021 Han — AV1 technical overview already-cited docs/metrics/ctc/aom.md
[216] 2021 Ma, Zhang, Bull — BVI-DVC training database already-cited docs/research/0019-tiny-ai-netflix-training.md, ai/src/vmaf_train/data/datasets.py
[217] 2024 Nawała — BVI-AOM training dataset directly-relevant + new ai/src/vmaf_train/data/ (ingest path)
[219] 2010 Bossen — JVET CTC already-cited docs/metrics/ctc/
[220] 2021 Zhao — AOM CTC v2 already-cited docs/metrics/ctc/aom.md
[221] 2020 Mercat — UVG dataset tangential not used as training set
[222] 2016 Wang — MCL-JCV dataset tangential not used
[225] 2022 Zhao — AOM CTC v3 already-cited docs/metrics/ctc/aom.md
[226] 2004 Wang — SSIM already-cited core/src/feature/ssim.c (origin)
[231] 2006 Sheikh & Bovik — VIF already-cited core/src/feature/vif.c
[234] 2010 Seshadrinathan — MOVIE tangential mentioned only in literature
[235] 2016 Li — VMAF (Netflix Tech Blog) already-cited model/vmaf_*.json, docs/metrics/vmaf.md
[236] 2018 Bampis — ST-VMAF directly-relevant + new ai/src/vmaf_train/models/fr_regressor.py (motion-aware fusion)
[237] 2021 Zhang — Enhancing VMAF (Bull lab) directly-relevant + new docs/research/ (digest); FR-regressor v2 backlog
[238] 2021 Siniukov — VMAF hacking directly-relevant + new docs/ai/models/fr_regressor_v1.md (NEG caveat)
[239] 2018 Zhang — LPIPS already-cited model/tiny/lpips_sq.onnx, ADR-0041
[240] 2020 Ding — DISTS directly-relevant + new new tiny-AI extractor candidate
[241] 2017 Liu — RankIQA tangential NR-only, future NR-extractor work
[242] 2024 Feng — RankDVQA (Bull lab) directly-relevant + new NR-metric backlog
[243] 2025 Feng — Towards unified VQA tangential survey-level
[244-254] 2023-25 LMM-VQA series (Q-Align / Q-Insight) tangential out of scope for current tiny-AI bar
[255] 2001 Bjøntegaard — BD-rate definition already-cited python/vmaf/tools/bd_rate_calculator.py
[257] 2003 Wang/Simoncelli/Bovik — MS-SSIM already-cited core/src/feature/ms_ssim.c, ADR-0125

Not-relevant count: ~225 of 258 — entropy-coding theory, NeRF / Gaussian-splatting variants, end-to-end NVC architectures (DCVC-*, MaskCRT, GIViC, HiNeRV, NVRC, PNVC), normalizing-flow priors, vector-quantization variants, LMM-based VQA. These describe full neural codecs and codec internals — interesting context but no actionable mapping to a fork-local file. The fork is a quality-metric library, not a codec.

Backlog candidates

Short — T7-NEG-VMAF (1 day, no ADR)

Title: document NEG / hacking caveat in fr_regressor_v1 model card.

Scope: add one section to docs/ai/models/fr_regressor_v1.md citing Siniukov [238] and NEG-VMAF [235], plus a smoke eval on a sharpened-distortion holdout under ai/scripts/ to confirm whether the regressor inherits the same vulnerability profile as upstream VMAF.

Success criterion: model card updated, smoke eval committed, PLCC/SROCC delta on the sharpened-set documented.

Long — T7-BVI-AOM-INGEST (1 week, ADR needed)

Title: add BVI-AOM as a second training corpus for fr_regressor_v2.

Scope: mirror BVI-DVC ingest in ai/src/vmaf_train/data/datasets.py, add a manifest under ai/src/vmaf_train/data/manifests/, set up LOSO splits, retrain fr_regressor_v2 on combined BVI-DVC + BVI-AOM + KoNViD-1k corpus, compare PLCC/SROCC against v1 baseline.

Success criterion: v2 ONNX checkpoint registered with PLCC ≥ v1 + 0.01 on held-out test set; LOSO results table in research digest.

Bottom line

The paper is a codec-side review, so most of its 258 refs are out of scope for a quality-metric fork. The actionable signal lives in §5 (Test Protocols) and §5.3 (Quality Measures): BVI-AOM as new training data, NEG-VMAF / Siniukov as a hardening axis for the FR regressor, ST-VMAF / Zhang-2021 as direct prior art for the FR regressor's design space, and DISTS as a clean LPIPS-companion extractor candidate.