Research-0070: vmaf-tune codec-agnostic encode dispatcher¶
- Status: digest for ADR-0297
- Date: 2026-05-03
Problem¶
tools/vmaf-tune/src/vmaftune/encode.py was hard-coded to libx264: the FFmpeg argv contained the literal -c:v libx264 -preset $PRESET -crf $CRF, the version regex only knew x264's core <N> line, and the corpus loop assumed CRF semantics throughout. Nine in-flight codec adapter PRs (libx265, libsvtav1, libaom, libvvenc, NVENC, QSV, AMF, VideoToolbox, plus the wave to 17) could not drive end-to-end encodes until the harness stopped baking x264-shaped argv.
Survey of codec quality knobs¶
| Codec | FFmpeg -c:v value(s) | Quality knob | Preset shape |
|---|---|---|---|
| libx264 | libx264 | -crf (0..51) | ultrafast..veryslow |
| libx265 | libx265 | -crf (0..51) | ultrafast..veryslow |
| libsvtav1 | libsvtav1 | -crf (0..63) | -preset 0..13 (integer) |
| libaom-av1 | libaom-av1 | -crf + -b:v 0 | -cpu-used 0..8 |
| libvpx-vp9 | libvpx-vp9 | -crf + -b:v 0 + -deadline good | -cpu-used 0..8 |
| libvvenc | libvvenc | -qp (constant-QP) or -b:v (target-rate) | faster/fast/medium/slow/slower |
| NVENC (h264/hevc/av1) | h264_nvenc, ... | -cq (with -rc vbr) | p1..p7 |
| QSV (h264/hevc/av1/vp9) | h264_qsv, ... | -global_quality | veryfast..veryslow |
| AMF (h264/hevc/av1) | h264_amf, ... | -qp_i + -qp_p (with -rc cqp) | -quality speed/balanced/quality |
| VideoToolbox (h264/hevc) | h264_videotoolbox | -q:v (1..100, higher = better) | (none — VT picks internally) |
Two non-uniformities forced the dispatcher design:
- Quality knob is not always
-crf. NVENC uses-cq, VVenC uses-qp, QSV uses-global_quality, VideoToolbox uses-q:von a 1..100 scale where higher is better. - Preset slot is not always
-preset. libaom-av1 / libvpx-vp9 take-cpu-used, AMF takes-quality. libsvtav1 keeps-presetbut the value is an integer.
A single hard-coded ffmpeg invocation cannot serve all of these. The adapter is the natural place to translate.
Options¶
A — Codec-agnostic dispatcher (chosen)¶
Adapter exposes ffmpeg_codec_args(preset, quality) -> list[str] returning the codec-specific argv slice. Dispatcher concatenates it into the otherwise-uniform [ffmpeg, -y, ..., -i src, *codec_args, *extra_params, output] shape.
- Pros: One PR unblocks 17 adapters; harness stays codec-agnostic per ADR-0237 invariant; existing x264 argv is bit-preserved.
- Cons: Duck-typed contract; missing methods fall back silently. Mitigated by per-codec test cases pinning the expected argv shape.
B — Per-codec driver functions¶
run_encode_x264, run_encode_x265, ... each in its own module.
- Pros: Strong isolation; type-checker can pin signatures per codec.
- Cons: Forks the harness loop 17 ways; rebase nightmare; every bisect / retry / version-parse fix must touch 17 files.
- Verdict: rejected — same anti-pattern ADR-0237 already ruled against.
C — Strict Protocol with mandatory methods¶
Define a runtime_checkable Protocol requiring ffmpeg_codec_args; reject adapters that don't implement it.
- Pros: Mis-typed adapters fail loudly at registration.
- Cons: Forces all 9 in-flight adapter PRs to land their
ffmpeg_codec_argsbefore the dispatcher merges — the whole point of this PR is to unblock them, so adding a synchronous-rendezvous requirement defeats the purpose. - Verdict: rejected — task hard-rule explicitly demands the fallback path.
D — Bump SCHEMA_VERSION to add a quality row column¶
Replace crf in the JSONL row with a codec-agnostic quality.
- Pros: Cleaner naming for non-CRF codecs.
- Cons: Forces every Phase B/C consumer to migrate; row schema contract (rebase-notes #0227) explicitly says "adding optional keys is fine; renaming requires bumping
SCHEMA_VERSIONand every downstream consumer in the same PR". - Verdict: rejected — keep
crfin the row, exposequalityonly as a request-side property.
Reproducer¶
cd /home/kilian/dev/vmaf
pytest tools/vmaf-tune/tests/ -q
# 32 passed (13 existing + 19 new multi-codec dispatcher tests)
python -c "
from pathlib import Path
from vmaftune.encode import EncodeRequest, build_ffmpeg_command
req = EncodeRequest(
source=Path('ref.yuv'), width=1920, height=1080, pix_fmt='yuv420p',
framerate=24.0, encoder='libx264', preset='medium', crf=23,
output=Path('out.mp4'),
)
print(build_ffmpeg_command(req))
"
Expected output contains -c:v libx264 -preset medium -crf 23 — unchanged from Phase A.
References¶
- ADR-0237 — parent spec; codec-agnostic-search-loop invariant.
- Research-0044 — option-space digest for the parent ADR.
- FFmpeg encoder docs:
ffmpeg -h encoder=libx264,=libx265,=libsvtav1,=libaom-av1,=libvpx-vp9,=libvvenc,=h264_nvenc,=hevc_qsv,=h264_amf,=h264_videotoolbox.