ADR-0579: vmaf-tune auto --execute — Phase F real encode/score execution mode¶
- Status: Accepted
- Date: 2026-05-16
- Deciders: lusoris, Claude (Anthropic)
- Tags:
vmaf-tune,phase-f,encode,score,cli,fork-local
Context¶
vmaf-tune auto (ADR-0325 / ADR-0364) runs the Phase F decision tree and emits a deterministic JSON plan: one or more (codec, preset, crf) cells with predictor estimates for VMAF and bitrate, and a selected flag marking the planner's chosen winner (ADR-0428). Until this ADR, plan emission was the terminal step — actual FFmpeg encodes and libvmaf scores were left to the operator as a manual follow-up. This made the auto subcommand useful for preview and pipeline composition but prevented it from being a fully self-contained end-to-end tool.
The remaining Phase F work tracked in .workingdir2/OPEN.md identifies "real encode/score execution mode" as the next concrete deliverable after pass 29's winner selection. Two design questions arise:
- Output format: the planning dossier mentioned Parquet; the vmaf-tune package has zero mandatory dependencies (pyproject.toml
dependencies = []). Addingpyarrowas a mandatory dep for one output serialiser would break zero-dep installs. JSONL is already used bycorpus.pyand understood by every downstream consumer that currently reads vmaf-tune output. - Default behaviour: plan-only must remain the default so existing CI and operator scripts are unaffected.
Decision¶
We will add a run_plan() function in a new vmaftune/executor.py module that:
- Iterates the
selectedcell(s) from anAutoPlan(or all cells whenexecute_all=True). - Drives FFmpeg via the existing
run_encode()seam fromencode.py. - Scores each output with the libvmaf CLI via the existing
run_score()seam fromscore.py. - Writes one JSONL row per cell to
<runs_dir>/tune_results.jsonl, appending so partial runs survive restarts. - Exposes
encode_runnerandscore_runnerkwargs as test seams (same pattern as the rest of the harness).
The vmaf-tune auto CLI gains three new flags:
--execute(store_true, default False) — enables execute mode; plan-only is unchanged when absent.--runs-dir PATH(defaultruns/) — destination for encoded files andtune_results.jsonl.--execute-all(store_true) — run every plan cell rather than only the selected winner.
Output format is JSONL (not Parquet) to preserve the zero-dependency invariant. A future optional [execute] extra can add pyarrow for operators who want Parquet column-store output; that is out of scope for this ADR.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
Parquet via pyarrow (mandatory dep) | Native columnar; best for downstream ML | Breaks zero-dep install; large binary dep | Zero-dep invariant is a deliberate design choice (pyproject.toml) |
Parquet via pyarrow (optional extra) | Same columnar benefit; install-optional | Adds a new [execute] extra, more complex install matrix | Can be added later; not needed for Phase F unblock |
| CSV | Universally readable | No native null support; schema fragile with feature columns | JSONL handles variable feature sets (cambi-only vs full CANONICAL6) naturally |
| JSONL (chosen) | Zero new deps; consistent with corpus.py; null-safe | Not columnar; conversion step for ML consumers | Correct choice for v1; Parquet layer is additive |
Embed in auto.py rather than a new module | Fewer files | auto.py is already large; mixing planning + execution concerns | Separation of concerns; executor.py is independently testable |
Consequences¶
- Positive:
vmaf-tune auto --executeis now a self-contained plan+run verb; operators get a JSONL file with encode size, encode time, VMAF score, and per-feature aggregates without any manual post-processing. - Positive: The subprocess boundary is a clean test seam —
test_executor.pyachieves 100% path coverage without FFmpeg or the vmaf binary. - Positive:
tune_results.jsonlappends on each run, so partial runs and incremental re-runs do not overwrite previous results. - Negative: JSONL is not columnar; downstream ML consumers that want efficient column reads need a one-off conversion (
polars.read_ndjson/pyarrowtable conversion). - Neutral:
--executeis off by default; existingautocallers are unaffected. - Neutral: follow-up work can add saliency-aware per-shot execution (the other open Phase F item) as a second call to
run_plangated by the plan's saliency short-circuit metadata.
References¶
- ADR-0325:
vmaf-tune autoPhase F decision tree. - ADR-0364: Phase F adaptive recipe and confidence-aware tuning.
- ADR-0428: Plan winner selection (
metadata.winner+cells[].selected). .workingdir2/OPEN.mdlines 184–188: Phase F remaining work (real encode/score execution).req: per user direction in agent task brief 2026-05-16, Phase F execute scaffolding.