ADR-0717: vmafx-node — ffmpeg latest-tag pinning + ffmpeg-patches bundled into Dockerfile¶
- Status: Accepted
- Date: 2026-05-28
- Deciders: Lusoris
- Tags: node, ffmpeg, docker, phase4b, fork-local
Context¶
Phase 4b.4 (ADR-0709) requires the vmafx-node worker container to run ffmpeg as a subprocess for all encoding operations. The node image must therefore include a fully-compiled ffmpeg binary with the fork's ffmpeg-patches/ series applied (the patches wire the VMAFX-specific libvmaf filters into FFmpeg's filter graph, expose the SYCL / Vulkan / CUDA / HIP backend selectors, and add the vmaf-tune qpfile and profile-glue patches).
Two packaging decisions require an explicit record:
- Which FFmpeg version to pin in the node image — the options are a pinned stable release tag (e.g.,
n8.2) versus tracking a rolling release branch. - Where the ffmpeg build lives — the node Dockerfile vs a shared external image.
The existing dev/Containerfile (the developer container) already builds FFmpeg n8.1.1 with all 15 patches applied. That container is not a runtime delivery artifact; it exists to give developers a full-featured workbench. The node Dockerfile needs its own independent ffmpeg build so the production node image stays lean (no developer toolchain, no Python environment, no MCP server).
Decision¶
Option A — pin to latest stable tagged release is chosen.
The docker/Dockerfile.node pins FFMPEG_TAG=n8.2 (the latest stable FFmpeg 8.x release at the time of this ADR). The tag is a Docker build argument with a documented default; CI can override it to test against a newer tag. The value is updated on every release-sync PR that touches ffmpeg-patches/.
The Dockerfile structure is a shared multi-stage build:
- vmaf-builder — compiles
libvmaf.so+vmafCLI fromcore/(CPU-only, release/stripped). Shared by all four node variants. - ffmpeg-deps-base — installs codec build-time dependencies (nasm, libx264-dev, libx265-dev, libvpx-dev, libdav1d-dev, libopus-dev, ...) and builds SVT-AV1 from source (Ubuntu's
libsvtav1-devomitsSvtAv1Enc.pc— same root cause documented in the dev Containerfile's §"NOTE: libaom" comment block). - ffmpeg-builder-cpu — clones FFmpeg at
FFMPEG_TAG, applies all patches inffmpeg-patches/series.txtorder viagit am --3way, configures with--enable-libvmaf --enable-libx264 --enable-libx265 --enable-libvpx --enable-libsvtav1 --enable-libdav1d, and installs into/ffmpeg-install. - go-builder — compiles
cmd/vmafx-nodewith CGo disabled for libvmaf linkage. - runtime-base —
gcr.io/distroless/cc-debian12(same base as the controller and production images). - node-cpu / node-cuda / node-rocm / node-sycl — four final targets that add vendor-specific runtime libs on top of the shared base.
libaom is excluded for the same reason as in the dev container: ffmpeg-patches/0007 references aom_roi_map_t fields (enabled, skip, ref_frame, delta_qp_enabled) that do not exist in any released libaom. SVT-AV1 covers the AV1 production lane.
The pkg/encoder package (already in tree at pkg/encoder/encoder.go) already shells out to ffmpeg via EncodeParams.FFmpegBin. The node startup probe (cmd/vmafx-node/probe/probe.go) runs ffmpeg -encoders at startup and caches the result; the server logs the inventory on every start so operators can confirm the codec matrix without shelling into the container.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| A — pin to latest stable tag (chosen) | Reproducible builds; n8.2 tag is immutable; CI can test against a known state | Must update the tag on every release sync | Reproducibility outweighs the small maintenance cost. Rolling builds introduce non-determinism that is difficult to bisect when a codec behaviour changes. |
| B — track release/8.x branch | Always gets the latest patch-level fixes without a manual tag bump | Rolling: two builds of the same image can produce different binaries; ffmpeg-patches may break against a mid-branch commit | Violates the principle of immutable image content. Commit SHAs are acceptable; branch names are not. |
| C — reuse the dev container ffmpeg | Zero extra build time | The dev container is a 4GB+ developer workbench; pulling it at node-runtime time is impractical; the CUDA/oneAPI toolchains would inflate the production image by 2–3GB | Not a production delivery artifact; wrong base image. |
| D — install distro ffmpeg via apt | Zero build time | Ubuntu's ffmpeg package cannot have the fork's patches applied; lags the upstream release significantly; codec matrix is frozen to whatever the distro ships | The patches are load-bearing; distro ffmpeg cannot be used. |
Consequences¶
Positive:
- Every
vmafx-nodepod runs the exact same ffmpeg binary with all 15 fork patches. - The
pkg/encoderpackage andcmd/vmafx-node/probepackage are codec-inventory-aware at startup, enabling the controller to dispatch only to nodes that carry the requested codec. - The four Docker targets (cpu / cuda / rocm / sycl) share the same ffmpeg binary via the common
ffmpeg-builder-cpustage; the vendor variants add only their GPU runtime libs on top. - CI can validate the node image build with
docker buildx build --target node-cpu.
Negative:
- Every node image build must compile ffmpeg from source (~5–10 min on standard runners). This can be accelerated with
--cache-fromin CI. - When
ffmpeg-patches/are updated (e.g., after an FFmpeg n8.3 base bump), theFFMPEG_TAGbuild-arg indocker/Dockerfile.nodemust be updated in the same PR (CLAUDE.md §12 r14).
Neutral / follow-ups:
- Phase 4b.5+ will add full gRPC service registration in
cmd/vmafx-node/server/server.goonce the controller proto is finalized. - NVENC / QSV / AMF hardware encoders require the host's GPU driver; the node container does not bundle drivers. The startup probe logs WARN for absent hw-encoder codec names but does not block startup.
- A CI job that validates the patches apply against both
n8.1.1(existing) andn8.2(new node target) is tracked as a follow-up to this PR; the existingffmpeg-patches/test/build-and-run.shgate covers n8.1.1, and the Dockerfile itself serves as a build-time n8.2 validation.
References¶
- ADR-0709 — Phase 4b umbrella (parent ADR); item 4b.4 is the direct parent scope.
- ADR-0541 — dev-container codec matrix rationale (libaom exclusion, SVT-AV1 source build).
- ffmpeg-patches/README.md — patch application invariants and verification gate.
req— "of course this has to be fully connected to a ffmpeg worker as well (latest of course)..." (architecture popup, 2026-05-28, captured verbatim in ADR-0709 §References).