ADR-0819: PR-time CI gate for dev/Containerfile¶
- Status: Accepted
- Date: 2026-05-29
- Deciders: lusoris
- Tags:
ci,build,workspace
Context¶
The fork's primary development environment is the vmaf-dev-mcp container defined in dev/Containerfile. Prior to this decision there was no CI gate on that file: regressions were only discovered at user runtime. The concrete trigger was a session where source-path references in dev/Containerfile were not updated after the libvmaf/ → core/ rename (ADR-0700), causing the container build to fail silently — a problem that was only surfaced hours later during a local docker compose build run.
The existing docker-image.yml workflow targets the root-level Dockerfile (the upstream-inherited image), not dev/Containerfile, and is marked continue-on-error: true (advisory-only) with no smoke test. It does not cover the dev container at all.
Decision¶
We will add .github/workflows/dev-container-build.yml that triggers on every PR that touches dev/Containerfile, dev/scripts/, dev/docker-compose.yml, or the workflow file itself. The job builds the container up to the libvmaf-build stage (covering all GPU SDK layers and the libvmaf source build), then runs two smoke tests inside the resulting image: vmaf --version and a CPU-backend scoring pass against the bundled 48-frame fixture YUVs. No image is pushed. The job is blocking (not continue-on-error).
Targeting only libvmaf-build (not the full dev-mcp stage) keeps the gate under 45 min on a free-tier runner while still covering the main regression surface (GPU SDK package pinning, meson configure flags, ONNX Runtime install). The FFmpeg stage and final assembly are exercised by the release-time docker-publish-production workflow.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Full multi-stage build (all stages) | Validates the entire image end-to-end | 80–120 min on free-tier; exhausts runner minutes on every Containerfile PR | Too slow for a PR gate; covered at release time |
Advisory-only gate (continue-on-error: true) | Never blocks a PR | Defeats the purpose; current docker-image.yml already proves advisory gates go unread | Not chosen — we need a hard failure signal |
| No gate (status quo) | Zero CI cost | Regressions discovered only at user runtime; exact failure mode that motivated this ADR | Not chosen |
Consequences¶
- Positive: Any
dev/Containerfilecommit that breaks the build or makes thevmafbinary non-functional will fail the PR gate before it reachesmaster. - Positive: GPU SDK package-pin regressions (the most common failure class from the path-rename incident) are caught at PR time, not at runtime.
- Negative: PRs that touch
dev/Containerfilenow incur a 30–45 min CI job. This is acceptable given how infrequently the file changes and how costly the alternative (user runtime failures) is. - Neutral / follow-ups: The
required-aggregator.ymldoes not list this check as a hard required gate (to avoid blocking unrelated PRs that touch onlydev/). The check is required by path filter on the PR itself.
References¶
- ADR-0700:
libvmaf/ → core/rename (root cause of the path-stale incident). - ADR-0317: path-filter rationale for docker-image.yml.
docker-image.yml: existing advisory-only Dockerfile gate.- Per user direction: the dev container build must be verified on every PR that touches it; the existing docker-image.yml did not cover dev/Containerfile.