ADR-0959: Metal kernel parity coverage round 4 — closeout¶
- Status: Accepted
- Date: 2026-05-31
- Deciders: lusoris, Claude
- Tags: testing, metal, gpu, parity, regression-guard
Context¶
The Metal backend ships 8 feature-extractor kernels under core/src/feature/metal/ (one .mm host wrapper and one .metal MSL source per kernel: float_moment, float_motion, float_ms_ssim, float_psnr, float_ssim, integer_motion, integer_motion_v2, integer_psnr). Per-kernel CPU-vs-Metal parity tests were closed across three rounds:
| Round | PR | Kernels covered |
|---|---|---|
| 1 (registration) | #351 | 8 extractors discoverable via vmaf_get_feature_extractor_by_name |
| 2 (parity, batch A) | #379 | motion_v2, integer_psnr, float_psnr, float_ssim |
| 3 (parity, batch B) | #447 | integer_motion, float_motion, float_moment, float_ms_ssim |
| Round 4 (closeout) | this PR | regression guard + 8/8 closeout audit |
The gap this ADR closes is structural, not per-kernel. Each round-2 / round-3 test names its target extractor inline ("integer_motion_metal", etc.). When a future round-5 kernel lands under core/src/feature/metal/ without a corresponding test_metal_*_parity.c file, the existing suite passes — there is no test naming the new kernel, so the gap is silent until someone manually re-runs the coverage audit. The fork-wide cross-backend parity gate (ADR-0214) applies to tests that exist; it does not enforce that every kernel has a parity test.
The CUDA round-4 closeout (PR #464, 19/19 kernels) and the SYCL round-4 closeout (PR #465) set the precedent of shipping a single audit test that asserts "every kernel in the source tree has a registered extractor + dispatch row" so the day a kernel ships without registration the audit goes red. This ADR aligns Metal with that precedent.
Decision¶
We will ship a single test_metal_kernel_coverage_audit.c test that enumerates the 8 expected Metal kernel basenames and asserts:
- Each
<basename>_metalresolves to a registered extractor viavmaf_get_feature_extractor_by_name. CPU-side check — runs on every host (Linux / Windows / Intel Mac / Apple Silicon). - Each
<basename>_metalis accepted byvmaf_metal_dispatch_supports. Needs aVmafMetalContext; skips cleanly with-ENODEVon non-Apple-Family-7 hosts. - The dispatch strategy does not falsely report plausible-looking non-existent Metal names (
vif_metal,adm_metal,ciede2000_metal,ssimulacra2_metal) as supported — wildcard-regression guard. - The basename list size equals the
EXPECTED_KERNEL_COUNTmacro (8) — defensive cross-check that catches edits to one half without the other.
The audit binds the grow-together contract: when a 9th .mm kernel lands, the contributor MUST extend g_metal_kernel_basenames[], add a registration row, add a dispatch-strategy row, and (per ADR-0214) ship a per-kernel parity test in the same PR. Otherwise this audit fails first, before the per-kernel test would have been needed.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Don't add a closeout audit; rely on rounds 1–3 | Zero new code | Silent gap the day a 9th kernel lands; manual re-audit cost compounds | Defeats the point of ADR-0214 cross-backend gate — gate must enforce coverage, not just per-pair parity |
Generate the audit list from a build-time glob() over core/src/feature/metal/*.mm | Auto-syncs with the source tree | Meson files() doesn't allow globs; build-time codegen adds a CI step; SYCL r4 (PR #465) explicitly rejected this pattern for the same reason | Hand-maintained list with an explicit EXPECTED_KERNEL_COUNT is the SYCL / CUDA precedent; trade auto-discovery for one-line audit edit per kernel |
Add 8 individual test_metal_<kernel>_registration.c files | Per-kernel granularity | 8× test binaries for what is one audit; round-1 PR #351 already does per-kernel registration | Already covered by PR #351; round 4 is the structural gate, not per-kernel |
Move the audit into test_metal_smoke.c | Single test binary | Smoke is for runtime/lifecycle, not coverage policy; mixing concerns reduces test-failure attribution | Coverage-audit lives in its own binary so a coverage failure is unambiguous in CI logs |
Consequences¶
- Positive:
- 100 % Metal kernel coverage is now a structural CI invariant.
- New Metal kernels MUST register + add a dispatch row + extend the audit list in the same PR — the audit fails first otherwise.
-
Aligns Metal with CUDA r4 (PR #464) and SYCL r4 (PR #465) closeout precedent — three backends now share the same audit-by-enumeration pattern.
-
Negative:
- Hand-maintained basename list — one edit per new kernel. Mitigated by the failure message which names the missing kernel.
-
Test runs on every CPU build (registration probe is CPU-side); a handful of extra
strcmpcalls permeson testinvocation. Cost is sub-millisecond. -
Neutral / follow-ups:
core/test/AGENTS.md— no rebase-sensitive invariant added; the audit file is fork-local.docs/state.md— closesT-METAL-KERNEL-PARITY-ROUND4-2026-05-31under "Recently closed".- Future round-5 PR (if a 9th Metal kernel lands) bumps the audit list and the
EXPECTED_KERNEL_COUNTmacro in the same diff.
References¶
- ADR-0214 — cross-backend parity gate (places=4 default).
- ADR-0361 — Metal backend scaffold (T8-1).
- ADR-0421 — first Metal kernel =
motion_v2(T8-1c). - ADR-0589 — Metal SSIM-family 1e-3 tolerance bound.
- PR #351 — Metal kernel coverage round 1 (registration audit).
- PR #379 — Metal kernel coverage round 2 (parity for 4 kernels).
- PR #447 — Metal kernel coverage round 3 (parity for remaining 4).
- PR #464 — CUDA kernel coverage round 4 closeout (sibling precedent).
- PR #465 — SYCL kernel coverage round 4 closeout (sibling precedent).
- Source:
req— "Identify any Metal kernels undercore/src/feature/metal/that still lack a CPU-vs-Metal parity test … final sweep to verify 100% kernel coverage or document remaining gaps with reason."