Research digest — Metal kernel coverage round 4 (closeout)¶
Date: 2026-05-31 ADR: ADR-0959 Companion ADRs: ADR-0214 (cross-backend parity gate), ADR-0361 (Metal scaffold), ADR-0421 (first Metal kernel), ADR-0589 (Metal SSIM tolerance).
Question¶
Are there any kernels under core/src/feature/metal/ that still lack a CPU-vs-Metal parity test after rounds 2 (PR #379) and 3 (PR #447)? And if 8/8 coverage is achieved, what structural guard prevents silent regression the day a 9th kernel lands?
Method¶
- Enumerate
.mm/.metalfiles on disk:
Yields 8 kernel pairs: float_moment, float_motion, float_ms_ssim, float_psnr, float_ssim, integer_motion, integer_motion_v2, integer_psnr.
- Enumerate
test_metal_*tests onmaster:
Yields only test_metal_install_header.c + test_metal_smoke.c. No per-kernel parity tests on master (rounds 2 + 3 are still open DRAFTs as of 2026-05-31).
- Examined open Metal PRs:
- PR #379 (round 2, DRAFT): adds parity for
motion_v2,integer_psnr,float_psnr,float_ssim. - PR #447 (round 3, DRAFT): adds parity for
integer_motion,float_motion,float_moment,float_ms_ssim. - PR #351 (round 1, merged): registration audit for all 8 extractors.
Sum across rounds = 8 kernels covered post-merge.
-
Build-wiring audit: Read
core/src/metal/meson.build. Every.mmis inmetal_objcpp_libsources; every.metalhas acustom_targetproducing a.airfile; all 8.airoutputs fold intometal_air_files→default.metallib. No dormant scaffolds — every kernel file on disk is in the build. -
Dispatch-strategy audit: Read
core/src/metal/dispatch_strategy.c. Theg_metal_features[]array carries all 8<name>_metalnames plus the provided-feature keys each kernel emits. No phantom entries; no missing entries.
Findings¶
- Kernel coverage: 8 / 8 kernels will be backed by per-kernel parity tests once PR #379 and PR #447 merge. No remaining gaps.
- Build wiring: 8 / 8
.mm+ 8 / 8.metalfiles wired intometal_objcpp_libandmetal_air_files. No dormant scaffolds (contrast with SYCL r4 PR #465 which surfacedspeed_chroma_sycl.cppas a dormant.cppnot in the build). - Structural gap: the per-kernel tests name their target extractor inline; the suite passes silently when a new kernel ships without a test. The fork-wide cross-backend gate (ADR-0214) enforces parity for tests that exist; it does not enforce that every kernel HAS a test.
Decision rationale¶
Ship a coverage-audit test that enumerates the 8 expected kernel basenames and asserts:
- Each
<basename>_metalis registered viavmaf_get_feature_extractor_by_name(CPU-side, runs everywhere). - Each is accepted by
vmaf_metal_dispatch_supports(gated on-ENODEVfor non-Apple-Family-7 runners). - Plausible-looking phantom names (
vif_metal,adm_metal,ciede2000_metal,ssimulacra2_metal) are not supported — wildcard-regression guard. - The basename list size matches the explicit
EXPECTED_KERNEL_COUNTmacro — defensive cross-check.
This pattern mirrors the CUDA round-4 closeout (PR #464, 19/19) and SYCL round-4 closeout (PR #465). Hand-maintained list with an explicit count is chosen over build-time glob because (a) meson files() doesn't allow globs, (b) build-time codegen adds a CI step, (c) SYCL r4 set the precedent. The trade-off is one audit-list edit per new kernel, in exchange for explicit "every Metal kernel MUST have registration + dispatch + parity test in the same PR" enforcement.
Reproducer¶
# Audit step 1: enumerate kernels.
find core/src/feature/metal -maxdepth 3 -type f \
\( -name '*.mm' -o -name '*.metal' \) | sort
# Audit step 2: enumerate existing parity tests.
find core/test -maxdepth 2 -name 'test_metal*parity*' -type f | sort
# Audit step 3: run the new closeout audit (CPU-only OK; dispatch
# check skips with -ENODEV on non-Apple-Family-7).
cd core && meson setup build-cpu \
-Denable_cuda=false -Denable_sycl=false -Denable_metal=enabled
ninja -C build-cpu test_metal_kernel_coverage_audit
meson test -C build-cpu test_metal_kernel_coverage_audit
Cross-references¶
- ADR-0214 — cross-backend parity gate, places=4 default.
- ADR-0361 — Metal backend scaffold (T8-1).
- ADR-0421 — first Metal kernel
integer_motion_v2. - ADR-0460 — dispatch registry audit (HIP/Metal dispatch-support alignment).
- ADR-0589 — Metal
float_ssimoption parity + 1e-3 SSIM-family tolerance bound. - PR #351 / PR #379 / PR #447 — Metal coverage rounds 1 / 2 / 3.
- PR #464 — CUDA kernel coverage round 4 (sibling).
- PR #465 — SYCL kernel coverage round 4 (sibling, with
speed_chroma_sycl.cppdormant-scaffold finding). docs/research/0761-metal-backend-audit-20260529.md— prior Metal audit that informed this closeout.