ADR-0338: macOS Vulkan-via-MoltenVK CI lane (advisory) for the Vulkan backend¶
- Status: Accepted
- Date: 2026-05-09
- Deciders: Lusoris, Claude (Anthropic)
- Tags: ci, vulkan, macos, moltenvk, gpu, advisory
Context¶
The fork ships a working Vulkan compute backend (ADR-0127, ADR-0175, and the kernel ADRs 0176–0252) with end-to-end coverage on Linux via Mesa lavapipe (CI) and Intel anv (developer hardware). Apple Silicon is the canonical hardware platform we have no automated coverage for: CUDA / SYCL / HIP do not target macOS, and the planned native Metal backend is a multi-month workstream tracked separately.
MoltenVK is the Khronos-supported open-source Vulkan-on-Metal translation layer. If MoltenVK works against the fork's existing SPIR-V kernels, macOS users get GPU-accelerated VMAF without waiting for the Metal port — and the fork validates that SPIR-V → MSL translation is a usable secondary route on Apple platforms.
The risk is real but bounded:
- Most of the fork's shaders use only non-atomic
int64arithmetic (GL_EXT_shader_explicit_arithmetic_types_int64), which lowers to Metal's nativelongtype and is well-supported on M1+. - One shader (
moment.comp) usesatomicAddonint64(GL_EXT_shader_atomic_int64). Per the MoltenVK Runtime User Guide, this requires Metal Tier-2 argument buffers — supported on M1+ but the most fragile capability dependency in the shader set. - Some Vulkan extensions used by the fork's runtime (notably
VK_KHR_external_memory_fdfor DMABUF import) are not supported by MoltenVK — but the smoke tests don't exercise the import path, and the host-staged copy fallback already exists (per ADR-0127 §Decision).
CI cost on macOS Apple Silicon runners is non-trivial (macos-latest billed at 10× the Linux rate per GitHub's billing schedule). The lane is justified by the gap it closes — no other matrix entry exercises any Apple-platform GPU code path.
Decision¶
Add a single advisory CI lane to libvmaf-build-matrix.yml:
- Job name:
Build — macOS Vulkan via MoltenVK (advisory) - Runner:
macos-latest(Apple Silicon, Homebrew prefix/opt/homebrew). - Install:
brew install -q molten-vk vulkan-loader vulkan-headers shaderc. - Formula
molten-vklaysMoltenVK_icd.jsonat/opt/homebrew/etc/vulkan/icd.d/MoltenVK_icd.jsonper the Homebrew formula source. - Formula
vulkan-loaderprovideslibvulkan.dylib(the loader volk dlopen()s). - Formula
vulkan-headersprovides the API headers consumed via the libvmaf wrap-fallback path. - Formula
shadercshipsglslc. - ICD pin:
VK_ICD_FILENAMES=/opt/homebrew/etc/vulkan/icd.d/MoltenVK_icd.jsonexported via$GITHUB_ENVso the loader is deterministic. - Build:
meson setup … -Denable_vulkan=enabled+ninja. - Smoke: runs
test_vulkan_smoke,test_vulkan_pic_preallocation, andtest_vulkan_async_pending_fenceagainst the live MoltenVK ICD and capturesvulkaninfo --summaryfor triage. - Advisory mode:
continue-on-error: ${{ matrix.experimental == true && matrix.moltenvk == true }}. Lane stays advisory until one green run onmaster, after which thecontinue-on-errorreverts to default (false) and the job name is added torequired-aggregator.yml. - Failure mode: if a kernel pipeline fails to compile or enumerate on MoltenVK, the failing kernel + suspected MoltenVK gap is documented in
docs/backends/vulkan/moltenvk.md"Known limitations". Perfeedback_no_test_weakening, thresholds are never lowered to make a failing kernel pass; the fix path is upstream MoltenVK or a kernel rewrite.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| MoltenVK CI lane (chosen) | Validates SPIR-V → MSL on real Apple Silicon hardware; reuses existing kernel set; closes the macOS GPU coverage gap today | One extra runner-minute cost; bounded by MoltenVK's translation gaps (atomicInt64, external memory) | Cheapest credible coverage of the fork's macOS GPU story |
| Wait for native Metal backend | Zero CI cost change | Leaves macOS without GPU coverage for the entire Metal port window (months); doesn't validate the SPIR-V translation path at all | Validation gap is real and the cost differential is small |
| MoltenVK on a self-hosted Apple Silicon runner | Stable hardware; no GHA macOS-runner billing | Requires a self-hosted runner registration we don't have today; secret-management overhead; inconsistent with the lavapipe lane shape | Premature optimisation — the GHA macOS runner is fine for advisory coverage |
| Required (not advisory) lane on day one | Forces every PR to confirm MoltenVK | If MoltenVK trips on a known-fragile capability (atomicInt64 / external memory), every PR red-lights regardless of whether the PR touches Vulkan | Advisory mode is the ADR-precedented pattern (cf. the Arc-A380 nightly lane in ADR-0127) |
| Build-only lane (no smoke test) | Cheaper; faster | Doesn't actually validate the shaders run on Metal — only that the C compiles | Defeats the purpose of the lane |
Consequences¶
Positive¶
- macOS is no longer a GPU-coverage hole. PRs that touch Vulkan runtime code surface MoltenVK regressions on the next push.
- The lane stress-tests SPIR-V → MSL translation on the fork's actual kernels, which feeds back into the native Metal backend decision: if MoltenVK works, the Metal backend's value is perf-only, not coverage-only.
- The operator-facing
docs/backends/vulkan/moltenvk.mdgives macOS users a documented install + troubleshooting path with no Metal backend required.
Negative¶
- One additional
macos-latestjob per PR — see GHA billing schedule for current cost. Mitigated byif: github.event_name != 'pull_request' || github.event.pull_request.draft == false(already shared with the rest of the matrix per ADR-0331), so draft PRs don't pay for it. - MoltenVK is a moving target — its limitations matrix shifts with each release. The
moltenvk.mdknown-limitations table is the canonical place to track current gaps; rebase notes call out that this table is hand-maintained. - The lane's
continue-on-errormasks regressions until promoted to required. Promotion is a follow-up tracked in docs/state.md.
Neutral¶
- No change to the existing Linux Vulkan lane, no change to any required CI check, no change to the Netflix golden gate.
- No public C-API surface change; the lane consumes existing smoke-test binaries.
References¶
- [req] Implementation task brief 2026-05-09 paraphrased: add a macOS CI lane that builds + smoke-tests the existing Vulkan compute backend on macOS via MoltenVK, complementary to the native Metal backend dispatched separately.
- ADR-0127 — Vulkan compute backend decision; this ADR appends a Status update to it.
- ADR-0175 — backend scaffold the smoke tests pin.
- ADR-0331 — draft-PR skip pattern reused by this lane.
- Research-0089 — feasibility digest: MoltenVK capability matrix vs the fork's shader inventory.
- docs/backends/vulkan/moltenvk.md — operator-facing documentation.
- Homebrew/homebrew-core
molten-vk.rb— install layout, MoltenVK_icd.json path. - MoltenVK Runtime User Guide — known limitations source for the moltenvk.md matrix.
- CLAUDE.md §12 r10 — doc-substance rule (this lane ships
docs/backends/vulkan/moltenvk.mdin the same PR). - CLAUDE.md §12 r11 / ADR-0108 — six-deliverable rule.