ADR-0561 — Widen HIP gfx_targets hardcoded fallback¶
| Field | Value |
|---|---|
| Status | Accepted |
| Date | 2026-05-18 |
| Deciders | lusoris, Claude (Anthropic) |
| Tags | hip, build, gfx, rocm, meson, fork-local |
Context¶
core/src/meson.build resolves the list of AMD GPU ISA targets for hipcc --genco in this order:
- The
-Dhip_gfx_targets=<csv>Meson option (explicit operator override). rocm_agent_enumerator— filters togfx*lines, comma-joined.hipconfig --amdgpu-target.- A hardcoded fallback when all three probes fail (the typical no-GPU build sandbox case: BuildKit CI, clean-room containers without a GPU bind-mount).
Until this ADR, step 4 fell back to gfx90a only — a single CDNA2 server target. The fork's primary development host is an AMD Raphael APU with an integrated GPU presenting as gfx1036. Raphael's hsa_init() required HSA_OVERRIDE_GFX_VERSION=10.3.0 to satisfy ROCm's allowlist; at runtime the loader looked for an HSACO blob compiled for gfx1030 (the override target). Any libvmaf.so compiled in a build sandbox used the gfx90a-only fallback and produced an HSACO fat binary that contained no compatible object for gfx1030. The dynamic loader emitted:
and the HIP backend failed to initialise at runtime even though the host had a functioning GPU and a full ROCm stack.
The RDNA2 desktop GPU (gfx1030), AMD Raphael APU iGPU (gfx1036), and RDNA3 desktop GPU (gfx1100) represent the three most common consumer AMD GPUs in the target demographic. Including them in the fallback ensures that a container or CI-built libvmaf.so runs on all three without operator intervention.
Decision¶
Widen the hardcoded fallback from gfx90a to gfx90a,gfx1030,gfx1036,gfx1100.
gfx90a— AMD Instinct MI200 (CDNA2); the original fallback; kept for server / cloud GPU compatibility.gfx1030— RDNA2 desktop (RX 6000 series) and the effective target for the AMD Raphael APU iGPU viaHSA_OVERRIDE_GFX_VERSION=10.3.0.gfx1036— AMD Raphael APU iGPU (RX 680M / Radeon 680M); the fork's primary HIP test device.gfx1100— RDNA3 desktop (RX 7000 series); forward-compatibility for the most common RDNA3 target.
The docs/backends/hip/overview.md gain a new ### -Dhip_gfx_targets section documenting the four-step resolution order and the operator override syntax.
Alternatives considered¶
| Option | Notes | Decision |
|---|---|---|
Keep gfx90a only | Requires every operator to pass -Dhip_gfx_targets=... explicitly; error-prone and not documented at build time. The fork broke silently on first use after a container rebuild — the build succeeded but runtime failed. | Rejected |
| Query GPU at configure time only | rocm_agent_enumerator / hipconfig already fill steps 2–3; no new logic needed. The fix is purely step 4. | Redundant — steps 2–3 already do this |
Add gfx1035 (another Raphael variant) | The HSA_OVERRIDE_GFX_VERSION=10.3.0 already maps the iGPU to gfx1030; a native gfx1035 blob is not needed. | Rejected — not a production target in this fork |
Auto-detect via hipcc --print-targets | Not available in ROCm toolchains before 6.0; adds a configure-time hipcc dependency even for no-GPU builds. | Rejected — fragile |
Consequences¶
- HSACO fat binaries built in no-GPU sandboxes cover four targets instead of one. Fat binary size increases by approximately 3× (measured:
vif_statisticsHSACO grows from ~180 KB to ~620 KB). Totallibvmaf.sosize increase: < 4 MB — acceptable. - Operators who need a smaller binary can pin via
-Dhip_gfx_targets=gfx1036. - No change to the behaviour of builds where
rocm_agent_enumeratororhipconfigsucceeds; step 4 is only reached in no-GPU sandboxes.
References¶
- Build failure:
hip_fatbin.cpp: No compatible code objects found for: gfx1030(surfaced on fork dev host running AMD Raphael APUgfx1036/ ROCm 6.4) docs/backends/hip/overview.md—-Dhip_gfx_targetssection (this PR)- ADR-0537 — HIP integer VIF kernel crash fixes (context: first real HIP VIF run)
- ADR-0552 — HIP VIF deterministic wavefront reduction (sibling fix)