ADR-0852: Wire speed_chroma_hip and speed_temporal_hip into HIP Build and Dispatch¶
- Status: Accepted
- Date: 2026-05-29
- Deciders: lusoris
- Tags:
hip,build,speed,feature,gpu,fork-local
Context¶
ADR-0567 landed real on-device GPU kernels for speed_chroma and speed_temporal on all four backends (CUDA, SYCL, HIP, Vulkan). The HIP half produced three files:
core/src/feature/hip/speed/speed_score.hip— fiveextern "C"kernelscore/src/feature/hip/speed_chroma_hip.c— host wrapper,vmaf_fex_speed_chroma_hipcore/src/feature/hip/speed_temporal_hip.c— host wrapper,vmaf_fex_speed_temporal_hip
All three files were committed but none were wired into the build system or the feature extractor registry. The hip_kernel_sources meson dict in core/src/meson.build had no speed_score entry, core/src/hip/meson.build had no entries for the two .c wrappers, and core/src/feature/feature_extractor.c had no extern declarations or dispatch-table entries for the two extractors. This left the SpEED HIP path unreachable regardless of build flags.
Decision¶
We add the three missing wiring entries (one in each of the three files listed above). No implementation changes are made — only build-system and registry connections. The scaffold posture is preserved: without enable_hipcc=true both extractors' init() returns -ENOSYS, matching every prior HIP consumer's contract.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Wire all three simultaneously (chosen) | Closes the gap atomically; consistent with ADR-0533 pattern | Slightly larger changeset than wiring only one side | Wiring only host wrappers without the kernel blob gives a guaranteed -ENOSYS on hipModuleLoadData — no user benefit |
| Add a weak HSACO stub and defer kernel wiring | Keeps link green, mirrors the pre-ADR-0539 ADM pattern | Users get -ENOSYS on the kernel load, same as without the stub | The real kernel source already exists; there is no reason to defer it |
Consequences¶
- Positive:
speed_chroma_hipandspeed_temporal_hipare now reachable viavmaf_get_feature_extractor_by_name; withenable_hipcc=truethey run real on-device kernels per ADR-0567. - Negative: none — the implementation was already reviewed as part of ADR-0567.
- Neutral / follow-ups: ADR-0214 cross-backend ULP gate (
places=4vs CPU reference) should be run once a ROCm device is available in CI.
References¶
- ADR-0567 (
0567-speed-chroma-temporal-real-gpu.md) — original implementation. - ADR-0533 (
0533-hip-extractor-registration-sweep.md) — HIP extractor wiring pattern. - ADR-0539 (
0539-integer-adm-hip-kernels.md) — prior art forhip_kernel_sources. core/src/feature/hip/AGENTS.md—hip_hsaco_stubs.cfallback pattern.