ADR-0471: Add enable_chroma to integer_psnr_hip (chroma parity with CUDA/SYCL/Vulkan twins)¶
- Status: Accepted
- Date: 2026-05-16
- Deciders: lusoris, Claude (Anthropic)
- Tags: hip, psnr, option-parity, chroma, fork-local
Context¶
ADR-0453 added enable_chroma (and full chroma dispatch) to the three GPU PSNR twins that existed at the time: integer_psnr_cuda.c, integer_psnr_sycl.cpp, and psnr_vulkan.c. The HIP twin (integer_psnr_hip.c) was not included in that scope because it was still listed as luma-only with a follow-up note in ADR-0372 ("Chroma extension is a follow-up").
As a result the HIP extractor:
- only dispatched and emitted
psnr_y; - did not advertise
psnr_cb/psnr_crinprovided_features; - silently dropped any
enable_chroma=falsecaller intent; - diverged from all other GPU PSNR twins.
The HIP kernel (psnr_score.hip) is plane-agnostic and accepts arbitrary (pointer, stride, width, height) arguments, so chroma can be dispatched by invoking the same kernel three times — identical to the CUDA pattern.
Decision¶
Mirror the ADR-0453 fix pattern in integer_psnr_hip.c:
- Add
bool enable_chromatoPsnrStateHipwith a matchingVmafOption(defaulttrue). - Compute per-plane geometry in
init()frompix_fmt(ceiling division for 4:2:0 / 4:2:2; same as CUDA twin). Applyenable_chromaguard: clampn_planesto 1 when false or whenpix_fmt == YUV400P. - Allocate per-plane readback pairs
rb[PSNR_NUM_PLANES]and staging buffersref_in[3]/dis_in[3](luma + up to two chroma planes). - In
submit(), loop overn_planes— one HtoD copy + one kernel dispatch per plane. - In
collect(), loop overn_planes— read each plane's SSE from its pinned host slot and emit the score underpsnr_name[p]. - Update
provided_featuresto{"psnr_y", "psnr_cb", "psnr_cr", NULL}. - Update
n_dispatches_per_framefrom 1 to 3.
No changes to the .hip kernel source — it is already plane-agnostic.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Keep HIP luma-only; document as known gap | No code change | Permanent backend divergence; callers using psnr_cb/psnr_cr on HIP silently get no output | Masks the gap rather than closing it |
| Add a new chroma-specific HIP kernel | Could be optimised for chroma subsampling | Extra kernel maintenance; the existing kernel is already generic | Unnecessary complexity |
Consequences¶
- Positive:
integer_psnr_hipnow provides full PSNR parity with the CUDA/SYCL/Vulkan twins:psnr_y,psnr_cb,psnr_crall emitted atenable_chroma=true; luma-only atenable_chroma=falseor YUV400P. - Negative: None. The default path (
enable_chroma=true) dispatches three kernels per frame instead of one, but each is a small reduction — consistent with the CUDA twin's profile. - Neutral: The
.hipkernel source is unchanged; no HSACO rebuild required beyond the normal hipcc pipeline.