ADRs tagged cuda¶
Auto-generated by scripts/docs/generate-adr-by-tag.sh. Edit ADR Tags: lines to update.
80 ADR(s) carry this tag.
| ID | Title |
|---|---|
| ADR-0022 | Inference runtime is ONNX Runtime via execution providers |
| ADR-0027 | Non-conservative image pins with experimental toolchain flags |
| ADR-0121 | Windows GPU build-only matrix legs (MSVC + CUDA, MSVC + oneAPI SYCL) |
| ADR-0122 | CUDA gencode coverage + actionable init-failure logging |
| ADR-0123 | CUDA prev_ref null-deref on ffmpeg libvmaf_cuda path |
| ADR-0131 | Port Netflix#1382 — cuMemFreeAsync → cuMemFree in vmaf_cuda_picture_free |
| ADR-0150 | Port Netflix #1472 — CUDA feature extraction on Windows (MSYS2/MinGW) |
| ADR-0156 | CUDA backend: graceful error propagation (Netflix#1420) |
| ADR-0157 | CUDA preallocation memory leak fix + vmaf_cuda_state_free public API (Netflix#1300) |
| ADR-0181 | Global feature-characteristics registry + per-backend dispatch strategy |
| ADR-0182 | GPU long-tail batch 1 — psnr + ciede + moment on CUDA / SYCL / Vulkan |
| ADR-0188 | GPU long-tail batch 2 — psnr_hvs / ssim / ms_ssim across CUDA / SYCL / Vulkan |
| ADR-0192 | GPU long-tail batch 3 — closing every remaining metric gap (motion_v2 / float_ansnr / ssimulacra2 / cambi + float twins) |
| ADR-0194 | float_ansnr GPU kernels — single-dispatch 3x3 + 5x5 filters with per-WG float partials |
| ADR-0195 | float_psnr GPU kernels — single-dispatch diff² with float partials, bit-exact vs CPU |
| ADR-0196 | float_motion GPU kernels — float twin of integer_motion blur+SAD |
| ADR-0197 | float_vif GPU kernels — 4-scale pyramid with mirror-asymmetry fix |
| ADR-0202 | float_adm CUDA + SYCL twins — sixth Group B float kernel finishes |
| ADR-0206 | ssimulacra2 CUDA + SYCL twins |
| ADR-0214 | GPU-parity CI gate (T6-8) — cross-device variance matrix |
| ADR-0219 | motion3 GPU coverage on Vulkan + CUDA + SYCL (3-frame window) |
| ADR-0234 | GPU-generation-aware ULP calibration head |
| ADR-0239 | Backend-agnostic GPU picture pool (gpu_picture_pool.{h,c}) |
| ADR-0243 | enable_lcs MS-SSIM extras on CUDA + Vulkan |
| ADR-0246 | Per-backend GPU kernel scaffolding templates (CUDA + Vulkan) |
| ADR-0271 | Wire integer_ms_ssim_cuda through the CUDA fence-batching helper |
| ADR-0299 | GPU scoring backend for vmaf-tune (--score-backend) |
| ADR-0345 | cambi × {CUDA, SYCL, HIP} GPU port strategy |
| ADR-0351 | CUDA PSNR — chroma extension (psnr_cb / psnr_cr) |
| ADR-0358 | CUDA motion correctness — SAD race, pinned-mem leak, and motion2/motion3 precision parity with CPU |
| ADR-0360 | CAMBI CUDA port (Strategy II hybrid, T3-15a) |
| ADR-0374 | Build-time-optional public APIs return -ENOSYS when disabled |
| ADR-0378 | Per-picture CUDA streams must use CU_STREAM_NON_BLOCKING |
| ADR-0385 | Feature-extractor deduplication by provided-feature names |
| ADR-0408 | FFmpeg libvmaf filter — CUDA backend selector |
| ADR-0410 | ssimulacra2_cuda GPU module leak + per-scale malloc removal |
| ADR-0431 | Split CUDA and CPU Feature Passes for FR-from-NR Extraction |
| ADR-0447 | Motion features under-report on HFR / 50p content |
| ADR-0451 | Local dev-MCP container for live probing |
| ADR-0453 | PSNR enable_chroma option parity across all GPU backends |
| ADR-0454 | VIF CUDA shared-memory staging for horizontal and vertical filter passes |
| ADR-0456 | SSIMULACRA2 CUDA Blur: 3-Channel Kernel Fusion and V-Pass Transpose for Coalesced Access |
| ADR-0464 | CAMBI CUDA spatial-mask shared-memory tile |
| ADR-0483 | Extract shared vmaf_gpu_dispatch_parse_env tokenizer |
| ADR-0485 | Extract VMAF_LIFECYCLE_ZERO macro to eliminate struct-init duplication across HIP and Metal kernel templates |
| ADR-0486 | Codify the three-function GPU backend context-API contract in docs |
| ADR-0487 | Wire adm_min_val option into integer_adm GPU backends |
| ADR-0488 | Shared once-snapshot helper for GPU dispatch env variables |
| ADR-0514 | dev-MCP container exposes every host GPU backend (CUDA + SYCL + Vulkan + HIP) |
| ADR-0529 | Replace /dev/dri/by-path bind with whole /dev/dri bind in dev container |
| ADR-0542 | Full GPU backend plumbing in the dev-mcp container |
| ADR-0564 | Real integer_ssim GPU kernels (CUDA, HIP, SYCL) — replace silent float_ssim substitution |
| ADR-0567 | Real On-Device GPU Kernels for speed_chroma and speed_temporal (4 Backends) |
| ADR-0573 | Dev-mcp container — ubuntu:26.04 + CUDA 13.2 + hipcc + ocloc |
| ADR-0574 | CUDA Twins for HDR-Model Features — Phase 1 (aim, adm3) |
| ADR-0576 | ffmpeg-patches n8.1.1 full-feature-exposure sync |
| ADR-0582 | MS-SSIM enable_db and clip_db option parity on CUDA and SYCL backends |
| ADR-0590 | Wire enable_db / clip_db into the CUDA and SYCL MS-SSIM twins |
| ADR-0591 | Restore rfe_hw_flags per-frame bitmask cache after PR #1067 clobber |
| ADR-0596 | Delete orphan and duplicate HIP/CUDA translation units |
| ADR-0597 | integer_vif is luma-only across every backend; CUDA enable_chroma is a documented no-op |
| ADR-0599 | Cross-Backend Parity Audit — Full Extractor Matrix (2026-05-18) |
| ADR-0603 | Ubuntu 26.04 (Resolute Raccoon) fallout fixes — CUDA 13.2, Python 3.14, apt renames |
| ADR-0605 | Renovate customManagers for all dev/Containerfile pinned dependencies |
| ADR-0662 | Vulkan Motion Lavapipe Parity |
| ADR-0664 | Install Windows CUDA directly in CI |
| ADR-0667 | vmaf-tune score backend native priority |
| ADR-0699 | VMAFX Helm Chart and Kubernetes Manifests with 3-Vendor GPU Device-Plugin Support |
| ADR-0738 | Bump local CUDA toolkit pin to 13.3 + R610 minimum driver (partial — CI deferred) |
| ADR-0743 | CUDA VIF filter1d ncu-driven performance optimizations |
| ADR-0744 | CUDA adm_cm __launch_bounds__(128, 8) register reduction (ms_ssim_decimate smem tiling reverted) |
| ADR-0746 | integer_adm_cuda — emit integer_adm3 + integer_aim (parity with CPU) |
| ADR-0750 | Hardware Measurement Verdict for PR perf/cuda-ms-ssim-decimate-adm-cm-ncu-driven |
| ADR-0753 | Runtime Resolution-Aware CUDA Kernel Variant Dispatch |
| ADR-0756 | CUDA F3 struct-by-value kernel audit (scope + dispatch order) |
| ADR-0759 | HIP ADM — AdmBufferHip passed by pointer (F3 fix) |
| ADR-0760 | CUDA motion kernel multi-resolution ncu profiling methodology |
| ADR-0762 | CUDA CIEDE2000 8bpc/16bpc — __ldg() read-only cache routing (F3 fix) |
| ADR-0840 | Fix cu_state leak on import failure and gpu_dispatch_env TOCTOU |
| ADR-0841 | Environment variable reference page and canonical naming |