ADRs tagged performance¶
Auto-generated by scripts/docs/generate-adr-by-tag.sh. Edit ADR Tags: lines to update.
29 ADR(s) carry this tag.
| ID | Title |
|---|---|
| ADR-0138 | _iqa_convolve AVX2 bit-exact double-precision fast path |
| ADR-0139 | SSIM SIMD accumulate bit-exact to scalar via per-lane scalar double |
| ADR-0145 | motion_v2 NEON SIMD — bit-exact to scalar |
| ADR-0147 | Thread-pool job-object recycling + inline data buffer |
| ADR-0159 | psnr_hvs AVX2 port — bit-exact DCT vectorization (T3-5) |
| ADR-0160 | psnr_hvs NEON port — bit-exact DCT vectorization (T3-5-neon) |
| ADR-0161 | SSIMULACRA 2 SIMD bit-exact ports — AVX2 + AVX-512 + NEON (T3-1 + T3-2) |
| ADR-0162 | SSIMULACRA 2 IIR blur SIMD ports — AVX2 + AVX-512 + NEON (T3-1 phase 2) |
| ADR-0251 | Vulkan VkImage import — v2 async pending-fence model (T7-29 part 4) |
| ADR-0252 | ssimulacra2 Vulkan host-path AVX2 + NEON SIMD (T-GPU-OPT-VK-2) |
| ADR-0353 | Vulkan submit-pool migration PR-B — six secondary kernels |
| ADR-0378 | Per-picture CUDA streams must use CU_STREAM_NON_BLOCKING |
| ADR-0383 | K150K corpus scoring driver — parallel CPU worker redesign |
| ADR-0445 | Persistent VkPipelineCache for Vulkan compute backend |
| ADR-0454 | VIF CUDA shared-memory staging for horizontal and vertical filter passes |
| ADR-0464 | CAMBI CUDA spatial-mask shared-memory tile |
| ADR-0489 | CAMBI SYCL — Replace GPU-to-GPU q.wait() Calls with Event Chains (SY-1) |
| ADR-0503 | vif_subsample_rd_8_avx512 Loop Fission to Reduce ZMM Register Spill |
| ADR-0504 | AVX-512F port of float separable convolution scanlines |
| ADR-0551 | VCQ-223 LocalExplainer CI timeout — root cause and fix path |
| ADR-0562 | VCQ-223 LocalExplainer hang fix — cap neighbor_samples in test runner |
| ADR-0600 | Port upstream USE_DIRECT_READ zero-copy input path (Netflix/vmaf@30a6e2a8d) |
| ADR-0607 | vmaf-tune compare: decode reference YUV once for the entire run |
| ADR-0743 | CUDA VIF filter1d ncu-driven performance optimizations |
| ADR-0744 | CUDA adm_cm __launch_bounds__(128, 8) register reduction (ms_ssim_decimate smem tiling reverted) |
| ADR-0750 | Hardware Measurement Verdict for PR perf/cuda-ms-ssim-decimate-adm-cm-ncu-driven |
| ADR-0759 | HIP ADM — AdmBufferHip passed by pointer (F3 fix) |
| ADR-0762 | CUDA CIEDE2000 8bpc/16bpc — __ldg() read-only cache routing (F3 fix) |
| ADR-0931 | MCP server — replace subprocess delegation with direct cgo (Phase 1) |