ADR-0655: Saliency Feature Materializer¶
- Status: Accepted
- Date: 2026-05-20
- Deciders: Lusoris, Codex
- Tags: ai, saliency, training-data, docs
Context¶
The signal-mix audit and CHUG HDR MOS experiments both need to answer whether saliency helps once it is present as a real numeric signal, not as placeholder zeros. Existing predictor plumbing can preserve saliency columns once they exist, but historical feature tables still need an explicit enrichment pass.
The enrichment pass must work on existing JSONL/parquet corpus tables without coupling every trainer to the saliency model stack. It also needs bounded decode cost because operators will run it over large local corpora while other refresh jobs are active.
Decision¶
Add ai/scripts/materialize_saliency_features.py, a table-shaped materializer that reads JSONL or parquet rows, resolves a source clip path, decodes a bounded sample to temporary yuv420p via FFmpeg, invokes the fork saliency helper, and writes saliency_mean, saliency_var, and an optional status column.
The default row contract is deliberately conservative:
- source clip path from
src; - geometry from
width/height, with ffprobe fallback; - bounded decode via
--max-framesand saliency sampling via--frame-samples; - status values that distinguish existing rows, missing sources, missing geometry, decode failures, and model failures.
The script is a preparation surface, not a trainer side effect. Trainers should consume already materialised saliency columns and fail or report missing-signal coverage explicitly instead of silently running saliency inference inside a training loop.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Compute saliency inside each trainer | No extra operator command; every trainer can request the signal directly | Duplicates decode/model code, makes training runs non-deterministically slower, and hides missing-signal coverage | Rejected because corpus enrichment should be observable and reusable |
| Add a one-off CHUG-only enrichment script | Fastest route for the current HDR run | Leaves KoNViD/UGC/Netflix refresh tables without the same path and repeats the scaffold problem in the next corpus | Rejected because the audit needs a cross-corpus table utility |
| Materialize table rows as a standalone script | Reuses existing saliency helper, gives operators a status column, and keeps trainers simple | Adds a new user-facing script and docs surface | Chosen as the narrowest reusable path |
| Require raw YUV inputs only | Avoids ffmpeg decode variance | Most MOS/HDR corpora are MP4/encoded assets; operators would need a separate decode pipeline first | Rejected because the materializer is meant to unblock real corpus tables |
Consequences¶
- Positive: Refreshed AI tables can now carry real saliency aggregates before MOS-head and predictor retrains.
- Positive: The status column makes bulk corpus quality visible without scraping stderr.
- Negative: Large runs still require local video access and model weights; CI covers the control flow with fake decoders rather than shipping a real video/model fixture.
- Neutral / follow-ups: Run this materializer over refreshed CHUG, KoNViD/UGC, and Netflix-derived tables, then re-run the signal-mix audit to measure whether saliency should stay in the production feature mix.
References¶
- ADR-0286
- ADR-0396
- ADR-0649
- Research-0655
- Source:
req- "well and in this audit perhaps find gaps that we have no metric/signal for at all or so" - Source:
req- "yeah every possible gain through intersection (even of not yet included metrics)"