Research-0655: Saliency Feature Materializer¶
Question¶
How should existing AI feature tables receive real saliency signals without turning every trainer into an implicit video decoder and saliency runner?
Findings¶
- Existing refreshed corpus tables are row-oriented JSONL/parquet assets, so saliency needs to be added as row-level aggregates before model selection can measure it.
vmaftune.saliency.compute_saliency_mapalready owns the model invocation and expects rawyuv420pplus explicit geometry.- Most local corpora are encoded video files, not raw YUV, so a reusable materializer must decode a bounded sample through FFmpeg and must probe geometry when table rows are incomplete.
- A status column is more useful than fail-fast behaviour for large local sweeps: operators need to know which rows decoded, which were skipped, and which failed because the source or geometry was missing.
Result¶
Ship one table materializer under ai/scripts/ with JSONL/parquet I/O, bounded FFmpeg decode, ffprobe fallback, saliency_mean / saliency_var columns, and status values for operational triage.
Verification¶
ai/tests/test_materialize_saliency_features.py covers JSONL roundtrip, existing-column skips, missing-source status, ffprobe geometry fallback, and the fake FFmpeg-to-saliency handoff.