ADR-0650: Add a Signal-Mix Audit CLI¶
- Status: Accepted
- Date: 2026-05-20
- Deciders: Lusoris, Codex
- Tags: ai, metrics, audit, hdr
Context¶
The fork now has multiple overlapping VQA signal sources: Netflix canonical features, full-feature parquet tables, subjective MOS corpora, DNN perceptual models, saliency models, encoder profile metadata, and HDR-specific CHUG rows. Several older audits catalogued feature availability, but they did not give an operator a current, table-driven answer to "what signal family is missing?" or "which not-yet-wired metric could add value through intersection with the current stack?"
This matters because the next product gains are unlikely to come from one more canonical-six retrain. HDR, panel-aware recommendations, saliency-weighted encodes, texture metrics, codec-specific profiles, and MOS heads each need a clear signal map before model promotion.
Decision¶
We will ship ai/scripts/signal_mix_audit.py as a table-only diagnostic CLI. It reads parquet/JSONL/JSON feature tables, classifies columns into VQA signal families, computes target correlations, flags redundant pairs, surfaces cross-family complementary intersections, and renders JSON plus Markdown reports with missing/weak signal-family recommendations.
The CLI is advisory and side-effect free. It does not extract features, train models, mutate corpus files, or gate CI by default.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
Extend feature_correlation.py | Reuses an existing script | That script is intentionally narrow: pairwise correlations and sklearn importances for one parquet target | Rejected; the new question is signal-family coverage and candidate intersections, not just ranking columns |
| Keep documenting audits manually | Fast for one session and flexible for narrative notes | Goes stale immediately after refreshes and does not inspect generated tables | Rejected; stale speed-feature notes already showed why prose-only audits are unreliable |
| Build the full continuous feature-mix evaluator now | Most complete long-term answer | Larger scope: YAML grids, subset search, model fitting, uncertainty intervals, and cost weighting | Deferred; this PR gives a safe first diagnostic layer while longer evaluator work remains separate |
| Add table-only signal-mix audit | Cheap, deterministic, can run on current local tables, highlights missing metric families and candidate intersections | Heuristic column-family mapping can miss custom names until updated | Accepted |
Consequences¶
- Positive: model-refresh and HDR work can see missing signal families before burning compute on a retrain.
- Positive: the report explicitly calls out candidate metrics that are not yet wired, such as U2NetP, DISTS, HDR-VDP, DOVER, Q-Align, and panel metadata.
- Positive: stale prose audits become inputs to compare against, not the source of truth for current table state.
- Negative: family matching is name-based. New custom column names may need new regexes before the audit classifies them correctly.
- Negative: geometry/content metadata is only conditioning context; it can explain MOS variation but does not replace perceptual metrics.
- Neutral / follow-ups: the continuous feature-mix evaluator can consume this audit's findings later, but remains a separate modelling/evaluation surface.
References¶
- Research-0026 — earlier cross-metric feature-fusion rationale.
- continuous-feature-mix-evaluation-design-2026-05-18 — broader evaluator design this diagnostic does not try to replace.
- Source: req: "well and in this audit perhaps find gaps that we have no metric/signal for at all or so"
- Source: req: "yeah every possible gain through intersection (even of not yet included metrics)... thats an interesting topic for sure lol"