ADR-0674: Second-Opinion Materializer Batch Manifest¶
- Status: Accepted
- Date: 2026-05-21
- Deciders: Lusoris, Codex
- Tags: ai, second-opinion, materializer, provenance, fork-local
Context¶
ADR-0657 added ai/scripts/materialize_second_opinion_features.py, a table-side joiner for externally generated NR/MOS scorer JSON. That design intentionally keeps competitor execution out of the repo: operators generate DOVER, Q-Align, FAST-VQA, fork-NR, or other sidecars elsewhere, then join the scalar evidence onto refreshed feature tables.
The current signal-mix backlog needs those joins across several refreshed tables. Repeating the single-table command by hand risks mixing scorer labels, missing policies, and output paths, and it produces no single artifact that downstream retraining can cite as the second-opinion refresh set.
Decision¶
Add ai/scripts/batch_materialize_second_opinion_features.py, a manifest-driven orchestrator over the existing second-opinion materializer. The manifest has shared defaults and a tables[] array. Each table carries id, features, scores, out, optional audit_json, and any single-table join override. Relative paths resolve from the manifest directory unless --base-dir is supplied.
The batch runner writes each joined table, optional per-table audit JSON, and a second-opinion-materializer-batch-v1 report with ADR-0661 run provenance. It must call materialize_second_opinion_features.materialize() for every table; it does not parse scorer payloads itself and does not invoke external scorer binaries.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Manifest-driven batch wrapper over the shared joiner | Repeatable multi-table joins; one provenance report; preserves the no-external-scorer boundary | Adds one operator-facing CLI | Chosen: it closes the execution gap without changing row semantics |
| Keep shell loops | No new Python surface | No stable batch artifact, weak provenance, easy label/path drift | Rejected: shell history is not adequate training evidence |
| Invoke competitor scorers from the batch runner | Fully automated end-to-end scorer generation | Vendors/links third-party competitors and violates ADR-0657's table-side boundary | Rejected: scorer execution remains outside this repo |
| Add corpus-specific second-opinion scripts | Simple per-corpus defaults | Duplicates join policy and column semantics | Rejected: corpus differences are already expressible in manifest entries |
Consequences¶
- Positive: Second-opinion joins for CHUG, KoNViD, UGC, Netflix, and BVI can be replayed from one manifest and cited by retraining jobs.
- Negative: Join option changes must keep the single-table script and batch manifest validation in sync.
- Neutral / follow-ups: Generate scorer sidecars, run the batch manifest on refreshed tables, rerun the signal-mix audit, and measure MOS/predictor retrain impact.
References¶
- ADR-0657 — table-side second-opinion joiner.
- ADR-0661 — shared AI run provenance.
- Research-0694 — implementation digest.
- Source: req — "yeah every possible gain through intersection (even of not yet included metrics)... thats an interesting topic for sure lol"
- Source: req — "well go on i guess we have enough backlog..."