ADR-0991: Second-Opinion Batch Materializer — Smoke-Run Scaffold and Test Fix¶

Status: Accepted
Date: 2026-06-03
Deciders: Lusoris
Tags: ai, second-opinion, materializer, testing, smoke, fork-local

Context¶

ADR-0674 introduced ai/scripts/batch_materialize_second_opinion_features.py and its test suite. The implementation is correct; however, two gaps were identified when the PR (archived as #1497) was reviewed for its first real corpus run:

No committed smoke-run scaffold. There is no versioned manifest or fixture set that an operator can use to verify the script works end-to-end against a known input before pointing it at real corpus data. The analogous saliency batch runner (ADR-0673) also lacks such a scaffold, but the second-opinion path is more frequently referenced in signal-mix planning.
ai/pyproject.toml missing pythonpath = ["scripts"] in pytest configuration. The batch tests import the scripts via importlib.spec_from_file_location, which triggers _script_bootstrap resolution at module load time. Without ai/scripts/ on sys.path, all three second-opinion batch tests and all four saliency batch tests fail with ModuleNotFoundError: No module named '_script_bootstrap'. This is a pre-existing bug affecting any invocation of pytest ai/tests/ from the repo root without an explicit PYTHONPATH=ai/scripts prefix.

Decision¶

Add pythonpath = ["scripts"] to [tool.pytest.ini_options] in ai/pyproject.toml. This fixes the pre-existing failure for both the second-opinion batch tests and the saliency batch tests without changing any test logic.
Commit a smoke-run scaffold under ai/testdata/smoke-second-opinion-batch/: a batch.json manifest, minimal synthetic fixture JSONL tables, and a README.md with the exact invocation and expected output. The manifest outputs to /tmp/ so no generated artifacts are committed.

Alternatives considered¶

Option	Pros	Cons	Why not chosen
Add `PYTHONPATH=ai/scripts` to every CI step	No pyproject.toml change	Requires updating every CI job that runs ai tests; leaks an env var concern into infra rather than the package config	Rejected: the cleaner fix is the package declaring its own pytest paths
Fix only second-opinion tests, leave saliency broken	Smaller PR scope	Leaves a known CI failure in the saliency batch test suite	Rejected: same root cause, same one-line fix
Place smoke manifest in `ai/runs/`	Conventional location for run manifests	`runs/` is gitignored at the root `.gitignore` level, so the manifest would not be committed	Rejected: `ai/testdata/` is not gitignored and is the correct home for committed fixture data
No smoke scaffold, rely on unit tests alone	No new committed files	No single "run this to prove it works" command for operators new to the script	Rejected: operator confidence requires an end-to-end smoke path with known fixtures

Consequences¶

Positive: pytest ai/tests/ from the repo root passes for all batch materializer tests without a manual PYTHONPATH prefix.
Positive: Operators can verify the second-opinion batch path end-to-end with one command and two synthetic tables before pointing the script at real corpus data.
Negative: ai/testdata/ grows by five small files (batch manifest, two fixture feature tables, two fixture score tables, README).
Neutral / follow-ups: Run the batch manifest on real scorer sidecars for CHUG, KoNViD, UGC, Netflix, and BVI tables, then rerun the signal-mix audit. Consider adding the same pytest path fix for any other ai/scripts batch runners added in the future (saliency batch runner currently uses the same fix path).

References¶

ADR-0657 — table-side second-opinion joiner.
ADR-0661 — AI run provenance.
ADR-0673 — saliency batch runner (same pytest fix also applied).
ADR-0674 — second-opinion batch manifest design.
Research-0991 — implementation digest.
Source: req — "Script merged (PR #1497 archived) but no artifacts under .workingdir2/ or runs/. If wired but not run, scaffold a smoke launch."