Research-0647: `fr_regressor_v1` refresh on the 2026-05-20 Netflix table¶

Question¶

Does the refreshed Netflix full-feature table still clear the ADR-0249 C1 ship gate, and how much do the shipped fr_regressor_v1 metrics move?

Inputs¶

Feature table: runs/full_features_netflix_refresh_20260520.parquet
Table shape: 11190 rows, 30 columns
Trainer: PYTHONPATH=ai/src .venv/bin/python ai/scripts/train_fr_regressor.py
Model recipe: unchanged ADR-0249 FRRegressor (hidden=64, depth=2, dropout=0.1, canonical-6 input)
Ship gate: mean LOSO PLCC >= 0.95

Result¶

Source	PLCC	SROCC	RMSE
BigBuckBunny	0.9962	0.6277	4.466
BirdsInCage	0.9993	0.9997	1.854
CrowdRun	0.9997	0.9998	1.005
ElFuente1	0.9992	0.9963	1.635
ElFuente2	0.9964	0.9969	3.169
FoxBird	0.9967	0.9954	2.352
OldTownCross	0.9992	0.9998	1.931
Seeking	0.9989	0.9962	1.982
Tennis	0.9982	0.9982	1.356

Summary:

Mean PLCC: 0.9982 ± 0.0014
Mean SROCC: 0.9567 ± 0.1234
Mean RMSE: 2.194 ± 1.049
Final all-source in-sample PLCC: 0.9993
Exported ONNX sha256: b57dee2509290d77c7980f8f23aa1380f64937c485d1b1d1e5f78c13a3a54c63

Interpretation¶

The refreshed model clears the unchanged PLCC ship gate by a wide margin and improves PLCC variance relative to the previous sidecar. The low BigBuckBunny SROCC is a real caveat: rank ordering within that fold is weaker even though linear agreement remains high. This does not block the C1 checkpoint because ADR-0249 defines PLCC-vs-teacher as the ship gate, but it should be checked again when the aggregate 4/5-corpus models are refreshed.

Reproducer¶

PYTHONPATH=ai/src .venv/bin/python ai/scripts/train_fr_regressor.py \
  --parquet runs/full_features_netflix_refresh_20260520.parquet \
  --metrics-out runs/fr_regressor_v1_refresh_20260520_metrics.json

PYTHONPATH=ai/src .venv/bin/python - <<'PY'
import onnx
onnx.checker.check_model(onnx.load("model/tiny/fr_regressor_v1.onnx"))
PY

bash core/test/dnn/test_registry.sh

Research-0647: fr_regressor_v1 refresh on the 2026-05-20 Netflix table¶