ADR-0685: Tiny-AI Netflix corpus training scaffold — 2026-05-27 prep scope¶

Status: Accepted
Date: 2026-05-27
Deciders: Lusoris, Claude (Anthropic)
Tags: ai, training, fork-local, onnx, mcp, docs

Context¶

ADR-0242 (2026-04-27) defined the scaffold-only PR strategy for training tiny-AI full-reference regressors on the original Netflix VMAF corpus (.workingdir2/netflix/{ref,dis}/, 9 reference + 70 distorted YUVs, gitignored). Subsequent iterations (ADR-0612, ADR-0640, ADR-0682) refreshed the research digest and extended the architecture alternatives table through 2026-05-22.

ADR-0682 (2026-05-22) opened the canonical ai/tiny-netflix-training-scaffold branch and draft PR. The PR branch was not present on origin when this routine fired on 2026-05-27 (either merged-and-deleted or never pushed), so the idempotency gate cleared and this iteration opens a fresh scaffold PR. The gate condition was also satisfied by the merge of PR #152 (fix/volk-static-archive-priv-remap, ADR-0198), confirming that master is in the expected clean state.

The corpus itself is held locally at .workingdir2/netflix/ and is never committed. File naming follows the Netflix encoding-ladder convention: <source>_<quality_label>_<height>_<bitrate-kbps>.yuv. A full description of the loader API and corpus path contract is in docs/ai/training-data.md.

Three open questions carried forward from ADR-0682 remain unresolved:

Architecture selection: 2×64-nano MLP vs 3×128-tiny MLP vs attention-pooled variant. No architecture has been selected; Research Digest 0730 (this PR) surveys the 2025–2026 literature update.
Distillation vs from-scratch: soft labels from vmaf_v0.6.1 vs training on any published Netflix subjective scores. No option has been chosen.
Evaluation scope: Netflix golden pairs as held-out correctness gate only, vs also including cross-backend ULP deltas in the eval harness.

Decision¶

We will open branch ai/tiny-netflix-training-scaffold as a consolidated draft PR that:

Ships ADR-0685 (this file) as the 2026-05-27 scope record, cross-referencing ADR-0242 as the root decision and ADR-0682 as the immediately prior iteration.
Adds Research Digest 0730, updating the literature survey through 2026-05-27.
Updates docs/ai/training-data.md to cross-reference ADR-0685.
Adds a CHANGELOG fragment and rebase-notes entry per ADR-0108.
Does NOT run training, download corpus data, or touch Netflix golden test assertions (CLAUDE.md §8).

Architecture selection and the first actual training run remain deferred to a follow-up PR. The model architecture table in the Alternatives section is the definitive open question; the user should select via the popup workflow in the follow-up.

Alternatives considered¶

A. Architecture choice (deferred — decision table for follow-up PR)¶

Architecture	Parameters	Distillation PLCC target	From-scratch PLCC target	Notes
2×64 nano MLP	~8 k	~0.92	~0.88	Fits in 32 KB; ORT MatMul improvement applies
3×128 tiny MLP	~50 k	~0.96	~0.93	Prior recommendation (Research-0706)
4×256 small MLP	~200 k	~0.97	~0.95	Diminishing returns above 3×128 per EfficientVMAF
Attention-pooled frame encoder	~300 k	~0.97	~0.95	Adds temporal modelling; requires frame-level features

B. Training regime (deferred)¶

Option	Pros	Cons	Status
Distill from `vmaf_v0.6.1`	Soft labels free; no subjective data needed; direct comparison baseline	Inherits v0.6.1 biases	Recommended per ADR-0242
Train from scratch on Netflix subjective scores	Potentially higher ground-truth fidelity	Subjective scores unpublished for this corpus; annotation uncertainty	Viable but requires annotation
Hybrid: distill then fine-tune on subjective subset	Best of both	More complex pipeline; annotation required	Deferred

C. Evaluation scope (deferred)¶

Option	Pros	Cons	Status
Netflix golden pairs as correctness gate only	Zero extra work; gates already exist	Does not catch cross-backend regressions	Default
Golden pairs + cross-backend ULP deltas	Full parity check	Requires GPU access in CI	Deferred to follow-up

D. PR scope (this ADR)¶

Option	Pros	Cons	Status
Open the PR immediately (this ADR)	Unblocks architecture discussion; satisfies idempotency key	Another ADR without new code	Chosen
Wait for architecture selection	Fewer PRs	Delays the formal review gate; state lost on context reset	Rejected
Merge directly to master	Fewer branch steps	Violates CLAUDE.md §12 rule 3	Rejected

Consequences¶

Positive: branch ai/tiny-netflix-training-scaffold exists on origin, satisfying the routine's idempotency key for all subsequent daily runs.
Positive: architecture-selection discussion has a single reviewable PR home.
Positive: MCP smoke-test (test_smoke_e2e.py) is verified to exercise the full vmaf_score JSON-RPC path against the Netflix golden fixture within places=2.
Negative: ADR count grows without new code shipping.
Neutral / follow-ups:
Architecture selection PR (follow-up, pending user popup response).
First training run PR (multi-day GPU job, --data-root .workingdir2/netflix/).
CI cannot validate the corpus path (gitignored); manual pre-run check required.

References¶

ADR-0242: Tiny-AI training on the original Netflix VMAF corpus — root decision.
ADR-0612, ADR-0640, ADR-0682: prior 2026-05-19, -20, -22 research iterations.
ADR-0198: volk static-archive priv remap — gate PR (PR #152).
Research-0019: Tiny-AI Training on the Netflix VMAF Corpus — original methodology survey.
Research-0706: Tiny-AI Netflix Training Prep — 2026-05-22 — prior digest.
Research-0730: Tiny-AI Netflix Training Prep — 2026-05-27 — this PR's digest.
project_netflix_training_corpus_local user-memory entry: local corpus at .workingdir2/netflix/, 9 reference + 70 distorted YUVs, gitignored, naming convention <source>_<quality>_<height>_<bitrate-kbps>.yuv.
Source: req (daily prep-scaffolding routine, Lusoris project instruction).