Skip to content

learned_filter_v1 — tiny residual luma filter

learned_filter_v1 is a self-supervised residual convolutional neural network that maps a degraded luma frame to a clean reconstruction. It serves as the C3 baseline for the fork's tiny-AI filter capability (ADR-0020), exercising the full training + export + quantisation pipeline end-to-end.

Status — shipped 2026-04-25. Production baseline for the C3 filter capability (KoNViD-1k self-supervised). An INT8 sidecar is available via learned_filter_v1.int8.onnx (dynamic-PTQ). See ADR-0168 and ADR-0174.

What the output means

The model takes a degraded luma frame (blurred + JPEG-compressed) and produces a residual-corrected clean luma estimate. The output is the reconstructed luma tensor on the same scale as the input; it is not a quality score. Downstream consumers subtract the output from the input to obtain the learned residual correction, or pass it directly to a downstream feature extractor.

Shipped checkpoint

Field Value
Model id learned_filter_v1
Location model/tiny/learned_filter_v1.onnx
INT8 sidecar model/tiny/learned_filter_v1.int8.onnx
Architecture 4-block residual CNN — Conv(1→16, 3×3) → 3×ResBlock(16, 3×3) → Conv(16→1, 3×3); ~19 K params
Input input — float32 NCHW [1, 1, H, W] normalised luma in [0, 1]
Output output — float32 NCHW [1, 1, H, W] reconstructed luma
ONNX opset 17
Training corpus KoNViD-1k middle-frames (1 200 clips; not redistributed in-tree)
Val loss (L1) ~0.019 on normalised luma (KoNViD-1k validation split)
Quantisation Dynamic-PTQ INT8 via ai/scripts/ptq_dynamic.py; quant_accuracy_budget_plcc = 0.01
License BSD-3-Clause-Plus-Patent
Trainer / exporter ai/scripts/export_tiny_models.py

Fresh exports from ai/scripts/export_tiny_models.py add ADR-0661 run_provenance to model/tiny/learned_filter_v1.json. That block records the C3 checkpoint input, parsed exporter arguments, ONNX output, sidecar output, and registry target so a refreshed filter baseline can be replayed.

Training corpus provenance

Field Value
Dataset KoNViD-1k
Source https://datasets.vqa.mmsp-kn.de/databases/KoNViD-1k/
Licence CC BY 4.0 — clips are not redistributed in-tree
Usage Middle frame extracted per clip; synthetic degradation applied (Gaussian blur σ=1.2 + JPEG quality=35); self-supervised (degraded→clean pairs, no external MOS labels used for the filter task)

Acknowledgement. Training uses KoNViD-1k frames for self-supervised degradation recovery. The clips are not committed to this repository.

Op-allowlist conformance

Every op in the graph is on core/src/dnn/op_allowlist.c: Conv, Relu, Add (residual skip connection).

Degradation recipe

The synthetic training pairs are produced inside export_tiny_models.py:

  1. Load middle frame of each KoNViD-1k clip as 224×224 grayscale (nearest-neighbour crop; no random augment at export time).
  2. Apply Gaussian blur with σ=1.2.
  3. JPEG-compress at quality 35 using PIL.Image.save(..., quality=35).
  4. Luma pair: (degraded, original) both normalised to [0, 1].

Usage — vmaf_pre FFmpeg filter

learned_filter_v1 is the filter loaded by ffmpeg-patches/0002 (vmaf_pre) when --tiny-model=learned_filter_v1 is passed:

ffmpeg \
  -i ref.yuv -i dist.yuv \
  -filter_complex '[1:v]vmaf_pre=model_path=model/tiny/learned_filter_v1.onnx[d];
                   [0:v][d]libvmaf' \
  -f null -

The filter applies the model to the distorted stream's luma before VMAF scoring, enabling a learned pre-processing step upstream of the feature extractors.

Reproducing the model

# 1. fetch KoNViD-1k (~40 GB) — not redistributed in-tree
.venv/bin/python ai/scripts/fetch_konvid_1k.py

# 2. train the C2/C3 checkpoints
#    A run-provenance sidecar is written automatically to
#    runs/c2_konvid/train_konvid.manifest.json (ADR-0668).
#    Pass --manifest-out <path> to override the sidecar location.
.venv/bin/python ai/scripts/train_konvid.py \
    --model both \
    --output-c2 runs/c2_konvid \
    --output-c3 runs/c3_konvid \
    --epochs-c2 50 \
    --epochs-c3 200 \
    --seed 42

# 3. export the ONNX + sidecar + registry rows
.venv/bin/python ai/scripts/export_tiny_models.py \
    --c2-ckpt runs/c2_konvid/last.ckpt \
    --c3-ckpt runs/c3_konvid/last.ckpt

# 4. quantise to INT8
.venv/bin/python ai/scripts/ptq_dynamic.py \
    --model model/tiny/learned_filter_v1.onnx \
    --output model/tiny/learned_filter_v1.int8.onnx

# 5. validate against the registry
.venv/bin/python ai/scripts/validate_model_registry.py

Known limitations

  • Task scope: the degradation recipe (Gaussian blur + JPEG) covers classic codec artefacts but not block noise patterns typical of AVC/HEVC at very low bitrate or content-adaptive quantisation.
  • Luma only: the filter operates on the Y channel. Chroma artefacts (colour bleed, cross-component leakage) are not corrected.
  • Fixed crop: training used 224×224 random crops; inference is fully convolutional (no size constraint), but quality on very large or very small inputs may degrade.
  • Self-supervised only: no perceptual loss (LPIPS, SSIM); the L1 reconstruction target may slightly over-smooth texture.
  • nr_metric_v1.md — sibling KoNViD-1k baseline (NR quality metric, same training corpus).
  • ADR-0168 — decision record for both C2 + C3 KoNViD baselines.
  • ADR-0174 — INT8 dynamic-PTQ policy.
  • ADR-0042 — tiny-AI doc-substance rule this card satisfies.