`vmaf-tune ladder` — uncertainty-aware ABR ladder construction¶

vmaf-tune ladder builds a per-title bitrate ladder by sampling the (resolution × target-VMAF) plane, computing the convex hull of the resulting (bitrate, VMAF) cloud, picking --quality-tiers knees along the hull, and emitting an HLS / DASH / JSON manifest. The base behaviour is documented in vmaf-tune.md and is unchanged when the uncertainty flags are omitted.

This page covers the uncertainty-aware extension (ADR-0393, shipped on top of the conformal-VQA prediction surface in PR #488). The extension is wired through both the library API and the CLI. The default sampler preserves vmaf_interval blocks from corpus rows; when those intervals are present, --with-uncertainty applies the same prune/insert recipe before knee selection. Point-only rows use the active wide_interval_min_width as a conservative centred fallback, so the recipe can still probe uncertain bitrate gaps.

Why uncertainty-aware¶

A point-estimate ladder treats every rung's VMAF as exact. In practice the predictor's intervals carry information that bears directly on rung selection:

Adjacent rungs whose intervals overlap a lot — the predictor cannot statistically distinguish them. Shipping both adds no quality information; ship the higher-quality rung and drop the lower-bitrate one.
Adjacent rungs whose averaged interval width is in the WIDE band — the predictor cannot localise even the rung. Insert a synthetic mid-rung; this is the highest-information-per-encode use of the budget because the operator empirically does not know which of "ship A" / "ship B" / "ship a hypothetical mid-rung" is best.

Both transforms run after convex_hull and before select_knees, so the Pareto-frontier invariant is preserved.

Flag surface¶

Flag	Default	Purpose
`--with-uncertainty`	off	Apply uncertainty-aware rung pruning/insertion. Sampled `vmaf_interval` payloads win; point-only rows use a conservative centred interval based on `wide_interval_min_width`.
`--uncertainty-sidecar PATH`	none	Calibration sidecar JSON (same schema as `recommend --uncertainty-sidecar`). Falls back to the documented Research-0067 floor.
`--rung-overlap-threshold F`	0.5	Overlap fraction above which two adjacent rungs are treated as indistinguishable and the lower-bitrate one is dropped.
`--framerate F`	`24.0`	Source frame rate fed to the default corpus sampler. Set to match the real source — the legacy hardcoded `24.0` produced incorrect bitrate math on non-24 fps inputs (ADR-0497).
`--duration S`	`1.0`	Analysed window length in seconds. Drives `bitrate_kbps = encode_size / duration` AND (ADR-0506) bounds the ffmpeg encode pipe and the reference decode to the first `S` seconds via input-side `-t`. Set this to a short value (e.g. `--duration 10`) when smoke-testing the ladder against a long source — pre-ADR-0506 the flag was metadata-only and a 10-second probe against a 9-minute container re-encoded the full source at every CRF in the sweep.
`--pix-fmt FMT`	`yuv420p`	Source pixel format fed to the default corpus sampler (e.g. `yuv422p`, `yuv420p10le`). Symmetric with `compare` / `tune-per-shot` (ADR-0497).
`--crf-sweep CSV`	`18,23,28,33,38`	Comma-separated CRF list overriding the canonical 5-point default sweep. Useful for smoke runs that only need to verify the ladder plumbing (e.g. `--crf-sweep 23,28`, ADR-0497).

The threshold defaults are documented in vmaftune.uncertainty.ConfidenceThresholds: tight_interval_max_width = 2.0 and wide_interval_min_width = 5.0 VMAF, sourced from Research-0067 and shared with the auto-driver F.3 loader. The DEFAULT_RUNG_OVERLAP_THRESHOLD = 0.5 is documented in the same research note.

The two transforms¶

`prune_redundant_rungs_by_uncertainty`¶

Walks the input rungs in ascending-bitrate order. For each adjacent pair (prev, cur), computes the overlap of their conformal intervals divided by the wider interval's width. When the overlap exceeds overlap_threshold, drops prev — the lower-bitrate rung.

The first and last rungs are always retained so the ladder's bitrate range is preserved; select_knees later picks interior rungs from whatever survives.

`insert_extra_rungs_in_high_uncertainty_regions`¶

Walks the rungs in ascending-bitrate order. For each adjacent pair (a, b), computes the pair-averaged interval width. When the averaged width is >= wide_interval_min_width, inserts a synthetic rung at:

Bitrate — geometric midpoint of (a.bitrate, b.bitrate). The geometric mean matches the log-bitrate spacing convention used by HLS authoring (see select_knees spacing="log_bitrate").
VMAF — arithmetic midpoint of (a.vmaf, b.vmaf).
Interval — union of the parent intervals (min(a.low, b.low), max(a.high, b.high)). Conservative on purpose: subsequent encodes refine it.
CRF — rounded average of the parent CRFs.
Resolution — inherited from the higher-quality parent.

`apply_uncertainty_recipe`¶

Composes the two transforms in their canonical order: prune first (so the inserted mid-rungs aren't immediately re-pruned against their parents), then insert.

Worked example — library API¶

from vmaftune.ladder import (
    UncertaintyLadderPoint,
    apply_uncertainty_recipe,
    convex_hull,
    select_knees,
    emit_manifest,
)
from vmaftune.uncertainty import (
    ConfidenceThresholds,
    load_confidence_thresholds,
)

# Sampler emits points with conformal intervals attached.
sampled = [
    UncertaintyLadderPoint(
        width=1920, height=1080, bitrate_kbps=8000.0,
        vmaf=95.5, crf=20, vmaf_low=92.5, vmaf_high=98.5,  # WIDE
    ),
    UncertaintyLadderPoint(
        width=1280, height=720, bitrate_kbps=2500.0,
        vmaf=91.0, crf=24, vmaf_low=88.0, vmaf_high=94.0,  # WIDE
    ),
    UncertaintyLadderPoint(
        width=854, height=480, bitrate_kbps=1200.0,
        vmaf=85.0, crf=27, vmaf_low=84.5, vmaf_high=85.5,  # tight
    ),
]

thresholds = load_confidence_thresholds("calibration.json")
augmented = apply_uncertainty_recipe(sampled, thresholds=thresholds)
# `augmented` may contain synthetic mid-rungs in any wide-interval gap.

hull = convex_hull([p.as_ladder_point() for p in augmented])
rungs = select_knees(hull, n=5, spacing="log_bitrate")
print(emit_manifest(rungs, format="hls"))

Reading the above: the (1080p, 8000 kbps) rung and the (720p, 2500 kbps) rung have intervals of width 6.0 each — averaged width 6.0 is >= wide_min = 5.0, so the recipe inserts a synthetic ~4470 kbps / 93.25 VMAF rung between them. The (480p, 1200 kbps) tight rung is left alone.

Decision rules¶

Transform	Condition	Action
Prune	overlap fraction > `rung_overlap_threshold`	Drop the lower-bitrate rung.
Insert	pair-averaged width >= `wide_interval_min_width`	Add a synthetic mid-rung.

When the sampler ships plain LadderPoint rungs without intervals, --with-uncertainty widens each point to a centred fallback interval whose width is the active wide_interval_min_width. That keeps point-only corpora conservative: pruning still has real overlap data to evaluate, and wide adjacent gaps can receive a synthetic midpoint rung.

Worked example — CLI¶

$ vmaf-tune ladder --src trailer.mp4 --encoder libx264 \
    --resolutions 1920x1080,1280x720,854x480 \
    --target-vmafs 95,90,85 --quality-tiers 5 \
    --with-uncertainty --uncertainty-sidecar calibration.json
#EXTM3U
#EXT-X-VERSION:6
#EXT-X-STREAM-INF:BANDWIDTH=1200000,RESOLUTION=854x480,CODECS="avc1.640028"
rendition_854x480_1200k.m3u8
...

If the sampled corpus rows contain vmaf_interval objects, the CLI attaches those intervals to the post-hull rungs, calls apply_uncertainty_recipe(), and selects knees from the adjusted rung set. If the rows are point-only, --with-uncertainty loads the threshold sidecar and uses its wide_interval_min_width as the fallback interval width before running the same recipe.

What this does NOT change¶

Per the feedback_no_test_weakening project rule, the uncertainty-aware recipe affects which rungs the ladder builder evaluates. It does not change:

The Netflix golden-data assertions (§8 of CLAUDE.md).
The convex-hull invariant (convex_hull is unchanged).
The knee-selection invariant (select_knees is unchanged).
The HLS / DASH / JSON manifest schema (emit_manifest is unchanged).

vmaf-tune ladder — uncertainty-aware ABR ladder construction¶