ADR-1008: Fix C lifecycle bugs — pic_cnt double-increment, div-by-zero in pooled score, silent test failures¶
- Status: Accepted
- Date: 2026-06-04
- Deciders: Lusoris
- Tags:
core,correctness,test,c
Context¶
Three correctness defects identified by the r3/r4 audit:
-
libvmaf.c:2507/2576—vmaf->pic_cntis incremented beforeread_pictures_validate_and_prep/vmaf_sycl_queue_waitsucceeds. If the caller retries on a transient error (e.g.-ENOMEMfrom picture-pool exhaustion),pic_cntis double-incremented, corrupting the FPS calculation and theindex_highvalue passed tovmaf_score_pooled. Same pattern in the SYCL path. -
libvmaf.c:2762-2774—vmaf_feature_score_pooledloops over[index_low, index_high]and incrementspic_cntonly for frames that are not subsampled (n_subsample). Whenn_subsampleis large enough that every frame in the range is skipped,pic_cntstays 0;sum / pic_cntand(double)pic_cnt / i_sumthen divide by zero / produce NaN. -
test_feature_collector.c:236,239—score = 105./score = 109.are assignments (=), not equality comparisons (==). Themu_assertcondition is always non-zero (truthy), so the aggregate-score value checks never exercise the actual score. -
test_framesync.c:77— The framesync verification loop compareddependent_buf[ctr]againstref+distbut the writer storedref+dist+2(line 56). The mismatch triggered on every non-zero-seeded frame, but the error was only printed tostderrviafprintf— invisible to the test harness. The test always "passed" regardless of data corruption.
Decision¶
- Move
vmaf->pic_cnt++to after the success check in bothvmaf_read_picturesandvmaf_read_pictures_sycl. - Add
if (pic_cnt == 0) return -EINVAL;before theswitchinvmaf_feature_score_pooled; also guardi_sum > 0.in the harmonic-mean case. - Fix
score = 105.→score == 105.andscore = 109.→score == 109.intest_feature_collector.c. - Fix the off-by-2 in
test_framesync.c(checkref+dist+2, matching the writer) and replace the silentfprintfwithabort()so data corruption terminates the test process.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Leave pic_cnt before validate | No diff | Silent FPS corruption on retry | Rejected |
| Return NaN from pooled score on zero pic_cnt | Preserves partial result | NaN propagates silently into scores | Rejected: -EINVAL is cleaner |
Use assert() in framesync test | Standard | Does not show which byte failed | abort() with fprintf gives more info |
Consequences¶
- Positive: FPS calculation is correct on error retries; pooled score returns an explicit error instead of NaN/Inf; test assertions now exercise real values; framesync data corruption is no longer silent.
- Neutral: No golden-assertion values change; no public ABI change.
- Negative: None.
References¶
- r3/r4 audit findings:
[r3-resource-exhaustion],[r4-test-quality] - ADR-0141 (touched-file cleanup rule)