ADR-0795: Clarify and harden VmafFeatureExtractor.prev_ref thread-safety invariant¶
- Status: Accepted
- Date: 2026-05-29
- Deciders: lusoris, Claude (Anthropic)
- Tags: threading, feature-extractor, batch-threading, correctness
Context¶
VmafFeatureExtractor.prev_ref is a VmafPicture field written by the dispatch layer immediately before calling an extractor's extract() callback and cleared immediately after. The field exists because the VMAF_FEATURE_EXTRACTOR_PREV_REF protocol (used by integer_motion_v2) requires the framework to inject the previous reference frame into the extractor struct rather than passing it as a parameter.
A PR #115 thread-safety audit flagged the pattern as a potential data race when multiple BATCH_THREADING workers write fex->prev_ref concurrently. The audit recommended hoisting prev_ref out of VmafFeatureExtractor and into BatchThreadData (the per-thread TLS data structure) as recommendation #4.
Actual race analysis:
-
BATCH_THREADING path (
threaded_extract_batch_func): each thread has its ownBatchThreadData *td(lazy-allocated in thread-pool TLS, never shared).td->fex_ctx[i]is created viavmaf_feature_extractor_context_create, whichmemcpys the registered extractor into a new heap object. Thereforetd->fex_ctx[i]->fex != registered_fex_ctx[i]->fex— different heap objects. Theprev_refwrite goes to the per-thread copy, not the shared registered fex. No race. -
Non-batch pool path (
threaded_extract_func): each pool slot owns its own deep-copiedVmafFeatureExtractor. The pool serialises access (one thread holds a slot at a time viain_use). Theprev_refwrite and clear are entirely within the holding thread's critical section. No race. -
Sequential path (
read_pictures_dispatch_one): called only whenread_pictures_should_skipreturns false. Withthread_pool != NULL,read_pictures_should_skipreturns true for all non-CUDA/SYCL non-TEMPORAL fex, soread_pictures_dispatch_onenever writesprev_refon any fex that is also dispatched to the batch/pool thread paths. No race.
Despite the absence of an active race, the code was fragile: the fex variable in threaded_extract_batch_func aliased the shared registered extractor pointer with no const annotation, the write used the same variable name as the read-only flag checks, and there was no assertion enforcing that the per-thread copy was distinct from the shared registration.
Decision¶
Apply a defensive hardening in core/src/libvmaf.c without changing the extractor API (VmafFeatureExtractor.prev_ref is kept as the injection point):
-
In
threaded_extract_batch_func: rename the localfex(which pointed to the shared registered extractor) toshared_fexand mark itconst. Add anassert(td->fex_ctx[i]->fex != shared_fex)to enforce at runtime that the per-thread context's fex is a distinct heap object. Add a comment citing this ADR. -
In
threaded_extract_func: add a comment explaining why the pool-slot-ownedfex->prev_refwrite is safe (exclusive pool acquisition + deep copy at slot creation time).
These changes are documentation + assertion only — no semantic change. They make the invariant machine-checked (assert fires in debug builds if the deep-copy contract is broken by a future refactor) and human-visible (const qualifier, rename, comments).
Alternatives considered¶
| Option | Pros | Cons |
|---|---|---|
Add prev_ref array to BatchThreadData indexed by extractor slot | Completely eliminates fex->prev_ref writes in the batch path | Requires allocating a VmafPicture[cnt] per thread; complicates lifecycle of picture refs; extractor still needs fex->prev_ref for the sequential path |
Pass prev_ref as a new extract() parameter | Eliminates the struct-field injection entirely | Breaks the VmafFeatureExtractor public API contract; requires changing every extractor's extract callback and all callers |
Add a mutex around fex->prev_ref reads/writes | Provably safe | Unnecessary — no concurrent writes exist today; adds lock overhead on the hot-path |
| Current approach (rename + const + assert + comments) | Zero runtime overhead; machine-checked invariant; no API change | Does not prevent the race if the deep-copy contract is broken silently |
The chosen approach (const rename + assert + comments) was preferred because the race does not actually exist in the current code and a full API change is not justified. The assert provides a regression guard for the underlying deep-copy invariant.
Consequences¶
Positive:
- The
assertinthreaded_extract_batch_funcfires in debug builds ifvmaf_feature_extractor_context_createever returns a context sharing itsfexpointer with the registered object. const VmafFeatureExtractor *shared_fexmakes it a compile-time error to accidentally write through the shared pointer.- Comments reduce re-investigation cost when future engineers audit threading.
Negative:
- None. This is a documentation + assertion change only.
Neutral follow-ups:
- The fuller recommendation #4 (move
prev_refintoBatchThreadDataas a staged slot) remains open as a future refactor once the extractor API is versioned (VMAFX Phase 4 / ADR-0709).
References¶
- PR #115 thread-safety audit thread (recommendation #4: hoist
prev_refintoBatchThreadData) core/src/libvmaf.c—threaded_extract_batch_func,threaded_extract_funccore/src/feature/integer_motion_v2.c— only PREV_REF consumer in the default model; readsfex->prev_ref.data[0]andfex->prev_ref.stride[0]- ADR-0795 (this document)