ADR-1023: MCP server asyncio correctness — async wrappers for blocking I/O¶
- Status: Accepted
- Date: 2026-06-04
- Deciders: Lusoris
- Tags:
mcp,asyncio,python,correctness
Context¶
The MCP server (mcp-server/vmaf-mcp/src/vmaf_mcp/server.py) runs on a single-threaded asyncio event loop. Several functions that run blocking subprocess.run calls were invoked directly from async coroutines, stalling the event loop for the duration of the subprocess. The affected sites were:
_probe_backends(vmaf)— called from_run_vmaf_score,_probe_backend, and_list_backends(all async), runsvmaf --helpviasubprocess.run._ffprobe_geometry(path)— called from_run_vmaf_score_encoded(async), runsffprobeviasubprocess.run._vmaf_version()— called from the async_call_tooldispatch handler, runsvmaf --versionandvmaf --helpviasubprocess.run.asyncio.gather(...)in_run_vmaf_score_encodedlackedreturn_exceptions=True, so a failure in one decode task would silently cancel the other without surfacing a clear error to the caller.VMAF_MCP_ASYNCenv-var parsing inmain()accepted arbitrary strings as anyio backend names, causing a confusingRuntimeErrorfrom anyio on ambiguous values such as"true"or"1".
Decision¶
We will:
-
Add
_probe_backends_async(vmaf)— an async wrapper that returns the cached result on a cache hit and delegates toasyncio.to_thread(_probe_backends, vmaf)on a miss. All async call sites use the async wrapper. -
Add
_ffprobe_geometry_async(path)— an async wrapper that delegates entirely toasyncio.to_thread(_ffprobe_geometry, path). -
Make
_vmaf_version()and_list_backends()async, usingasyncio.to_threadfor the blockingsubprocess.runcalls and_probe_backends_asyncfor the help probe. -
Add
return_exceptions=Trueto theasyncio.gathercall in_run_vmaf_score_encodedand inspect results to re-raise the first exception. -
Restrict
VMAF_MCP_ASYNCto well-defined tokens:""/"asyncio"/"0"/"false"/"no"→asyncio.run;"1"/"true"/"yes"/"trio"→anyio.run(backend="trio"); any other value is treated as an explicit anyio backend name (e.g."uvloop").
The synchronous _probe_backends and _ffprobe_geometry functions are preserved unchanged so sync call sites (test helpers, offline tooling) continue to work.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
Rewrite _probe_backends as natively async | Single implementation | Breaks sync callers; needs asyncio.get_event_loop() fallback | Unnecessary churn given the thin wrapper pattern |
Use loop.run_in_executor directly | Standard library, no helper | More verbose; asyncio.to_thread is idiomatic Python 3.9+ | No advantage over asyncio.to_thread |
| NOLINT the blocking sites | Quick | Hides a real bug class | Not a fix |
Consequences¶
- Positive: The event loop is never blocked by an external process. Concurrent MCP tool calls no longer serialise on the subprocess wait.
- Positive:
asyncio.gatherfailure mode is now explicit: a decode error for one input raises immediately with a clear message rather than being masked. - Positive:
VMAF_MCP_ASYNCrejects ambiguous values at startup rather than producing a cryptic anyio error. - Negative: Marginal complexity increase from the thin async wrapper functions.
- Neutral: The
_probe_backendscache means the thread-hop cost is paid at most once per binary path per process lifetime.
References¶
- Reported as part of the r5-python-async + r5-integration-boundaries review round.
- Related: ADR-0608 (MCP tool surface), ADR-0988 (JSON serialisation).