MCP server — vmaf-mcp¶
The VMAFX fork ships three MCP surfaces:
- External Python MCP server (
vmaf-mcp) — wraps thevmafCLI via subprocess. Stable; in production use. Lives inmcp-server/vmaf-mcp/. - External Go MCP server (
vmafx-mcp) — single static binary, same 15 tools, byte-for-byte schema parity with the Python server. Lives incmd/vmafx-mcp/. Stage 1 (Python preserved alongside). See Go implementation below. - Embedded MCP server inside libvmaf — runs in-process on the host that loaded
libvmaf.so; serves stdio, UDS, and loopback SSE transports withlist_featuresandcompute_vmaf. It is the right surface when an embedding host needs an in-process control plane rather than a childvmafprocess. Model hot-swap and frame-boundary SPSC draining remain future work. Seedocs/mcp/embedded.mdfor build flags, transport limits, and the C API reference.
All three surfaces are additive; running any combination at once is fine. This document covers surfaces 1 and 2. See embedded.md for surface 3.
vmaf-mcp is a Model Context Protocol server that exposes the VMAFx fork's scoring CLI to LLM tooling (Claude Desktop, Cursor, custom MCP clients) over JSON-RPC on stdio. It lives in mcp-server/vmaf-mcp/.
Use it when you want an LLM to:
- score a
(reference, distorted)YUV pair and reason about the result, - enumerate which VMAF models shipped with the build,
- probe which runtime backends (CPU / CUDA / SYCL / Vulkan / HIP / Metal) the local binary can dispatch to,
- run the Netflix benchmark harness and summarise the output,
- evaluate a tiny-AI ONNX regressor against a parquet feature cache on a deterministic split and report PLCC / SROCC / RMSE,
- rank several candidate tiny-AI models on the same split.
The server exec's the repo's own built vmaf binary under argv — it never passes a shell string — and refuses any file path that is not under an allowlisted root. See security below.
Tool catalogue¶
| Tool | Purpose | Detail |
|---|---|---|
vmaf_score | Score one (ref, dis) YUV pair; return the full JSON report | tools.md#vmaf_score |
vmaf_score_encoded | Score a (ref, dis) encoded video pair via ffmpeg decode | tools.md#vmaf_score_encoded |
list_models | Enumerate .json / .pkl / .onnx under model/ | tools.md#list_models |
list_backends | Report which backends the local vmaf binary was built with | tools.md#list_backends |
probe_backend | Check whether a specific backend is runtime-healthy on this host | tools.md#probe_backend |
vmaf_version | Return the version string reported by the local vmaf binary | tools.md#vmaf_version |
run_benchmark | Run testdata/bench_all.sh on a pair | tools.md#run_benchmark |
run_compare | Wrap vmaf-tune compare: compare codec adapters at target VMAF scores | tools.md#run_compare |
run_ladder | Wrap vmaf-tune ladder: generate a quality-ladder bitrate report | tools.md#run_ladder |
run_tune_per_shot | Wrap vmaf-tune per-shot: per-shot CRF/QP tuning | tools.md#run_tune_per_shot |
eval_model_on_split | Evaluate a tiny-AI ONNX model on a parquet feature cache | tools.md#eval_model_on_split |
compare_models | Rank several ONNX models on the same split by descending PLCC | tools.md#compare_models |
list_extractors | Enumerate all VmafFeatureExtractor implementations in core/src/feature/ | tools.md#list_extractors |
describe_model | Return metadata for a VMAF model by name or path | tools.md#describe_model |
describe_worst_frames | Score a pair, extract the N worst-VMAF frames as PNGs, and describe visible artefacts via a local VLM | tools.md#describe_worst_frames |
All tools return a single TextContent message whose body is a JSON document. On error the body is {"error": "<message>"} with the same shape so the client can always json.loads() the response.
Install¶
From a checkout of the repo:
# 1. build vmaf (Meson + Ninja; see CLAUDE.md §2)
meson setup build -Denable_cuda=false -Denable_sycl=false
ninja -C build
# 2. install the MCP server package
cd mcp-server/vmaf-mcp
pip install -e .
# optional: pull in ML deps for eval_model_on_split / compare_models
pip install -e '.[eval]'
The server binary lands as vmaf-mcp on your PATH. It expects to find the vmaf CLI at build/tools/vmaf relative to the repo root. Override with VMAF_BIN=/abs/path/to/vmaf.
Run¶
No network ports are opened. The server reads JSON-RPC requests from stdin and writes responses to stdout; diagnostic logs go to stderr.
Claude Desktop configuration¶
Drop this into ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%/Claude/claude_desktop_config.json (Windows):
{
"mcpServers": {
"vmaf-local": {
"command": "vmaf-mcp",
"env": {
"VMAF_BIN": "/home/you/dev/vmaf/build/tools/vmaf",
"VMAF_MCP_ALLOW": "/home/you/yuv-corpus:/home/you/renders"
}
}
}
}
A complete example covering the Docker image variant lives in mcp-server/vmaf-mcp/claude-desktop-config-example.json.
Environment variables¶
| Variable | Purpose | Default |
|---|---|---|
VMAF_BIN | Absolute path to the vmaf CLI binary | <repo>/build/tools/vmaf |
VMAF_MCP_ALLOW | Colon-separated extra roots under which file paths are accepted | (empty — only built-in roots) |
VMAF_MCP_ASYNC | AnyIO backend (asyncio / trio) | asyncio |
Security model¶
The server is meant to run on the user's own machine, driven by a local LLM client. Even so, any JSON-RPC input could be crafted by the LLM to try to coerce the server into reading arbitrary host paths — so the server enforces a path allowlist:
- Built-in roots (always allowed):
testdata/python/test/resource/model/- Extra roots can be added via
VMAF_MCP_ALLOW=<abs-path>[:<abs-path>...].
Any tool argument that names a file (ref, dis, model, features, each member of models) is resolved with Path.resolve() and rejected unless it lands under one of the allowed roots and refers to an existing regular file. .. segments and symlinks that escape the allowlist are rejected by resolution.
The underlying CLI is exec'd with an argv list — never a shell string — so there is no pathway for shell-metacharacter injection.
See also ai/security.md for the tiny-AI-specific hardening (ONNX operator allowlist, model size cap).
When not to use the MCP server¶
- Bulk scoring in a pipeline — use the
vmafCLI directly. MCP is request/response; the CLI streams pictures and does not pay JSON-RPC overhead per frame. - Integration into your own code — use the C API or the Python bindings for an in-process surface.
- CI checks — the Docker image is a better fit than stdio-attached MCP.
MCP shines when the caller is an LLM that benefits from having a tool-calling interface with declared schemas and a JSON-shaped response.
Go implementation — vmafx-mcp¶
vmafx-mcp is a single static Go binary that exposes the same 15 MCP tools as the Python server with byte-for-byte schema parity (ADR-0704). It is the recommended implementation for deployments that cannot install a Python environment.
Build¶
The binary has no runtime dependencies other than the vmaf CLI binary (resolved via VMAF_BIN or the standard search order).
Run¶
The Go binary is wired on the golusoris fx framework (ADR-1119) and is configured entirely through environment variables — there are no CLI flags. Transport selection moved from the removed --transport / --port flags to VMAFX_MCP_TRANSPORT / VMAFX_MCP_HTTP_ADDR.
# Default stdio transport — drop-in replacement for vmaf-mcp
vmafx-mcp
# Streamable-HTTP transport on the default address :3000
VMAFX_MCP_TRANSPORT=http vmafx-mcp
# Streamable-HTTP transport on a custom address
VMAFX_MCP_TRANSPORT=http VMAFX_MCP_HTTP_ADDR=:8080 vmafx-mcp
Migration (ADR-1119). The pre-framework binary used
vmafx-mcp --transport http --port 3000. Replace--transport <t>withVMAFX_MCP_TRANSPORT=<t>and--port <N>withVMAFX_MCP_HTTP_ADDR=:<N>(a full listen address, not a bare port). The historical default port3000is preserved as the default address:3000.
Claude Desktop configuration (Go binary)¶
{
"mcpServers": {
"vmafx-local": {
"command": "/path/to/vmafx-mcp",
"env": {
"VMAF_BIN": "/home/you/dev/vmaf/build/tools/vmaf",
"VMAF_MCP_ALLOW": "/home/you/yuv-corpus"
}
}
}
}
Differences from the Python server¶
| Feature | Python (vmaf-mcp) | Go (vmafx-mcp) |
|---|---|---|
| Tool names / schemas | Reference | Byte-for-byte parity |
| Transport | stdio (default), HTTP (PR #1583); --transport / --port flags | stdio (default), HTTP; selected via VMAFX_MCP_TRANSPORT / VMAFX_MCP_HTTP_ADDR env vars (no flags, ADR-1119) |
VLM descriptions (describe_worst_frames) | SmolVLM / Moondream2 when [vlm] extras installed | Returns placeholder; Stage 2 will add a native VLM bridge |
eval_model_on_split / compare_models | Native Python (onnxruntime, pandas, scipy) | Delegates to python3 subprocess; requires Python env |
| Binary size | ~50 MB Python env | ~10 MB static binary |
| Startup time | ~300 ms (Python import) | ~10 ms |
Environment variables¶
Tool-handler variables are the same as the Python server (VMAF_BIN, VMAF_MCP_ALLOW, plus VMAFX_MCP_DIRECT=1 to opt into the direct cgo scoring path — ADR-0931). On top of those, the fx framework (ADR-1119) adds the config-driven keys below. Config uses the VMAFX_ env prefix with a . koanf delimiter, so every _ in the variable name becomes a . in the koanf key:
| Variable | koanf key | Default | Purpose |
|---|---|---|---|
VMAFX_MCP_TRANSPORT | mcp.transport | stdio | Transport: stdio or http. |
VMAFX_MCP_HTTP_ADDR | mcp.http.addr | :3000 | HTTP listen address (used only when transport is http). Full address (:3000), not a bare port. |
VMAFX_LOG_LEVEL | (bridged to LOG_LEVEL) | INFO | slog level. golusoris#234: bridged to the bare LOG_LEVEL the v0.4.0 log module reads. |
VMAFX_LOG_FORMAT | (bridged to LOG_FORMAT) | auto | Log handler (auto/tint/json). |
All framework logging is written to stderr so the stdio JSON-RPC stream on stdout stays uncorrupted.
Tests¶
TestToolListMatchesPython and TestToolSchemasMatchPython run without any external dependencies. TestVmafScoreTool and TestGoVsPythonOutputParity require the Netflix golden YUVs and the vmaf binary (skipped automatically when absent).
Related¶
- Tool reference — request/response schemas and error codes for every tool.
- Backend discovery and default allowlist — how
list_backendsprobes compiled-in GPU runtimes and which paths the server accepts withoutVMAF_MCP_ALLOW(ADR-0511). - ADR-0100 — the per-surface doc bar this page satisfies (MCP tool: what / schema / allowed paths / example / error codes).
- ADR-0704 — decision record for the Go port.
- mcp-server/vmaf-mcp/README.md — short-form README kept alongside the Python code.