ADR-1109: vmafx-node Serve() registers the VmafxScoring gRPC service¶
- Status: Accepted
- Date: 2026-06-13
- Deciders: lusoris, Claude (Phase-4b RC build-out)
- Tags: go, node, grpc, scoring, streaming, phase4b, fork-local
Context¶
cmd/vmafx-node/server/server.go::Serve() shipped as a Phase-4b.4 scaffold: it bound a TCP port, logged readiness, and blocked on ctx.Done() without registering any gRPC service. A node started this way accepted connections but answered no RPCs, so it could not actually do work — the worker side of the distributed platform (ADR-0709 / ADR-0713) had a hole where its service surface should be.
Two service-surface shapes are defensible for the node, and the proto is the source of truth on which exist:
proto/vmafx.protodefines exactly one scoring service,VmafxScoring(Score,ScoreStream,Health).cmd/vmafx-controller/proto/controller.protodefinesVmafxController, whose Node API (RegisterNode/Heartbeat/PullWork/ReportResult) the node consumes as a client against the controller — it is not a service the node hosts.
ADR-0713 describes the node's controller-pull loop (PullWork → Execute → ReportResult) as a client role. That loop is orthogonal to what the node serves. The decision here is only about the service the node hosts on its own listen port.
Decision¶
vmafx-node's Serve() registers the VmafxScoring service — Score (unary, file-path), ScoreStream (bidirectional, in-memory per-frame, ADR-0933), and Health. This makes a node a directly-dispatchable scoring endpoint (push model), reusing the same pkg/libvmaf engine the standalone vmafx-server uses, with a graceful-shutdown path (GracefulStop + hard-stop fallback) that respects the node's 30 s SIGTERM budget from ADR-0713.
The node's scoring engine is optional: when Config.Scorer is nil the node still serves Health (so liveness probes and the Phase-4b.4 smoke test pass) and returns codes.FailedPrecondition from the scoring RPCs.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
Register VmafxScoring (chosen) | The only scoring contract the proto defines; reuses the proven vmafx-server handler shape and the shared pkg/libvmaf engine; gives the node a dispatchable endpoint immediately; matches the existing server.go doc ("accepts the controller's job-dispatch calls"). | The node now has two roles (served scoring + future controller-pull client). | Smallest, contract-faithful change that turns the stub into a working service without inventing a new proto. |
Wire the controller-pull loop (PullWork → Execute → ReportResult) instead | Matches ADR-0713's lifecycle narrative directly. | Much larger change to main.go (controller dial, heartbeat goroutine, retry/backoff); needs a live controller; does not give the node a served surface at all, leaving Serve() still empty of services. | Out of scope for "register the service the node exposes"; the pull loop is a separate, larger work item that can coexist with a served VmafxScoring. |
| Define a new node-only gRPC service in a new proto | Bespoke node API surface. | The proto is the source of truth and defines no such service; inventing one duplicates VmafxScoring for no added meaning today. | Premature; would fork the scoring contract. |
Register Health only | Minimal; keeps the smoke test green. | Node still cannot score over gRPC — the hole stays open. | Does not deliver the worker surface the platform needs. |
Consequences¶
- Positive: a node is now a working push-model scoring endpoint; the ScoreStream streaming surface (ADR-0933) is available on every node, not just the standalone server. Graceful shutdown is consistent with
vmafx-server. - Positive:
pkg/libvmafis the single scoring engine across server and node — no duplicated cgo path. - Negative: the node
serverpackage now depends on cgo (pkg/libvmaf), which it already did transitively via the executor; theScorerfield is optional so Health-only deployments still work. - Neutral / follow-up: the controller-pull worker loop (ADR-0713) remains a separate work item. A node can serve
VmafxScoringand later run the pull loop; the two are independent.
References¶
- See ADR-0709 for the controller/node/operator platform context.
- See ADR-0713 for the node's controller-pull client lifecycle (the role orthogonal to what the node serves).
- See ADR-0933 for the
ScoreStreamcontract the node now serves. - See ADR-0703 for the
vmafx-serverhandler shape this mirrors. - Source: per user direction — Phase-4b RC build-out task brief, 2026-06-13 ("register the appropriate gRPC service(s) the node is meant to expose").