ADR-1119: Adopt the golusoris fx framework across all vmafx Go binaries¶
- Status: Accepted
- Date: 2026-06-14
- Deciders: Lusoris
- Tags: go, framework, fx, golusoris, server, controller, node, operator, mcp, vmaf-tune, rc-blocking, fork-local
Context¶
vmafx ships a fleet of six Go binaries — cmd/vmafx-{server,controller,node,operator,mcp,tune} — plus twelve pkg/ libraries. Each binary hand-rolls its own composition root: bespoke slog setup, a private observability.InitOTel + shutdown dance, manual signal.NotifyContext + errgroup lifecycle, hand-built *http.Server / grpc.NewServer(...), and ad-hoc os.Getenv config. The same ~150 lines of boilerplate are copy-pasted across the fleet and drift independently.
The maintainer's RC condition is that all vmafx Go code adopt the sibling-org framework github.com/golusoris/golusoris — a thin, composable set of go.uber.org/fx modules (config via koanf, slog logging, OTel, chi HTTP, a gRPC server with OTel/logging/recovery interceptors baked in, a controller-runtime operator module, clikit for CLIs, health probes). The maintainer's directive is explicit: adopt the framework fully (every binary becomes an fx.New(...).Run() composition over golusoris modules), and where golusoris is missing a capability vmafx needs, file an issue on the golusoris repo for the maintainer to integrate upstream — rather than designing a vmafx-local workaround.
A read-only design pass (full per-binary plan in .workingdir2/rc/golusoris/PLAN.md) established that the migration is low-risk on dependencies: both repos already pin byte-identical versions of every shared dependency (grpc v1.81.1, otel v1.44.0, controller-runtime v0.24.1, client-go v0.36.2, cobra v1.10.2, MCP SDK v1.6.1, go 1.26.4), so the go.mod merge carries no version-skew risk. The same pass surfaced three real framework gaps (filed as golusoris issues, see References) and one non-code decision (the config env-prefix), captured below.
Decision¶
-
Full fx adoption, pinned
golusoris v0.4.0. Every vmafx Go binary is restructured asfx.New(...)over golusoris modules. A shared, vmafx-localinternal/app/bootstrappackage provides the common stanza (bootstrap.Base = golusoris.Core + otel.Module + fx.Supply(version.Info)and anFxLogger()helper routing fx events onto the golusoris*slog.Logger) so the boilerplate is wired once and imported by all six binaries. -
Keep the
VMAFX_config env-prefix (do not migrate the deployment surface to golusoris's defaultAPP_). golusoris'sconfig.Moduleis parameterised, so each binary overrides the prefix withfx.Replace(config.Options{EnvPrefix: "VMAFX_", Delimiter: "."})ahead ofgolusoris.Core. This preserves the existing Helm/k8s/dev/Containerfile/docs env contract; a prefix migration would be a large, breaking ops churn for no functional gain. -
RC-blocking, phased, services first. Phase 0 is this foundation (
go get,bootstrap, this ADR). Phase 1 migrates the production services in ordervmafx-server→vmafx-controller→vmafx-node→vmafx-operator. Phase 2 migrates the CLI tools (vmafx-mcp,vmafx-tune) and finishes thepkg/sweep.vmafx-servergoes first because it exercises the full common stanza (Core + otel + HTTP + gRPC + health) with no auth interceptor, controller-runtime, embedded queue, or stdio constraint — proving every reusable pattern the later binaries copy. -
Missing golusoris capabilities are filed upstream, not worked around. The controller's gRPC migration is gated on golusoris#225 (no fx-injectable interceptor hook today — vmafx needs to chain its JWKS auth interceptor); the controller is sequenced after that lands.
version/buildinfo(golusoris#226) and the operatorSetLogger/webhook-config polish (golusoris#227) are non-blocking — vmafx supplies its ownversion.Infoand a one-linectrl.SetLoggerfx.Invokein the interim, to be removed when the framework absorbs them. -
Domain packages stay framework-agnostic.
pkg/*(the cgolibvmafscorer, encoder/probe subprocess wrappers, ladder/report math, rclone storage, ONNXairegistry) do not import*config.Config; values are resolved in each binary'sfx.Provideprovider and passed as plain typed args. This keeps the libraries unit-testable without fx. The controller's embeddedmodernc.org/sqlitejob queue is kept as domain code — it is a deliberate single-binary design choice, not a gap to backfill with golusoris's Postgres/riverJobsmodule.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Full fx adoption, pinned v0.4.0, phased services-first (chosen) | Eliminates ~150 LOC/binary of duplicated composition; one place to fix lifecycle/otel/health bugs; aligns vmafx with the sibling-org standard; gaps drive upstream improvements | Six binaries to migrate; controller blocked on one upstream fix; an env-prefix decision to document | Selected — matches the maintainer's explicit "full + file-issues-for-gaps" directive and the de-risked dependency alignment |
| Partial adoption (Core + log + otel only, keep hand-rolled servers) | Smaller blast radius | Leaves the duplicated *http.Server/grpc.NewServer boilerplate the framework exists to remove; half-migration drifts | Rejected — does not satisfy "all golang code uses the framework" |
Float on @main / @latest instead of a pin | Picks up gap fixes (#225/#226/#227) automatically | Non-reproducible builds; a moving target across six in-flight PRs; supply-chain risk | Rejected — pin v0.3.1; bump deliberately when the gap fixes tag |
Migrate env-prefix VMAFX_ → golusoris default APP_ | Uses framework defaults verbatim | Breaking change to every Helm chart, k8s manifest, Containerfile, and doc; large ops churn, zero functional gain | Rejected — override the prefix via fx.Replace and keep the contract |
Adopt golusoris.Jobs (river/Postgres) for the controller queue | One less bespoke component | Forces a Postgres dependency onto a deliberately single-binary embedded-SQLite design; changes deployment topology | Rejected — keep the embedded queue as domain code (golusoris#99 frames river/Postgres as a deliberate framework strength, not a mandate) |
Consequences¶
- Positive: a single composition idiom across the fleet; lifecycle, OTel, graceful shutdown, and health probes owned by the framework and fixed once; new binaries start from
bootstrap.Base; framework gaps found by a real consumer flow back upstream (golusoris#225/#226/#227). - Negative: a multi-PR migration that touches every binary's
main; the controller is blocked until golusoris#225 ships an interceptor-injection hook (interim: a vmafx-local gRPC provider mirroring the framework chain plus the auth interceptor, deleted once #225 lands);pkg/observabilityis split (the OTel/logger/shutdown helpers are dropped in favour of framework modules; the PrometheusMetricsstruct is kept and renamedpkg/metrics, since golusoris OTel is OTLP, not a Prometheus registry). - Neutral / follow-ups: the
go.modclosure widens (koanf, chi, river, rueidis, casbin enter the graph via thegolusorisumbrella aliases) — import sub-packages directly where possible and re-rungovulncheck+ watch binary size after Phase 0. cgo-scorer lifetime under fxOnStopordering (the gRPC server must drain before the scorerClose()s) is the key correctness risk, handled by the provider dependency edge. - Update (2026-06-15): the maintainer integrated and closed all four filed gaps and the pin was bumped
v0.3.1 → v0.4.0(popup decision). The v0.4.0 tag carries only #226 (version) + thek8s/operatormodule; #225 (gRPCServerOptionfx-group injection), #227 (operatorSetLogger+ webhook config) and #234 (logger readslog.levelfrom the koanf tree) are merged to golusorismainbut not yet tagged. So on v0.4.0 each service binary keeps a small interim shim — theVMAFX_LOG_LEVEL/LOG_FORMAT → bare-envbridge for #234 (alevelledLoggerdecorator in vmafx-tune), and the operator'sctrl.SetLogger+ config-gated webhook wiring for #227 — and the controller (which needs #225) waits for the next golusoris tag. Cutting av0.4.1/v0.5.0tag offmainwould drop every shim and unblock the controller in one step. Decision §4's interim-shim plan therefore still holds on the v0.4.0 tag and is retired tag-by-tag as the fixes land.
References¶
- golusoris roadmap epic: golusoris#99 (VMAFx pattern extraction).
- Filed gaps: golusoris#225 (gRPC interceptor injection + hard-stop — blocks the controller), golusoris#226 (version/buildinfo module), golusoris#227 (operator SetLogger + webhook config).
- Per-binary migration plan:
.workingdir2/rc/golusoris/PLAN.md. - Research digest: Research-1119.
- Source:
req(user direction — all golang code must use the github.com/golusoris/golusoris framework; adopt it fully; where the framework lacks a needed capability, open an issue on the golusoris repo for the maintainer to integrate rather than designing a local workaround; RC-blocking, phased with the production services first, pinned initially at v0.3.1, then bumped to v0.4.0 once the maintainer integrated all four filed gaps — #225/#226/#227/#234 — and the k8s/operator module landed there).