VMAF Fork — Engineering Principles¶
This document defines the non-negotiable standards for code merged to master. Every requirement here is codified in one of: .clang-tidy, .cppcheck-suppressions.txt, .semgrep.yml, .pre-commit-config.yaml, .github/workflows/{lint-and-format,security-scans,supply-chain}.yml. If a rule here is not yet codified in tooling, it is tracked as an OPEN item in .workingdir2/OPEN.md.
1. Coding¶
1.1 NASA/JPL "Power of 10" rules (adapted for C)¶
- Simple control flow. No
goto, nosetjmp/longjmp, no recursion (static or dynamic). - Bounded loops. Every loop has a statically-verifiable upper bound on iterations.
- No dynamic allocation in hot paths.
malloc/calloc/reallocare permitted at initialization only; frame-loop code paths are allocation-free. - Short functions. Function bodies ≤ 60 lines (a single printed page). Enforced by
readability-function-sizein clang-tidy withLineThreshold: 60. - Assertion density. ≥ 2 runtime assertions per function on average. Use
assert()for invariants; in hot loops whereassert()cost matters, useVMAF_ASSERT_DEBUG()which compiles away in release. - Minimal scope. Declare variables at the smallest possible scope.
- Check return values. All non-void function returns must be checked or explicitly discarded with
(void). Enforced bycert-err33-c. - Preprocessor restraint. Only header inclusion (
#include) and simple macros. No token pasting, no recursive macros, no conditional compilation beyond platform/feature toggles established at build-system level. - Pointer restraint. Single level of dereferencing in most cases. No function pointers except where essential for dispatch (feature extractor registry, SIMD runtime selection).
- Max strictness. Compile with
-Wall -Wextra -Wpedantic -Werror. Static analyzers (clang-tidy, cppcheck, scan-build) must finish with zero findings at enabled levels.
1.2 JPL Institutional Coding Standard for C — applicable subset¶
The JPL-C-STD is the 31-rule superset of Power of 10. The following rules extend 1.1 and are additionally enforced:
- Rule 11. Place
#includedirectives only at the top of a file. - Rule 12. Do not place executable code inside a
#included file. - Rule 13. Do not use reserved names (names beginning with
_followed by uppercase letter, or__). - Rule 14. All type casts must be explicit.
- Rule 15. The evaluation order of operands must not matter. No
i++or++iused within a larger expression; split into separate statements. - Rule 16. No assignments inside expressions.
- Rule 17. All variables must be initialized before use.
- Rule 18. All global variables must have a single declaration, and it must be in the file that owns them.
- Rule 19. All non-trivial types used in a function's public interface must be defined in a header.
- Rule 20. Use
constaggressively. Anything not modified isconst. - Rule 21. Use
staticon every symbol not referenced outside its translation unit. - Rule 22. No side effects in conditional expressions.
- Rule 23. All switch statements must have a default case.
- Rule 24. No fall-through in switch statements except where explicitly marked with
/* FALLTHROUGH */or__attribute__((fallthrough)). - Rule 25. Functions must have a single exit point where practical. Early returns are permitted for validation at function entry.
- Rule 26. Use of pointer arithmetic is restricted to array traversal within bounded loops.
- Rule 27. No use of
<setjmp.h>,<signal.h>handlers that do real work, or any non-async-signal-safe function inside a signal handler. - Rule 28. No variadic functions except standard
printf/scanffamily. - Rule 29. No use of
<stdarg.h>macros outside of printf/scanf wrappers. - Rule 30. Banned functions:
gets,strcpy,strcat,sprintf,strtok(non-reentrant),atoi,atol,atof(no error reporting),rand(non-cryptographic and non-reproducible),system(outside build scripts). Use:fgets,strncpy(with explicit null-termination) /snprintf/strlcpy,snprintf,strtok_r,strtolerrno checking,arc4randomor a seededxorshift, direct process calls with arg arrays. - Rule 31. All headers must be self-contained (include what they need) and include-guard protected.
1.3 SEI CERT C & CERT C++¶
Full compliance required for:
- INT (Integers) — no signed overflow, always check
size_tarithmetic, narrow casts - STR (Strings) — no off-by-one, always check buffer bounds
- MEM (Memory management) — every
mallochas a matchingfree; no double-free; no use-after-free (ASan enforced) - FIO (I/O) — check every
fopen,fread,fwrite,fclosereturn - EXP (Expressions) — no undefined behavior; no sequence-point violations
- CON (Concurrency) — use atomics correctly (
_Atomic); no data races (TSan nightly) - ENV (Environment) — never trust
getenvinput without validation
Enforcement: cert-* checks in .clang-tidy all enabled. The noisy clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling subset (Microsoft C11 _s functions) is explicitly disabled — it does not map to any portable POSIX API.
1.4 MISRA C:2012 (informative subset)¶
Applied where it does not conflict with existing libvmaf conventions:
- Rule 8.5 (one external declaration per identifier in a single file)
- Rule 10.1–10.8 (essential type rules — stricter than standard C integer promotions)
- Rule 17.7 (function return values must be used or explicitly discarded) — matches Power of 10 #7
- Rule 21.x (banned functions) — matches JPL rule 30
Not enforced as PR-blocking; informational in review.
1.5 Style¶
- C: Existing libvmaf conventions preserved (K&R braces, 4-space indent, 100-char columns). Codified in .clang-format.
- C++ (SYCL code): same as C where applicable; RAII encouraged for queue/context wrappers; no exceptions in hot paths.
- Python: PEP 8 + black (line-length 100) + isort + ruff. Codified in pyproject.toml.
- CUDA (
.cu): follows C style; kernel nameskernel_*; device helpersdevice_*. - Shell: shfmt + shellcheck;
#!/usr/bin/env bash;set -euo pipefail.
2. Security¶
See SECURITY.md for reporting policy.
2.1 Input validation at boundaries¶
- All CLI inputs validated at parse time (see
cli_parse.c). - All public libvmaf API entry points validate pointer non-null + struct version.
- Video frame dimensions bounded; no negative or zero-size frames accepted.
- File paths validated for NULL and checked with
access()before open.
2.2 Memory safety¶
- ASan + UBSan on every PR (debug builds).
- TSan on nightly cron.
- valgrind memcheck on release-candidate audit.
_FORTIFY_SOURCE=3in release builds.- Stack protector
-fstack-protector-strong. - PIE (
-fPIE,-pie) for all binaries. - RELRO + BIND_NOW:
-Wl,-z,relro,-z,now.
2.3 Banned functions¶
See JPL rule 30 above — enforced by .semgrep.yml custom rules + bugprone-unsafe-functions in clang-tidy.
2.4 Dependency auditing¶
- trivy scans Dockerfile + filesystem for known CVEs on every PR.
- pip-audit for Python dependencies.
- OSV scanner enabled in CodeQL config.
- Dependabot opens weekly PRs for github-actions / pip / docker.
3. Quality gates (required for PR merge to master)¶
- ✅
make lint— zero clang-tidy / cppcheck / ruff / black findings - ✅
make test— all unit + integration tests pass with ASan + UBSan - ✅
make sec— zero gitleaks / semgrep critical / bandit high / trivy critical findings - ✅ CodeQL — zero security-and-quality issues at
security-extendedsuite - ✅ Conventional commit messages (enforced by commit-msg hook)
- ✅ CI matrix green on Linux/macOS/Windows
- ✅ Netflix source-of-truth golden tests pass (CPU, 3 pairs: 1 normal + 2 checkerboard)
- ✅ GPU-parity matrix gate (T6-8 / ADR-0214) — CPU ↔ Vulkan/lavapipe variance across every enabled feature; CUDA / SYCL / hardware-Vulkan advisory until a self-hosted runner registers. See development/cross-backend-gate.md.
- ✅ Coverage ≥ 70% overall, ≥ 85% for security-critical code (validation, parsing, crypto-adjacent)
- ✅ Touched-file lint-clean rule (ADR-0141) — every hunk in the PR's diff against its merge base must be lint-clean to the fork's strictest profile;
// NOLINTis reserved for load-bearing invariants and must cite the ADR or digest that forces it. - ✅ State-tracking rule (ADR-0165) — every PR that opens, closes, or rules out a Netflix upstream report against the fork updates
state.mdin the same PR. - ✅ FFmpeg-patches sync rule (ADR-0186) — every PR touching a libvmaf C-API surface, CLI flag,
meson_options.txtentry, or any other interface that the in-treeffmpeg-patches/patches consume updates the relevant patch file in the same PR.
3.1 Netflix golden-data gate¶
The fork preserves the three Netflix-authored reference test pairs as the canonical ground-truth gate for VMAF numerical correctness (CPU only):
| # | Type | Reference | Distorted |
|---|---|---|---|
| 1 | normal | src01_hrc00_576x324.yuv | src01_hrc01_576x324.yuv |
| 2 | checkerboard | checkerboard_1920_1080_10_3_0_0.yuv | checkerboard_1920_1080_10_3_1_0.yuv (1-px) |
| 3 | checkerboard | checkerboard_1920_1080_10_3_0_0.yuv | checkerboard_1920_1080_10_3_10_0.yuv (10-px) |
YUVs live in python/test/resource/yuv/. Golden expected scores are hardcoded assertAlmostEqual assertions in python/test/ (primarily quality_runner_test.py, vmafexec_test.py, vmafexec_feature_extractor_test.py, feature_extractor_test.py, result_test.py). These assertions are never modified. They run in CI as a required status check on every PR. They are not run as a pre-commit hook because their runtime is longer than the acceptable pre-commit latency budget.
Fork-added tests (SYCL, CUDA, SIMD snapshots, performance benchmarks) live in separate files and directories, and must not modify or override Netflix golden behavior.
4. Deterministic builds¶
SOURCE_DATE_EPOCHrespected throughout the build- All build-time random data (tempdir names, etc.) is seeded from
SOURCE_DATE_EPOCH - Meson pinned to a specific minor version via
dev-setupscripts - Compiler/linker flags logged and archived per release
5. Supply chain¶
See phases/03-framework/3c-supply-chain.md:
- SLSA Level 3 provenance attestation on every release
- CycloneDX + SPDX SBOMs on every release
- Keyless cosign signatures via Sigstore / GitHub OIDC
- Transparency log (Rekor) makes signatures publicly verifiable
6. Compliance targets¶
- OpenSSF Best Practices — Gold badge goal
- OpenSSF Scorecard — ≥ 8.5 score required on master
- OWASP ASVS — Level 2 applicable controls (auth/authz not applicable — library)
- NIST SSDF — PW.4 (design), PW.5 (implement), PW.7 (review), PW.9 (archive) aligned
7. Non-goals¶
We explicitly do not pursue:
- Full MISRA C compliance (too restrictive for numerical code — informative subset only)
- CII Best Practices "Passing" badge deadline commitment (will earn it when it earns itself)
- FIPS 140 (no cryptography in this library)
- DO-178C (aerospace software — out of scope)
8. Multi-language policy (ADR-0702)¶
VMAFX uses multiple languages. Each has a clearly bounded role:
| Language | Role | Constraint |
|---|---|---|
| C / C++23 | Core library (core/) — metric engine, feature extractors, GPU backend runtimes | All of §1–§7 apply. C ABI at core/include/libvmaf/ is frozen. New fork-added C files target C23; C++ files target C++23. Netflix-inherited C files are migrated per-TU only when a PR already touches the file. |
| Go (≥ 1.23) | Production CLI binaries and servers (cmd/) | Module root github.com/VMAFx/vmafx. go vet is a required CI gate. New packages must have _test.go coverage before merging. |
| Rust (stable) | libvmaf FFI bindings (bindings/rust/vmafx-sys) and optional feature-extractor pilots (core/src/feature/rust/) | cargo clippy + cargo test are required CI gates. unsafe blocks must be audited and explained in a doc comment. |
| Python | ML training (ai/), dev scripts (scripts/), MCP server scaffolding (mcp-server/), vmaf-tune Python harness (tools/vmaf-tune/) | ruff + mypy strict. No Python in hot-path scoring code. |
| CUDA / SYCL / HIP / Metal / GLSL | GPU compute kernels, inside the C core | Language-specific style rules in docs/backends/. |
Cross-language invariants:
- The C ABI at
core/include/libvmaf/is the single stable interface boundary. Go, Rust, and Python all consume it; none of them may call each other directly. - A new language is never added to the project without a per-language CI gate (compile check + fast tests) and an ADR documenting the role.
- No production tooling binary embeds ML model weights or training code. That stays in
ai/(Python only).
10. Revising this document¶
Changes to this document require:
- A dedicated PR titled
docs(principles): <summary> - Two approvals from CODEOWNERS
- A corresponding change to tooling if the standard is newly enforceable in CI
- Linked rationale — either a design doc, a referenced security incident, or a standards-body update