Research digest — __init__.py export-completeness audit (2026-05-31)¶
Companion ADR: ADR-0911
Question¶
For every __init__.py in the fork-added Python tree, does the file:
- Carry the fork's Lusoris + SPDX header (CLAUDE.md §12 r7)?
- Document its sub-modules in the module docstring?
- Declare
__all__as a machine-readable public-surface contract?
If not — fix the gaps in one focused PR, codify the pattern in an ADR so future packages follow it.
Method¶
find . -name "__init__.py" -not -path "*/build*" -not -path "*/.venv/*" -not -path "*/node_modules/*"— enumerate every package init in the working tree.- For each file: count lines, count
__all__declarations, countfrom .re-exports. - Classify as upstream-mirror (Netflix copyright → leave byte-identical), test-marker (empty by convention → leave alone), or fork-added (in scope for the audit).
- For each fork-added file, inspect the docstring vs. the actual sibling
.pyfiles; note staleness. - Spot-check
from <pkg> import *callers to confirm no behavioural regression from adding__all__. - Verify each modified package imports cleanly with the right
PYTHONPATH.
Audit results¶
In scope (fork-added)¶
| File | Lines | __all__? | Re-exports | Header? | Verdict |
|---|---|---|---|---|---|
ai/__init__.py | 12 | no | 0 | yes | fix — add __all__, expand docstring with sub-package list |
ai/data/__init__.py | 19 | no | 0 | yes | fix — add __all__ (docstring already enumerates sub-modules) |
ai/train/__init__.py | 16 | no | 0 | yes | fix — add __all__; also: docstring was stale (3 of 6 sub-modules listed) |
ai/src/vmaf_train/__init__.py | 3 | no | 0 | no | fix — add SPDX header + expanded docstring + __all__ = ["__version__"] |
ai/src/vmaf_train/data/__init__.py | 1 | no | 0 | no | fix — add SPDX header + sub-module list + __all__ |
dev-llm/src/vmaf_dev_llm/__init__.py | 3 | no | 0 | no | fix — add SPDX header + docstring + __all__ = ["__version__"] |
mcp-server/vmaf-mcp/src/vmaf_mcp/__init__.py | 3 | no | 0 | no | fix — add SPDX header + docstring + __all__ = ["__version__"] |
scripts/lib/__init__.py | 8 | no | 0 | no | fix — add SPDX header + __all__ |
Out of scope (upstream-mirror — leave byte-identical)¶
| File | Reason |
|---|---|
compat/python-vmaf/__init__.py (378 lines) | Netflix copyright header; bulk of the upstream Python harness — rebase-sensitive |
compat/python-vmaf/core/__init__.py | Netflix copyright (Copyright 2016-2020, Netflix, Inc.) |
compat/python-vmaf/tools/__init__.py | Netflix copyright |
compat/python-vmaf/script/__init__.py | upstream-mirror tree |
compat/python-vmaf/third_party/__init__.py | upstream-mirror tree |
compat/python-vmaf/third_party/xiph/__init__.py | upstream-mirror tree |
python/vmaf/__init__.py (27 lines) | Fork-added compatibility shim — deliberately re-imports via sys.modules manipulation; adding __all__ would not be meaningful (it re-imports the compat package wholesale) |
Out of scope (test-marker, empty by convention)¶
| File | Reason |
|---|---|
python/test/__init__.py (0 lines) | Pytest discovery marker; upstream-mirror anyway |
ai/tests/__init__.py (2 lines, header only) | Pytest discovery marker — adding __all__ is busywork |
tools/external-bench/tests/__init__.py (0 lines) | Pytest discovery marker |
Already well-formed (no change)¶
| File | __all__ size | Notes |
|---|---|---|
ai/src/aiutils/__init__.py | 12 entries | Re-exports concrete symbols; uses __getattr__ for lazy-import; reference example |
ai/src/corpus/__init__.py | 10 entries | Re-exports concrete symbols from .base |
ai/src/vmaf_train/models/__init__.py | 4 entries | Re-exports model classes |
tools/vmaf-tune/src/vmaftune/__init__.py | 1 entry block | Pinned schema-version constants + canonical-6 feature names |
tools/vmaf-roi-score/src/vmafroiscore/__init__.py | 4 entries | Pinned schema + result-key tuple |
tools/vmaf-tune/src/vmaftune/codec_adapters/__init__.py | 21 re-exports | Codec-adapter dispatch table |
Star-import safety check¶
grep -rEn "from (ai|ai\.data|ai\.train|vmaf_train|vmaf_dev_llm|vmaf_mcp|scripts\.lib) import \*" \
--include="*.py" . | grep -v -E "(\.venv|\.cache|build/)"
→ zero hits. Adding __all__ to a package that previously had none is a behavioural no-op when no caller does from pkg import *. The audit confirmed no caller does.
Import-sanity check (after edits)¶
PYTHONPATH=. python3 -c "import ai; print(ai.__all__)"
→ ['data', 'train']
PYTHONPATH=. python3 -c "import ai.data; print(ai.data.__all__)"
→ ['feature_extractor', 'netflix_loader', 'scores']
PYTHONPATH=. python3 -c "import ai.train; print(ai.train.__all__)"
→ ['dataset', 'eval', 'konvid_pair_dataset', 'qat', 'train', 'train_combined']
PYTHONPATH=ai/src python3 -c "import vmaf_train; print(vmaf_train.__version__)"
→ 0.1.0
PYTHONPATH=ai/src python3 -c "import vmaf_train.data; print(vmaf_train.data.__all__)"
→ ['datasets', 'feature_dump', 'frame_dataset', 'frame_loader', 'manifest_scan', 'splits']
PYTHONPATH=dev-llm/src
python3 -c "import vmaf_dev_llm; print(vmaf_dev_llm.__version__)"
→ 0.1.0
PYTHONPATH=mcp-server/vmaf-mcp/src
python3 -c "import vmaf_mcp; print(vmaf_mcp.__version__)"
→ 0.1.0
PYTHONPATH=. python3 -c "import scripts.lib; print(scripts.lib.__all__)"
→ ['backlog_tracker']
All eight modified packages import cleanly and expose the expected __all__ / __version__ surface.
Stale-docstring note (ai/train/__init__.py)¶
The previous docstring listed 3 sub-modules; the directory actually contained 6:
dataset.py(listed)konvid_pair_dataset.py(not listed)qat.py(not listed)train_combined.py(not listed)eval.py(listed)train.py(listed)
Per the user-scope "fix pre-existing inaccuracies in files you touch" rule (CLAUDE.md memory: feedback_fix_preexisting_bugs_too.md), the docstring is refreshed in the same change to enumerate all six.
Findings summary¶
- 8 fork-added
__init__.pyfiles needed work; all fixed in this PR. - 3 of those 8 also missed the Lusoris + SPDX header (CLAUDE.md §12 r7 pre-existing debt — fixed).
- 1 of those 8 also had a stale docstring (3 of 6 sub-modules listed — fixed).
- 0 callers use
from <pkg> import *against the touched packages; the change is behaviourally a no-op for runtime behaviour, but adds a machine-readable contract for pyright / IDE auto-import / future consumers. - 6 fork-added packages were already well-formed; the audit confirms the existing convention and ADR-0911 codifies it.
Decision¶
See ADR-0911.