ADR-0628: Remote-aware ADR number allocator — cross-worktree collision prevention¶
- Status: Accepted
- Date: 2026-05-19
- Deciders: lusoris, Claude (Anthropic)
- Tags: adr, tooling, ci, governance, agents, fork-local
Context¶
On 2026-05-19, multiple parallel agents each running in an isolated worktree collided on the same ADR numbers. PR #1414 claimed ADR-0607; it merged before PR #1415 (which had independently picked 0607), forcing a manual renumber to 0612. PRs #1416 and #1417 both claimed ADR-0608 simultaneously; whichever lands second will require a renumber.
Root cause: the existing allocator (scripts/adr/next-free.sh, introduced by ADR-0535) serialises concurrent agents on the same host via a /tmp/vmaf_adr_claim_lock_* directory lock and a .md.stub file in docs/adr/. But agents in separate worktrees — the mandatory isolation model per feedback_agents_isolated_worktree_only.md — each have their own working tree and only share the common .git/ directory. An agent in worktree A cannot see the stub created by agent in worktree B unless B has pushed its branch AND A has fetched it. During the window between a claim and a push, both agents see the same highest-taken number and independently increment from it, producing identical candidate numbers.
Three concrete failure modes:
- Push-before-claim gap: two agents start, both fetch master, pick the same next number, write stubs concurrently (the
/tmplock prevents this within the same PID namespace, but separate worktrees may use different/tmpmounts in container environments). - Stale remote view: an agent fetches origin before another agent has pushed its branch, so the remote scan sees no in-flight ADR.
- Claim-not-pushed: an agent has written a stub but hasn't pushed — its claim is invisible to the network scan of any sibling agent.
Decision¶
Extend scripts/adr/next-free.sh --claim with two complementary mechanisms:
-
.git/adr-claims/<NUMBER>side-pointer (cross-worktree, instant visibility): after writing the.md.stub, write a one-line file at$(git rev-parse --git-common-dir)/adr-claims/<NUMBER>containing the slug, timestamp, and branch name. All worktrees sharing the same.git/common directory (the standard worktree layout) observe this file immediately, even before a push. The lock key is also migrated to the common.git/directory so parallel agents in different worktrees of the same repo contend on the same lock. -
Remote-branch scan via
git ls-remote+ per-SHAgit ls-tree(cross- machine, covers pushed branches): before acquiring the lock, callgit ls-remote --heads originin the main shell (not a subshell) to get all open branch refs, then for each non-master SHA rungit ls-tree -rto enumeratedocs/adr/NNNN-*.mdentries. The union of master + local stubs dr-claims + remote-branch trees forms the taken-number set from whichmax + 1is computed. -
Offline fallback: if
git ls-remote --headsfails (network unreachable, no origin configured), setREMOTE_OFFLINE=1in the main shell, emit aWARNING: could not reach origin — remote branch scan skippedmessage to stderr, and continue with local + master + adr-claims only. The number allocated is still collision-free among visible state. -
--releasecleans both the stub and the side-pointer so abandoned claims do not permanently block numbers.
The allocator continues to bias upward (max + 1) rather than filling gaps; this ensures that a claim not yet pushed — but visible via adr-claims — is never clobbered by a subsequent allocator run.
The CI adr-collision-check job in rule-enforcement.yml is extended to also check each PR's added ADR numbers against all currently open PRs (via gh pr list --state open --json headRefName,number), not just against master. This catches cases where the local allocator was run offline or without a network round-trip.
Alternatives considered¶
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Server-side reservation (central DB or GitHub label) | Perfectly serialised across machines | Requires network for every claim; adds external dependency; complex to set up | Not suitable for offline dev workflows |
| Single shared worktree (no isolation) | No cross-worktree visibility problem | Violates ADR-0535 isolation rule; agents writing to same working tree cause checkout conflicts | Explicitly rejected by feedback_agents_isolated_worktree_only.md |
| Monotonic counter in a git note | Persists across forks without a DB | Git notes are easy to lose on rebase; requires force-push to update; racy without a lock | Too fragile; rebase notes do not solve the cross-worktree gap |
Single-machine lock file in common .git/ | Cheap; visible to all worktrees | Does not cover separate machines or container agents with different .git/ mounts | Insufficient for the CI collision pattern (2026-05-19); combined with adr-claims as the intra-machine layer |
| Require all agents to push stubs before claiming | Perfectly visible via remote scan | Requires a push per claim (slow, noisy history); stub commits pollute the log | Unacceptable workflow friction |
Consequences¶
- Positive: parallel agents on the same host sharing a
.git/common dir will no longer collide; the adr-claims mechanism is instant (no network round-trip). Agents on different machines or containers still benefit from the remote-branch scan covering already-pushed branches. The CI gate now also catches collisions among in-flight PRs before merge. - Negative:
--claimnow makes a network call (git ls-remote) that adds ~0.5–2 s to the claim path; the performance budget is < 5 s for 30 branches. Offline agents lose the remote-scan coverage (downgraded to a warning, not a fatal error). Theadr-claims/directory must be added to.gitignore/ excluded fromgit cleanto prevent accidental deletion. - Neutral / follow-ups: the
.git/adr-claims/directory is created inside the common.git/dir which git does not track, so no.gitignoreentry is needed. CI collision guard update requiresghCLI to be available in the runner (it is, onubuntu-latest).
References¶
- See ADR-0386 — original collision prevention ADR.
- See ADR-0535 — the atomic allocator this ADR extends.
- Source: req — "Today multiple PRs collided on ADR-0607 and ADR-0608. Root cause: each agent's worktree has its own local view (master tip + local
.stubfiles only). Other agents' open PRs in separate worktrees / remote branches are INVISIBLE." - CI gate:
.github/workflows/rule-enforcement.ymljobadr-collision-check. - Implementation:
scripts/adr/next-free.sh,scripts/adr/tests/test-next-free-remote-aware.sh.