Chapter 18: Shared Repositories Under Write Pressure

Pencil sketch of a person walking down a harbor lane toward boats and waterfront buildings.

A repository can feel healthy right up until it starts taking a lot of writes. Read-heavy pain usually shows up as slow clone, slow fetch, slow status, or expensive history walks. Push-heavy pain is different, because the server has to accept objects, run policy, move refs, and keep its own storage from drifting into a bad state while other writers do the same thing. The result is that a repository can look fine in quiet periods and turn miserable when CI, automation, merge queues, and humans all start pushing at once.

A Push Does More Than Move a Branch

git receive-pack: Server-side Git process that accepts pushed objects, validates the update request, and applies the ref changes if the push is allowed.

gitprotocol-pack describes push in plain terms. The client tells the server which refs it wants to update, sends the objects needed for those updates, and the server validates everything before changing the refs. On the server side, that path runs through git receive-pack, which is at least four jobs strung together:

receive the incoming pack
validate the proposed update
update the refs
run any follow-on work attached to that update

The misleading case is the tiny push. A developer rebases one branch, pushes a handful of new commits, and expects the operation to be cheap. Often it is, but other times the object transfer is trivial and the time goes into server policy, hooks, ref locking, or maintenance that happened to trigger afterward. When a team reports "push is slow," the useful question is which part of receive-pack is actually expensive, not how much data crossed the wire.

Incoming Objects Land in Quarantine First

Quarantine directory: Temporary object directory used by receive-pack so incoming objects can be validated before they are migrated into the main object store.

Incoming objects do not go straight into the main object store. They land in a temporary quarantine directory under objects/ and stay there until the pre-receive hook finishes successfully, which keeps failed pushes from leaving junk behind and gives hooks a clean way to reject bad updates before they become part of the repository.

That design means push spikes hit disk twice:

first while the incoming pack is received and staged
then again when accepted objects are migrated into the main store

On a quiet repository this is invisible. Under write pressure it dominates: temporary disk usage rises, validation work sits in the critical path, and failed pushes still consume I/O even when they leave no lasting objects behind. Quarantine is one reason a repository can look idle from the outside while the server is still busy chewing through push work internally.

Writers Usually Collide at the Ref Layer

Git's push protocol is careful about ref updates. The client does not just say "set main to this new commit." It says, in effect, "I think main still points to this old object ID; if that is still true, move it to this new one." Concurrent pushes to the same ref therefore collide even when the object transfer itself is small, because one writer updates the ref and the next one to arrive carries an out-of-date view of the old object ID and loses.

gitprotocol-pack requires the server to validate that a ref has not changed while the request was being processed before updating it. The same locking is exposed through the reference-transaction hook: by the prepared state, refs are already locked on disk.

You do not need a branch with thousands of commits changing per second for this to bite you. You only need:

one hot branch
many writers
policy that forces everyone through the same narrow place

Merge queues, stacked-PR bots, retry-heavy automation, and branch-per-task agents all create that shape easily, and the symptom is often misread as a network flake or a Git race when the cause is usually simpler: too many writers trying to advance the same name. If the repository also creates many short-lived refs, the pressure spills outward into more loose ref files or more packed-refs churn, more reflog traffic, more hook invocations, and slower ref enumeration in surrounding tooling. A repository under write pressure exercises the ref backend much harder, not only the commit history.

Small Pushes Can Still Dirty the Repository Fast

One giant release push per day and ten thousand tiny automation pushes per day do not stress the repository the same way, and the second pattern is usually worse. git-config documents receive.unpackLimit: if a push arrives below that threshold, the received objects are unpacked into loose object files, and at or above the threshold the received pack is stored as a pack instead. Either outcome is fine in isolation; the problem is cumulative.

Repeated small pushes can create:

a growing pile of loose objects
many small packs
more follow-on maintenance work to clean the shape back up

Enough churn later, read paths start paying for write behavior. Fetch gets worse because pack layout degraded, bitmaps and graph data stop matching the current shape as cleanly, and object lookups spread across more files than they should. The aggregate write pattern keeps nudging the storage layout away from the shape that made clone and fetch cheap, even though each individual push was small and the repository is not technically broken.

You can get a quick read on that drift with:

git count-objects -vH
ls "$(git rev-parse --git-path objects/pack)"/*.pack 2>/dev/null | wc -l
test -f "$(git rev-parse --git-path objects/pack/multi-pack-index)" && echo "midx: yes" || echo "midx: no"
ls "$(git rev-parse --git-path objects/pack)"/*.bitmap 2>/dev/null || true

Those commands will not explain every slow push, but they will tell you quickly whether the repository is accumulating storage debt while it absorbs all that churn.

Hooks and Validation Are Often the Whole Problem

Push latency often has very little to do with pack transfer. The server may be running:

a pre-receive hook for whole-push policy
an update hook once per ref
a reference-transaction hook around ref updates
a post-receive hook for notifications, CI fanout, or integration work

That is before optional object checking such as receive.fsckObjects, signed-push verification, or platform-specific branch-protection logic wrapped around Git. When push feels slow, the cause is usually one of those layers rather than the wire:

a hook that shells out five times per ref
a policy check that walks too much history
a post-receive integration path that still blocks the client

The easiest proof is often negative. The pushed data set is tiny, the server CPU is hot, the disk is busy, and the branch moves only after a long pause, all of which point at server work rather than wire time.

If the server does a lot of validation after the pack arrives, receive.keepAlive matters too. receive-pack can produce no output while it is processing the pack, and some networks will drop the TCP connection if that quiet period runs too long. Keepalives do not make the push faster; they keep a long validation path from looking like a dead connection.

One tempting escape hatch is git receive-pack --skip-connectivity-check. Without an external validation mechanism, it risks corrupting the repository, so it is a specialized operator tool rather than a general performance flag. The fix for slow push validation is usually one of:

make the hook cheaper
move expensive work out of the blocking path
reduce per-ref fanout
accept the safety cost because the check is actually worth it

Push-Time Maintenance Has Spiky Costs

Write-heavy repositories need maintenance more, not less, and they suffer more when maintenance lands at the wrong moment. The modern steady-state path is git maintenance, especially the incremental tasks. incremental-repack uses a two-step process specifically to avoid race conditions with concurrent Git commands, and the incremental schedule is not gc on a timer: it spreads the work across commit-graph updates, loose-object cleanup, incremental repacking, and ref packing.

The other path is the unlucky push that winds up paying for auto-maintenance. receive.autogc causes receive-pack to run git maintenance run --auto after receiving data and updating refs (older Git ran git gc --auto here). Either way, a push can become the moment Git decides the repository needs cleanup, which is fine under light traffic but makes latency spiky and unpredictable under heavy write load, because one pusher winds up paying for background debt.

Full git gc is a different class of event. The prune grace period exists partly to reduce corruption risk when gc overlaps another process that is writing objects, and aggressive pruning on a non-quiescent repository is dangerous: do not run --prune=all on a live write-hot repository unless you know exactly why the repository is quiet enough. The operational split is simple. Incremental maintenance is part of normal operations, while full gc is a planned event, and on a repository under real push pressure you should schedule full cleanup deliberately rather than treat it as background hygiene.

What To Check on a Stressed Server

When a shared repository feels bad under push load, I want four kinds of evidence quickly:

receive policy
storage shape
ref count and ref churn
hook surface area

That usually starts with:

git config --show-origin --get-regexp '^(receive\.|transfer\.fsck|gc\.|maintenance\.)' || true
git count-objects -vH
git for-each-ref --format='%(refname)' | wc -l
find "$(git rev-parse --git-path hooks)" -maxdepth 1 -type f -perm -111 | sort

Those commands answer useful questions immediately:

does receive-pack run extra validation?
can pushes trigger auto-maintenance?
is the repository piling up loose objects or packs?
how many refs exist?
which hooks are even in play?

If the repo is bare and shared, I also care about whether maintenance data is keeping up:

test -f "$(git rev-parse --git-path objects/pack/multi-pack-index)" && echo "midx: yes" || echo "midx: no"
test -f "$(git rev-parse --git-path objects/info/commit-graph)" && echo "commit-graph: yes" || echo "commit-graph: no"
ls "$(git rev-parse --git-path objects/pack)"/*.bitmap 2>/dev/null || true

None of that proves the hooks are slow or the ref transaction is the bottleneck, but it does tell you whether the repository is running a high-churn workload on top of stale storage, which is a different problem with a different fix. A stressed repository may have a single bottleneck (usually hook latency) or several stacked together: hook latency, ref contention, and maintenance debt from constant tiny pushes. The fix is different for each.

Treat the Repository Like a Service

A repository under enough writers is a service: write path, admission control, lock contention, background compaction, and failure modes caused by timing instead of size. If the workload is mostly pushes, the practical posture is short:

keep blocking hooks short
move fanout and heavy integration work out of the synchronous path where possible
watch ref growth, especially automation-created refs
prefer steady incremental maintenance over occasional heroic cleanup
schedule full gc like planned downtime, not like casual housekeeping

A repository under write pressure can fail without ever becoming formally corrupt. Branch updates start racing, push latency gets jagged, and reads degrade because writes left the object store in a mediocre shape. The repository is carrying more live write traffic than its current policy, maintenance posture, and ref layout were designed to absorb. The next chapter is about instrumenting Git so those problems stop being guesses.