High Performance Git

Section V ยท Diagnosis and Recovery

Chapter 18

Instrumenting Git

Pencil sketch of a person using a telescope under a bright night sky.

Git already gives you a lot to work with. You can time commands, capture repository layout, trace nested regions, log network chatter, and keep a small baseline for later comparison.


Start With One Reproducible Command

Start with one command you can run again without changing five other things at the same time.

Good examples:

Run the same command several times before you get fancy:

time git status >/dev/null
time git status >/dev/null
time git status >/dev/null

Decide What "Slow" Means In This Context

There is no universal Git number that separates "fine" from "bad." The useful comparison is usually local:

That still leaves a practical question: what counts as enough to investigate?

The point is to anchor the discussion in a real workload. git status taking 1.5 seconds in a very large monorepo may be acceptable for now. The same 1.5 seconds in a small repo after a config change is a regression.

Capture Repository Shape Before You Start Guessing

Timing alone rarely tells you enough.

Take a quick snapshot of the repository you are talking about:

git --version
git count-objects -vH
test -f "$(git rev-parse --git-path objects/info/commit-graph)" && echo "commit-graph: yes" || echo "commit-graph: no"
test -f "$(git rev-parse --git-path objects/pack/multi-pack-index)" && echo "midx: yes" || echo "midx: no"
ls "$(git rev-parse --git-path objects/pack)"/*.bitmap 2>/dev/null || true

git count-objects gives you a fast read on loose objects, packed objects, and pack count. That alone can explain more than you might expect. If the question looks storage-heavy, git verify-pack belongs nearby too.

Keep A Small Baseline On Disk

Do not trust yourself to remember what the repository looked like before you changed it.

A tiny baseline kit goes a long way:

mkdir -p perf

git --version >perf/version.txt
git count-objects -vH >perf/count-objects.txt
git config --show-origin --get-regexp '^(core\\.fsmonitor|index\\.|gc\\.|maintenance\\.|commitGraph\\.|feature\\.)' >perf/config.txt || true

You do not need a whole benchmark harness on day one. A few files with version, layout, and config already make later comparisons much less slippery.

Use Trace2 When Timing Stops Being Enough

Timing tells you that a command is slow. Trace2 helps show where the time went.

Git's Trace2 outputs are the first place to reach when wall-clock timing is not enough:

Examples:

GIT_TRACE2_PERF=/tmp/status.perf git status >/dev/null
GIT_TRACE2_EVENT=/tmp/status.json git status >/dev/null

If the command spawns child processes or disappears into several layers of internal work, Trace2 is usually where the story gets clearer.

Use Narrow Traces For Narrow Questions

Sometimes you do not need full Trace2 output. You need one specific argument settled.

A few useful examples:

GIT_TRACE_PERFORMANCE=1 git status >/dev/null
GIT_TRACE_SETUP=1 git status >/dev/null
GIT_TRACE_REFS=1 git for-each-ref --count=5 >/dev/null
GIT_TRACE_PACKET=/tmp/fetch.packet git fetch origin

Those answer different questions:

Use the narrowest trace that can settle the question in front of you.

Keep Trace Output Separate From Normal Output

Trace output gets messy fast if you mix it with ordinary command output.

Write traces to files:

mkdir -p perf

GIT_TRACE2_PERF=perf/fetch.perf git fetch origin >perf/fetch.out 2>perf/fetch.err
GIT_TRACE_PACKET=perf/fetch.packet git fetch origin >/dev/null

That makes reruns easier and comparisons less annoying.

Git also redacts some sensitive values by default when tracing is enabled, but trace files can still contain URLs, branch names, path names, and other useful context. Treat them like logs, not throwaway terminal noise.

Instrument The Network Separately From The Checkout

Clone and fetch often hide several costs under one command name.

A slow clone can include:

Packet tracing and Trace2 let you separate transport cost from local checkout cost before you start changing bundle strategy, bitmaps, or sparse settings.

Measure Before And After

Once you make a change, rerun the same workload.

The order is simple:

  1. record the starting command and baseline
  2. make one meaningful change
  3. rerun the same command
  4. compare more than one run if the numbers bounce around

That keeps the work honest.

For most investigations, repeated timing of the exact command, git count-objects -vH, one baseline config snapshot, GIT_TRACE2_PERF when the command still feels opaque, GIT_TRACE_PACKET for network questions, and GIT_TRACE_REFS for ref-heavy questions already cover a lot of real Git performance work. Instrumentation does not tell you which fix to pick. It gives you better evidence.