High Performance Git

Section III ยท Storage and Local Scale

Chapter 11

Sparse-Checkout and Sparse-Index

Pencil sketch of three people riding bicycles along a harbor road toward a lighthouse.

Large-repository features tend to answer one of two questions:

Sparse-checkout and sparse-index live in the second camp.

They reduce the working-tree and index footprint so local commands touch less data.

That matters anywhere one checkout is being asked to represent more of the repository than the task actually needs.

That cost shows up in all the places the earlier chapters have already prepared for:


Sparse-Checkout Narrows the Working Tree

Sparse-Checkout
Git feature that limits the tracked files materialized in the working tree to a selected subset.

Sparse-checkout leaves the repository unchanged and narrows what appears in the working tree, so local commands have less data to scan.

Sparse-checkout subdivides tracked files into two groups: the subset you are focused on, and the rest. The rest remain tracked, but they are not present in the working tree.

Sparse-checkout is selective materialization, not an ignore system or a kind of untracking. The omitted paths are still part of the repository state. They are just not materialized in your checkout right now.

It composes naturally with normal Git semantics:

The gain is local, not semantic. The repository is still the repository; you are just not dragging all of it into your working tree. A lot of monorepo pain is exactly that: dragging.

Cone Mode Is the Default Shape for Good Reason

Cone Mode
Sparse-checkout mode where users specify directories, and Git expands those into a restricted high-performance pattern set.

Modern sparse-checkout is mostly a cone-mode story. In cone mode, the user specifies directories, while non-cone mode uses gitignore-style patterns.

This matters for both usability and performance. In cone mode, you describe the directories you want, and Git can translate that request into a restricted pattern set that is much easier to evaluate efficiently. Non-cone mode can suffer from inherent quadratic performance problems.

So for most real large-repository use, the advice is straightforward:

That is part of how sparse-checkout stays fast enough to be practical.

git sparse-checkout init --cone
git sparse-checkout set src docs
git sparse-checkout list

Cone mode is sometimes described too loosely as "check out this directory." It is slightly more structured than that.

When you specify a directory in cone mode, Git includes:

Cone mode behaves well for real development trees. It materializes more than one isolated deep leaf and keeps the leading path context around in a way that preserves a workable local checkout. It is one of those rare Git features that is both more practical and less fussy than the cleverer alternative.

Sparse Specification, reapply, and Temporary Expansion

Sparse Specification
The set of paths Git currently treats as in-scope for the user's sparse working area.

The sparse specification is the set of paths in the user's area of focus, while the sparsity patterns are the contents of the sparse-checkout file that define the intended subset.

In a clean steady state, those line up. In practice, they can temporarily diverge.

There are several reasons:

Sparse-checkout is a controlled local state, not a perfectly rigid absence contract.

You can inspect that state directly:

git sparse-checkout list
git ls-files --sparse
git status

You correct those transient differences with git sparse-checkout reapply.

So reapply belongs in normal sparse-checkout hygiene, not off to the side as some weird emergency lever. It comes up after merges, rebases, conflicts, or other commands that temporarily materialize paths outside the intended focus. Sparse state is a living thing. Sometimes it wanders.

In practice, that means the workflow is often:

  1. set or add the directories you want
  2. do the work
  3. let Git temporarily expand the sparse specification if necessary during difficult operations
  4. reapply when you want the checkout tightened back to the intended scope

Once you expect some temporary widening, the behavior feels much more coherent:

Sparse-Checkout Reduces Local Cost, Not Transfer Cost

Sparse-checkout reduces working-tree materialization. It does not, by itself, reduce which objects were downloaded into the repository. If you want to reduce transfer volume too, you are in the territory of partial clone and promisor remotes, which comes in the next chapter.

So sparse-checkout can make status, add, and ordinary local navigation feel much better without necessarily shrinking the repository's full object store.

The two features pair well, but they solve different problems:

Keeping that split clear makes large-repository tuning much easier.

Sparse-checkout and sparse-index stop feeling mysterious once you ask Git to show the state directly:

git sparse-checkout init --cone
git sparse-checkout set src docs
git sparse-checkout reapply --sparse-index
git ls-files --sparse

That sequence initializes cone mode, narrows the checkout, reapplies the sparse rules while enabling sparse-index, and then prints the index in a way that shows sparse directory entries.

Sparse-Index Reduces Index Surface Too

Sparse-Index
Sparse-checkout mode that represents out-of-scope regions in the index with directory entries instead of every individual file.

Sparse-index comes from a specific observation: even with sparse-checkout reducing the working tree, commands like git status can still be slow in very large repositories because the index may still contain an entry for every tracked file. Sparse-index changes the index representation itself so that out-of-scope directories can be stored as collapsed entries rather than fully expanded file lists.

Sparse-checkout reduces the working tree. Sparse-index extends the same idea into the index.

Sparse-index is a special mode for sparse-checkout that records a directory entry in lieu of all the files underneath that directory, and it is controlled by --[no-]sparse-index on init, set, or reapply.

If the index has to explicitly enumerate millions of out-of-scope paths just to preserve the full repository view, then many commands still pay O(HEAD) costs even when the user is only working in a small region. Sparse-index changes that by letting whole out-of-scope regions collapse into directory entries.

That mode is visible in both configuration and index contents:

git sparse-checkout reapply --sparse-index
git config --get index.sparse
git ls-files --sparse

For sparse work, the config suggestions should be concrete too, but they should also respect that sparse state is usually worktree-local:

git config extensions.worktreeConfig true
git config --worktree core.sparseCheckout true
git config --worktree core.sparseCheckoutCone true
git config --worktree index.sparse true

Those settings deserve plain-language reasons:

The command-line porcelain is still usually better than editing those knobs by hand:

git sparse-checkout set --cone src docs
git sparse-checkout reapply --sparse-index

The command is safer operationally because it updates the sparse-checkout file, the worktree-local config, the working tree, and the index together. Direct config is most useful when you are inspecting or standardizing the posture on purpose and want the reasons to stay explicit. Hand-editing sparse knobs is not forbidden. It is just a good way to meet your future self in a bad mood.

When there is a big imbalance between how many files exist at HEAD and how many are actually populated, sparse-index aims to take key index operations from O(HEAD) to O(Populated). Sparse-index is more than a compact serialization trick. It is an attempt to change the scaling law for common local commands when a sparse working tree is much smaller than the full repository.

It achieves that improvement by adding sparse-directory entries. In ordinary full indexes, the index is primarily a list of files. In a sparse index, some out-of-scope regions are represented as directory-level entries pointing at trees instead of every file underneath those trees being expanded into separate index entries.

Sparse-index belongs conceptually with Chapters 2 and 8 on trees and the index:

The sparse-index doc frames the scale problem around three dimensions:

If HEAD is huge and the populated set is relatively small, then commands like status and add can be dominated by index parsing and rewriting that scales with all of HEAD, not with the checkout you are actually using. Sparse-index is meant to pull those commands back toward the populated set instead. This matters most in monorepos because the scale imbalance is real.

Compatibility Still Matters

Sparse-index is powerful, but it is also more structural than plain sparse-checkout. Compatibility and staged rollout matter for exactly that reason.

Sparse-index is not "free speed" in the abstract. It is a mode that works best when the command set you rely on is well integrated with it.

Sparse-index has matured since its introduction, but the right recommendation is still measured:

That caution keeps performance wins from becoming workflow surprises. Fast is good. Fast and weird is exhausting.

Why Sparse-Checkout and Worktrees Fit Together

The earlier worktree chapter and this chapter reinforce each other.

One worktree can represent one task. Sparse-checkout lets that task materialize only the directories it actually needs. Sparse-index keeps the index cost closer to that task's real local footprint.

That is a much better local model than:

A practical modern pattern is:

In that configuration, Git feels better aligned with the work rather than passively inherited from the full repository.

When sparse-checkout or sparse-index feels disappointing, the first diagnostic question is usually:

"Am I solving the right problem?"

If the real problem is network transfer, sparse-checkout alone will not solve it.

If the real problem is local working-tree and index cost, sparse-checkout and sparse-index are exactly the features to examine.

If the real problem is that one task depends on half the repository anyway, then the sparse cone may keep collapsing toward dense, and that tells you something useful too. Sometimes the repository slice really is large for the work being done.

Sparse-checkout says: "I do not need the whole tracked tree materialized right now."

Sparse-index says: "If I do not need the whole tracked tree materialized, I usually should not have to pay full-index costs for it either."

Those two ideas belong together. One shrinks the working tree. The other shrinks the index representation around that working tree.