Chapter 10: Sparse-Checkout and Sparse-Index

Pencil sketch of three people riding bicycles along a harbor road toward a lighthouse.

Large-repository features tend to answer one of two questions:

how much of the repository Git transfers to your machine
how much of the repository Git materializes into your working tree and index

Sparse-checkout and sparse-index live in the second camp. They reduce the working-tree and index footprint so local commands touch less data. That matters anywhere one checkout is being asked to represent more of the repository than the task actually needs. That cost shows up in all the places the earlier chapters have already prepared for:

status scanning too much tree state
add and diff touching too much index state
branch switches rewriting more files than the task actually needs
one monorepo checkout trying to serve every team and every task equally

Sparse-Checkout Narrows the Working Tree

Sparse-Checkout: Git feature that limits the tracked files materialized in the working tree to a selected subset.

Sparse-checkout leaves the repository unchanged and narrows what appears in the working tree, so local commands have less data to scan. It subdivides tracked files into two groups: the subset you are focused on, and the rest. The rest remain tracked but stay unmaterialized in your checkout, so the omitted paths are still part of the repository state. The mechanism composes naturally with normal Git semantics:

commits are still full-tree commits
history still names the whole repository
refs still point to full snapshots
commands can still reason about paths outside the working tree when they need to

The gain is local, with the semantic model intact. The repository is still the repository; you are just keeping most of it out of your working tree, and a lot of monorepo pain is exactly the cost of dragging that whole tree around when you do not need it.

Cone Mode Is the Default Shape for Good Reason

Cone Mode: Sparse-checkout mode where users specify directories, and Git expands those into a restricted high-performance pattern set.

Sparse-checkout is mostly used in cone mode. In cone mode the user specifies directories, while non-cone mode uses gitignore-style patterns, and the difference matters for both usability and performance: in cone mode, Git can translate a directory request into a restricted pattern set that is much easier to evaluate efficiently, whereas non-cone mode can suffer from inherent quadratic performance problems. For most large-repository use, the advice is straightforward:

use cone mode unless you have a very specific reason not to
think in directories, not custom inclusion puzzles
optimize for a stable, understandable local slice

git sparse-checkout init --cone
git sparse-checkout set src docs
git sparse-checkout list

Cone mode is sometimes described too loosely as "check out this directory," but it is slightly more structured than that. When you specify a directory in cone mode, Git includes:

everything under that directory
files immediately under leading directories
files at the root level as part of the cone behavior

Cone mode behaves well for real development trees. It materializes more than one isolated deep leaf and keeps the leading path context around in a way that preserves a workable local checkout. It is more practical and less fussy than the cleverer pattern-based alternative.

Sparse Specification, `reapply`, and Temporary Expansion

Sparse Specification: The set of paths Git currently treats as in-scope for the user's sparse working area.

The sparse specification is the set of paths in the user's area of focus, while the sparsity patterns are the contents of the sparse-checkout file that define the intended subset. In a clean steady state those line up, but in practice they can temporarily diverge for several reasons:

conflicts can materialize files
commands such as stash apply can implicitly vivify files
explicit path operations can bring files back temporarily
users or tools can write files into the working tree directly

Sparse-checkout is a controlled local state with room for transient widening, and you can inspect that state directly:

git sparse-checkout list
git ls-files --sparse
git status

You correct those transient differences with git sparse-checkout reapply, which belongs in normal sparse-checkout hygiene rather than off to the side as some weird emergency lever. It comes up after merges, rebases, conflicts, or other commands that temporarily materialize paths outside the intended focus, because sparse state is a living thing that occasionally wanders.

In practice, that means the workflow is often:

set or add the directories you want
do the work
let Git temporarily expand the sparse specification if necessary during difficult operations
reapply when you want the checkout tightened back to the intended scope

Treat temporary widening as expected:

sparse-checkout narrows the normal working set
some commands temporarily widen it
reapply restores the intended steady state

Sparse-Checkout Reduces Local Cost, Not Transfer Cost

Sparse-checkout reduces working-tree materialization. It does not, by itself, reduce which objects were downloaded into the repository. If you want to reduce transfer volume too, you are in the territory of partial clone and promisor remotes, which comes in the next chapter.

So sparse-checkout can make status, add, and ordinary local navigation feel much better without necessarily shrinking the repository's full object store.

The two features pair well, but they solve different problems:

sparse-checkout reduces local checkout size
partial clone reduces transferred object volume

Directly:

git sparse-checkout init --cone
git sparse-checkout set src docs
git sparse-checkout reapply --sparse-index
git ls-files --sparse

That sequence initializes cone mode, narrows the checkout, reapplies the sparse rules while enabling sparse-index, and then prints the index in a way that shows sparse directory entries.

Sparse-Index Reduces Index Surface Too

Sparse-Index: Sparse-checkout mode that represents out-of-scope regions in the index with directory entries instead of every individual file.

Sparse-index comes from a specific observation: even with sparse-checkout reducing the working tree, commands like git status can still be slow in very large repositories because the index may still contain an entry for every tracked file. Sparse-index changes the index representation itself so that out-of-scope directories can be stored as collapsed entries rather than fully expanded file lists.

Sparse-checkout reduces the working tree. Sparse-index extends the same idea into the index.

Sparse-index is a special mode for sparse-checkout that records a directory entry in lieu of all the files underneath that directory, and it is controlled by --[no-]sparse-index on init, set, or reapply.

If the index has to explicitly enumerate millions of out-of-scope paths just to preserve the full repository view, then many commands still incur O(HEAD) costs even when the user is only working in a small region. Sparse-index changes that by letting whole out-of-scope regions collapse into directory entries.

That mode shows up in both configuration and index contents:

git sparse-checkout reapply --sparse-index
git config --get index.sparse
git ls-files --sparse

For sparse work, the config is concrete:

git config core.sparseCheckout true
git config core.sparseCheckoutCone true
git config index.sparse true

Those settings deserve plain-language reasons:

core.sparseCheckout=true turns on sparse materialization at all. That is the switch that says "this working tree should not look dense by default."
core.sparseCheckoutCone=true is the practical default because cone mode scales better than arbitrary pattern matching and fits how most teams actually slice large repositories.
index.sparse=true is the config expression of sparse-index. Use it when the checkout is much smaller than HEAD and local index cost is still showing up after sparse-checkout narrows the working tree.

If you run multiple worktrees against the same repository, these settings usually want to be per-worktree rather than repository-wide. Chapter 13 shows how extensions.worktreeConfig makes that split clean.

The command-line porcelain is still usually better than editing those knobs by hand:

git sparse-checkout set --cone src docs
git sparse-checkout reapply --sparse-index

The command is safer operationally because it updates the sparse-checkout file, the worktree-local config, the working tree, and the index together. Direct config is most useful when you are inspecting or standardizing the posture on purpose and want the reasons to stay explicit. Hand-editing sparse knobs works but is error-prone.

When there is a big imbalance between how many files exist at HEAD and how many are actually populated, sparse-index aims to take key index operations from O(HEAD) to O(Populated). It achieves that improvement by adding sparse-directory entries. In ordinary full indexes, the index is primarily a list of files. In a sparse index, some out-of-scope regions are represented as directory-level entries pointing at trees instead of every file underneath those trees being expanded into separate index entries.

Sparse-index belongs conceptually with Chapters 2 and 7 on trees and the index:

trees already summarize directories in history
sparse-index borrows a similar directory summarization idea for the local index

The sparse-index doc frames the scale problem around three dimensions:

how many files exist at HEAD
how many are populated in the sparse-checkout cone
how many are modified

If HEAD is huge and the populated set is relatively small, then commands like status and add can be dominated by index parsing and rewriting that scales with all of HEAD, not with the checkout you are actually using. Sparse-index is meant to pull those commands back toward the populated set instead. Monorepos see the biggest gain here.

Compatibility Still Matters

Sparse-index changes the index format, so compatibility and staged rollout matter more than for plain sparse-checkout.

Sparse-index is a mode that works best when the command set you rely on is well integrated with it. Treating it as "free speed" overstates the case.

Sparse-index has matured since its introduction, but the right recommendation is still measured:

use cone-mode sparse-checkout first
enable sparse-index when the repository and command mix justify it
keep an eye on tooling that assumes a fully expanded index

When Sparse Feels Disappointing

When sparse-checkout or sparse-index feels disappointing, the first diagnostic question is usually:

"Am I solving the right problem?"

If the bottleneck is network transfer, sparse-checkout alone will not solve it. If it's local working-tree and index cost, sparse-checkout and sparse-index are exactly the features to examine. And if the task genuinely depends on half the repository, the sparse cone may keep collapsing toward dense, and that tells you something useful too. Sometimes the repository slice really is large for the work being done.

Sparse-checkout says: "I do not need the whole tracked tree materialized right now."

Sparse-index says: "If I do not need the whole tracked tree materialized, I usually should not have to carry full-index costs for it either."