High Performance Git

Section 0 ยท Introduction

Chapter 0

Introduction

Pencil sketch of a harbor village with a dock, several workers, and small boats on calm water.

Git looks like one tool, but it very much is not. Underneath, it is several systems layered together: a content-addressed object store, a filesystem cache, a history graph, and a transfer protocol.

That layered and somewhat haphazard design is why Git performance can be confusing and why this book exists. A slow git status, a slow git log -- path, a large clone, and a noisy fetch are usually different problems.

I wrote this book for engineers who need Git to stay fast as their repositories, histories, and teams get larger: build and CI engineers, monorepo owners, developer-experience teams, and the people who wind up debugging strange Git behavior when easy explanations stop working.

This book keeps coming back to a few ideas:

The early chapters spend time on Git's logical model, because the later performance discussions only make sense once objects, refs, the index, and history walks are clear. But most of the book's time is spent lower down, in the storage and metadata layers, because that is where Git's performance and scaling tradeoffs become visible. Even if you have used Git for years, the logical-model material should still serve as a useful reset rather than a detour, and much of the storage layer stuff may be new to you.