Why Google Stores Billions of Lines of Code in a Single Repository

062016_CACMpg79_Why-Google.largeEarly Google employees decided to work with a shared codebase managed through a centralized source control system. This approach has served Google well for more than 16 years, and today the vast majority of Google’s software assets continues to be stored in a single, shared repository. Meanwhile, the number of Google software developers has steadily increased, and the size of the Google codebase has grown exponentially (see Figure 1). As a result, the technology used to host the codebase has also evolved significantly.

Back to TopKey InsightsThis article outlines the scale of that codebase and details Google’s custom-built monolithic source repository and the reasons the model was chosen. Google uses a homegrown version-control system to host one large codebase visible to, and used by, most of the software developers in the company. This centralized system is the foundation of many of Google’s developer workflows. Here, we provide background on the systems and workflows that make feasible managing and working productively with such a large repository. We explain Google’s “trunk-based development” strategy and the support systems that structure workflow and keep Google’s codebase healthy, including software for static analysis, code cleanup, and streamlined code review.

Source: Why Google Stores Billions of Lines of Code in a Single Repository | July 2016 | Communications of the ACM

Leave a Reply

Your email address will not be published. Required fields are marked *