Presented in Communications of the ACM 2016.
Authors: Rachel Potvin, Josh Levenberg (Google)
- Google chose to stick with the central repository due to its advantages.
- The monolithic model of source code management is not for everyone, e.g., organizations where large parts of the codebase are private or hidden between groups.
- Piper: The distributed source-code repository
- Implemented on top of standard Google infrastructure (originally Bigtable, now Spanner)
- Reply on the Paxos algorithm to guarantee consistency across replicas
- CitC (Clients in the Cloud): The workspace client
- With a cloud-based storage backend and a Linux-only FUSE13 file system
- Critique: The code-review tool
- Tricorder: Static analysis system
- Code quality, test coverage, and test results
- Rosie: large-scale cleanups and code changes
- Create a large patch; find-and-replace
- Split the large patch into smaller patches; test them independently; send for code review; commit them automatically once they pass tests and a code review
- Google’s monolithic software repository is used by 95% of its software developers worldwide.
- The Google codebase includes
- approximately 1 billion files
- a history of 35 million commits
- 86TB of data (excluding release branches)
- Over 99% of files stored in Piper are visible to all full-time Google engineers.
- Over 80% of Piper users today use CitC.
- Unified versioning → a single source of truth
- Code sharing and reuse
- Simplified dependency management
- Avoid diamond dependency problem
- Atomic changes
- Large-scale refactoring
- Collaboration across teams
- Flexible team boundaries and code ownership
- Code visibility and clear tree structure → implicit team namespacing
- Tooling investments for both development and execution
- Code-indexing system
- Automated test infrastructure
- Build infrastructure
- Code search and browsing tools
- Codebase complexity
- Unnecessary dependencies → binary size bloating
- Efforts invested in code health
- Git (distributed version control systems)
- A team at Google is focused on supporting Git, which is used by Google’s Android and Chrome teams outside the main Google repository.
- Important for these teams due to external partner and open source collaborations.
- The Git community strongly suggests and prefers developers have more and smaller repositories.
- Git-clone will copy all content to one’s local machine.
- Mercurial
- An experimental effort