Chapter 5. Repository Administration

Table of Contents

The Subversion Repository, Defined
Strategies for Repository Deployment
Planning Your Repository Organization
Deciding Where and How to Host Your Repository
Choosing a Data Store
Berkeley DB
FSFS
Creating and Configuring Your Repository
Creating the Repository
Implementing Repository Hooks
Berkeley DB Configuration
FSFS Configuration
Repository Maintenance
An Administrator's Toolkit
svnadmin
svnlook
svndumpfilter
svnsync
fsfs-reshard.py
Berkeley DB utilities
Commit Log Message Correction
Managing Disk Space
How Subversion saves disk space
Removing dead transactions
Purging unused Berkeley DB logfiles
Packing FSFS filesystems
Berkeley DB Recovery
Migrating Repository Data Elsewhere
Filtering Repository History
Repository Replication
Repository Backup
Managing Repository UUIDs
Moving and Removing Repositories
Summary

The Subversion repository is the central storehouse of all your versioned data. As such, it becomes an obvious candidate for all the love and attention an administrator can offer. While the repository is generally a low-maintenance item, it is important to understand how to properly configure and care for it so that potential problems are avoided, and so actual problems are safely resolved.

In this chapter, we'll discuss how to create and configure a Subversion repository. We'll also talk about repository maintenance, providing examples of how and when to use the svnlook and svnadmin tools provided with Subversion. We'll address some common questions and mistakes and give some suggestions on how to arrange the data in the repository.

If you plan to access a Subversion repository only in the role of a user whose data is under version control (i.e., via a Subversion client), you can skip this chapter altogether. However, if you are, or wish to become, a Subversion repository administrator, [27] this chapter is for you.



[27] This may sound really prestigious and lofty, but we're just talking about anyone who is interested in that mysterious realm beyond the working copy where everyone's data hangs out.