Subversion is a relatively new version control system which is meant to be an improvement on CVS. Both of these tools are intended to allow multiple developers to check out working copies of source code so that concurrent development can occur on different sections of an application (or document).
I have been using Subversion for over a year now, and have found it to be useful in any situation where I want to keep track of changes to files. Specifically, I use it in the following situations:
Subversion is a new project, and CVS has been around a while and is widely used. So what are the reasons to use Subversion?
The world is never one-size-fits-all. There are many ways to keep track of changes to files. One way is to do daily backups and keep them for a couple of years. I would argue that this particular choice is inferior in most respects: almost no one is willing to bother with backups every day, and finding the right backup that contains the file you want can be a real chore.
Subversion is meant to be a general purpose tool. It keeps track of files and changes to those files. If you are looking for it to do more, then it might be a good idea to check for tools that are more specialized for your needs. A popular choice for web developers is to use a content management system like OpenCMS (www.opencms.org), which not only version your files, but are tuned for web page development. The disadvantage of specialized tools is that they may be less adept at version control, may be more resource intensive, and may be more complex to install and configure.
The basic idea is that there is a central repository that keeps track of a group of files, including every change that is made to them. Individual developers check out a copy of these files, make changes to their private copy, and when they have something useful to submit they commit their changes back to the central store. Changes made by one developer cannot be seen by anyone else until they commit them.
Committed changes can be pulled into someone's working copy at any time. This allows a developer to merge the changes made by others into their own working copy so they can track the progress of the project as a whole.
Of course, there is nothing that says you have to use Subversion with more than one developer. I use it all the time for things that are unique to my environment.
For example, I store most of my UNIX home directory in Subversion. This allows me to keep tabs on all of my important files, and even allows me to share all or part of my home directory among my machines, such as my desktop and laptop. When I make a change to a file, I commit the changes to the central repository and it is then available for checkout on my other machines.
Storing these files in a versioned system also gives me the ability to recover files deleted long ago, undo changes to configurations that have proven to be unworkable, or restore files that I've accidentally erased. It also has the advantage that when I make a backup of my Subversion repositories, I am also backing up a complete history of my important files. Finally, the fact that I keep a working copy checked out on multiple machines means that I am well-protected from data loss due to catastrophe or theft. I could lose all of my backup CD's and my repository machine in a fire, but if one of the machines that has a working copy survives, then I at least have a pretty recent version of my files in tact.
On the down side, using Subversion will increase your disk space usage by quite a bit. This is usually not a concern on personal machines, where people usually have space to spare, but it may be a concern if you have a disk quota on a multi-user machine. The repository itself will be at least as big as your initial set of files, and will grow; each working copy, which includes the files you are to work on as well as a hidden copy of those files in an unmodified state, usually takes more than three times the disk space as an unmanaged set of files.
As an example, my open-source projects directory contains about 3MB of source files when exported from subversion (i.e. as unmanaged files). The working copy for these same files takes 11MB. The repository that keeps track of them is currently at revision number 196, and takes 6MB.
The small size for the repository may seem a bit odd at first, especially since it has the complete history of 196 different versions of my files! This paradox is resolved by the fact that the repository only has to track the differences in the files from one version to the next. For example, in revision 192 I may have changed only one line of one file. That one line and the context for it (i.e. file and location) are all that has to be stored in order to move from version 192 to 193.
Space usage can be mitigated somewhat if you run the repository on a personal machine, and just keep a working copy on the multi-user machine. This is in fact what I do with my University computing account, where a good portion of my home directory is really a working copy that has been checked out of a Subversion repository that is running on my own networked Linux box.
This step can be very easy or very difficult depending on your target OS and personal level of control over the hardware. Subversion can be built as a user of any of the supported operating systems, but the easiest way to install it is to use pre-compiled packages, which require that you have unlimited access to the configuration of your target system. If you are trying to build it yourself, then be prepared to spend some time getting it all correct.
The simplest platforms on which to use Subversion are the ones that have binary distributions: Linux, Mac OS X, and Windows. A GUI client called RapidSVN is also available. It uses the wxWindows toolkit to give it some platform independence, and there are binary versions for Windows and a few variants of Linux.
If you are using Windows, you can also download a GUI system called TortoiseSVN which integrates with Windows Explorer to give you point-and-click access to your file management functions.
The first thing to do in order to use Subversion is to create a repository. This is nothing more that a directory that stores the subversion database for a set of files. You can have as many repositories as you want, and I suggest making different repositories for files that need different levels of security. For example, I do not want to share my UNIX home directory with the world, so I put that in a repository that has very strong access restrictions (SSL and authentication required). I also work on projects that I make freely available on the Internet. I make those repositories read-only, and require authentication for committing changes.
The creation steps are the same for either kind of repository. The physical separation just gives you a way to easily break up your security policies later. The command to create a repository is:
svnadmin create name
Where name is the name of the directory into which your new repository should be created. The directory should not already exist, but the path to it should. Once it is created, you should change the ownership and permissions on the directory to appropriate settings. For example, if the repository will only be used through Apache, then the user that runs Apache should be the owner of the repository, and that should also be the only user that can read/write the repository files.
If you plan to use Subversion from local disks only, then you may have some trouble if you want more than one person to write files to the repository. The problem is that new files are created from time-to-time, and the person who is using Subversion at the time ends up being the owner.
If you are using a binary distribution of Subversion, then you should have gotten a pre-compiled version of Apache and the modules needed to run a networked subversion repository. The security of a network repository is completely controlled through Apache, and the instructions for setting up simple network access can be found in the Subversion Book available from http://subversion.tigris.org.
I find that the networked method of access is better in the long run for almost all uses, because it avoids permission, ownership, and process interruption issues that can cause problems with direct disk access.
Nevertheless, some users may need to use local disk instead of networked access. I have two warnings for those users:
Locations in your new repository are accessed via Internet URL syntax. If you are using local disk access, then you point to files with a file URL:
file://path_to_repository/path_to_file
If you are using network access:
http://alias/path_to_file
where alias is the path you tell Apache to use for accessing the repository.
Subversion understand other URL types, including HTTPS, and a special one that lets you access a disk-based repository through secure shell (svn+ssh://host/path).
The following example assumes the following:
The subversion commands will work at the command prompt on Linux/OS X/Windows if you substitute proper path/file naming. The other commands are UNIX specific, and will only work on Linux and OS X. If you plan to use a GUI client, you will still need to do the initial repository setup from a command line.
mkdir -p /home/nancy/svn chmod 700 /home/nancy/svn
cd /home/nancy/svn svnadmin create docs
mv /home/nancy/docs /home/nancy/olddocs
svn import /home/nancy/olddocs file:///home/nancy/svn/docs
This copies the documents into the repository. Note that your olddocs directory is never part of the subversion system. It is a backup of your documents before they were under the management of Subversion.
cd /home/nancy svn co file:///home/nancy/svn/docs docs
You should see a list of your files as they are placed into the (new) docs directory. At this point you are ready to edit the files.
Be aware that you should no longer manage the files with regular file system commands (i.e. do not delete, copy, or rename them with the operating system once they are in Subversion). Subversion needs to know when such a change occurs so it can track it, so it has its own commands for doing these things.
cd /home/nancy/docs svn status
This gives a list of all files that have changes. The type of change is marked next to the file. For example 'A' means the file has been added, but has not been sent to the repository. 'D' means the file has been deleted from the local copy, but the central repository doesn't know it yet. 'M' Means the file has been modified in the working copy, but changes have not been sent. '!' means you deleted the file, but failed to use an svn command, so Subversion is not happy about it.
Simply creating a file in a managed directory does NOT make it a managed file. If you do an svn status after creating a new file, you will see that file listed with a '?' beside it. This means that Subversion sees the file, but has not been told to manage it yet. To put a new file under version control, create the file somewhere in the working copy, and then:
svn add filename
An svn status at this point would show the file with an 'A' beside it, indicating that it is to be added at the next commit. Remember that it isn't in the repository until you commit it!
svn rm name
Note that this will immediately remove the named object from your working copy. If it is a directory, then the whole thing (with content) is removed. This change is not made in the repository until you commit you changes, so this can be undone with a revert until then.
You can have an entire directory structure in your repository, but you have to let Subversion know about it. You have two choices when you want to create a new folder: Create the folder and use svn add to add it to the repository, or use:
svn mkdir name
where name is the name of the new folder. You must create managed subdirectories in an already managed directory.
svn mv src dest
The source must exist and already be a managed resource. If dest exists, and is a directory, then the source item will be moved there. This command is very similar to the UNIX mv command.
cd /home/nancy/docs svn commit -m "Log message"
Everything you do to your working copy is kept in the working copy until you explicitly decide you are at a point you wish to record. The commit command sends all of the changes to the central repository, and tells you the new revision number. Each time you commit changes you get a new revision number that can be used to recall the exact state of your files. For example, if you deleted a file two months ago and it had been committed at least once to you repository, then there is a revision number you can use to recall that file should you suddenly realize you need it. I recommend committing changes frequently, unless committing those changes would interfere with someone else's use of the repository. For example, if two people were working on a program, then it might be bad to commit changes you've made that keep the application from working because your code won't even compile yet.
The "Log message" is a note to yourself and others as to the nature of the changes you are committing. It is required. If you omit this option, then svn will try to open an editor to let you type in your log message. Commits are recursive: it starts from your current directory and commits everything below it.
cd /home/nancy/docs svn update
This pulls changes that other people have committed to the repository into your working copy. It is also used to update the revision number when you have done partial commits. For example, if you made subdirectories under docs, and committed things there, then the upper level directories in your working copy would be at an old revision number and would need to be updated before commits could be made.
svn revert name
This restores the named resource to it's previously committed state. Note that once you commit a change, revert no longer works as an undo. If you want to "undo" something that has been committed to the central repository then you must use a more complex command (typically svn merge or svn cp with revision numbers). See the next item.
This one is a little more complex. You have several options, and they are described in detail in chapter 4 of the Subversion Book. For example, if you wanted to reverse the changes to /home/nancy/docs/paper.doc that were committed in revision 43, you would use:
cd /home/nancy/docs svn merge -r 43:42 file:///home/nancy/svn/docs/paper.doc
This tells subversion to undo the changes made between revision 42 and 43 to paper.doc. Note that this changes the WORKING copy, not the one in the repository. Once you do this, look at the document in /home/nancy/docs to see that it looks right. Then commit the "undo" using the normal svn commit command.
If it is truly binary, then Subversion may not be able to undo the change. For example, if the file was changed in revision 46 there is no way for Subversion to know how to undo just the change that was made in 42->43. In this case it will give you multiple files with names that reflect version numbers, and you will have to manually fix paper.doc (using those as reference). See the Subversion Book for details.
svn cp SRC DEST
This is also how you make a "branch" of development. Read the Subversion book for more details.
cd /home/nancy svn co -r rev file:///home/nancy/svn/docs old
Where rev can be a revision number, or a date enclosed in braces. For example, to get a copy of your docs as they looked on January 1, 2002:
svn co -r {1/1/2002} file://home/nancy/svn/docs jandocs
Subversion supports a generic dump format for when you do your periodic backups (you do make backups, right?). The idea is that the on-disk format for the repository made need to change as Subversion matures, but that backups should always be readable by any newer version.
As a result, you should do backups with svnadmin dump, which outputs the repository in a nice, forward-compatible format: cd /home/nancy/svn svnadmin dump docs > docs.dump
You can, of course, compress this file. Once complete, copy it to some form of external media and you are done. The restore command is svnadmin load, and it should be used on a newly-created repository unless you want to merge multiple repositories.
Subversion is a great tool for keeping a history of the changes to a set of files. It provides a useful extension of the classic backup schemes, and helps you share files among multiple machines on a network.