2009-08-26

Migrating from CVS to SVN

We have started a migration of our projects from CVS to SVN.

Choice of VCS and RAD plugin

We looked at other VCS tools and would have liked a distributed VCS. But that would have required training of our developers to a different configuration management model, so we went for Subversion instead.

For the same reason, we chose to use the Subversive plugin in RAD instead of the tigris Subclipse plugin. While Subclipse is more true to how Subversion works, Subversive provides us with behavior that (for the developer) is much more like the CVS Team integration in RAD.

There are still a few minor bugs in the plugin, but on the whole it is quite usable.

The biggest change for our developers will be the new tagging behavior: before they could tag projects in the Project Explorer. But now they need to tag in the Subversion Repositories Explorer (because only it knows about the folder hierarchy in Subversion).


Repositories and Performance

Today we have 18 CVS repositories on a single server (in use by some 100 developers and Hudson).

While it has always been stable and running without problems, its performance is not good. Synchronization time for most projects is a minute or more, which is aggravating.

We expect Subversion to give us a much faster VCS experience - and that is certainly true for the 50-odd projects that have already been migrated.

In the new setup, we want to use fewer repositories, making it easier to move projects between departments when that is necessary.

However, as we have not been able to find some good data on Subversion performance, I am a little concerned about how the server will perform when we get all the projects migrated to SVN - to fewer repositories.

Fortunately, the ability (as admin) to move data with full history between repositories saves us (is the assumption :)

If many projects in few repositories becomes a problem, we can introduce more repositories. If load on the server becomes too high, we can use redirection in Apache to move some repositories to other machines.

So we feel safe at the moment (in our glorified ignorance).

Repository Layout

In CVS, a number of modules constitute a single RAD project. That makes it a (small) nuisance to check out a project. It may look like:

REPO_DEPT_A/
projectA.cfg
projectA.ear
projectA.ejb
projectA.web

REPO_DEPT_B/
projectB.ear
projectB.ejb
projectB.web


In Subversion, we address that by introducing a new logical layer in the folder hierarchy. We also introduce a layer for the departments:

REPO/
DEPT_A/
projectA/
trunk/
projectA.cfg
projectA.ear
projectA.ejb
projectA.web
branches/
tags/
DEPT_B/
projectB/
trunk/
projectB.ear
projectB.ejb
projectB.web
branches/
tags/


The layout is ensured with a commit hook.


Subversion Hooks

We have a number of hook scripts implemented in Ruby. The scripts are launched from DOS bat scripts.

Layout Validator

The Layout Validator ensures the REPO/DEPT/LOGICAL/trunk|branches|tags/ hierarchy in Subversion.

Launcher pre-commit.bat, Script hook-layout-validator.rb

Commit Allower

In general all employees have read access to Subversion. But our build users are not allowed to commit (because they anonymize whomever started the job they run).

The list of users who cannot commit are listed in the script.

Launcher start-commit.bat, Script hook-commit-allower.rb

Repository Locker

When doing administrative stuff directly on a repository (imports, for example) we prevent write access for the developers.

The script looks for a file named jb.txt in the repository's root folder. If it exists, commit is prevented and the text in the file is presented to the user.

Launcher start-commit.bat, Script hook-repository-locker.rb


Integrity Checking

Each night we run a cron job that iterates over all repositories and runs this command:

svnadmin.exe verify -q $repo

The output is sent to a mailbox that is (more or less) constantly monitored.

So if there is data corruption on the server we will know within a short time. Unlike CVS, where we would only find out if the project containing a corrupted file was attempted built.


Per-Commit Backups

The next thing I will be looking at is creating per-commit diff files to another drive.

That should allow us to get up and running in hurry if the main server is hit by logical errors on the primary drive.

Interesting, and a new improvement over CVS where we only had the nightly backups to rely on...

No comments:

Post a Comment