Migrating a repo from CVS to Git

For more information on Git, visit git-scm.com


CVS is one of the first major Source Code Management tools that was available and widely used. Information about it can be found here. In this tutorial we will be covering the process of converting a CVS repository over to git, a more modern, more powerful SCM.

These instructions are only compatible with CentOS 7     and RHEL 7  


The following list of things need to be installed on the server/container that will be used to perform the migration. This migration example will be done using CentOS 6 as the base OS.

1.    Directory Structure:
Create directories to be used for the migration. In this example, two directories named cvs, and git, off the the root level, and a single directory in /tmp named cvs_migration will be used.

  • /cvs will be used to hold the rsync clone of the cvs repository
  • /git will be used to hold the converted project(s).
  • /tmp/cvs_migration will be used during the conversion, data will be extracted from /cvs, pulled into git_migration and used to create conversion logs, which will be used to populate the final converted project(s) in /git.
mkdir /{cvs,git}
mkdir /tmp/cvs_migration

2.    Install Epel:
To ensure that we get the latest packages that we need for git, svn, and cvs install the epel repository.

yum install epel-release

3.    Git:
The default version of git included in the CentOS 6 repository (version 1.7) works fine for the process of migrating a CVS repo over to git. Simply install the default version of git.

yum clean all; yum install git

4.    SVN:
As with Git, the default version found in the CentOS 6 repository, works fine for the conversion. Simply install the default version of svn off the base repos.

yum clean all; yum install svn

5.    CVS:
We need to install cvs as well in order to read the cvs repositories.

yum clean all; yum install rcs cvs

6.    Support Tools:
In order to do the CVS to git migration, we need to rsync the entire repository, as opposed to doing a checkout. In order to rsync the repo, we need ensure that openssh-client and rsync are both installed.

yum clean all; yum install openssh-clients rsync wget

7.    CVS2SVN:
The last tool that we need is the cvs2svn. This will be the tool that we use to do the conversion of the cvs repo over to git format.

cd /tmp
wget ftp://ftp.pbone.net/mirror/dag.wieers.com/redhat/el6/en/x86_64/dag/RPMS/cvs2svn-2.3.0-1.el6.rf.noarch.rpm
rpm -ivh cvs2svn-2.3.0-1.el6.rf.noarch.rpm

Rsync the repository:

The first step will be to rsync the repository from the CVS server down to the host that will perform the conversion. To do this we will use rsync, from the host doing the conversion. We will rsync the project into the newly created /cvs directory on the migration host.

rsync -av user@cvs.yourcompany.com:/home/cvs/project /cvs

Migration Prep:

The next thing that we need to do is search the cvs data for keyword files to determine their revision number, which will be appended to the default branch. Typically I've needed to run this for any png or xcf files in the repository.

cd /cvs/project
find -name "*.png,v" -o -name "*.xcf,v" | xargs rcs -kb

Run the CVS migration for a single project:

Now that everything is in place, we will start the cvs to git migration.

1.    Convert the project:
This step will start to churn through the CVS repository, migrating it to git format.

cd /cvs/project
cvs2git --blobfile=/git/git-blob.dat --dumpfile=/git/git-dump.dat --username=myusername --fallback-encoding=ascii project >> /tmp/cvs_migration/project.log

This step could take several hours depending on both the size of the repository and the number of commits.

2.    Init the git repo, and import the converted cvs repo from the conversion created blob files:

cd /git
git init --bare project.git
cd project.git
cat /git/git-blob.dat /git/git-dump.dat | git fast-import
git gc --prune=now

Run the CVS migration for a several projects:

If you have multiple projects, then you can start the process on all of them at once using the following methodology.

for dir in /cvs/*;
do cd /cvs
PROJECT=`echo $dir | awk -F'/' {'print $4'}`;
cvs2git --blobfile=/git/git-blob.dat --dumpfile=/git/git-dump.dat --username=myusername --fallback-encoding=ascii $dir >> /tmp/cvs_migration/$PROJECT.log;
cd /git;
git init --bare $PROJECT.git;
cd $PROJECT.git;
cat /git/git-blob.dat /git/git-dump.dat | git fast-import;
git gc --prune=now;

This step could take several hours depending on both the size of the repository and the number of commits.

Push the repo to Git:

The last and final step of the migration is to push the new repository or repositories to Git. Depending on the git server / service that you are using, you will either have to create a new repo for each of your converted cvs repos, or use the service's tools to do a mass import.

1.    Pushing repos manually:

cd /git/project.git
git remote add origin http://gitlab.yourcompany.com/namespace/git_repo_name.git
git config --global user.email "kdrogo@yourcompany.com"
git config --global user.name "Khal Drogo"
git add --all
git commit -m "Conversion from CVS to Git"
git push origin master

2.    Pushing repos to gitlab:
Gitlab offers a quick and easy way to migrate all of the converted repositories, in one swoop. First create a tar of all of the converted git repos, scp the the tar to the gitlab server, and then use rake import to import all of the repos into gitlab all at once.

cd /git
tar -czvf converted_repos.tar.gz *
scp converted_repos.tar.gz gitlabadmin@gitlab.mycompany.com:/var/opt/gitlab/git-data/repositories/<repo-group>

# On the gitlab server
cd /var/opt/gitlab/git-data/repositories/<repo-group>
tar -xzvf converted_repos.tar.gz
chown -R git:git *
gitlab-rake gitlab:import:repos

Post Requisites:

Go and crack yourself a beer.. you deserve one!