Subversion is increasing in popularity and usage, and while it makes a fantastic version control system -- as with all things -- it needs to be properly backed up.

If subversion is used as a version control or backup system for other data, then that data becomes exceptionally important, and "backing up the backup" would be a wise thing to do.

Fortunately, subversion gives us the tools and flexibility to do this easily. With a monthly full backup and on-the-fly backups of each commit, backing up subversion can be painless and protects against the rare case of data corruption.

The first thing is to add a post-commit hook to do a dump of each commit as it is made. Assuming the subversion repository is stored in /subversion/repos/mystuff, modify /subversion/repos/mystuff/hooks/post-commit and add:

svnadmin dump "$REPOS" --revision "$REV" --incremental >/subversion/backup/incremental/mystuff/commit-$REV

The backup directory will be /subversion/backup/incremental/mystuff and every commit will be dumped to that directory as "commit-52" for commit number 52.

If you do not already have the post-commit file, simply copy the skeleton post-commit.tmpl file to create it.

To create monthly backups, two things are desired: first is a full hotcopy of the subversion repository and then setting new dump baselines for subsequent commits. On a busy subversion server, the incremental dumps can quickly add up, so every month a new full dump is done to create a baseline and the previous month's commits are removed (as they are part of the new baseline). The script below accomplishes this. To automate it, create it as /etc/cron.monthly/svn-fullbackup:

#!/bin/sh

date="`date +%Y-%m`"
svnbasedir="/subversion/repos"
svnfullbkdir="/subversion/backup/full"
svnincbkdir="/subversion/backup/incremental"

echo "+++ Backing up subversion repositories..."
for repo in mystuff ; do
echo ""
mkdir -p ${svnfullbkdir}/${date}
/usr/bin/hot-backup.py ${svnbasedir}/${repo} ${svnfullbkdir}/${date}
if [ "$?" != "0" ]; then
echo "!! Hot backup failed on repository: ${repo}"
fi
lastrev="`svnlook youngest ${svnbasedir}/${repo}`"
echo "** Creating new baseline for ${repo} (at revision ${lastrev})..."
mkdir -p ${svnincbkdir}/${repo}.tmp
svnadmin dump -q ${svnbasedir}/${repo} >${svnincbkdir}/${repo}.tmp/baseline-${lastrev}
if [ "$?" == "0" ]; then
rm -f ${svnincbkdir}/${repo}/{baseline,commit}*
mv -f ${svnincbkdir}/${repo}.tmp/baseline-${lastrev} ${svnincbkdir}/${repo}/
else
echo "!! Creating new baseline failed! Left the old baseline and commits intact!"
fi
rm -rf ${svnincbkdir}/${repo}.tmp
done

Depending on how subversion is installed, the path to hot-backup.py may need to be adjusted; that script comes with subversion and its resulting installed location may depend on how the vendor packaged subversion for your particular Linux distribution.

Now, every month, a new baseline dump of the entire repository is created both as a text dump, which can easily be reloaded in the case of problems with the subversion repository, and also a hot backup of the full repository. If worse comes to worse, the repository can be restored from the hot backup and then subsequent changes can be reloaded with the incremental dumps the repository performs on every commit.

As well, because each hot dump goes to its own dated directory, you can roll back the repository to a specific point in time if required. The script could probably be made more robust for high-activity repositories so that Apache would be stopped prior to the backup and restored after the backup (if feasible, and assuming the use of Apache to front-end the repository in the first place) to prevent any commits to the repository during the backup. You could also perform verification on the repository prior to doing the backup to ensure the backup is okay; however, I recommend doing a daily verify on the database to ensure you spot problems as soon as they arise. This can be done by creating /etc/cron.daily/svn-verify with:

#!/bin/sh

svnbasedir="/subversion/repos"
echo "Verifying subversion repositories..."
for repo in mystuff ; do
printf "Verifying: ${repo}\t\t"
svnadmin verify ${svnbasedir}/${repo} >/dev/null 2>&1
if [ "$?" != "0" ]; then
printf " FAILED!\n"
else
printf " ok\n"
fi
done

Of course, in either script you can verify/back up more than one subversion repository by changing for repo in mystuff to for repo in mystuff stuff private to perform the actions on the repositories mystuff, stuff, and private.

Open Sourcery This was published in Open Sourcery, check every Monday for more stories

Related links

Comments

1

Maciek - 29/08/07

Your article was very helpful. Thanks! However, the hot-copy.py script seems to be deprecated and eventually removed even since the 1.0 version of Subversion. svnadmin hotcopy command should be used instead.

» Report offensive content

2

Maciek - 29/08/07

Your article was very helpful. Thanks! However, the hot-copy.py script seems to be deprecated and eventually removed even since the 1.0 version of Subversion. svnadmin hotcopy command should be used instead.

» Report offensive content

3

Web design India - 02/05/08

Content management systems were initially developed internally at organizations which were doing a lot of content publishing. In 1995, CNET spun out its internal development offerings into a separate company called Vignette. The company started offering the software as a web-based content management system, allowing sites to create templates of the presentation of their content on the web. In 1998 , Pencom Web Works, a consulting company, introduced the Metaphoria Data Transformation Server, allowing Java developers to write applications that would be tied with content and target the content output to different channels. The product failed but the concepts that were introduced by it made their way into most modern systems. The term Content Management System (CMS) was first synonymous with a UK company called Site kit, who exhibited their 'instant web publishing' (latterly CMS) at Cebit in Germany. The term was originally intended for web site publishing systems and web site management systems, however the term is now used to refer to a vast range of technologies and techniques, including portal systems, wiki systems and web based groupware.

There are several recognized types of content management systems:

* Web content management systems assist in automating various aspects of web publishing, such as wikis.
* Transactional content management systems (T-CMS) assist in managing e-commerce transactions.
* Integrated content management systems (I-CMS) assist in managing enterprise documents and content.
* Publications management systems (P-CMS) assist in managing the publications (manuals, books, help, guidelines, references) content life cycle.
* Learning management systems (L-CMS) assist in managing the web-based learning content life cycle. See also managed learning environment.
* Document imaging systems are also generally considered under the family of general content management.
* Enterprise content management systems (E-CMS) vary in their functionality. Some support both the web and publications content life cycle, while others support the web content life cycle and either transactional content or customer relationship management content. The definition of AIIM for ECM includes methods and tools that "capture, manage, store, preserve and deliver" content across an enterprise. "Manage" contains components like document management, collaboration, business process management, records management, email management, workflow and web content management. The ECM concept is not restricted to web based technologies but includes client/server and hosted/ On-demand solutions. - <a href="http://www.vijayinfo.in"><b>web designer</b></a>, <a href="http://www.vijayinfo.in"><b>web design company</b></a>

Professional web site designer. Go for Custom web site design at affordable price and be unique. Leading web site company in India. Seo services.
http://www.vijayinfo.in

» Report offensive content

4

ulfben - 29/06/09

I posted an improved version of Vincents script on the ubuntu server list. It has some additional features;
* it will compress (tar.bz2) a hotcopy of every repo in the basedir and any subdir
* it will do a hotcopy only if the repository has been touched since last backup.
* it will logg all activities to a textfile for your convinience

http://www.mail-archive.com/ubuntu-server@lists.ubuntu.com/msg01022.html

» Report offensive content

Leave a comment

You must read and type the 6 chars within 0..9 and A..F

* indicates mandatory fields.

4

ulfben - 29/06/09

I posted an improved version of Vincents script on the ubuntu server list. It has some additional features; * it will ... more

3

Web design India - 05/02/08

Content management systems were initially developed internally at organizations which were doing a lot of content publishing. In 1995, CNET spun ... more

2

Maciek - 29/08/07

Your article was very helpful. Thanks! However, the hot-copy.py script seems to be deprecated and eventually removed even since the 1.0 ... more

Log in


Sign up | Forgot your password?

What's on?

  • Optus Deal

    Broadband + home phone + PlayStation®3 in a single package price!