User Tools

Site Tools


Sidebar






newpage

linux:backup:dirvish

dirvish

This backup system saves a lot of space by using rsync to create a “hardlink farm” to store “incremental” backups. This way, every image is a complete directory tree and backups can be restored just by copying the directory.

We use it a lot to create nightly backups of directories trees up to 100GB of small mail files. Usually we mirror the data from a remote server and backup via dirvish from this local mirror directory. It is also possible to use dirvish directly with remote servers.

Dirvish is not developed anymore. Anyway it runs stable, though there are some problems with it.

See the docu for configuration: http://www.dirvish.org/FAQ.html

Problems

Dirivish can cause a lot of IO load, mainly when expiring images, becuase it needs to travers the whole directory tree to find if there are other copies of the files. This can even cause database servers to hang, because they don't get a lock on a file for too long. Fix: Avoid to overload dirvish. This can happen, if you run it too often (like every three hours) and then expire too many images at once. Run dirvish-expire –vault xy before you backup the vault again, not only once per night (if you backup more often than daily)

“No good unexpired images found” - no reason found. Maybe it is connected to multiple expire definitions: Once in master.conf and once in vault default.conf. Maybe the reason is that multiple images expire on the same day: the code may not support that.

Workaround: change the expires-at date in the summary file of the first image to Never or some distant future, so it never gets expired.

Tricks

Deleting images

Removing with rm causes a lot of io-load and can impact other services. Here is a soft way to delete images:

cd imagedir/
mkdir empty
# delete single image:
ionice -c 3 nice -19 rsync -a --delete empty/ 20170418-0239/

# delete all images from 2017:
for d in ./* ; do  if [ ${d:2:4} == "2017" ]; then echo "removing $d"; ionice -c 3 nice -19 rsync -a --delete  --stats empty/ $d/; fi; done

Copy a vault to another Server

Rsync can copy the hardlinks correctly - even over network. People reported memory problems, but for me this worked with 100GB of small files copying to a slow internet line:

rsync --progress -avrltH --delete -pgo --stats -D --numeric-ids -x --compress  -e "ssh -p 12345" root@yourserver:/sourcedir/ /local_targetdir/

Show discusage

By default du -sch does not show the correct directory sizes, because it does not handle the hardlinks corretly. This is slow, but works well:

ls -1d /bank/vault/* | grep -v dirvish | xargs du -smch 

Example Output:

61G     /bank/vault/20160103-2344
9.1G    /bank/vault/20160403-0721
...
linux/backup/dirvish.txt · Last modified: 2017/12/12 02:33 by tkilla