PyCon.de: the borgbackup project - Thomas Waldmann

Tags: pycon, python

(One of my summaries of a talk at the 2017 PyCon.de conference).

Borgbackup is 2.5 years old, but the code is older: is a fork of attic. Thomas discovered Attic after someone blogged about it. They forked it to get more collaboration and quicker releases.

Borg backup is a backup tool. There are 1000 backup tools. So what’s different? Borg is one you maybe actually would enjoy using. The features sound logical: simple, efficient, safe, secure. How borg sees this:

  • Simple. Each backup is a full backup. Restore? Just do a FUSE mount. Easy pruning of old backups.

    Tooling: it is just borg, ssh and a shell. It is a single-file binary. There’s good filesystem and OS support.

    There’s good documentation.

  • Efficient. It is very fast for unchanged files. Every backup is a full backup, but unchanged files don’t need to be handled a second time.

    Chunk deduplication, sparse file support, flexible compression scheme.

    Compression is chunk-based, it doesn’t compress the whole file at once.

  • Safety. Checksums, transactions, filesystem syncing, atomic operations. Checkpoints while backing up.

    You can have off-site remote repositories.

  • Secure. Authenticated encryption. There’s nothing to see in the repo: borg doesn’t trust the backup host, everything is encrypted.

    Tampering/corruption detection. SSH transport for remote connections. Append-only mode repos.

    It is open source: you can see the code.

About deduplication: it uses various ways of deduplication. Similar files don’t need to be stored twice. Unchanged files don’t need to be stored a second time. And there’s chunking. Bigger files are chopped up and the parts get the deduplication treatment.

About the code: 90% is python, the high level logic. cython is 5% and 5% is pure c. Testing: pytest and tox (for testing with various python versions). They use pyenv.

Pyinstaller makes a single-file binary. It bundles your python/cython/c code with the python binary of your choice. Only glibc needs to come from the OS.

They use GPG to sign the releases. Even all the commits in git are signed.

Documentation: sphinx. They also reuse the ArgParse output for man pages and the sphinx documentation. The README is included in sphinx. Make the effort to write a good README: it is your “elevator pitch”.

Tip: https://asciinema.org/ for documenting how your CLI app works.

https://abload.de/img/screenshot2017-10-24asms7y.png

Photo explanation: simply a picture from my train trip (with a nice planned detour through the Eifel) from Utrecht (NL) to Karlsruhe (DE). Station hall in Trier with rail lines. Half of them aren’t operational anymore.

 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):