Backups... and restores

Posted by Wxcafé on Mon 07 May 2018

So, as you might have noticed if you’re following me on twitter/mastodon, or if you check your rss reader logs, or if you just happened to check this website in the last week, my server has been down for about four days last week following a hardware failure. Here’s what happened.

So, on Monday morning (30th of April), I started seeing hardware errors in dmesg and broadcast on consoles. I figured that a kernel message about a hardware failure that was broadcast on all consoles was probably important enough to at least investigate, and I found out that it was related to the motherboard dying.

I immediately opened a ticket with my hosting provider (Online.net) to ask them to replace the motherboard. It took them 5 hours to react, and in the meantime the server had gone down. I pressed them on, the support agent tried to reboot the machine in rescue mode which obviously didn’t work since the mobo was toast, and then decided that the machine was lost and gave me a new one. Which meant that I didn’t have access to my data anymore.

I tried to have them plug the disk of the old machine in the new one, but they “couldn’t do that on this hardware” (I’ve since checked, and that hardware uses 2.5” SATA drives, which means they’d only have had to unplug the disk from the old machine and put it in the new one. At the most, four screws might be involved. But anyway.), so they told me that they were sorry but I’d have to restore from my backups.

Which, thankfully, I had! Complete backups from that same day, 4:15am. Obviously the situation would have been much worse otherwise, and I thanked the day I had decided to setup a sensible backup strategy. So I set to work on restoring these.

My backups are managed via duplicity; I have a setup where the first puppet run on a server installs some basic backup definitions, and some more targeted configuration once they’re configured, depending on what they’re used for. This setup is described at the end of the post, if you’re interested.

Anyway, these are broken up into what duplicity calls “targets”, which are ensembles of folders that are backed-up with the same rules (frequency, time before expiration, etc…). The main ones in my setup are homedir, which includes… my home directory, yes; conf_files, which includes /etc, /var, /opt and /usr/local; srv_data, which includes most of /srv, and finally mysql and pgsql, which have a pre-run hook to dump the respective databases and then backs them up.

So, on the evening of the 30th, I started restoring these. After fiddling for a bit to figure out how duplicity restores work, I started restoring the homedir target. And that’s when I found out that restoring data from an sftp server running behind an ADSL connection takes ages, a fact that’s only made worse by the insistence of duplicity to copy to the remote the signature files and indexes for all the full backups, and not just the latest ones applicable. In this case, it took about three days.

I managed to restore email first, as that was the most urgent, to avoid having bounces (most MTAs retry for 3-5 days before giving up on delivery), and then slowly walked my way back to restoring all of /var (including the cache, which I had forgotten to exclude from my backups…), and /srv/pub, which holds https://pub.wxcafe.net and https://wxcafe.net/pub, and which included (among other things) a few HD movies, some taking over 4GB.

Needless to say, this restore took a long time. I’ve learned a few lessons from that whole thing, though:

  • never assume the hosting provider is gonna do the right thing,
  • decide how much downtime you are willing to live with
  • check your backups regularly and see how fast they restore
  • define prioritized restoration targets (i.e. website and mail server. That xmpp server can probably wait.)
  • don’t stress out too much about this. it’s gonna be okay, and rebuilding can always work. you’ll find a solution.

Anyways, in the end all I lost was a few months of my RSS subscriptions, which, while annoying, is definitely something I can live with. It worked out alright in the end.

Now for that puppet/duplicity config…

I use the very good puppet-duplicity module, which defines most of what you need already. Then, it so happens that there’s a bug in the paramiko version most of my servers have, so I have taken to replacing the file with that bug with a fixed version, which you can find here I then define a backup class, that can be used where ever it’s needed in host definitions:

## Puppet backups with duplicity

# definitions

class backups {
    file { '/var/backups/mysql/':
        ensure => directory,
    }

    file { '/var/backups/pgsql/':
        ensure => directory,
    }

    class { 'duplicity':
        backup_target_url      => "sftp://censored//srv/backups/$hostname",
        backup_target_username => 'duplicity',
        backup_target_password => 'censored',
    }

    ## dirty hotfix
    if $facts['os']['name'] == 'freebsd' {
        file { '/usr/local/lib/python2.7/site-packages/duplicity/backends/_ssh_paramiko.py':
            ensure  => present,
            content => file('base/backups/_ssh_paramiko.py'),
            require => Package['duply']
        }
    } else {
        file { '/usr/lib/python2.7/dist-packages/duplicity/backends/_ssh_paramiko.py':
            ensure  => present,
            content => file('base/backups/_ssh_paramiko.py'),
            require => Package['duply']
        }
    }

    if $facts['os']['name'] == 'freebsd' {
        package {'py27-pip':
            ensure => present,
        }
        package {'py27-cryptography':
            ensure => present,
        }
    } else {
        package {'python-pip':
            ensure => present,
        }
        package {'python-cryptography':
            ensure => present
        }
    }

    duplicity::profile { 'conf_file':
        full_if_older_than => "2W",
        max_full_backups   => 3,
        cron_hour          => '05',
        cron_minute        => '20',
        cron_enabled       => true,
        gpg_encryption     => false
    }

    duplicity::profile {'homedir':
        full_if_older_than => "1M",
        max_full_backups   => 3,
        cron_hour          => '04',
        cron_minute        => '40',
        cron_enabled       => true,
        gpg_encryption     => false,
    }

    duplicity::profile {'srv_data':
        full_if_older_than => "1M",
        max_full_backups   => 3,
        cron_hour          => '05',
        cron_minute        => '35',
        cron_enabled       => true,
        gpg_encryption     => false
    }

    duplicity::profile { 'pgsql':
        full_if_older_than  => "1W",
        max_full_backups    => 2,
        cron_hour           => '04',
        cron_minute         => '20',
        cron_enabled        => true,
        gpg_encryption      => false,
        exec_before_content => 'sudo pg_dumpall -h 127.0.0.1 -U postgres -f /var/backups/pgsql/db.sql'
    }

    duplicity::profile { 'mysql':
        full_if_older_than  => "1W",
        max_full_backups    => 2,
        cron_hour           => '04',
        cron_minute         => '20',
        cron_enabled        => true,
        gpg_encryption      => false,
        exec_before_content => 'sudo mysqldump -pcensored --all-databases --result-file=/var/backups/mysql/db.sql'
    }

}

And then here’s a sample from a node definition:

node 'yoshi.wxcafe.net' {
    $physical_location = "Illiad - DC2, Vitry-sur-Seine"
    include base
    include backups

    duplicity::file {'/var/backups/mysql/':
        profile => 'mysql',
        ensure  => 'present'
    }

    duplicity::file {'/var/backups/pgsql':
        profile => 'pgsql',
        ensure  => 'present'
    }

    duplicity::file {'/etc/':
        profile => 'conf_file',
        ensure  => 'present'
    }

    duplicity::file {'/var/':
        profile => 'conf_file',
        ensure  => present
    }

    duplicity::file {'/usr/local/':
        profile => 'conf_file',
        ensure  => present
    }

    duplicity::file {'/opt/':
        profile => 'conf_file',
        ensure  => present
    }

    duplicity::file {'/srv/lists/':
        profile => 'srv_data',
        ensure  => present
    }

    duplicity::file {'/srv/mail/':
        profile => 'srv_data',
        ensure  => present
    }

    duplicity::file {'/srv/pub/':
        profile => 'srv_data',
        ensure  => present
    }

    duplicity::file {'/srv/rpg/':
        profile => 'srv_data',
        ensure  => present
    }

    duplicity::file {'/srv/wallabag/':
        profile => 'srv_data',
        ensure  => present
    }

    duplicity::file {'/srv/www':
        profile => 'srv_data',
        ensure  => present
    }

    duplicity::file {'/home/':
        profile => 'homedir',
        ensure  => present,
    }
}