Juju_at_MAAS backup & restore procedures (with rollback)

We explain how to create a backup of a Juju controller on a MAAS Cloud, and restore it… and how to setup a rollback in case of problems.

Scenario

We have MAAS environement setup on LXD containers.

We installed Juju client (2.0) on an LXD container.

We bootstrapped a Juju controller and created a new model called testbck; we then deployed a test machine on the model testbck:

# juju controllers

Controller     Model    User   Access     Cloud/Region  Models  Machines    HA  Version
testbck-orig*  testbck  admin  superuser  maasCSD            -         -  none  2.0.0

# juju models

Controller: testbck-orig

Model       Cloud/Region  Status     Machines  Cores  Access  Last connection
controller  maasCSD       available         1      4  admin   just now
default     maasCSD       available         0      -  admin   42 minutes ago
testbck*    maasCSD       available         3      4  admin   11 minutes ago

Backup

First, let’s create a backup of the controller:

juju create-backup --filename "juju-maas-backup.tgz"

Setup rollback

We want to upgrade Juju to v2.2, and have a procedure to go back to the original condition if anything goes wrong. We take advantage from the fact that our infrastructure is based on LXD so it’s easily clonable.

  1. clone the containers hosting the MAAS region and rack controllers
  2. Stop the original containers
  3. Set the same IP addresses on the new containers
  4. Start the new containers

After a while, MAAS region and rack controllers are synchronized and available again.

Let’s now clone the Juju client container; then log in to the new client and update the Juju client packages (sudo su -; apt-get update; apt-get upgrade juju)

Restore

It is important to remember that to restore a Juju controller from a file the original controller must be unavailable in MAAS. Do NOT destroy the controller otherwise the backup will become useless!

To make it unavailable we must mark the hosting machine broken or delete it from MAAS.

Once the controller is unavailable we can restore the backup with this command:

juju restore-backup -b --constraints tags="tag" --file "juju-maas-backup.tgz" --debug

where -b means “bootstrap a new controller”.

(TODO: Currently juju 2.2 ignores constraints! Ticket opened to Canonical support)

NOTE: Juju does not change the configuration on the hosted machines. If the controller machine changes its IP addresses the deployed services won’t contact it any longer and will be shown in status “down”.

To make them available again there are two ways:

  1. Take note of the IP addresses of the old controller, set it to the target controller via MAAS (static configuration on the network interfaces);
  2. Once the backup is restored:
  • log in to the hosted machines

  • update the parameters apiaddresses in all the /var/lib/juju/agents/machine-<XXX>/agent.conf and /var/lib/juju/agents/unit-<XXX>/agent.conf

    setting the IP addresses of the new controller, e.g.:

    apiaddresses:
     - 10.4.1.138:17070
     - 10.4.4.70:17070
     ...
    
  • Restart jujud-machine-<XXX> and jujud-unit-<XXX> services

Rollback

Now let’s suppose something is wrong with the restored controller and we want to go back to the previous condition. All we have to do is to stop the new MAAS containers and start the original ones. Wait for them to re-synchronize… we’ll eventually find our old controller’s machine back in status “Deployed” (and switched off, because having marked it as broken in the new MAAS it was shut down… just switch it on again!); restore the agent.conf’s files of the deployed services if necessary, to make them point back to the all controller… and we’re done!