Rok 0.10

Rok Gateway

Before doing the migration, properly modify the Gateway's configuration.

In /etc/rok/gw/backend.conf.py:

  • Replace the times parameter of any periodic version retention rules with the retain parameter, which is an interval denoting how long versions should be retained by the rule.
  • Rename the interval parameter of any age version retention rule to retain, without altering its value.

For instance, suppose you have the following version retention rules in your configuration, retaining all versions for 12 hours, one version of each object every 3 days for 10 times (30 days), and one version of each object every 1 month forever:

[{"strategy": "age",
  "params": {"interval": {"count": 12,
                          "unit": "hours"}}},
 {"strategy": "periodic",
  "params": {"interval": {"count": 3,
                          "unit": "days"},
             "start": "2018-01-01T00:00:00+00:00",
             "times": 10}},
 {"strategy": "periodic",
  "params": {"interval": {"count": 1,
                          "unit": "months"},
             "start": "2018-01-01T00:00:00+00:00",
             "times": null}}]

The equivalent rules using the updated format would be:

[{"strategy": "age",
  "params": {"retain": {"count": 12,
                        "unit": "hours"}}},
 {"strategy": "periodic",
  "params": {"interval": {"count": 3,
                          "unit": "days"},
             "start": "2018-01-01T00:00:00+00:00",
             "retain": {"count": 30,
                        "unit": "days"}}},
 {"strategy": "periodic",
  "params": {"interval": {"count": 1,
                          "unit": "months"},
             "start": "2018-01-01T00:00:00+00:00",
             "retain": null}}]

In all service configuration files under /etc/rok/gw/api/, update the variable syntax in the values of all parameter to use double instead of single braces. For instance, if a service parameter was using the value disk_{meta:disk_name} as its default value, replace it with disk_{{meta:disk_name}}.

With the Gateway gunicorn process and daemons stopped, run the following command to apply migrations to the Gateway's store:

$ rok-gw-manage migrate

For more information on the available options, you can use the rok-gw-manage migrate --help as well as the rok-gw-manage showmigrations commands.

Registered Roks

Rok v0.10 changes the way Roks are registered with an Indexer. More specifically, users that registered their Roks prior to v0.10 will need to register them again. This time, the registrations are global, meaning that once a user has registered a Rok with an Indexer, the other users in the same Rok don't need to do the same.

Currently, Rok does not have the concept of administrators, so we advise that the actual Rok administrator registers the Rok(s) first, e.g., by publishing a dummy bucket. This will not be necessary once Rok has support for administrators.

Named OSD Partitions

Introduction

Rok v0.10 supports named OSD partitions. For that a migration is being required to rename the old OSD partitions, which use IDs. For a default Rok installation, the mapping between the old partition IDs and the new partition names are

  • 3e8 -> maps-named
  • 3e9 -> maps-uuid
  • 3ea -> maps-ca
  • 3eb -> chocks-idx
  • 3ec -> chocks-ca
  • 3ed -> maps-nodelocal

In addition, use the --old-object-names flag on filed/cephd. This flag ensures that the old objects, prefixed with 'e:' at the backend, will continue to work until a proper migration renames them.

Upgrade steps

Do the migration with every filed/cephd in the cluster shutdown. Upgrade Rok to v0.10 and make sure that all filed/cephd instances are down.

Note

In case the Rok appliance is using Ceph, you can perform the upgrade without stopping rok-cephd daemons. In this case you must ensure that after renaming the pools, the --old-object-names flag is set, Rok gets upgraded to v0.10, and only then each daemon is safe to be restarted. This can be achieved with rolling upgrades of RokE appliances.

Then, assuming a default Rok installation, for filed run the following:

# mv $filed_rootdir/3e8 $filed_rootdir/maps-named
# mv $filed_rootdir/3e9 $filed_rootdir/maps-uuid
# mv $filed_rootdir/3ea $filed_rootdir/maps-ca
# mv $filed_rootdir/3eb $filed_rootdir/chocks-idx
# mv $filed_rootdir/3ec $filed_rootdir/chocks-ca
# mv $filed_rootdir/3ed $filed_rootdir/maps-nodelocal

Note

The last folder ($filed-rootdir/3ed) will not exist if nodelocal mode has not been enabled in composer (which will most certainly be the case for installations migrating from v0.9).

For cephd, run the following:

# ceph osd pool rename $ceph_osd_namespace:0x3e8 $ceph_osd_namespace:maps-named
# ceph osd pool rename $ceph_osd_namespace:0x3e9 $ceph_osd_namespace:maps-uuid
# ceph osd pool rename $ceph_osd_namespace:0x3ea $ceph_osd_namespace:maps-ca
# ceph osd pool rename $ceph_osd_namespace:0x3eb $ceph_osd_namespace:chocks-idx
# ceph osd pool rename $ceph_osd_namespace:0x3ec $ceph_osd_namespace:chocks-ca

After that, start the filed/cephd service on one node, and run rok-composer-tool verify, to verify that the migration was done correctly. Finally, start the rest of filed/cephd services.

Indexer

Apply any database migrations introduced in v0.10 with:

$ rok-indexer-manage migrate

RokE appliance

This section focuses on how to upgrade an existing cluster of RokE appliances. This is the first version that uses the RokE cluster configuration mechanism, and for this reason you need to do some extra manual steps to upgrade RokE appliances.

We suggest the following procedure:

  • Rename OSD partitions.
  • Upgrade one RokE appliance that is currently not the master, e.g. rok1.
  • Initialize RokE config based on existing setup.
  • Stop all clusterd instances so that a master failover takes place to rok1.
  • For each one of the remaining members: - Evacuate physical host that the RokE appliance is running - Permanently remove the RokE appliance - Remove the member from the cluster - Re-install the RokE appliance - Re-join the cluster using as existing member the rok1 member
  • Do the above for rok1 as well

The above procedure guaranties that rok-init will run again and configure RokE appliances without requiring the user to do any manual changes. The only side effect is that the new appliance will be renumbered, e.g. from rok{0..9} you will now have rok{10..19}.

Note

The above procedure is tested with external Ceph as storage backend.

Rename OSD partitions as mentioned in previous section.

To upgrade one RokE appliance shut it down, replace the boot disk with the new image and start it up. In case you are using a Ganeti cluster use:

sudo python ./scripts/rok-cluster-upgrade-ganeti --rapi-host GANETI_MASTER rok1.ROK_CLUSTERNAME

Note

Probably you have to grow the boot disk first since the new image is larger that the previous one.

To avoid passing the --cluster option all the times, simple export the corresponding environment variable:

export ROK_CLUSTERNAME="roke.example.com"
export MASTER_IP="192.168.0.100"

Initialize the RokE cluster configuration:

rok-config init

To list available vars use:

rok-config get --detail

Set the basic variables:

rok-config set -f cluster.type=enterprise
rok-config set -f cluster.storage.backend=EXT_CEPH
rok-config set -f cluster.name=$ROK_CLUSTERNAME
rok-config set cluster.master_candidates=rok1.$ROK_CLUSTERNAME

Set variables regarding networking:

rok-config set -f net.fqdn.clustername=$ROK_CLUSTERNAME
rok-config set -f net.fqdn.master.mgnt=master.mgnt.$ROK_CLUSTERNAME
rok-config set -f net.fqdn.master.public=master.public.$ROK_CLUSTERNAME
rok-config set -f net.fqdn.master.storage=master.storage.$ROK_CLUSTERNAME

rok-config set -f net.fqdn.master_alias.mgnt=mgnt.$ROK_CLUSTERNAME
rok-config set -f net.fqdn.master_alias.public=public.$ROK_CLUSTERNAME
rok-config set -f net.fqdn.master_alias.storage=storage.$ROK_CLUSTERNAME

rok-config set net.mgnt_master_ip=$MASTER_IP
rok-config set net.public_master_ip=$MASTER_IP
rok-config set net.storage_master_ip=$MASTER_IP

Set variables regarding SSL:

rok-config set cluster.ssl.internal.key=/etc/ssl/private/rok.key
rok-config set cluster.ssl.internal.cert=/etc/ssl/certs/rok.crt

rok-config set cluster.ssl.external.key=/etc/ssl/private/rok.key
rok-config set cluster.ssl.external.cert=/etc/ssl/certs/rok.crt

rok-config set cluster.trusted_CAs=file:/usr/local/share/ca-certificates/arrikto-ca.crt

Note

Ensure that all appliances have the same SSL certs.

pdcp -g roke /etc/ssl/private/rok.key /etc/ssl/private/rok.key pdcp -g roke /etc/ssl/certs/rok.crt /etc/ssl/certs/rok.crt

Set variables regarding Fort:

rok-config set fort.db_host=localhost
rok-config set fort.db_port=5433
rok-config set fort.db_name=fortdb
rok-config set fort.db_user=fort
rok-config set fort.db_password=MY_FORT_DB_PASSWORD
rok-config set fort.django_secret_key=MY_DJANGO_SECRET_KEY

Note

See SECRET, HOST, NAME, USER and PASSWORD in /etc/rok/fort/fort.conf.py.

Set variables regarding access credentials:

rok-config set cluster.root_passwd_hash='MY_ROOT_PASSWORD_HASH'
rok-config set cluster.ssh.authorized_keys=file:/root/.ssh/authorized_keys

Note

See root:....: in /etc/shadow.

Set variables regarding Rok daemons:

rok-config set topology.composerd.connect=cephd
rok-config set daemons.cephd.ceph_rbd_pool=$ROK_CLUSTERNAME:rbd
rok-config set daemons.cephd.ceph_osd_namespace=$ROK_CLUSTERNAME
rok-config set daemons.autostart=controllerd.0,cephd.0,composerd.0,hasherd.0,clusterd.0
rok-config append daemons.controllerd.policies+=vasa
rok-config set daemons.throwerd.extra_trackers=https://tracker.arr

Note

See ceph-osd-namespace and ceph-rbd-pool in /etc/rok/daemons.conf

Set variables regarding Rok Gateway:

rok-config set gw.django_secret_key=MY_DJANGO_SECRET

Note

See SECRET /etc/rok/gw/api/10-api.conf.py.

Check the current settings:

rok-config apply --dry-run

For the time being, you may also need to edit a few templates to apply some configuration that is not covered by rok-config mechanism. We will fix this in an upcoming hotfix release.

To edit templates you should perform the following steps:

  • Download config with rok-config download.
  • Edit the templates.
  • Upload config rok-config upload.

The section below describes some configuration changes that are not covered by rok-config mechanism:

In templates/rok/daemons.conf.j2:

[controllerd.root]
ignore-affinity-for-host = MY_HOST_FQDN

[cephd.0]
old-object-names = true

In ./templates/nginx/rok.nginx.j2:

server_name {{rok.cluster.name}} MY_CLUSTER_CNAME;

In templates/gw/api/10-api.conf.py.j2:

ALLOWED_HOSTS=["localhost", "127.0.0.1", "{{rok.cluster.name}}", "MY_CLUSTER_CNAME"]

Note

To be able see all changes, at least for the first time, you might want to enable logging by commenting out no_log in rok_config.yml.

Update /etc/default/rok with:

ROK_CLUSTERNAME=$ROK_CLUSTERNAME

Note

Currently, we have a partitioned setup. Old cluster daemons obtain cluster status from /rok/rokcluster where the new one from /rok/$ROK_CLUSTERNAME.

To be able to start PostgreSQL ensure that the private key has proper permissions:

chmod root:ssl-cert /etc/ssl/private/ssl-cert-snakeoil.key

Stop all clusterd instances and start config daemon manually:

pdsh -g roke rok-daemon stop clusterd.0
systemctl start rok-configd

This will apply configuration locally and restart services and eventually will cause a master failover to the upgraded appliance.

Upgrade Rok Gateway:

rok-gw-manage migrate

Now follow the procedure mentioned above for every other appliance.

  • Remove the instance from ganeti.

  • Remove the member from RokE cluster, i.e., remove the member from all etcd clusters and remove any related DNS entry from SkyDNS. To do so use the following script from inside one appliance:

    ./scripts/rok-cluster-member-remove.py MEMBER
    

    where MEMBER is rok1, rok2, etc.

  • Re-create the instance using the same MAC but slightly larger boot disk (e.g., 4G).

Note

Make sure to have --prealloc-wipe-disks=yes Ganeti cluster setting otherwise you might end with and unbootable appliance since multiple disks can have the APP_BOOT label.