Snapshot policies for backup

Rok snapshot policies allow users to automatically create snapshots of their resources with a specified schedule. Additionally, version retention policies define how long each of these snapshots will be retained. The combination of the two enables users to restore their resources to any previous point in time, without needing to retain an unreasonably large amount of snapshots.

Snapshot policy for Jupyter notebooks

This section provides examples to create a snapshot policy for Jupyter notebooks, along with an accompanying version retention policy. The aim of the retention policy is to retain all recent snapshots, which minimizes the possibility of data loss and allows quickly recovering from user errors, while at the same time retaining snapshots more sparsely the older they are, to limit their total number.

Rok UI

This section describes how to create a snapshot policy for Jupyter notebooks and configure the bucket’s version retention policy using the Rok UI.

Create a snapshot policy

  1. Visit the Kubeflow dashboard, and go to Rok by selecting the Snapshots tab of the sidebar.

  2. (optional) Create a new bucket by clicking on the Create bucket button. Give it a name and click on Create. If you already have a bucket you want to use, you can skip this step.

    ../_images/create-bucket-1.png
  3. Click on the bucket, and select the policies tab.

  4. Click the + Add policy button next to Snapshot policies.

    ../_images/snapshot-policy-1.png
  5. From the list of available drivers, select the JupyterLab driver.

    ../_images/snapshot-policy-2.png
  6. Provide a name for the policy, and optionally a detailed description.

  7. Set the Backup policy toggle to True, to ensure the policy snapshots resources unconditionally, even if they have already been snapshotted.

  8. Define a schedule for your policy. This controls how often the policy runs, i.e., how often you create snapshots of your notebooks. In this example we will configure the policy to run every 15 minutes, effective immediately. This means that we will be creating 4 snapshots of each notebook every hour.

    1. Click the + icon under the Schedule section.
    2. Select the Repeat tab, to add a scheduling rule that runs periodically.
    3. Enter 15 minute(s) in the interval.
    4. Optionally, you can specify the exact time the repeating runs will start, or leave the Starting now default to have them start immediately.
    ../_images/snapshot-policy-3.png
  9. Define which notebooks the policy should snapshot. This is achieved by attaching filters to your policy. Note that specifying at least one filter that defines the namespace is required. In this example we will configure the policy to snapshot all notebooks in namespace kubeflow-user whose name starts with dev-.

    1. Click the + icon under the Resources to Snapshot section.
    2. Configure the filter to limit the namespace to kubeflow-user by selecting Namespace, equal, kubeflow-user in the three input boxes.
    3. (optional) Click on + again to add a second filter, and enter JupyterLab, starts_with, dev- in the three new input boxes. This will make the policy only snapshots notebooks whose name starts with dev-. If this filter is omitted then all notebooks in the desired namespace will be snapshotted.
    ../_images/snapshot-policy-4.png
  10. Click the Add button to create the policy. The first task of the policy should now be visible in the Tasks tab of the bucket.

Configure the version retention policy

  1. Visit the Policies tab again in the same bucket.

  2. Under the Admin policies section, click the edit button on the version retention policy.

    ../_images/retention-policy-1.png
  3. Scroll down to the Schedule tab, and click on the + button to add a new scheduling rule. Select the Repeat tab to add a scheduling rule that runs periodically, and enter 2 hour(s) as its interval, to make the policy run every 2 hours.

    ../_images/retention-policy-2.png
  4. Scroll down to the Retention section which specifies which versions of each object the policy will retain, and delete the existing rule that retains all versions forever by clicking on the trashcan icon.

  5. Create five new retention rules by clicking on the + icon, and entering the following options:

    1. Retain the current version forever.
    2. Retain all versions for 1 day(s)
    3. Retain one version every 1 hour(s) for 3 day(s)
    4. Retain one version every 12 hour(s) for 1 month(s)
    5. Retain one version every 1 day(s) for 3 month(s)
    6. Retain one version every 1 month(s) for 1 year(s)
    ../_images/retention-policy-3.png
  6. Click the Update button to update the policy. You should be able to see the first task of the policy under the Tasks tab. Additionally, in the Files tab the colored icon next to each file name now indicates its retention status, and hovering over it reveals information about the reasoning behind this decision.

Rok CLI

This section describes how to create a snapshot policy for Jupyter notebooks and configure the bucket’s version retention policy using the Rok CLI.

Note

For the purposes of this example we will be executing commands from within a Jupyter notebook, where the Rok URL and Rok account, i.e., the --url and --account command line arguments, have already been populated via the ROK_GW_URL and ROK_GW_ACCOUNT environment variables.

Create a snapshot policy

  1. (optional) Create a new bucket. If you already have a bucket you want to use, you can skip this step:

    $ BUCKET="My bucket"
    $ rok bucket-create "${BUCKET?}"
    
  2. Create the snapshot policy for the bucket. The policy will run every 15 minutes and snapshot all notebooks in namespace kubeflow-user whose name starts with dev:

    $ DESCRIPTION="Notebook snapshot policy"
    $ rok policy-create "${BUCKET?}" jupyter registration \
    >     --description "${DESCRIPTION?}" \
    >     --backup \
    >     --schedule now,15minutes \
    >     --filter params:namespace,equal,kubeflow-user \
    >     --filter params:lab,starts_with,dev-
    

Configure the version retention policy

  1. Make sure you have jq installed:

    $ sudo apt-get install jq
    
  2. Retrieve the ID of the version retention policy of the bucket:

    $ POLICY_ID=$(rok -o json policy-list --bucket "${BUCKET?}" | jq -r '.[] | select(.action=="version_retention") | .policy_id')
    
  3. Version retention policies have by default only a single retention rule to retain all versions of each object forever. Here we retrieve the IDs of all existing rules so that we can replace them with the desired ones:

    $ DEL_RETENTION_ARGS=$(rok -o json policy-show ${POLICY_ID?} | jq -r '.action_params.rules[] | .version_retention_rule_id' | xargs -r -n1 echo --del-retention | xargs)
    
  4. Version retention policies have by default no scheduling rules. Here we retrieve the IDs of all scheduling rules so that we can replace them with the desired ones:

    $ DEL_SCHEDULE_ARGS=$(rok -o json policy-show ${POLICY_ID?} | jq -r '.schedule[] | .scheduling_rule_id' | xargs -r -n1 echo --del-schedule | xargs)
    
  5. Remove the existing retention and scheduling rules and add new rules in order to:

    • Run every 2 hours.
    • Retain the current of each object version forever
    • Retain all versions of each object for 1 day
    • Retain one version of each object every hour for 3 days
    • Retain one version of each object every 12 hours for 1 month
    • Retain one version of each object every day for 3 months
    • Retain one version of each object every month for 1 year
    $ rok policy-update ${POLICY_ID?} \
    >    ${DEL_RETENTION_ARGS?} \
    >    ${DEL_SCHEDULE_ARGS?} \
    >    --add-schedule now,2hours \
    >    --add-retention 1,forever \
    >    --add-retention all,1days \
    >    --add-retention 1per1hours,3days \
    >    --add-retention 1per12hours,1months \
    >    --add-retention 1per1days,3months \
    >    --add-retention 1per1months,1years