Configure Default Retention Policy for Notebook Snapshots

You can use retention policies to configure the retention of versions for your notebook snapshots. Version retention policies along with snapshot policies help you to fully control which snapshots you want to retain. This way you can recover quickly from user errors, manage and optimize your backup storage, and minimize your data loss.

This guide will walk you through configuring the version retention policy for snapshots created by the default Jupyter Notebook snapshot policy in an Arrikto EKF installation.

Procedure

  1. Go to the skel-resources deploy overlay of your GitOps repository, inside your rok-tools management environment:

    root@rok-tools:~# cd ~/ops/deployments/kubeflow/manifests/common/skel-resources/overlays/deploy
  2. Include the bucket configuration patch in the kustomization resources of the overlay:

    root@rok-tools:~/ops/deployments/kubeflow/manifests/common/skel-resources/overlays/deploy# kustomize \ > edit add patch --path patches/auto-backup-config.yaml
  3. If you wish to modify how often the version retention policy will run, edit the patches/auto-backup-config.yaml file and set your desired time interval in the policy’s schedule. By default, the policy will run once per day:

    schedule: - interval: 1 day # <-- Set this line to your desired interval rules: # Retain the latest version of each object forever - strategy: versionCount params: count: 1 retain: forever

    Note

    The schedule of a policy only affects how often the policy will run. It does not affect which versions the policy will delete once it runs. Policy intervals support the following units: seconds, minutes, hours, days, weeks, months, and years.

  4. If you wish to modify which versions of each object the policy will retain, edit the patches/auto-backup-config.yaml file and modify, remove, or add a new rule to the list of existing rules.

    schedule: - interval: 1 day rules: # <-- modify the items in this list # Retain the latest version of each object forever - strategy: versionCount params: count: 1 retain: forever # Retain all versions for 1 week - strategy: age params: retain: 1 week # Retain one version of each object per week for 6 months - strategy: periodic params: interval: 1 week retain: 6 months # Retain one version of each object per 3 months for 5 years - strategy: periodic params: interval: 3 months retain: 5 years

    Note

    Each retention rule specifies a subset of the versions of each object that the policy will retain. Versions that are not retained by any rule will be deleted. The way each rule determines which versions to retain depends on its strategy and params.

    The supported strategies are the following:

    • versionCount: This rule retains a specified number of versions of each object. This rule accepts the parameters count and retain.
    • age: This rule retains all versions for a specified duration after their creation. This rule accepts only the parameter retain.
    • periodic: This rule retains one version of each object per specified time interval. If multiple versions of the same object exist within the same time interval, the latest version is selected. This rule accepts the parameters interval, start, and retain.

    The supported parameters are the following:

    • interval: The time interval in which the rule will retain one version. For example, if interval is set to 2 hours, the rule will retain one version in each two hour window.
    • start: The starting timestamp of the retention interval, in ISO8601 format. Use this parameter to fine-tune when each interval of the policy will start. If omitted, the timestamp of the rule’s creation will be used.
    • retain: The duration for which versions will be retained. If omitted or set to forever, the rule will retain the selected versions forever.
    • count: The number of versions the rule will retain. The latest versions available are always selected.

    Note

    By default, the version retention policy in the auto-backup bucket of each namespace uses the following retention rules:

    • Retain the latest version of each object forever. This rule ensures that the latest snapshot of each notebook is never deleted.

      - strategy: versionCount params: count: 1 retain: forever
    • Retain all versions of each object for one week. This rule ensures that all notebook snapshots will be available for at least one week after their creation.

      - strategy: age params: retain: 1 week
    • Retain one version of each object per week for six months. This rule ensures that the policy will retain a weekly backup of each notebook for six months.

      - strategy: periodic params: interval: 1 week retain: 6 months
    • Retain one version of each object per three months for five years. This rule ensures that the policy will retain a quarterly backup of each notebook for five years.

      - strategy: periodic params: interval: 3 months retain: 5 years
  5. If you wish to temporarily disable the retention policy in all user namespaces, edit the patches/auto-backup-config.yaml file, uncomment the paused field and set it to true:

    retentionPolicy: paused: true # <-- Uncomment this line and set it to true schedule: - interval: 1 day

    Note

    Paused policies do not schedule any new runs. You can restore the value of this field at any point in time to resume the policies.

  6. Return to the base directory of your GitOps repository, inside your rok-tools management environment:

    root@rok-tools:~/ops/deployments/kubeflow/manifests/common/skel-resources/overlays/deploy# cd \ > ~/ops/deployments
  7. Commit your changes:

    root@rok-tools:~/ops/deployments# git commit \ > -am "Configure Default Retention Policy for Notebook Snapshots"
  8. Apply the kustomization:

    root@rok-tools:~/ops/deployments# rok-deploy \ > --apply kubeflow/manifests/common/skel-resources/overlays/deploy

Verify

Repeat the following steps for every namespace you wish to verify.

  1. Specify the namespace you wish to verify:

    root@rok-tools:~# export NAMESPACE=<NAMESPACE>

    Replace <NAMESPACE> with the name of the namespace you want to verify, for example:

    root@rok-tools:~# export NAMESPACE=kubeflow-user
  2. Retrieve the bucket configuration for the auto-backup bucket:

    root@rok-tools:~# kubectl get -n ${NAMESPACE?} \ > rokbucketconfiguration auto-backup -o wide NAME RETENTION POLICY PAUSED RETENTION LAST RUN ID RETENTION LAST RUN AT RETENTION LAST RUN STATUS RETENTION LAST RUN STATS AGE auto-backup false 25621f516b09448d81e992d0950ec64c 2022-07-18T19:36:19.039267+00:00 success deleted: 17, failed: 0, objects: 6, versions: 44 3d17h
  3. Verify that the RETENTION LAST RUN AT field is no older than the scheduling interval you have selected.

  4. Verify that the RETENTION LAST RUN STATUS is equal to success.

  5. Verify that the failed count of the RETENTION LAST RUN STATS field is equal to zero.

Summary

You have successfully configured the version retention policy for notebook snapshots in all user namespaces.

What’s Next

Check out the rest of the operations you can perform on your Kubeflow deployment.