Snapshot Notebook

This guide will walk you through creating a snapshot of a notebook server using Rok.

What You’ll Need

Procedure

Choose one of the following options, depending on your desired environment.

  1. Select Snapshots from the left-side menu.

    ../../_images/menu-snapshots3.png
  2. Click on the bucket in which you want to create the snapshot.

    ../../_images/buckets.png
  3. Click Snapshot.

    ../../_images/snapshot-button.png
  4. Select the JupyterLab service.

    ../../_images/services.png
  5. In the snaphot form:

    1. Set Namespace to the namespace of the notebook you wish to snapshot.

    2. Set Name to the name of the notebook you wish to snapshot.

    3. Set Filename to the name of the snapshot you are about to create.

    4. Click Snapshot.

      ../../_images/snapshot-notebook.png

      Troubleshooting

      The task failed with an authorization error

      Rok tasks use the rok-task-runner service account that exists in the same namespace as the task for authorization. Therefore, if the task fails with an error similar to the following in its logs:

      2021-11-11T16:25:59.911064+00:00 ERROR ApiException: (403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Audit-Id': 'e95d761c-6dd3-4e72-ae27-a15b6e10a52f', 'Content-Length': '399', 'X-Content-Type-Options': 'nosniff', 'Cache-Control': 'no-cache, private', 'Date': 'Thu, 11 Nov 2021 16:25:59 GMT', 'Content-Type': 'application/json'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"notebooks.kubeflow.org \"notebook2\" is forbidden: User \"system:serviceaccount:kubeflow-user:rok-task-runner\" cannot get resource \"notebooks\" in API group \"kubeflow.org\" in the namespace \"kubeflow-admins\"","reason":"Forbidden","details":{"name":"notebook2","group":"kubeflow.org","kind":"notebooks"},"code":403}

      then the rok-task-runner service account does not have permissions to snapshot the notebook in the namespace in which you are trying to create the snapshot. If you are using default EKF permissions, this means you have to switch to the namespace of the notebook you are about to snapshot from the Kubeflow dashboard’s namespace selector.

      Alternatively, if you wish to provide the rok-task-runner service account with access to the namespace of the notebook you are about to snapshot, follow the instructions in the Share Namespace section of the docs.

  1. Select Notebooks from the central Kubeflow dashboard.

  2. Select the namespace of the notebook you wish to snapshot using the namespace selector.

  3. Pick a notebook to connect to, and click CONNECT. This does not need to be the same notebook as the one you are about to snapshot.

  4. Open a new terminal.

  5. Specify the bucket in which you want to create the snapshot:

    jovyan@mynotebook-0:~$ export ROK_BUCKET=<BUCKET>

    Replace <BUCKET> with the bucket name, for example:

    jovyan@mynotebook-0:~$ export ROK_BUCKET="notebooks"
  6. Create the bucket, in case it does not already exist:

    jovyan@mynotebook-0:~$ rok bucket-create ${ROK_BUCKET?}

    Troubleshooting

    The bucket already exists

    If the command fails with the following error message:

    Error: Bucket 'notebooks' already exists

    then the bucket already exists. In this case, ignore the error and continue to the next step.

  7. Specify the name of the notebook you want to snapshot:

    jovyan@mynotebook-0:~$ export NOTEBOOK=<NOTEBOOK>

    Replace <NOTEBOOK> with the notebook name, for example:

    jovyan@mynotebook-0:~$ export NOTEBOOK=notebook2
  8. (Optional) Specify the namespace of the notebook you are about to snapshot. This guide assumes the notebook you have connected to lives inside the same namespace as the notebook you are about to snapshot. If that is the case, then you can skip this step.

    However, if you wish to snapshot a notebook that lives in a different namespace, specify the namespace explicitly in the following environment variable:

    jovyan@mynotebook-0:~$ export ROK_PARAM_REGISTER_JUPYTER_NAMESPACE=kubeflow-shared

    Note

    If you set the above environment variable, the rok CLI tool takes it into account automatically. Instead of specifying the namespace via an environment variable, you can alternatively include the --param namespace=<NAMESPACE> command line argument in the command of the next step.

  9. Snapshot the notebook:

    jovyan@mynotebook-0:~$ rok object-register jupyter ${ROK_BUCKET?} \ > ${NOTEBOOK?} --param name=${NOTEBOOK?} --no-interactive 2021-11-11T22:39:54.006763+0000 rok pid=150/tid=150/pytid=140696336713536 cli:1911 [INFO] Waiting for task `f36fbeee1bd94efc8f117432bb40c3d1' to complete Completed [################################################] 100% Complete Registration Task Account kubeflow-user Bucket notebooks Task ID f36fbeee1bd94efc8f117432bb40c3d1 Action register Level 0 Parent - Status success Progress 100% Message Completed Created At 2021-11-11T22:39:54.002619+00:00 Started At 2021-11-11T22:39:54.045853+00:00 Updated At 2021-11-11T22:40:06.863876+00:00 Completed At 2021-11-11T22:40:06.863564+00:00 Task Result Event ID 025c7697-f191-4c29-98e2-181943160b1f Version Information Account Name kubeflow-user Bucket Name notebooks Object Name notebook2 Version Name 1d2cdfec-90d1-48b4-bb23-793948497208 Rok URL http://rok.rok.svc.cluster.local/swift/v1/kubeflow-user/notebooks/notebook2?version=1d2cdfec-90d1-48b4-bb23-793948497208 Size 0 Rok Fisk local://rokgw-rok-dbc29df44f5648433bd087a963bb8b416ffe37d596907c883e231b76558b9273 Object Name Rok URL notebook2 http://rok.rok.svc.cluster.local/swift/v1/kubeflow-user/notebooks/notebook2?version=1d2cdfec-90d1-48b4-bb23-793948497208 |-- data http://rok.rok.svc.cluster.local/swift/v1/kubeflow-user/notebooks/notebook2_data?version=9810c9ea-0720-4b5e-b5db-b2159ec791f7 |-- notebook2-volume http://rok.rok.svc.cluster.local/swift/v1/kubeflow-user/notebooks/notebook2_notebook2-volume?version=b7c2e7b3-5c25-43be-85cc-08c7ad8e802d

    Troubleshooting

    The task failed with an authorization error

    Rok tasks use the rok-task-runner service account that exists in the same namespace as the task for authorization. Therefore, if the task fails with an error similar to the following in its logs:

    2021-11-11T16:25:59.911064+00:00 ERROR ApiException: (403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Audit-Id': 'e95d761c-6dd3-4e72-ae27-a15b6e10a52f', 'Content-Length': '399', 'X-Content-Type-Options': 'nosniff', 'Cache-Control': 'no-cache, private', 'Date': 'Thu, 11 Nov 2021 16:25:59 GMT', 'Content-Type': 'application/json'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"notebooks.kubeflow.org \"notebook2\" is forbidden: User \"system:serviceaccount:kubeflow-user:rok-task-runner\" cannot get resource \"notebooks\" in API group \"kubeflow.org\" in the namespace \"kubeflow-admins\"","reason":"Forbidden","details":{"name":"notebook2","group":"kubeflow.org","kind":"notebooks"},"code":403}

    then the rok-task-runner service account does not have permissions to snapshot the notebook in the namespace in which you are trying to create the snapshot. If you are using default EKF permissions, this means you have to switch to the namespace of the notebook you are about to snapshot from the Kubeflow dashboard’s namespace selector.

    Alternatively, if you wish to provide the rok-task-runner service account with access to the namespace of the notebook you are about to snapshot, follow the instructions in the Share Namespace section of the docs.

  1. Select Notebooks from the central Kubeflow dashboard.

  2. Select the namespace of the notebook you wish to snapshot using the namespace selector.

  3. Pick a notebook to connect to, and click CONNECT.

  4. Start a new terminal inside the notebook.

  5. Create a new Python file and name it snapshot-notebook.py:

    jovyan@mynotebook-0:~$ touch snapshot-notebook.py
  6. Copy and paste the following code inside snapshot-notebook.py:

    snapshot-notebook.py
    1# Copyright © 2021-2022 Arrikto Inc. All Rights Reserved.
    2
    3"""Snapshot a notebook."""
    4-30
    4
    5from rok_gw_client import RokClient
    6
    7BUCKET_NAME = "mybucket"
    8# When running inside a notebook, leave this empty to create the snapshot in
    9# the current namespace.
    10BUCKET_NAMESPACE = None
    11NOTEBOOK_NAME = "mynotebook"
    12NOTEBOOK_NAMESPACE = "mynamespace"
    13
    14# Initialize the Rok client
    15client = RokClient(account=BUCKET_NAMESPACE)
    16
    17# Specify the snapshot parameters
    18commit_title = "Snapshot notebok %s" % NOTEBOOK_NAME
    19commit_message = "Snapshot notebook %s in namespace %s" % (NOTEBOOK_NAME,
    20 NOTEBOOK_NAMESPACE)
    21params = {"name": NOTEBOOK_NAME,
    22 "namespace": NOTEBOOK_NAMESPACE,
    23 "commit_title": commit_title,
    24 "commit_message": commit_message}
    25
    26# Create the snapshot
    27task_info = client.version_register(BUCKET_NAME,
    28 NOTEBOOK_NAME,
    29 "jupyter",
    30 params,
    31 wait=True)
    32print("Successfully snapshotted notebook %s in namespace %s via task %s"
    33 % (NOTEBOOK_NAME, NOTEBOOK_NAMESPACE, task_info["task"]["id"]))

    Alternatively, download the Python file above and upload it to the notebook.

  7. Update the following fields in the file:

    1. Set BUCKET_NAME to the bucket you wish to create the snapshot in.

    2. Set NOTEBOOK_NAME to the name of the notebook you wish to snapshot.

    3. Set NOTEBOOK_NAMESPACE to the namespace of the notebook you wish to snapshot.

      snapshot-notebook-updated.py
      1# Copyright © 2021-2022 Arrikto Inc. All Rights Reserved.
      2
      3"""Snapshot a notebook."""
      4
      5from rok_gw_client import RokClient
      6
      7-BUCKET_NAME = "mybucket"
      8+BUCKET_NAME = "notebooks"
      9# When running inside a notebook, leave this empty to create the snapshot in
      10# the current namespace.
      11BUCKET_NAMESPACE = None
      12-NOTEBOOK_NAME = "mynotebook"
      13-NOTEBOOK_NAMESPACE = "mynamespace"
      14+NOTEBOOK_NAME = "notebook1"
      15+NOTEBOOK_NAMESPACE = "kubeflow-user"
      16
      17# Initialize the Rok client
      18client = RokClient(account=BUCKET_NAMESPACE)
      19-33
      19
      20# Specify the snapshot parameters
      21commit_title = "Snapshot notebok %s" % NOTEBOOK_NAME
      22commit_message = "Snapshot notebook %s in namespace %s" % (NOTEBOOK_NAME,
      23 NOTEBOOK_NAMESPACE)
      24params = {"name": NOTEBOOK_NAME,
      25 "namespace": NOTEBOOK_NAMESPACE,
      26 "commit_title": commit_title,
      27 "commit_message": commit_message}
      28
      29# Create the snapshot
      30task_info = client.version_register(BUCKET_NAME,
      31 NOTEBOOK_NAME,
      32 "jupyter",
      33 params,
      34 wait=True)
      35print("Successfully snapshotted notebook %s in namespace %s via task %s"
      36 % (NOTEBOOK_NAME, NOTEBOOK_NAMESPACE, task_info["task"]["id"]))
  8. Run the script using Python 3:

    jovyan@mynotebook-0:~$ python3 snapshot-notebook.py

    Troubleshooting

    The task failed with an authorization error

    Rok tasks use the rok-task-runner service account that exists in the same namespace as the task for authorization. Therefore, if the task fails with an error similar to the following in its logs:

    2021-11-11T16:25:59.911064+00:00 ERROR ApiException: (403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Audit-Id': 'e95d761c-6dd3-4e72-ae27-a15b6e10a52f', 'Content-Length': '399', 'X-Content-Type-Options': 'nosniff', 'Cache-Control': 'no-cache, private', 'Date': 'Thu, 11 Nov 2021 16:25:59 GMT', 'Content-Type': 'application/json'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"notebooks.kubeflow.org \"notebook2\" is forbidden: User \"system:serviceaccount:kubeflow-user:rok-task-runner\" cannot get resource \"notebooks\" in API group \"kubeflow.org\" in the namespace \"kubeflow-admins\"","reason":"Forbidden","details":{"name":"notebook2","group":"kubeflow.org","kind":"notebooks"},"code":403}

    then the rok-task-runner service account does not have permissions to snapshot the notebook in the namespace in which you are trying to create the snapshot. If you are using default EKF permissions, this means you have to switch to the namespace of the notebook you are about to snapshot from the Kubeflow dashboard’s namespace selector.

    Alternatively, if you wish to provide the rok-task-runner service account with access to the namespace of the notebook you are about to snapshot, follow the instructions in the Share Namespace section of the docs.

Summary

You have successfully created a snapshot of a notebook server.

What’s Next

Check out the rest of the user guides available for Rok.