Changelog

This file describes code and packaging changes for all Rok releases starting with Rok 0.15. It is mostly of interest to packagers, administrators, and developers.

Version 1.4 (Titanium)

  • bootstrap: Rename --no-check to --validate/--no-validate.
  • Add Verify and Troubleshooting sections in the AKS docs to ensure that managed identities are enabled on AKS clusters.
  • Remove dpkg-dev, apt-utils and bzip2 from our images.
  • Use reproducible base images for Debian, Ubuntu, AmazonLinux, and CentOS.
  • Use a single task to build all bootstrap images.
  • Tag MiniKF image with labels.
  • Fix a bug to prevent a key error in a CloudFormation stack status.
  • Add a Check Kubernetes Version section in our upgrade guides.
  • Restructure the "Configure Access to Arrikto’s Private Registry" guide and add a verify section.
  • Introduce a developer guide for installing and configuring Docker.
  • Introduce arrikto-admin admonition in docs.
  • Introduce fast-forward admonition in docs.
  • Improve nested lists style in docs.
  • Introduce custom design in nested lists in docs.
  • Move the cleanup instructions to the top level of the docs.
  • Split the cleanup instructions into separate documents that clean up apps, the RokCluster, identities, storage, and the Kubernetes cluster itself.
  • Improve the structure of our cleanup documents.
  • Add support for Azure in our cleanup documents.
  • Make the Kubeflow cleanup instructions part of the Rok cleanup guide.
  • Extend the docs with Azure CLI instructions for creating an AKS cluster.
  • Extend the docs with Azure CLI instructions for attaching disks to nodes.
  • Extend the docs with Azure CLI instructions for creating a storage account.
  • Extend the docs with Azure CLI instructions for creating a Managed Identity.
  • Introduce internal ops guide for new customer onboarding.
  • Introduce internal ops guide for Team Member onboarding to AWS.
  • Improve anchor links scroll behavior in doc.
  • Introduce persistent state for toggles and admonition directives.
  • Extend the Azure docs to add tags in storage accounts.
  • Fix autofill suggestions in presentation policies in Rok UI.
  • Add account management in Rok deployments.
  • Update the Kubeflow guide to not deploy AuthService or Dex.
  • rok-csi: Extend GC to handle LIO devices.
  • Remove the section about draining CSI nodes from the upgrade instructions.
  • Add user guide for Kale JupyterLab extension.
  • Add Verify section for Azure in the "Authorize Access to Object Storage" guide.
  • Restructure the 'Hot-Patch an Arbitrary Image in Your Deployment' section of the ops docs.
  • Render Jinja2 YAML templates when rendering manifests.
  • Fix regression causing slow Rok CSI reboots.
  • rok-csi: Wait until LIOd has been fully initialized.
  • Introduce persistent state for tab directive.
  • Restructure "Create Kubernetes Cluster on AWS".
  • Explicitly specify in the Sphinx configuration file the paths included per tag for doc builds.
  • Fix the default CLI help message of True/False questions.
  • Support environment files as a new input source for answering questions.
  • Log question related events at INFO level.
  • Automate the "Clone GitOps Repository" guide.
  • Add a restore mechanism for retrieving answered questions from the deployment context.
  • Add a save mechanism for storing answers to questions in the deployment context.
  • Automate the "Configure Access to Arrikto's Private Registry" guide.
  • Add a fast-forward admonition for the 'Configure Access to Arrikto’s Private Registry' guide.
  • Extend literalinclude directive in docs.
  • Support network-accessible RWX volumes in rok-csi.
  • Support adding EBS volumes to managed node groups.
  • Fix doc builds to retrieve correct Rok version info from a vcs-version file.
  • Restructure the "Create Cloud Identity" and improve the verify section.
  • Restructure the "Authorize Access to Object Storage" and add verify section.
  • Restructure the "Grant Rok Access to Private Docker Registry" guide and add a verify section.
  • Automate the "Create Cloud Identity" guide.
  • Automate the "Authorize Access to Object Storage" guide.
  • Automate the "Grant Rok Access to Private Docker Registry" guide.
  • Restructure the "Deploy Kubeflow" section so that it follows structure and writing guidelines.
  • Add styles for the :guilabel: role in docs.
  • Add instructions for logging in to EKF via the Okta Provider.
  • rok-csi: Fix RWX volumes becoming unresponsive after restarting the rok-csi-node Pod.
  • Add the mechanism to save and restore the context of the docs.
  • Restructure the "Set Up Users for Rok" guide and add a verify section.
  • Improve the verify section of the "Deploy Rok Components" guide.
  • Improve literalinclude directive's output when rendering diffs.
  • Automate granting access to Rok and Kubeflow Pipelines to user namespaces using skel resources.
  • Add design doc for the skel controller.
  • Update Kale images to work with KF 1.4.
  • Introduce user guides for the Kale integration with the Kubeflow PyTorch Operator.
  • Support disabling automatic Profile creation upon login.
  • Automate the "Configure Git" guide.
  • Automate the "Configure AWS CLI" guide.
  • Support patching the Kale Python image to use in manifests with rok-image-patch.
  • Handle all image references of KFServing in air-gapped deployments.
  • Handle deleted resource types in rok-deploy --delete.
  • Add validation checks when restoring the deployment context of a task.
  • Automate the "Set Up Cloud Environment for AWS" guide.
  • Automate the "Create VPC" guide for AWS.
  • Automate the "Configure Subnets" guide for AWS.
  • Fix broken hidden-literalinclude directive.
  • Introduce user guides for Rok.
  • Introduce user guides for the Kale support for pipeline conditionals, and the use of volumes for data passing.
  • Support unpinning of RWX volumes in rok-csi.
  • Add Verify section for AWS in the "Authorize Access to Object Storage" guide.
  • Fix upgrade guide to first apply the new CRD and then the new CR.
  • Introduce user guides for the Kale support for Kubernetes metadata and spec configuration of pipeline steps.
  • Extend user guides with Kale-KFServing integration docs.
  • Use NFSv4 for RWX volumes in rok-csi to support file locking.
  • Add rok/rwx-enable-local-access annotation to disable the local access optimization for RWX volumes in rok-csi.
  • Prune stale resources after upgrading to Kubeflow 1.4.
  • Support filtering Notebooks by image in rok-notebook-upgrade.
  • Extend rok-do to build the access server image that Rok CSI uses to provide RWX volumes on Kubernetes.
  • Make the skel controller ignore the status field of Kubernetes objects.
  • Include version information in all Rok API service driver calls.
  • Make Rok API tasks impersonate the 'rok-task-runner' service account in their namespace, instead of the last user that created or updated them.
  • Fix stale references to Dex and AuthService Rok manifests.
  • Introduce a Kubernetes controller for Rok policies.
  • Introduce an operations guide for setting a culling policy for your Notebook Controller.
  • Add ops guide for setting up a backup policy in DML for the EBS volume that Rok etcd uses.
  • Introduce user guides for Kale container-based steps.
  • Restructure the cluster-autoscaler kustomization package and configure it using j2.
  • Patch Cluster Autoscaler to support scale-in operations in clusters running Rok.
  • Revamp Rok Monitoring Stack to work with Kubernetes 1.19 and 1.20.
  • Extend rok-deploy to support server-side applying resources to Kubernetes.
  • Deploy Rok Monitoring Stack using server-side apply.
  • Always deploy Rok Monitoring Stack on Kubernetes using rok-deploy.
  • Upgrade Kale to support numerous new features and fix bugs.
  • Extend the Rok policy controller to add finalizers to policies it manages.
  • Fix VerifyPasswordInputQuestion to respect question attributes.
  • Use NFSv4.2 and non-privileged NFS ports to prevent using stale conntrack entries after migrating the NFS server of a RWX volume in rok-csi.
  • rok-csi: Don't mix pods accessing a RWX volume over NFS with pods accessing it locally.
  • rok-csi: Track the nodes where a volume is staged to work around a Kubernetes bug which results in unpublishing in-use volumes.
  • Introduce optional field spec.images.rokAccessServer in the RokCluster CR.
  • Support auto-recovery of RWX volumes in rok-csi.
  • rok-csi: Work around NodeStageVolume Kubernetes bug.
  • Add support for tolerations in the RokCluster CR.
  • Add fast-forward support in deploy2.
  • Automate the "Create EKS Cluster IAM Role" guide.
  • Automate the "Create EKS Node IAM Role" guide.
  • Automate the "Create EKS Cluster" guide.
  • Automate the "Enable IAM Roles for Kubernetes Service Accounts" guide.
  • Automate the "Access EKS Cluster" guide.
  • Automate the "Create EKS Node Group" guide.
  • Introduce an Ops guide to create a default snapshot policy for notebooks.
  • Introduce an Ops guide to create a snapshot policy for Kubeflow PVCs.
  • Improve toggle formatting in docs.
  • Automate the "Set Up Users for Rok" guide.
  • Automate the "Deploy Rok Components" guide.
  • Automate the "Set Up Rok Storage Class" guide.
  • Automate the "Install Kubeflow" guide.
  • rok-csi: Use common labels for Rok Access Server StatefulSet and Service.
  • Add an AuthorizationPolicy client in our Rok Kubernetes clients.
  • rok-csi: Restrict access to Rok Access Server using Istio Authorization Policy.
  • Automate the "Integrate Rok with Kubeflow Dashboard" guide.
  • Improve the Verify section of the "Authorize Access to Object Storage" guide to detect authorization errors if the bucket does not exist.
  • Automate the "Create Hosted Zone" guide.
  • Automate the "Create IAM Role for ExternalDNS" guide.
  • Improve the display of task logs for tasks with large numbers of log lines in Chromium browsers.
  • Improve handling of unknown labels in navigation buttons in the docs.
  • Automate the "Deploy ExternalDNS" guide.
  • Support hiding the first paragraph of admonitions on demand.
  • Automate the "Create ACM Certificate" guide.
  • Improve nested lists styles in docs.
  • Automate the "Deploy cert-manager" guide.
  • Automate the "Create IAM Role for AWS Load Balancer Controller" guide.
  • Automate the "Deploy AWS Load Balancer Controller" guide.
  • Automate the "Deploy NGINX Ingress Controller" guide.
  • Improve explicit numbering in doc's numbered lists.
  • Improve the style of tabs inside admonitions.
  • Automate the "Expose Istio" guide.
  • Support running rok-k8s-reboot in air-gapped environments.
  • Fix EKS_IAM_{CLUSTER, NODE}_ROLE variables in docs and j2 templates.
  • Remove exports from the CF stacks for the EKS cluster/node IAM roles.
  • Provide clearer messages for the fast-forward path in rok-deploy2.
  • Standardize deploy2 logic for rok-deploy.
  • Fix new line omission when rendering jinja templates in deploy2.
  • Fix LOW priority questions in the fast-forward path of deploy2.
  • Automate the "Deploy Cluster Autoscaler on AWS" guide
  • Split some Rok guides into parent-fork structure.
  • Extend the Rok CSI GC code to check the system state and reconcile the 'staged' list accordingly.
  • rok-csi: Fix ControllerPublishVolume leaving behind stale NFS server Pods.
  • Split the AWS VPC guide into 2 guides for VPC creation and subnets configuration.
  • Use edit commands in Deploy Autoscaler guide instead of kustomize edit.
  • Update Kubeflow manifests to fix KFP UI bugs.
  • Extend the Frontend module to support writing a summary to stderr at the end of the execution.
  • LVMd: Fail volume creation if hydration is stuck for more than one minute.
  • Update Kale images to introduce a PyTorch distributed example, support the new ML Notebook driver, and fix a KFP client credentials initialization bug.
  • Update the Test Rok section of the installation docs to deploy an application in the user's rather than the default namespace, so it is compatible with the task authentication changes introduced in Rok 1.4.
  • rok-csi: Add 30 minute timeout on volume lock acquisition for snapshots.
  • Improve the Rok driver for Jupyter Notebooks to handle Notebook CRs instead of Pods.
  • Add toleration to ensure that RWX volumes work on GPU dedicated nodes.
  • Patch knative-serving Deployments and set the safe-to-evict annotation to true.
  • Support deploying Rok monitoring stack in air-gapped environments.
  • Restructure subnet configuration.
  • Minify the Node Exporter Grafana dashboard JSON definition to avoid server-side applying the Rok Monitoring Stack.
  • Fix a rendering bug in rok-deploy for the "Create IAM Role for Cluster Autoscaler" task.
  • Fix LIOd waiting forever for the TCM loop device to appear.
  • Add save/restore mechanism to manual installation guides.
  • Use simulate-principal-policy to verify permissions of IAM role for ExternalDNS.
  • Use simulate-principal-policy to verify permissions of IAM role for AWS Load Balancer Controller.
  • Use CloudFormation in the "Create IAM Role for ExternalDNS" guide.
  • Use CloudFormation in the "Create IAM Role for AWS Load Balancer Controller" guide.
  • Use simulate-principal-policy to verify permissions of IAM role for EKS Cluster and EKS Node IAM Role guides.
  • Use CloudFormation in the "Create Hosted Zone" guide.
  • Support using existing hosted zones in the "Create Hosted Zone" guide.
  • Update Verify section in the "Create Hosted Zone" guide.
  • Save the names of CloudFormation stacks
  • Add missing environment variables to some questions in deploy2.
  • Support AMI releases 1.18.20-20211001, 1.18.20-20211003 and 1.18.20-20211004 [kernel version 4.14.246-187.474.amzn2.x86_64] for node groups on EKS.
  • Support AMI releases 1.18.20-20211008 and 1.18.20-20211013 [kernel version 4.14.248-189.473.amzn2.x86_64] for node groups on EKS.
  • Support AMI releases 1.19.13-20211001, 1.19.13-20211003 [kernel version 5.4.144-69.257.amzn2.x86_64] for node groups on EKS.
  • Support AMI releases 1.19.13-20211004, 1.19.14-20211008 and 1.19.14-20211013 [kernel version 5.4.149-73.259.amzn2.x86_64] for node groups on EKS.
  • Add fast-forward admonitions to manual installation guides for AWS.
  • Fix Kale image building breakage for Python versions other than 3.6.
  • Add the --run-from TASK argument to deploy2, to allow starting the installation from an arbitrary task.
  • Remove CloudFormation stack names from the deployment context.
  • Improve the 'Set Up Rok Storage Class' guide and introduce a verification step.
  • Update Verify section in the "Deploy Rok Components" guide.
  • Specify trusted CIDRs for both internal and internet-facing ALBs.
  • Restructure the "Configure Git" guide.
  • Add ops guide on how to gather logs for troubleshooting.
  • Restructure the "Clone GitOps Repository" guide.
  • Fix a bug when creating snapshots of notebooks with emptyDir volumes.
  • Add missing export for subnets env var in the f-f section of the "Create EKS Managed Node Group" guide.
  • Fix a bug in the NodeUnstageVolume Rok CSI method that could result in stale (not deactivated) volumes.
  • Work around NFS kernel bugs that could result in leaving behind stale knfsd threads, preventing rok-csi from deactivating and deleting a RWX volume.
  • Improve the Rok Registry installation docs.
  • Add instructions to snapshot a notebook using the Rok UI, command line and Rok Python client.
  • Fix some omissions in the fast-forward sections of our docs.
  • docs: Add operations guide about recovering RWX volumes after node failure.
  • Use CloudFormation in the "Create ACM Certificate" guide.
  • Introduce ops guides related to firewalling.
  • Fix typo in the "Gather Logs for Troubleshooting" guide.
  • Add an operations guide on how to issue Rok Registry tokens.
  • Refactor rok-deploy to use client-side apply for the Rok Monitoring Stack.
  • rok-csi: Don't try to record events on non-existing resources.
  • Extend the docs with instructions to retrieve the logs of a Rok API task via the Rok UI.

Version 1.3.1 (Sapphire)

  • Extend Istio to support regular expressions in Authorization Policies.

Version 1.3 (Sapphire)

  • Support RDM on Google Cloud.
  • Enable auto-recovery for Rok on Google Cloud.
  • Configure gcloud inside rok-tools.
  • Set up cloud environment for GCP inside rok-tools.
  • Support creating a GKE cluster.
  • Expose services on GCP.
  • Add instructions for logging in to EKF via the Google Identity Provider.
  • Rename the --aws-region argument of the Rok S3 daemon to --region.
  • Introduce the --authentication-scheme argument to the Rok S3 daemon, which controls the authentication scheme used when accessing the S3 service.
  • Introduce the --gcp-access-token argument to the Rok S3 daemon to pass the OAuth2 token when using the GCP authentication scheme.
  • Extend the Rok S3 daemon to automatically retrieve security credentials from GCP instance metadata when they have not been provided via the environment.
  • Introduce the --gcp-project-id argument to the Rok S3 daemon to pass the Google project ID to use when accessing Google Cloud Storage.
  • Extend the Rok Operator to support deploying Rok using Workload Identities on GKE.
  • Support deploying Rok in GKE using Workload Identities.
  • Add instructions to deploy Rok using a Workload Identity on GKE.
  • Prevent GKE from forcing v1beta1 CSI snapshot CRDs.
  • Use high performance storage for Rok external services on GKE.
  • Introduce Kubernetes resource quotas on Rok to allow assigning the system critical priority classes to System Pods.
  • Improve ordered list styles in docs.
  • Make the deploy overlays of our kustomizations build-able.
  • Support running nginx-ingress-controller in security-wise strict environments where privilege escalation is not allowed.
  • Introduce nav-buttons directive in docs.
  • do: Enrich the labels rok-do attaches to snapshots and remotes
  • Improve toggle directive's nested functionality in docs.
  • Improve list design in docs.
  • Support deploying rok-tools inside an EC2 instance.
  • Support air gapped deployments on AWS.
  • Improve numbering in nested ordered lists in docs.
  • Make all documentation's headers black.
  • Improve code-block's desing in docs.
  • Introduce helper for preserving comments when removing entries from YAML manifests.
  • Temporarily revert changes when running rok-image-patch to support seamless upgrades after first invocation.
  • Extend the Rok common download helper to automatically encode the downloaded content using the encoding found in the HTTP headers of the response.
  • Introduce ec2 specific helpers in rok-aws that fetch an AWS instance's metadata.
  • Implement a Python 3 client for the MiniKF DDNS API.
  • Introduce a developer guide for bug report workflow.
  • Rename Maintenance section to Operations Guide.
  • Restructure "Configure Rok" and move it to Operations Guide.
  • Upgrade cert-manager to version 1.3.1.
  • Use hex encoding in S3Proxy credentials.
  • Use a predictable and unique storage account name on Azure.
  • Specify the S3 bucket prefix when deploying Rok on Azure.
  • Introduce a Rok Kubernetes client for SubjectAccessReview resources.
  • Introduce a Rok Kubernetes client for TokenReview resources.
  • Use the Rok Kubernetes client in the Kubernetes authorization and authentication backends of the Rok Django library.
  • Restructure "Test Rok".
  • Restructure the "Expose Services on AWS with ALB" guide.
  • Use a predictable and unique Managed Identity name on Azure.
  • Support Ubuntu Bionic kernel 5.4.0-1048-azure for AKS node pools.
  • Remove rok-conf dependency from RDM.
  • Introduce Kubernetes resource quotas to our manifests to allow assigning the system critical priority classes to System Pods.
  • Mark System Pods of Rok external services as critical to protect from OOM kills and evictions.
  • Add CPU requests to containers of Rok external services to protect them from CPU starvation.
  • Fix rok-image-patch to work with EKF 1.3.
  • Extend rok-kf-rebase to handle commits made with rok-image-patch.
  • Drop support for Kubernetes 1.16.
  • Bump the version of the Kubeflow manifests.
  • Introduce dedicated guide for patching manifests to use mirrored images.
  • Modify the "Switch release channel" document for 1.3.
  • Add Auto Scaling Rok AWS client.
  • Support multiple node groups and Availability Zones in rok-k8s-drain tool.
  • rok-k8s-drain: Recalculate utilization of candidate node before draining it.
  • rok-k8s-drain: Add extra logs while waiting for a node to be removed.
  • Support AMI releases 1.17.12-20210628 and 1.18.9-20210628 [kernel version 4.14.232-177.418.amzn2.x86_64] for managed node groups on EKS.
  • Restructure and enhance the Kale SDK guides.
  • Enable the AutoML-related features of Kale.
  • Increase the HTTP request header limits for the NGINX and Istio proxies, and Rok's Gunicorn.
  • Restructure and enhance the Mirror Arrikto GitOps repository guide.
  • Extend rok-notebook-upgrade script to support label selectors.
  • Extend rok-notebook-upgrade script to remove PodDefaults from notebooks.
  • Extend rok-notebook-upgrade script to add PodDefaults to notebooks.
  • Remove the AGPL-licensed libjbig2dec0 package from rok-tools.
  • Add instructions for logging in to EKF via the PingID Identity Provider.
  • Upgrade Kale due to bug fixes.
  • Fix an error in the migration script for config version v010300_0002.
  • Update the Rok 1.3 upgrade guide to check for Kubernetes version 1.17 or 1.18.
  • Introduce the Kale - Katib integration user guides.
  • Extend rok-image-list to include the Rok Registry image.
  • Ensure user-enabled Istio patches take effect after running rok-image-patch.
  • Remove some unnecessary dependencies from the Rok Registry image.
  • Move the Rok Registry overlays named registry-* into their own registry/ directory.
  • Enable Rok Trackers to port-check arbitrary hosts.
  • Enable Rok Thrower to specify a user-defined host during port-checking.
  • Enable Rok Thrower to announce a user-defined host to a Rok Tracker.
  • Enable users to change the Rok Tracker configuration from the RokRegistryCluster CR.
  • Make the Rok Tracker trust by default the hosts that Rok Thrower announces.
  • Allow exposing the Rok Thrower using a LoadBalancer Service.
  • Restructure the "Deploy Rok Registry" guide.
  • Handle modify/delete conflicts during rebase
  • Fix rok-kf-prune to not remove necessary resources for cert-manager leader election.
  • Add a guide on how to configure a Rok cluster to sync data with other peers.
  • Introduce an ops guide for trusting a custom CA.
  • Use Dex as the default OIDC provider for authentication in Rok Registry.
  • Add user guide on how to register Rok cluster to Rok registry.
  • Add user guides on how to publish and subscribe to bucket.
  • Support containerd as a container runtime for Kubernetes, by configuring Argo to use the PNS executor.
  • Update the "Scale-in Kubernetes Cluster" documentation and remove the single node group, single Availability Zone requirement.
  • Support Ubuntu Bionic kernels 5.4.0-1049-azure and 5.4.0-1051-azure for AKS node pools.
  • Add repo to detect 5.4 kernel source packages in the Amazon Linux 2 image.
  • Support AMI release 1.18.9-20210722 [kernel version 4.14.238-182.422.amzn2.x86_64] for managed node groups on EKS.
  • Support AMI release 1.18.20-20210813 [kernel version 4.14.241-184.433.amzn2.x86_64] for managed node groups on EKS.
  • Support AMI release 1.19.13-20210813 [kernel version 5.4.129-63.229.amzn2.x86_64] for managed node groups on EKS.
  • Support Ubuntu bionic kernel 5.4.0-1044-gke for GKE.
  • Upgrade kubectl in rok-tools to 1.18.19.
  • Support Kubernetes version 1.19.
  • Restore "clone an existing Rok snapshot" functionality in VWA.
  • Set the failurePolicy to Fail in the MutatingWebhookConfiguration for the admission-webhook controller.
  • Upgrade Kale to fix local execution bugs and introduce some new features.
  • Set Node Exporter's port to 9200 to avoid possible conflicts when deploying Rok's monitoring stack alongside vanilla Prometheus' installations.
  • Support AMI releases 1.18.20-20210826 and 1.18.20-20210830 [kernel version 4.14.243-185.433.amzn2.x86_64] for node groups on EKS.
  • Support AMI releases 1.19.13-20210826 [kernel version 5.4.129-63.229.amzn2.x86_64] and 1.19.13-20210830 [kernel version 5.4.141-67.229.amzn2.x86_64] for node groups on EKS.
  • Fix a bug where Rok Operator mishandled the trusted_CA_certs configvar during cluster upgrades.

Version 1.2.2 (Ruby)

  • Set imagePullPolicy to IfNotPresent in Istio manifests.

Version 1.2.1 (Ruby)

  • Update the Rok 1.2 upgrade guide to check for Kubernetes version 1.17 or 1.18.
  • Handle potential conversion webhook misconfiguration during upgrades.

Version 1.2 (Ruby)

  • Introduce a new Django view in Rok GW to serve HTTP GET requests at /metrics and expose Rok metrics in Prometheus's text-based format.
  • Introduce a Grafana dashboard with multiple rows and panels to visualize Rok metrics, extracted from Prometheus's TSTB.
  • Set newTag only if necessary when patching images for air gapped deployments.
  • Add python-authlib in the Debian packages to install for CI, RokE and Registry images.
  • Separate Istio deployment from Rok and Rok Registry in rok-deploy.
  • Introduce "Arrikto" and "air gapped" custom admonitions in docs.
  • Include the S3 action performed in the logs of the S3 daemon.
  • Include the names of all libs3 functions called in the logs of the S3 daemon.
  • Truncate the MiniKF image name to conform to the naming restrictions of GCP and AWS.
  • Use Kubernetes 1.17 for EKS clusters.
  • Implement a common button component in the UI.
  • Introduce social login buttons in the UI.
  • Improve the button hover functionality in the UI.
  • Disable GC cron jobs in Rok Registry clusters.
  • Generate the rok-dlm-break service dynamically, based on the type of the appliance.
  • Add python script in package rok_pu for testing individual target PUs.
  • Fix a bug where the Rok S3 daemon would not verify the SSL certificate of the S3 service it connected to.
  • Add a Rok cluster config variable to allow connecting to an S3 service without verifying its SSL certificate.
  • Configure Prometheus to run in multiprocess mode to allow Gunicorn workers to cooperate in order to expose GW metrics.
  • Restructure the 'Prepare Management Environment' section of the EKS docs to follow the current documentation guidelines.
  • Add the Prometheus Python client as a dependency to Rok's Django library.
  • Install the Prometheus Python client in Rok Registry container images.
  • Add settings for external OIDC providers for Rok Fort.
  • Add the 'SocialUser' model which holds information about users who authenticate with external OIDC providers in Rok Fort.
  • Add support for authentication via external OIDC providers in Rok Fort.
  • Protect the OIDC endpoints using a state parameter.
  • Add support for the OIDC callback URL in the common UI code.
  • Extend Rok Registry UI to initialize/finalize OIDC cycles.
  • Prevent updating browser's history in docs when scrolling.
  • Increase documentation's content width.
  • Change ordered list design in docs.
  • Remove depth limitation from doc's menu.
  • Update our docs with instructions on how to edit Registry-related images.
  • Fix a bug in Registry UI that was showing the "Sign In" form when there's a single Social provider.
  • Introduce Python helper to calculate Rok's build ID and use it from CMake.
  • Introduce Python helper to calculate the version for Rok's Python packages and use it from CMake.
  • Include Rok Registry in the release procedure.
  • Extend rok-image-mirror to dump list of mirrored images.
  • Skip creating a pending cluster configuration if there are no changes.
  • Fix a bug that prevented setting cluster config variables to values that contain braces.
  • Extend Rok Operator to upgrade cluster config variables that are not specified under .spec.configVars, but are provided by the users as fields in the CR's spec.
  • Add documentation for configuring external OIDC providers in Rok Fort.
  • Fix an incompatibility issue in Rok APIs that caused Prometheus metrics to be registered more than once in Python 3.
  • Fix a Python 3 compatibility bug in the Rok etcd3 client.
  • Implement an etcd backend for the Dynamic DNS API for MiniKF.
  • Introduce a Django based Dynamic DNS API for MiniKF AWS instances, that will serve names under the minikf.arrikto.ai zone.
  • Introduce arrikto-dev, arrikto-contact and air-gapped admonition directives in docs.
  • Allow long links to wrap in docs.
  • Gracefully exit GC task of rok-do when the working directory is empty.
  • Fix error logs in Rok Registry and Rok Fort due to Prometheus integration.
  • Fix a validation bug for config variables that have already been converted to the proper Python type.
  • Fix a bug in MiniKF's provision script, where the list of downloaded images was not correctly passed to the ConfigMap of the admission webhook that sets the imagePullPolicy of downloaded images to Never.
  • Change the MiniKF's admission webhook's invocation policy, so that it is invoked again if a subsequent webhook (e.g., Istio injection webhook) further changes the Pod.
  • Introduce RDM overlay with a disk-script that works on Azure.
  • Upgrade Linux kernel in MiniKF to 5.4.104-0504104-generic to fix a Go runtime issue that made CSI sidecars crash because of hitting max locked memory limits.
  • Install virtualbox-guest-dkms and nvidia-440 in MiniKF of all supported platforms.
  • Do not attach the AmazonEKSClusterPolicy IAM policy to the EKS cluster IAM role.
  • Declaratively manage IAM roles needed to create an EKS cluster with AWS CloudFormation stacks.
  • Rename the assume-no-versioning command line argument of the Rok S3 daemon to --no-validate-versioning, and make it skip validation of S3 bucket versioning status when provided, regardless of whether versioning is used by the daemon.
  • Remove the --no-versioning argument from the Rok S3 daemon and automatically enable versioning when the IFC library is enabled via the --enable-ifc argument.
  • Instead of always listing versions to determine if an S3 bucket exists and is empty, only list versions if IFC is enabled, otherwise list objects, to ensure the S3 daemon is compatible with S3 APIs that do not support versioning.
  • Add a note for rebalancing the pods.
  • Update gcloud sdk in MiniKF, as currently pinned version was removed from repo.
  • Enable TCP keepalives globally in Istio.
  • Fix a bug where custom admonitions did not support multiple CSS classes.
  • Introduce toggle directive in docs.
  • Introduce foldable admonitions in docs.
  • Add sphinx-tabs extension for tabbled content in docs.
  • Fix a bug where a user couldn't register a new Rok Registry from the settings page in the UI.
  • Fix email symbols handling in Rok Registry links in the UI.
  • Update NVIDIA driver and CUDA version in MiniKF to 460 and 11.2 respectively.
  • Mount ~/.docker/ on tmpfs to fix the broken symlink across MiniKF reboots.
  • Extend MiniKF to use rok-image-list and automatically generate the list of images that provision.py needs to pre-pull.
  • Use a newer version of python3-git to work with packed-refs created from newer Git versions. As a result, fix some import issues.
  • Redesign MiniKF's landing page for Vagrant.
  • Use our own nginx-ingress-controller kustomization instead of Minikube's ingress addon.
  • Use manifests to deploy Istio Ingress instead of applying a formatted string value.
  • Extend MiniKF to read docker/images-exclude and exclude images mentioned in this file.
  • Fix a bug in Rok UI where it throws a NullInjectorError for the AuthUrl InjectionToken.
  • Fix a bug that resulted in an incorrect suggested file name in Dataset snapshot policies.
  • Fix a bug where after changing the file name of a snapshot policy, the Rok UI would still display the default value.
  • Produce a smaller Vagrant box for MiniKF by excluding non-critical images from the pre-pull list.
  • Extend rok-version to generate a valid SemVer for MiniKF.
  • Fix a bug in MiniKF where it would always try to pull images from index.docker.io even if they exist locally.
  • Add design doc for authentication with external OIDC providers in Rok Fort.
  • Increase the amount of required RAM for MiniKF on VirtualBox from 10GB to 12GB.
  • Exclude extra Docker images from MiniKF on GCP to improve provisioning times.
  • Implement a composite authentication backend for the MiniKF Dynamic DNS API, to allow bearer token authentication for instances and admins.
  • Ensure that no stale containers are left in the final MiniKF image.
  • Update APT cache before installing kernel build dependencies on Ubuntu.
  • Support Ubuntu Bionic kernel 5.4.0-1040-azure for AKS node pools.
  • Support Ubuntu Xenial kernel 4.15.0-1108-azure for AKS node pools.
  • Support Ubuntu Xenial kernel 4.15.0-1109-azure for AKS node pools.
  • Support Ubuntu Xenial kernel 4.15.0-1111-azure for AKS node pools.
  • Extend the rok-tools manifests to support deployment on Azure.
  • Disable Azure's Admissions Enforcer for Istio.
  • Support RDM on Azure.
  • Retry Kubernetes watch() operations on ProtocolError exceptions.
  • Enable TCP keepalives in rok-kubernetes Python module.
  • Install Azure CLI in rok-tools.
  • Bring rok-deploy up-to-date with the latest instructions for cloning our GitOps repository.
  • Introduce manifests to deploy S3Proxy on AKS.
  • Extend the docs with instructions to deploy Rok over S3Proxy on Azure cloud.
  • Deploy Rok's external services (etcd/PostgreSQL/Redis) on Azure.
  • Expose services on Azure.
  • Configure Azure CLI inside rok-tools.
  • Set up a cloud environment for Azure inside rok-tools.
  • Support creating an AKS cluster.
  • Introduce the rok-kf-rebase CLI tool to help with manifests rebase.
  • Introduce the rok-kf-prune CLI tool to help with resource pruning during upgrades.
  • Update to Enterprise Kubeflow 1.3 manifests.
  • Add upgrade instructions for EKF 1.3.
  • Remove EKS references from platform-agnostic sections of the docs.
  • Add a maintenance guide with instructions on how to add an internal GitHub repository as a backup GitOps remote.
  • Add a maintenance guide with instructions on how set up cluster-wide access to a Docker Registry.
  • Add aliases for Kubernetes memory units Ei, Pi, Ti, Gi, Mi, Ki.
  • Introduce script to scale-in a Kubernetes cluster.
  • Improve highlighting of prompts in doc's code blocks.
  • Update the Debian base image rok-do uses to debian/snapshot:stretch-20210511.
  • Use Kubernetes 1.18 for EKS clusters.
  • Expose services on AWS using Classic Load Balancer.
  • Fix a validation check for emails in our githooks that failed if an email address contained a dot.
  • Add maintenance guide for adding users in dex.
  • rok-k8s-drain: Fix scale-in script to handle Unauthorized Errors.
  • rok-k8s-drain: Remove K8s configuration confirmation question.
  • rok-k8s-drain: Ask for user input confirmation.
  • rok-k8s-drain: Update log file location.
  • rok-k8s-drain: Fix help argument to work with missing kube config file.
  • rok-k8s-drain: Use AWSRegion Question instead of AWSRegionArgument.
  • Introduce script to protect Arrikto EKF Pods from OOM conditions and CPU starvation.
  • Support rendering the Rok 1.4-rc5 "Titanium" (release - release-1.4) (iliastsi@rok-dev) (GCC 6.3.0) 2021-11-19T13:20:36Z in docs.
  • Increase the buffer size that NGINX Ingress Controller allocates for reading HTTP response headers, so that it doesn't fail when the Rok UI returns large headers.
  • Add upgrade instructions for NGINX Ingress Controller.
  • Fix supported list of kernels.
  • Support AMI releases 1.17.12-20210526, 1.17.12-20210621, 1.18.9-20210526 and 1.18.9-20210621 [kernel version 4.14.232-176.381.amzn2.x86_64] for managed node groups on EKS.
  • Remove check for the AWS CLI credentials file when deploying in EKS.
  • Make the deploy overlays of our kustomizations build-able.

Version 1.1.1 (Quartz)

  • Update the Rok 1.1 upgrade guide to check for Kubernetes version 1.17.

Version 1.1 (Quartz)

  • Make our AWS CloudFormation client, and rok-s3-authorize by extension, idempotent.
  • Improve the periodic rule of Rok API version retention policies to retain the latest instead of the earliest version in each interval.
  • Do not include group members in the files list API call of the Rok API.
  • Extend the files list API call of the Rok API to support including deleted files in the response.
  • Include the number of versions of each object in the files list API call of the Rok API.
  • Support pagination in the files list API call of the Rok API.
  • Extend Rok's provisioning tool for Kubernetes with the --delete mode to delete specified Kustomize packages.
  • Add a loader to the select all button of the Rok UI.
  • Use pagination in the copy and delete files dialogs of the Rok UI.
  • Use pagination in the files list page of the Rok UI.
  • Remove a backwards compatibility fix for Rok versions v0.10 or earlier, that allowed passing the task ID in place of the bucket name to retrieve a task by ID in the API call to list the tasks of a bucket in the Rok v1 services API.
  • Replace the coarse grained authorization which was applied by the Rok API to provide namespace isolation with fine grained authorization tests for each API call, ensuring the user is authorized to perform the specific action they requested.
  • Remove a workaround that automatically added the Kubeflow-UserID header in all Rok client requests performed inside a Kubernetes cluster.
  • Only allow authenticating via a token in the Rok client and CLI.
  • Drop the GW_ part of all environment variables used by the Rok client. For example, rename ROK_GW_TOKEN to ROK_TOKEN.
  • Use the Authorization: Bearer <token> header instead of the X-Auth-Token: <token> header for authentication in the Rok API and client.
  • Relax a restriction in our githooks that required every introduced Rok config version in our repo to also immediately be the target one.
  • Support using more than one authentication backend simultaneously in the Rok API.
  • Support authentication via Kubernetes tokens in the Rok API.
  • Retrieve the CSRF token from the X-XSRF-Token header in the Rok API.
  • docs: Document how Rok CSI handles auto-registration for VolumeSnapshots
  • Introduce more fine-grained ClusterRoles for users and administrators to provide access to the Rok API.
  • Restrict access to individual Rok API services via RBAC rules.
  • Fix a bug where Rok API tasks created using a Kubernetes token failed to access the Kubernetes API due to using the user ID instead of the username for impersonation.
  • Introduce a design document for the Kubernetes Rok operator.
  • Restrict Rok CSI to only allow registering VolumeSnapshots in the same Rok account as the snapshot's Kubernetes namespace.
  • Restrict Rok CSI to only allow creating PVCs from a Rok URL in the same Kubernetes namespace as the account of the Rok URL.
  • Remove support for the rok/origin-fisk and rok/origin-fisk-group annotations from Rok CSI, which violated namespace isolation by allowing users to register any fisk into their account.
  • Extend our APT helper to install packages in a batch while retaining progress reports.
  • Remove a 500ms delay from our progress messages in the 'dialog' frontend.
  • Use a distinct call to list group members in the versions list page of the Rok UI.
  • Introduce separate tasks to manage different deployments repos.
  • Rename the Rok CLI from rok-gw to rok.
  • Automatically reload tokens before every request in the Rok client if they have been provided using the file: prefix.
  • Extend rok-do to garbage-collect local artifacts.
  • Add design document for Rok CLI questions.
  • Set argparse.SUPPRESS as the global default for CLI args and display the enclosing Question's default in the CLI arg's help message.
  • Do not mutate CLI argument defaults via preseed files.
  • Extend rok-version with the --build-tag argument to report the versioned tag of build artifacts.
  • Extend Rok's build version with the source branch of the release.
  • Add license, build type and git branch information to rok-do tasks that manage manifests, docs and the deployment repositories.
  • Introduce per release open-ended upgrade notes and fold any generic ones into the version-specific ones.
  • Include fixes for upstream dm-era bugs in the rok-kmod images.
  • Introduce a script to upgrade the image of all notebooks in a cluster.
  • Create Rok Registry images with rok-do.
  • Introduce a script to perform a rolling reboot of a Kubernetes cluster.
  • Introduce a script to reset the CBT data of all Rok PVCs.
  • Fix a bug where the Rok etcd library would sometimes report an incorrect number of retries in its logs.
  • Fix a bug where the Rok DLM CLI would incorrectly log warnings about all other DLM clients being missing when requested to retrieve information for one of them.
  • Fix an out of bounds memory access bug in the Python bindings of librok_dlm that resulted in the rok-dlm CLI occasionally segfaulting and leaving behind stale locks after a pod restart.
  • Extend rok-deploy to deploy Rok Registry clusters and split the deployment process into three steps: Deploy, Generate manifests, Apply manifests.
  • Improve the Kubeflow recurring runs upgrade instructions to use the Jobs page and clone old failing runs.
  • Include the user's AWS account ID in the default S3 bucket name prefix.
  • Omit the -rok-rok suffix from the name of the CF stack and related IAM resources needed to grant Rok full access to S3 buckets.
  • Fix a bug where the modal for entering an authorization code in Rok UI closes unexpectedly.
  • Use UI's path as a prefix when storing and retrieving localStorage values.
  • Introduce rok-do tasks for building the Rok Documentation with any combination of (builder type, tags).
  • Incorporate the public tag of the Rok Documentation into the logic/content of the docs.
  • Use Debian image snapshots as the base Docker images for rok-do tasks.
  • Add an option that disables the offline warning notification for specific requests in the UI.
  • Remove the v prefix from Rok version and related artifacts.
  • Fix a bug where the Rok S3 daemon would attempt to assume an AWS role using the AWS STS endpoint of an incorrect region.
  • Revamp the Rok S3 daemon bucket versioning validation to first retrieve the versioning, and then if required either update it during formatting or fail with an error during validation.
  • Support deploying Rok over pre-existing, empty S3 buckets
  • Fix a wrong route in Authservice's SKIP_AUTH_URLS setting.
  • Replace the patchesStrategicMerge and JSONPatches6902 fields with the patches one in the kustomization file of monitoring's deploy overlay.
  • Allow the user to verify if the S3 IAM role exists, instead of making it a strict check in rok-deploy.
  • Prevent the auto-redirect to the Kubeflow dashboard from the OAuth callback page.
  • Highlight the active menu item in the Rok docs.
  • Upgrade Font Awesome version in docs.
  • Improve the appearance of admonitions in the docs.
  • Allow selecting the prompts in all code-blocks except console in the docs.
  • do: Improve the way we clean up and snapshot MiniKFs
  • Loosen the newsworthiness check of our githooks by ensuring that at least one of NEWS.rst, Changelog.rst is updated by a commit that closes a GH issue.
  • Fix a bug in the responses of the OAuth endpoints in the Rok API.
  • Use the correct Registry base URL in the Rok UI during the Rok registration process.
  • Support using classic ELB instead of ALB to expose NGINX.
  • Support terminating TLS on NGINX instead of using an ACM certificate at ALB.
  • Introduce manifests for creating self-signed certificates and expose Rok+EKF with ELB in front of NGINX.
  • Support AMI release 1.16.15-20210310 [kernel version 4.14.219-164.354.amzn2.x86_64] for managed node groups on EKS.
  • Fix rok-lio bug that causes rok-csi to misdetect whether a Fisk is exposed as a block device.
  • Fix race in the pre-clone verification step of LVMd that could lead to errors, such as failures to unexport the origin Fisk, I/O errors, and stale TCMU handlers.
  • Support applying different set of patches for each supported kernel version in do/kmod tasks.
  • Support AMI release 1.16.15-20210322 [kernel version 4.14.225-168.357.amzn2.x86_64] for managed node groups on EKS.
  • Support serving multiple versions of the docs.
  • Fix rok-do to download the correct kernel source for Ubuntu kernels.
  • Support AMI releases 1.16.15-20210329 and 1.16.15-20210414 [kernel version 4.14.225-169.362.amzn2.x86_64] for managed node groups on EKS.
  • Support AMI release 1.16.15-20210501 [kernel version 4.14.231-173.360.amzn2.x86_64] for managed node groups on EKS.
  • Support AMI releases 1.16.15-20210504, 1.16.15-20210512 and 1.16.15-20210518 [kernel version 4.14.231-173.361.amzn2.x86_64] for managed node groups on EKS.
  • Mark Rok and RokCSI Pods as critical, to avoid OOM kills and evictions.
  • Improve the copy button, implement exactly the same behavior as manually selecting and copying text.
  • Improve copy behavior for secondary prompts in doc's code blocks.
  • Improve text color for command's output in doc's code blocks.
  • Improve copy behavior in doc's code blocks with command's outputs.
  • Add CPU requests for RokE and Rok CSI containers to protect them from CPU starvation.

Version 1.0 (Platinum)

  • Fix a bug where the account selector in the Rok UI sometimes displayed the incorrect account.
  • Do not display a logout button when logging out is not possible in the Rok UI
  • Fix a bug where Rok API drivers would use the account instead of the user to perform authorization checks for tasks.
  • Fix a bug where the Rok UI would sometimes raise an undefined variable exception after logging in.
  • Fix a bug where the Rok UI would ignore the namespace selected via the Kubeflow dashboard selector
  • Fix a bug where the Rok UI would not render correctly in a Kubeflow environment.
  • Fix a bug where Kubernetes exceptions would not be converted to a Unicode string properly, resulting in the messages of Kubernetes errors not being visible in Rok task logs.
  • Fix a bug where the Rok client would fail to retrieve the user's ID when using static authentication.
  • Remove secrets from the allowed variables in Rok CSI auto-register URLs.
  • Fix a bug where Rok CSI would fail to auto-register a VolumeSnapshot when the Rok API was using AuthService authentication.
  • Fix a bug where Rok CSI would fail to hydrate a PVC when the Rok API was using AuthService authentication.
  • Give Rok CSI a rok-admin ClusterRole to allow it to access to all Rok accounts.
  • Extend Rok's provisioning tool for Kubernetes with the --apply mode to avoid questions, skip regeneration of manifests and only apply specified Kustomize packages.
  • Make rok-do fail by default if a path in the host is needed by a task and it does not exist.
  • Replace CommandNotFoundError with CommandOSError, which is more broad and accurate.
  • Fix the logging of byte strings (and the b'...' prefix) in the cmdutils module.
  • Persist the home directory of user root inside rok-tools by mounting a Docker volume or Kubernetes PVC at /root.
  • Correctly display the account name instead of the user ID in Rok CLI.
  • Move authorization code from the Rok API views to a dedicated backend.
  • Store the Kubernetes namespace UUID in Rok API accounts and verify it matches the one on Kubernetes with every request to prevent accessing resources on Rok after the namespace has been deleted.
  • Add fine-grained authorization to account metadata updates in the Rok API.
  • Introduce the rok-cluster-admin ClusterRole for Rok cluster administrators on Kubernetes.
  • Prevent auto redirect to KF dashboard when the Rok UI is in chooser mode.
  • Bump the version of Istio that Rok's provisioning tool for Kubernetes installs to 1.5.7.
  • Remove a late import in Rok's log formatting code, which could cause a deadlock between the log handler's lock and the Python module import lock during the initialization of the Rok client by Rok CSI.
  • Improve the style of all links in the Rok UI.
  • Display the number of versions in the object list of the Rok UI
  • Migrate githooks to Python 3.
  • Use Angular's infinite scroll component in the Rok UI.
  • Implement search support for buckets and objects in the Rok UI.
  • Export the Rok client, its error classes and the helpers responsible for querying Rok URLs at the Rok client's module level.
  • Introduce a helper to the Rok client to list the members of a group.
  • Fix a bug where Rok CSI would sometimes use the incorrect Rok API version when restoring a volume from the Rok URL of a group.
  • Introduce group delete for objects and versions in Rok UI.
  • Improve messaging in UI's network errors.
  • Suppress C812 Flake8 error, because it doesn't offer us much and leads to a bit uglier code.
  • Perform retries when setting the versioning status of an S3 bucket, to workaround the fact that the S3 API sometimes returns 404 errors for buckets that have just been created.
  • Suppress E741 Flake8 error, because most monospace fonts already do a good job at showing "l", "I" and "1" differently.
  • Add a way to lazily evaluate Task attributes in rok-do
  • Introduce rok-dev, a Debian Stretch environment for Arrikto devs.
  • Enable logs in UI's production builds
  • Fix CRD validation in Istio kustomizations.
  • Provide a ClusterRoleBinding for the rok-admin and rok-cluster-admin ClusterRoles to the rok and rok-operator ServiceAccounts.
  • Fix Githooks random behavior regarding flake8 checks
  • Add support for creating a Docker image with Python 3.5.1 installed.
  • Preserve LC_ALL when running tasks in a remote with rok-do.
  • Build Rok Enterprise Docker images with rok-do
  • Improve rok-dev with support for running rok-do
  • Make Python bindings compatible with Python 3 and ship the corresponding Python 3 packages.
  • Add support for building the Rok Operator Docker image with rok-do.
  • Add support for building the Rok Disk Manager Docker image with rok-do.
  • Add support for building the Rok CSI Docker image with rok-do.
  • Give Rok CSI nodes the rok-admin ClusterRole, to provide them access to all Rok accounts.
  • Reduce configd log spam by rendering config only if member is not up-to-date
  • Improve the Rok API error message when accessing an account for a Kubernetes namespace that does not exist.
  • Fix a bug where the Rok Composer could deadlock while serving simultaneous requests to delete and access a fisk.
  • Support snapshot policies in the Rok GW Jupyter driver.
  • Support snapshot policies in the Rok GW dataset driver.
  • Reduce electiond log spam by watching the master lease without timeout.
  • Preserve query parameters when the namespace changes in Rok UI.
  • Add documentation for cmdutils, as well as a developer guide with examples for some common scenarios.
  • Extend LVMd to report successful snapshot completion.
  • Allow LVMd to recover from an interrupted snapshot.
  • Introduce config variables to setup cron jobs for local/global GC.
  • rok-csi: Add support for garbage collecting LVs and nodelocal fisks owned by LVMd.
  • Remove the "escalate" permission from the Rok Operator/Cluster pods.
  • Fix a bug where the UI was showing the wrong object count when deleting objects.
  • Add a mixin with common helpers for Rok-related tasks in rok-do.
  • Handle existing tags in deployments repo and avoid tagging trunk versions.
  • Handle transient disconnections in a less intrusive way in Rok UI.
  • Introduce a user guide for snapshot and retention policies.
  • Disable msg_delay in text progressbar
  • lvmd: Ensure we delete stale resources under normal operation.
  • rok-csi: Skip GC-ing nodelocal fisks when composer runs in non-nodelocal mode.
  • rok-csi: Improve GC logs.
  • Add a rok-do task to GC old Docker images used by rok-do.
  • Fix a bug where the rok_common.apt Python module would ignore failures to update the APT cache, because apt-get update returns with a 0 exit code.
  • Fine-tune the update strategy for rok-disk-manager and rok-kmod DaemonSets so that they can be upgraded in parallel.
  • Remove the message limits in the Rok etcd v3 client.
  • Add support for building the Rok Tools Docker image with rok-do.
  • Fix running rok-do subtasks as direct goal tasks.
  • Implement API call to retrieve the members of a group in the Rok API.
  • Make the task-gc management command more efficient by avoiding having to protect all parameters of all tasks.
  • Use an LRU cache for the classes dynamically created when protecting objects to fix a performance issue when protecting large numbers of objects. This will also improve performance of task-gc management command.
  • Improve the efficiency of recursive listing in the etcd v2 emulation client by using a node index when formatting the response.
  • rok-csi: Extend GC to unfreeze frozen filesystems and collect stale device mapper devices.
  • Document how we generate Docker images for the Kubernetes CSI Sidecars.
  • Remove any force-cleanup logic from rok-deploy that could purge a non-empty directory specified by the user as their local GitOps repository.
  • Introduce manifests to deploy a monitoring stack alongside Rok on Kubernetes, based on Prometheus and Grafana.
  • Configure Prometheus to periodically scrape and store metrics from Rok's etcd.
  • Add a dashboard to Grafana to visualize Rok's etcd metrics.
  • Configure Prometheus to periodically scrape and store metrics from Rok's Redis.
  • Add a dashboard to Grafana to visualize Rok's Redis metrics.
  • Add public document with description and deployment steps for Rok's monitoring stack on Kubernetes.
  • Use Kubernetes 1.16 for EKS clusters.
  • Work around a Mitogen issue where the standard I/O streams in the remote are in non-blocking mode.
  • Update the code for deploying a Rok Registry cluster.
  • rok-csi: Record all logs and progress updates as events on the corresponding Kubernetes object.
  • rok-csi: Allow displaying the subjob progress along with the total progress.
  • rok-gw: Allow displaying the virtual subtask progress along with the total progress.
  • rok-csi: Fail stale VolumeSnapshots after Pod restart
  • do: Warn when a task does not support caching
  • Fix task's logs alignment in Rok UI
  • rok-csi: Support migrating PVs from cordoned nodes.
  • do: Create rok-kmod image using Debian packages.
  • Decouple do task NGINXStaticSite from docs
  • do: Support caching in NGINXStaticSite
  • Introduce the run-if-master tool to allow easily running commands on the master node of the Rok cluster.
  • Introduce a helper to acquire an exclusive cluster-wide DLM lock.
  • do: Take the env and entrypoint task attributes into account when caching a task.
  • Introduce a way to uniquely identify a process in a running host, by computing an ID that cannot be reused during the host's uptime.
  • Extend the run-if-master tool to break all stale DLM locks left behind by the process it executed.
  • Allow garbage collecting Rok API tasks based on their status.
  • Enable automatic garbage collection of Rok API tasks in the Rok cluster.
  • do: Hint to the task that must run when a fromsnap is not found.
  • do: Support adding labels to rok-do snapshots.
  • do: Add support for GCP remotes.
  • Support provisioning MiniKF using the new Kubeflow manifests.
  • Remove Pod deletion logic from Rok Operator; delegate this task to the DaemonSet Controller
  • do: Automate building MiniKF images for GCP.
  • deploy: Improve auto-detection of EKS cluster name to handle clusters created with eksctl.
  • do: Automate building MiniKF images for AWS.
  • Use the j2 CLI to render Jinja templates instead of using envsubst and environment variables.
  • csi: Unpin both used and unused PVCs.
  • csi: Produce events when pinning/unpinning a volume.
  • csi: Automate garbage collecting completed jobs every hour.
  • csi: Do not crash if etcd goes down.
  • Update the Rok operator and systemd units to break locks in the master namespace.
  • Fix an issue where computing the run ID of a process occasionally failed due to a bug when parsing the process stat file.
  • Use fixed size widgets in our dialog based frontend.
  • Fix yld() not to leave open fds behind.
  • electiond: Fix a bug where if the Rok master node was permanently removed, other nodes did not attempt to become master.
  • cluster: Do not lock the master lease just for inspecting it
  • aws: Add CloudFormation support
  • minikf: Reduce timeout limit of APT connections
  • lvmd: Log info that can help us debug filesystem related issues.
  • lvmd: Verify the filesystem state.
  • lvmd: Recover the filesystem journal when activating volumes.
  • csi: Use the same PU object for both CSI and LVMd running on the same process.
  • liod: Set timeout for tcmu_handler while waiting for a connection with Rok to succeed to infinity.
  • operator: Use the kubernetes.io/hostname Kubernetes node label over the name one to schedule Rok CSI Guard Pods more robustly.
  • manifests: Remove Pod Disruption Budgets for Istio.
  • operator: Take into account unschedulable nodes when calculating which nodes to guard to avoid unneeded resource create-delete-recreate cycle.
  • Use the watch helpers provided by the Rok etcd clients when watching for document changes in the Rok API.
  • operator: Emit more events to increase observability into the cluster scaling algorithm
  • Add design document for Rok Disk Manager (RDM)
  • Revamp Rok Disk Manager to always request LVs with size that is a multiple of the block size, i.e. 512.
  • RDM: Hash block devices based on the underlying kernel device, not their path.
  • Fix a bug where rok-deploy modified the kustomization file for Istio, removing some useful resources/transformers.
  • docs: Extend our guides with instruction on how to create a dedicated VPC for the EKS cluster
  • Add missing packages (curl and bsdmainutils) in rok-tools image
  • rok-gw: Fix a bug where the Rok StatefulSet driver would create a group resource with the wrong order for the registered disks.
  • rok-gw: Fix a bug where the Rok StatefulSet driver would not sort the Pod names correctly, placing pod-10 before pod-2 inside the generated group resource.
  • csi: Document how to create a StatefulSet from a Rok group resource using the rok/origin annotation.

Version 0.15.1 (Onyx)

  • Move docs out of the CMake build system.
  • Make the building of docs depend on version-specific manifests.

Version 0.15 (Onyx)

  • manifests: Use latest kmod image and kubeflow/manifests
  • Revamp the instructions to test a Rok installation on EKS
  • doc: Use proper mount for Docker
  • doc: Add deploy overlays to EKS guide manual option
  • doc: Update instructions of building the rok-kmod image
  • manifests: Add .cache kfctl folder to gitignore
  • Enhance guides of onboarding and release procedure
  • cli: Store logs under ~/.rok/log
  • operator: Fix bug with stale cluster config
  • Add instructions to configure the Kubernetes namespaces and RBAC rules after installing a Rok cluster in EKS
  • scripts: Fix tag creation in manifests script
  • rok-kmod: Update Dockerfile.local with missing kernel
  • Restore all Rok probes except the one used by the Rok appliance to Python 3
  • conf: Set master_capable to True on Kubernetes
  • deploy: Provision auth components
  • doc: Treat warnings as errors when building with Makefile
  • Fix an invalid JSON document in the EKS installation docs
  • scripts: Make manifests script adopt existing repos
  • doc: Mention EKF instead of MiniKF
  • doc: Do not copy the results when user select text
  • manifests: Use string replacement instead of jinja2 templating
  • kmod: Don't start a progress bar if there are no modules to install
  • gw: Always display cancel button in services form
  • Static rok and ekf themes
  • doc: Do not copy the results shown in blocks
  • gw: Move namespace selector into its own component
  • doc: Update Kubeflow integration doc
  • Hide and show code blocks in docs
  • Make our manifests templates and have bases only refer to proper image tags
  • Introduce a developer guide for the Kubernetes client's initialization
  • Fix a bug in the Kubernetes Rok API drivers that caused SubjectAccessReview requests to sometimes fail with an unauthorized error
  • doc: AuthService Integration
  • Kubernetes: Configure dockerconfig with rok-deploy
  • Introduce the v2 services and OAuth APIs in Rok, to allow Rok clients to interact with any account instead of only the one matching their user UUID
  • Include CMake>=3.8.2 as new a build dependency since we make use of the COMMAND_EXPAND_LISTS option of add_custom_command.
  • Make AuthService authentication the default in Kubernetes
  • Introduce the AUTHORIZATION_BACKEND setting for the Rok API to control the way requests are authorized
  • Convert all Rok API authentication backend names to lowercase
  • Rename the static-authservice authentication backend to authservice in the Rok API
  • Fix custom fonts in doc
  • Further improve Python 3 compatibility
  • doc: Use example.com in our public docs
  • Make Kubeflow-UserID the default user header when using AuthService authentication in the Rok API
  • doc: Improve doc on Kubeflow's integration with GitLab
  • Fix services request with namespaces
  • Enhance Kubeflow integration and use ekf overlays in KfDef
  • doc: Fix broken copy button
  • manifests: Move Rok manifest to its proper place
  • Revert Rok probes to Python 2 to workaround missing dependencies for the Rok cluster probe
  • Make the Rok etcd3 client compatible with Python 3
  • Automatically allow access to Rok API resources to users that have access to Kubeflow resources in the same Kubernetes namespace
  • doc: Add absolute URL in snippet commands
  • cmake: Separate ctypesgen preprocessor flags
  • Kubernetes: Refactor manifests
  • Fix a bug in the Rok S3 daemon template
  • Build custom dex image
  • Enable the Rok API and UI to run behind Istio with AuthService authentication
  • common: Detect dirty repo and return trunk version
  • Kubernetes: Make Redis probe Python3-compatible
  • etcd: Add Python3 package for v3
  • doc: Extend docs and add integrations
  • Enable building reproducible rok-kmod images locally
  • kmod: Fix typo in Ubuntu PPA Dockerfile
  • rok-tools: Serve Rok's public docs
  • rok-kmod: Use rok-kmod debian package in rok-kmod's Dockerfile
  • githooks: Exclude json.in from Copyright check
  • debian: Introduce rok-kmod package
  • rok-kmod: Convert to Python3 and introduce python package
  • doc: Make public docs customer-friendly
  • common: Properly dump to file in current dir
  • Kubernetes: Introduce rok-deploy
  • probes: Make probes library Python3 compatible
  • doc: Change doc's layout
  • common: Open dump_to_file in text mode by default
  • Mention bootstrapping in the docs
  • Make a number of small fixes to the Rok client to ensure our CI tests pass after transitioning to Python 3
  • ci: Configure locale inside chroot
  • Update the botocore dependency of the Rok AWS library to 1.12.103
  • Integrate Rok with ctypesgen 1.0.2
  • doc: Fix broken copy button image in nested docs
  • Support mass deletion in the Rok UI
  • Revamp the initialization of The Rok S3 daemon to identify deployment errors as soon as possible
  • Introduce formatting and validation to all Rok PUs
  • Correctly include the Rok Tools template in the docs
  • kmod: Build reproducible rok-kmod images
  • doc: Do not copy/link sources in public docs
  • Minor fixes in the Python wheels doc
  • Fix error reporting in Python 3 in the Rok client
  • kmod: Find available custom modules
  • Give to modules installed by rok-kmod the highest priority
  • Introduce instructions for EKS
  • Add design document about the formatting and validation of Rok daemons
  • Introduce kustomize overlays for EKS
  • Introduce Rok Tools
  • doc: Make various adjustments to the rok-do guide
  • Avoid retrying all available methods of retrieving security credentials when updating them in the Rok S3 daemon
  • Support reading values from a file in the Rok C argument parser
  • Display bucket descriptions in the Rok UI
  • Prepare towards Python3 packages
  • rokfs: Make ioctl prototype conditional
  • operator: Set/apply cluster config
  • Allow deleting a specific bucket or all buckets of a Rok cluster using the Rok AWS helper scripts
  • Make rok_cluster an optional dependency of rok_aws
  • Add entrypoints for AWS helper scripts
  • Add AWS C++ SDK to rok-do build dependencies
  • cmake: Use -Og on Debug and fix ctypesgen flags
  • Kubernetes: Use rok-probed in initContainers
  • Make the Rok commmon helpers converting strings to bytes and Unicode Python 3 compatible
  • doc: Add Rok upgrade guides for Kubernetes
  • Disable Fort signups
  • operator: Cluster-neutral logging
  • scripts: Allow purging multiple S3 buckets at once
  • Add search support in Rok UI
  • rdm: Activate the LVs when loading a VG/LV
  • Check if the source directory exists when adding Python tests in CMake
  • gw: Display the number of versions in objects list
  • gw: Change link style across the Rok UI
  • Update rok-do instructions
  • cmdutils: Add check and log_error to wait()
  • bootstrap: Improve validation
  • Kubernetes: Treat configVars as object
  • cmake: Add non-bootstrapped env as possible failure reason
  • libredis: Implement scanning keys and batch deletions
  • libtasks: Fixes and support for disabling logging to frontend
  • Remove a stale file
  • Add support for the IAM Roles for Service Accounts feature of EKS to the Rok S3 daemon
  • Add a design document explaining in detail the way Rok pods gain access to AWS services when running within an EKS cluster
  • Improve handling of time durations in timeutils
  • Add script to attach an IAM role to the Rok service inside an EKS cluster
  • Add script to purge an S3 bucket
  • rok_args: Do not set dest for Sensitive arg
  • libredis: Fix various bugs
  • gw: Disable group toggle button when group is empty
  • Add bootstrap and get build version with Python
  • gw: Remove created info from task popover
  • dm clone: Fix discard handling and overflow bugs which could cause data corruption
  • operator: Add helpers to get CR info as rok-init metadata
  • Add new tooltip messages
  • Introduce file badge component in Rok UI
  • githooks: Use relative paths for symlinks
  • scripts: Fix a check for enabled githooks
  • Fix various issues related to double reclassing
  • Add guidelines for testing to the Rok documentation
  • Add script to attach EBS volumes to a Kubernetes cluster
  • Add perf tests for libfiber
  • gw: Use bigger icons in services header
  • Introduce new delete dialogs in UI
  • Fix monospace and bold in UI
  • Styles changes in authorizations page in UI
  • Minor Kubernetes-related fixes
  • libredis: Refactor code and support retries
  • conf: Fix ip_reachable and remove default gateway verification
  • Correctly initialize the Rok 0.15 client in MiniKF
  • Update the MiniKF kustomize templates and wheels
  • blkutils: Add --force for RAID devices with 1 drive
  • Improve reporting of sizes in CLIs
  • config: Factor out DLM lock break
  • Fix and upgrade custom tensorflow images
  • Update the Dockerfile used to produce the notebook image to create the required Python wheels using rok-do
  • Make rok-do less noisy in case of errors
  • conf: Support disabling host header check
  • libredis: Enforce redis scheme
  • scripts: Improve add_signature() to work on rebase
  • Update Rok Kubernetes guides
  • operator: Support cluster upgrades
  • End-to-end building of Python wheels with rok-do
  • operator: Retrieve secrets from CR
  • Add generic helpers to get, list, and retrieve the owners of resources to the Rok Kubernetes client.
  • libmap: Migrate epoch cache to Redis
  • Keep logs in case cronic fails
  • Do not deepcopy service params to increase the performance of service-related API calls
  • operator: Remove hardcoded cluster refs
  • rok-csi: Recover volumes from deleted nodes
  • libredis: Introduce connection pool
  • Add a simple graph implementation to the Rok common module
  • kustomize: Manage Rok Storage/VolumeSnapshot classes
  • trpt: Print message when magic number is invalid
  • rok-init: Add basic support for upgrading clusters
  • operator/kustomize: Add Redis endpoint
  • appliance: Add redis endpoint
  • operator: Fix bug in member removal
  • doc: Update stretch build dependencies
  • python/pu: Check PU status before releasing objects
  • python: Replace select() with poll()
  • Search for ext2/ext3/ext4 libraries in CMake
  • Add a bucket icon in Rok UI's breadcrumb trail
  • Fix dependency to the PyYAML package in the Rok Kubernetes client
  • kustomize: Introduce Redis
  • Correctly display access tokens which were issued without an application
  • libredis: Introduce a Redis library
  • electiond: Improve detecting master changes
  • operator: Fix typo in postgresql_probe()
  • Specify arbitrary device attributes for CSI volumes
  • Reduce LU-oriented lock contention in the I/O path
  • csi: Start dm-clone monitoring threads after successfully initializing lvmd
  • csi: Fix imports
  • lvmd: Fix imports
  • lvmd: Don't snapshot discarded blocks
  • Do not retry ENOENT on get_ca()
  • gw: Refactor objects and versions list in UI
  • doc: Fix indentation errors
  • conf: Remove the templates and the render.py from etcd
  • Fix a bug where tasks would never be finalized if they contained a value that cannot be JSON serialized
  • Use common's document view component in event info page
  • conf: Do not use hostname as member ID fallback
  • conf: Support config annotations
  • Kubernetes: Extend cluster CRDs with status
  • gw: Fix imports in webpack's dev config
  • Use common form component in Rok UI
  • rdm: Support parsing and applying scripts line-by-line
  • minikf: Some libtask fixes before updating provisioning script
  • Use only scoped imports in UI
  • common: Relax type restriction in format_duration
  • common: Update copyright date in UI
  • lvmd: Add support for replicated volumes
  • Convert utility class in UI
  • Fix a bug when waiting for a failed task in the Rok client
  • Factor out our internal Kubernetes client
  • common: Add Python functions to calculate versions
  • Introduce two new probes to test the readiness of an etcd and PostgreSQL deployment
  • Add options to wait until a readiness probe succeeds or a liveness probe fails
  • conf: Support atomic config apply
  • common: Revamp error service in UI
  • Replace prettytable with printutils
  • Change position strategy in UI
  • libtrpt: Minor performance optimization
  • Fix some minor email issues in Registry
  • Deploy Rok Registry with Istio on Minikube/GKE
  • operator: Rework init container cmds
  • thrower: Work with any lz4 version
  • lvmd: Properly close the data device
  • operator: Graceful termination
  • Make retention policies in the Rok API return accurate information about group members and cleanup orphan group members.
  • blkutils: Fix using get_disks() with glob
  • lvmd: Fix progress reporting
  • libtrpt: Fix high completion latencies
  • doc: Update LVMD design document
  • common: Create a password helper
  • composer: Use 1MiB chock size as default
  • Always display snapshot policies section in UI
  • common: Use different connection strategy in tooltip
  • Improve Rok UI's loading screen
  • common: Do not autodetect if we are in container, be explicit
  • Close the Pyro daemon before stopping the thrower
  • dm-clone: Backport upstream patches
  • libmap: Support batched epoch updates
  • common: Add missing prefix in UI's HTTP client
  • Add Kubernetes PodSecurityPolicy Integration Design Doc
  • Angular and dependencies upgrade
  • indexer: Remove auth token in some API requests
  • ci: Increase dm-clone region size in lvmd tests
  • blkutils: Fix how we parse mountinfo in get_mountinfo()
  • Disable ASan's LeakSanitizer for tests
  • Do not define min() and max() macros in C++, since they are already defined as funtions.
  • Update the instructions to build a Jupyter notebook
  • Update the Jupyter notebook Dockerfile to include Tensorflow 1.14.0, Python 4 wheels for the Rok client and the latest Kubeflow ml-pipelines Kale, and Kale Jupyterlab plugin.
  • scripts: Allow passing minus tags in rok-buildbot
  • lvmd: Discover volume mount points automatically, instead of providing them explicitly in take_snapshot()
  • Add helper to wait for a task to the Rok client
  • Rename the Rok client and its errors to RokClient and RokClientError respectively
  • Parse credentials and service parameters from a file in the Rok client
  • Parse credentials and service parameters from the environment in the Rok client
  • Provide authentication credentials during initialization in the Rok client
  • lvmd: Support encrypted volumes
  • Fix a bug where Rok API installations would raise internal errors when when accessing old delete marker versions due to migration v001400_0002 incorrectly introducing a number of attributes that should only exist in non-delete marker versions
  • Specify the minimum Gevent version that is supported
  • Detect Rok build type and skip lvmd CI tests
  • Add shared memory transport
  • Fix some prettytable dependency issues
  • Improve HTTP response handling in UI
  • minikf: Merge questions with CLI args
  • dlm: Use force_str() on strings passed to C calls
  • Support logging to the frontend from anywhere
  • liod: Rescan SCSI bus periodically
  • Add a role and role binding to MiniKF users to enable Rok API tasks to access Kubeflow resources
  • Add a PodDefault to allow the MiniKF's default user to access the Rok API from within the Kubeflow namespace
  • operator: Replace Threads with Greenlets
  • lvmd: Use dm-clone only when restoring a volume from a snapshot, not for fresh volumes
  • conf: Fix a bug where an undefined var was referenced
  • csi: Do not start dm-clone monitor threads on controller
  • Fix the MiniKF deployment and its QA process
  • lvmd: Remove mostly unused dyn_params parameter
  • Make cmdutils compatible with Python 3
  • common: Do not import subprocess32 when using Python3
  • roke: Fix dots in member IDs and stopping md devices
  • Refactor lvmd to improve code readability and maintainability
  • operator: Handle nodeSelector and node labels
  • minikf: Track latest wheels
  • githooks: Fix a bug when checking config version
  • gw: Make the Rok gateway UI pass Prettier checks
  • lvmd: Support variable dm-era tracking granularity
  • Add support for building Python 3 wheels for the Rok client
  • Make the Rok client Python 3 compatible
  • Make the rok_common library Python 3 compatible
  • New icons in Rok UI
  • doc: Add copy buttons in all doc's code blocks
  • pu: connect all PUs to the external controller by default
  • lvmd: Introduce DM snapshots
  • test: Set start_new_session instead of new_session
  • Use HttpClient in Rok Registry UI
  • ci: Fix hashing test to consider duplicate offsets
  • rdm: Add support for RAID arrays
  • rdm: Export attributes of block devices
  • tests: Fix leaks discovered by ASan
  • cmake: Use correct soversion for libetcd3
  • Add JSON and CSV output format to the Rok client
  • rok-do: Introduce rok-do CLI tool
  • doc: Use correct apt files on source install guides
  • gw: Dynamically resize file chooser window
  • Add extra validation checks in Rok Gateway
  • docker: Add missing syslog argument
  • Implement dialog for copying files
  • cmdutils: Support Popen kwargs and remove some shell=True commands
  • operator: Fix an operator regression wrt platforms
  • operator: Uniformly sync resources
  • conf: Improve diffing support and move it under rok_common
  • gw: Add a Django cache to cache the chock size
  • operator: Produce events on cluster CR
  • cli: Add QuestionContext and expose question threshold through args
  • operator: Always refresh cluster driver cache
  • lvmd: Add support for configuring the snapshot chunk size
  • libtasks: Separate null and empty answers and add boolean-type question
  • Switch to Stretch builds
  • Deploy Rok Registry using Rok Operator
  • Remove deprecated Http class from UI
  • Fix rok-init bugs
  • Allow Fort to filter LDAP users by group
  • doc: Extend Sphinx configuration to include versioned manifests from a specified path

Version 0.14.1 (Nephrite)

Version 0.14 (Nephrite)

Version 0.13 (Marble)

Version 0.12 (Lignite)

Version 0.11.1 (Kryptonite)

Version 0.11 (Kryptonite)

Version 0.10.3 (Jade)

Version 0.10.2 (Jade)

Version 0.10.1 (Jade)

Version 0.10 (Jade)

Version 0.9 (Iron)

Version 0.8.1 (Hematite)

Version 0.8 (Hematite)

Version 0.7.2 (Granite)

Version 0.7.1 (Granite)

Version 0.7 (Granite)

Version 0.6.2 (Flint)

Version 0.6.1 (Flint)

Version 0.6 (Flint)

Version 0.5 (Emerald)

Version 0.4.5 (Diamond)

Version 0.4.4 (Diamond)

Version 0.4.3 (Diamond)

Version 0.4.2 (Diamond)

Version 0.4.1 (Diamond)

Version 0.4 (Diamond)

Version 0.3 (Celestite)

Version 0.2 (Beryl)

Version 0.1 (Amethyst)