Arrikto Enterprise Kubeflow Documentation¶
Open Source (OSS) Kubeflow enables you to operationalize much of an ML workflow on top of Kubernetes. It comprises a number of ML components and services; SDKs and APIs; integrated development environments (IDEs); and libraries for data science.
The Arrikto Enterprise Kubeflow (EKF) distribution introduces important additional features to address gaps in OSS Kubeflow and commonly expressed needs of MLOps engineers and data scientists.
- Automation: With Arrikto EKF you can orchestrate an end-to-end ML workflow from your IDE. Start by tagging cells in Jupyter Notebooks to define pipeline steps, hyperparameter tuning, GPU usage, and metrics tracking. Click a button to define the necessary Kubernetes services and run a scalable ML pipeline and serve the best model. Or use the EKF Kale SDK to do all the above within your preferred IDE.
- Reproducibility: Snapshot pipeline code, libraries, and data for every step with the Arrikto Rok data management platform. Roll back to any machine learning pipeline step at its exact execution state for easy debugging. Collaborate with other data scientists through a Git-style publish/subscribe versioning workflow.
- Portability: Arrikto EKF enables you to deploy and upgrade a Kubeflow environment using GitOps processes across all major public clouds and on-prem infrastructure. Move ML workflows seamlessly across with Rok Registry.
- Security: Arrikto EKF security features enable you to manage teams and user access via GitLab or any ID provider via Istio/OIDC. Isolate user ML data access within their own namespace while enabling notebook and pipeline collaboration in shared namespaces. Manage secrets and credentials securely, and efficiently.
Getting Started¶
The easiest way to start with EKF is to follow one or more of the tutorials below!
Installation¶
- Features
- GitOps
- Pipelines
- Hyperparameter Tuning
- Kubeflow Notebooks
- Rok Snapshotting
- Kale Kubeflow Pipeline and Rok Snapshots
- Kubeflow Pipeline and Initial Rok Snapshot
- Kubeflow Pipeline Steps and Rok Snapshots
- Rok Snapshot Creation and Rok Buckets
- Rok Snapshots Outside of I/O Path
- Rok Snapshots & Environment Restoration
- Rok Snapshots and Volume Restoration
- Rok Snapshots and Pipeline Restoration
- Scaling
- Operations Guide
- Manage Your EKS Cluster
- Manage Your GKE Cluster
- Manage Authentication
- Create Privileged Notebook Server
- Migrate Notebooks
- Manage Your Kubeflow Deployment
- Manage Networking
- Manage Your Rok Registry Cluster
- Add an internal GitHub repository as a backup GitOps remote
- Set Up Cluster-Wide Authenticated Access to a Docker Registry
- Disable Automatic Profile Creation
- Scale In Kubernetes Cluster
- Protect Pods from OOM conditions and CPU starvation
- Add Static Users in Dex
- Hot-Patch an Arbitrary Image in Your Deployment
- Expose TokenRequest API for External Clients
- Configure Syncing
- Trust Custom CA
- Add Extra Resources To All User Namespaces
- Gather Logs for Troubleshooting
- Recover RWX Volume After Node Failure
- Manage Your Rok Monitoring Stack
- Handle Degraded Nodes