Clean Up KFP Artifacts from MinIO

MinIO is an object-based datastore used by KFP to save

  • pipeline definitions,
  • pipeline steps logs,
  • artifacts that are submitted as visualizations, and
  • artifacts passed as Inputs and Outputs in pipelines written with the KFP SDK.

MinIO runs as a deployment in the kubeflow namespace and saves these artifacts to a PVC. Currently, there is no automated garbage collection mechanism in place, so you may end up in one of the following scenarios:

  • The MinIO volume runs out of space due to big artifacts filling it up.
  • The MinIO volume runs out of filesystem inode space due to too many small artifacts.

In both cases, you will not be able to submit pipelines to KFP anymore. You can solve this problem by deleting old artifacts from the MinIO volume.

This guide will walk you through checking the available space on MinIO, as well as cleaning it up by deleting old artifacts.

Check Your Environment

In order to determine whether any cleanup operation is required, you need to inspect the available space and inode space on MinIO. To do that:

  1. Exec into the MinIO container:

    root@rok-tools:~# kubectl exec -it -n kubeflow svc/minio-service -c minio -- sh
  2. Check the available space:

    # df -h /data Filesystem Size Used Available Use% Mounted on /dev/mapper/roklvm-0885ae6a-f7c0-4aa7-805a-d034efed2b7a-era 19.6G 19.6G 0M 100% /data
  3. Check the available inode space:

    # df -i /data Filesystem Inodes Used Available Use% Mounted on /dev/mapper/roklvm-0885ae6a-f7c0-4aa7-805a-d034efed2b7a-era 1310720 1310720 0 100% /data

Important

In a healthy cluster, both commands above should return a Use% value that is below 100%. If that is not the case, or if these values are particularly high, run the following procedure to free up space.

Procedure

  1. Optional

    Exec into the MinIO container, if you haven’t already:

    root@rok-tools:~# kubectl exec -it -n kubeflow svc/minio-service -c minio -- sh
  2. Set the number of DAYS in an environment variable:

    # export DAYS=<DAYS>

    For example, if you want to delete all artifacts older than 30 days:

    # export DAYS=30

    Note

    This procedure will delete all artifacts older than DAYS days.

  3. Filter artifacts by age, in order to inspect what is going to be deleted in the following steps:

    # find /data/mlpipeline/artifacts -mtime +${DAYS?} -maxdepth 1 /data/mlpipeline/artifacts/my-pipeline-qdb7c

    Warning

    The actions in the next steps are not reversible. Make sure that no important artifacts are listed in the output of the command above, before proceeding with the next step. If needed, consider adjusting the provided number of DAYS.

  4. Filter artifacts by age and delete the ones that are older than the number of days you defined in the previous step:

    # find /data/mlpipeline/artifacts -mtime +${DAYS?} -maxdepth 1 | xargs -n1 rm -rfv removed: '/data/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02/23/my-pipeline-qdb7c-420265999/mlpipeline-ui-metadata.tgz' removed: '/data/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02/23/my-pipeline-qdb7c-420265999/main.log' removed directory: '/data/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02/23/my-pipeline-qdb7c-420265999' removed directory: '/data/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02/23' removed directory: '/data/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02' removed directory: '/data/mlpipeline/artifacts/my-pipeline-qdb7c/2022' removed directory: '/data/mlpipeline/artifacts/my-pipeline-qdb7c'
  5. Delete the corresponding metadata:

    # find /data/.minio.sys/buckets/mlpipeline/artifacts -mtime +${DAYS?} -maxdepth 1 | xargs -n1 rm -rfv removed '/data/.minio.sys/buckets/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02/23/my-pipeline-qdb7c-420265999/mlpipeline-ui-metadata.tgz/fs.json' removed directory: '/data/.minio.sys/buckets/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02/23/my-pipeline-qdb7c-420265999/mlpipeline-ui-metadata.tgz' removed '/data/.minio.sys/buckets/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02/23/my-pipeline-qdb7c-420265999/main.log/fs.json' removed directory: '/data/.minio.sys/buckets/mlpipeline/artifacts/my-pipeline-qdb7c/2022/02/23/my-pipeline-qdb7c-420265999/main.log' removed directory: '/data/.minio.sys/buckets/mlpipeline/artifacts/my-pipeline-qdb7c'

Note

You can provide both a lower and an upper bound for the above operations, for example, delete all artifacts created between 30 and 60 days ago:

# find /data/mlpipeline/artifacts -mtime +30 -mtime -60 -maxdepth 1 | xargs -n1 rm -rfv # find /data/.minio.sys/buckets/mlpipeline/artifacts -mtime +30 -mtime -60 -maxdepth 1 | xargs -n1 rm -rfv

Verify

  1. Optional

    Exec into the MinIO container, if you haven’t already:

    root@rok-tools:~# kubectl exec -it -n kubeflow svc/minio-service -c minio -- sh
  2. Check the available space to determine how much space you freed up. Verify the Use% field:

    # df -h /data Filesystem Size Used Available Use% Mounted on /dev/mapper/roklvm-0885ae6a-f7c0-4aa7-805a-d034efed2b7a-era 19.6G 9.8G 9.8GG 50% /data
  3. Check the available inode space to determine how much space you freed up. Verify the Use% field:

    # df -i /data Filesystem Inodes Used Available Use% Mounted on /dev/mapper/roklvm-0885ae6a-f7c0-4aa7-805a-d034efed2b7a-era 1310720 655360 655360 50% /data
Troubleshooting
Used space at 100% or not low enough
If your Used space and inode space is still not at the desirable value it means that there are probably no logs older than DAYS left on MinIO. Adjust the value of DAYS and follow the Procedure again.

Summary

You have successfully inspected the available space on MinIO and cleaned up old run artifacts.

What’s Next

Check out the rest of the operations you can perform on your Kubeflow deployment.