Create Short-Lived Token to Authenticate External Client

This section describes how to use service account tokens to generate short-lived tokens and use them to authenticate external clients.

The service account token is a long-lived one, i.e., it does not expire. You cannot use it directly for authentication because it does not have the required audience that AuthService expects, that is istio-ingressgateway.istio-system.svc.cluster.local. All you can do with this token is use it to hit the Kubernetes TokenRequest API and generate a short-lived token with the desired audience and a specific validity period.


The short-lived token will expire at most after one day. Your clients must refresh the token before it expires.

External clients will use the short-lived token as Bearer Token, that is, to make requests with the Authorization: Bearer $token header. AuthService will authenticate any incoming requests, that is, incoming requests will obtain a kubeflow-userid header that maps the underlying service account, for example, system:serviceaccount:SA_NAMESPACE:SA_NAME. You can restrict/allow access to specific services with:

  • Istio AuthorizationPolicies
  • RoleBindings for services that do SubjectAccessReview

See also

What You’ll Need

If you want to access a serving model you also need:



In the procedure below we explain step-by-step what the programmer should do to create a short-lived token. The snippets provided are examples in Python that can be translated to any language or CLI tool.

  1. Specify the service account token (long-lived token):

    >>> sa_token = "<TOKEN>"

    Replace <TOKEN> with you service account token.

  2. Decode your service account token. This is a JSON Web Token that includes service account and token info in its payload.

    1. Decode the token to obtain further info:

      >>> import jwt >>> sa_token_info = jwt.decode(sa_token, options={"verify_signature": False})
    2. Obtain the namespace, service account name and service account secret:

      >>> sa_namespace = sa_token_info[""] >>> sa_name = sa_token_info[""] >>> sa_secret = sa_token_info[""]
  3. Specify the Kubernetes endpoint. This is the base URL where the Kubernetes API server is exposed.

    >>> kubernetes_endpoint = "<ENDPOINT>"

    Replace <ENDPOINT> with your Kubernetes endpoint, for example:

    >>> kubernetes_endpoint = ""
  4. Specify the validity period of the short-lived token in seconds:

    >>> expiration = 3600


    This cannot be less than 10 minutes and more than one day. Your client must refresh the token before it expires.

  5. Prepare the request.

    1. Set the request URL. Use the Kubernetes endpoint and the token info and construct the TokenRequest API endpoint, that is <KUBERNETES_ENDPOINT>/api/v1/namespaces/<NAMESPACE>/serviceaccounts/<SA>/token. Replace <KUBERNETES_ENDPOINT>, <NAMESPACE>, and <SA> with your Kubernetes endpoint, your namespace, and your service account name, respectively.

      >>> url = "%s/api/v1/namespaces/%s/serviceaccounts/%s/token" % (kubernetes_endpoint, sa_namespace, sa_name)
    2. Set the request headers. You will use the service account token as Bearer Token so that Kubernetes authorizes you to create your short-lived token.

      >>> headers = {"Accept": "application/json", ... "Content-type": "application/json", ... "Authorization": "Bearer %s" % sa_token}
    3. Set the audience. AuthService expects this specific audience, otherwise it will not allow the request.

      >>> audience = "istio-ingressgateway.istio-system.svc.cluster.local"
    4. Set the request data. This is what the TokenRequest API expects as input.

      >>> data = {"spec": {"audiences": [audience], ... "expirationSeconds": expiration, ... "boundObjectRef": {"apiVersion": "v1", ... "kind": "Secret", ... "name": sa_secret}}}

      See also

  6. Make the request.

    >>> import json >>> import requests >>> resp =, data=json.dumps(data), headers=headers, verify=True)
  7. Ensure that the request succeeded.

    >>> resp.ok True
  8. Parse the response to get the short-lived token.

    >>> token = resp.json()["status"]["token"]
  9. Print the short-lived token.

    >>> print(token) eyJhbGciOiJSUzI1NiIsImtpZCI6Ijg3YjI...


  1. Decode the short-lived token (JWT payload) and verify that this has the expected audience:

    >>> import jwt >>> audience = "istio-ingressgateway.istio-system.svc.cluster.local" >>> jwt.decode(token, options={"verify_signature": False}, audience=audience) {'iat': 1649407595, 'exp': 1649443595, 'iss': '', '': {'secret': {'uid': '3ba251a0-27a2-46cc-a619-05a160e6344a', 'name': 'serving-token-99jwj'}, 'serviceaccount': {'uid': '4848db2c-8436-4280-bc95-3878578b5361', 'name': 'serving'}, 'namespace': 'kubeflow-user'}, 'nbf': 1649407595, 'sub': 'system:serviceaccount:kubeflow-user:serving', 'aud': ['istio-ingressgateway.istio-system.svc.cluster.local']}
  2. Specify the external URL of your service.

    >>> url = ""


    If you want to access a serving model, specify the external URL of a ready inference service.

    In case the model supports Data Plane v1 use the list models API endpoint, that is, /v1/models or the Readiness API endpoint, that is, /v1/models/${MODEL_NAME}.

    In case the model supports Prediction Protocol v2 use the server ready health API endpoint, that is, /v2/health/ready.

  3. Use the short-lived token as Bearer Token to access the service endpoint.

    >>> import json >>> import requests >>> headers = {"Authorization": "Bearer %s" % token} >>> resp = requests.get(url, headers=headers, verify=True) >>> resp.ok True


    If the URL is not valid, you will get a 404 error because of the Istio filter chain order, that is, route_not_found comes before ext_authz_denied.


You have successfully created a short-lived token to authenticate external clients.

What’s Next

Check out the rest of the documentation regarding external clients.