Serving on EKF

This section describes how Serving works in Arrikto EKF and how you can fine-tune it for better performance. Assuming a ready inference service running in a cluster, we will analyze the request path for both internal and external clients and break down the overhead of each involved component. Finally, we will do a performance evaluation using a Tensorflow Model served by a Triton server and share the testbed so that you can reproduce our findings.

Here’s what you’ll need to follow the next sections: