Serverless Kubernetes Cluster on AWS with EKS on Fargate
Containerized applications are exponentially growing in popularity. The container technologies (a.o. Kubernetes) are estimated to account for over four billion dollars in market volume in 2022 with a 30 percent year-over-year growth rate measured for the period 2017 to 2022 .
One of the most popular platforms to run containers at scale is Kubernetes,
a feature-rich open-source orchestration system that allows automating the entire lifecycle of containers, including application deployment and monitoring their health. Despite all its benefits, Kubernetes can add an additional level of complexity, as it requires dedicated DevOps resources to keep the cluster healthy and ensure that it scales.
AWS proved many times that they are customer-centric and that they want to improve the developer’s experience. With the launch of AWS EKS on Fargate, a serverless Kubernetes service, they proved it again while introducing a service that turned out to be a game-changer with respect to running containerized applications at scale.
The Story Behind Fargate and Kubernetes
Up until recently, there have been many attempts to bring serverless applications to Kubernetes, but most of the frameworks I’ve seen focused on deploying serverless functions (Functions as a Service) to an existing Kubernetes cluster, rather than providing a cloud service that would automatically provision Kubernetes worker nodes (data plane) to run serverless containers.
In December 2019, AWS launched a new service: EKS on Fargate, which provides a serverless data plane for a Kubernetes cluster. In theory, it is not considered a separate service, but rather a mix of two existing ones, EKS (AWS implementation of Kubernetes), and ECS Fargate (serverless AWS-specific container orchestration platform).
Before EKS on Fargate, Elastic Kubernetes Service (EKS) let us enjoy the benefits of Kubernetes, but it still required additional efforts to maintain the data plane, i.e. the worker nodes that are running on Amazon EC2. In contrast, Fargate is completely serverless, it provides an abstraction layer that allows deploying containerized workloads without any worker nodes maintenance on our side.
ECS on Fargate vs. EKS on Fargate
While Fargate’s abstraction certainly saves a lot of time due to the nature of a fully-managed serverless architecture, it requires learning an AWS-specific vocabulary, a.o. task definition and service definition. Those concepts are (subjectively) not as intuitive as the declarative language of Kubernetes objects.
Additionally, when using pure ECS on Fargate, we are becoming to some extent dependent on AWS, as this service is only offered by them, this paradigm of becoming dependent on a single provider is often referred to as vendor lock-in.
In contrast, when using EKS on Fargate, we could easily later switch to any other cloud provider or on-premise Kubernetes cluster, since our pods and deployments work the same way on any Kubernetes. There are some differences with respect to which storage classes or load balancers are used by each cloud vendor, but in general, Kubernetes is Kubernetes, regardless of where you run your pods. Also, there are currently more people on the market with knowledge about how to work with Kubernetes rather than those who know Fargate.
Monitoring and managing the health and status of containerized workloads is likewise easier with Kubernetes-native API. This may be my subjective opinion, but interacting with Kubernetes API via kubectl commands, directly from the terminal, seems to be more convenient than interacting with Fargate via AWS management console or AWS CLI.
Finally, Kubernetes opens up the entire world of applications deployable via helm charts, such as deploying a Dask Distributed cluster, that would have been much more difficult (if not impossible) with pure ECS Fargate.
Those are the reasons why it has been such a game-changer that AWS mixed the two services and let us enjoy the benefits of both worlds:
- Fully-managed control plane that let us interact with Kubernetes API via kubectl and by deploying applications packaged with helm charts
- Fully-managed serverless data plane that allows to rapidly deploy our containerized workloads without having to manage and scale the underlying compute capacity.
What Are the Implications of EKS on Fargate?
Having Fargate as an orchestrator not only for ECS but also for EKS changed what we consider a basic unit of work and unit of charge . With pure EKS, our unit of work is EC2 instance and a unit of charge is the instance price per hour. In contrast, EKS on Fargate uses a pod as a unit of work and the amount of vCPU and memory used by the pod as a unit of charge. We are billed just $0.10 per hour for running the Kubernetes control plane on AWS and on top of that we pay only for the CPU and memory resources that our pods consumed.
This is especially attractive for various chargeback scenarios . Imagine that you are a freelance developer and you work for three different clients at the same time. You could use different namespaces or different label selectors for those projects and AWS provides you with detailed billing information so that you could easily analyze what you spend on each project. The same applies if your company wants to track costs for specific projects.
This setup is also attractive for startups that can’t guess in advance how quickly the required compute capacity will grow. They may not have any workloads running at some times, which would entail paying for idle resources with a normal Kubernetes setup.
One obvious benefit mentioned above is costs. By costs, I don’t mean just the costs per compute resources, but also for specialized employees required to maintain a Kubernetes cluster. At the time of writing, DevOps resources are scarce. Many companies don’t have enough engineers who could do this type of work and they struggle to find experienced people to do it.
With respect to compute resources, you could additionally consider AWS Compute Savings Plans, which allow you to further cut down the compute costs by up to 52% when committing to using Fargate, EC2, or Lambda for a period of one to three years .
Having a serverless Kubernetes cluster indicates that we no longer need to maintain the worker nodes, and we don’t have to configure any autoscaling groups to scale our worker nodes, Fargate takes care of all that for us and we can focus our scarce DevOps resources on other issues.
This means that we can spend more time adding value to our business, i.e. programming and deploying our containerized applications by leveraging kubectl and the common YAML syntax to configure pods and deployments.
One additional benefit from the security perspective is that each pod ends up running in a separate micro-VM. This way, if somebody would get unauthorized access to one of our pods, they couldn’t access any of the other pods due to the complete isolation between pods deployed on Fargate . Due to this isolation, each pod gets its own Elastic Network Interface (ENI) and its own Security Group (SG). In contrast, all pods deployed on the traditional EC2 data plane, are sharing ENI of the worker node.
Use cases it opens up
So far, running a Kubernetes cluster required a lot of knowledge and was difficult. Now, by being able to spin up an entire production-ready cluster (that scales automatically) with a single eksctl command, Kubernetes can be used for:
- Data science: to run several experiments, possibly in parallel using Dask,
- Data engineering: to be used as an execution layer in workflow management platforms such as Prefect or Apache Airflow
- Web development: AWS EKS integrates well with Application Load Balancer
- Separation of computing resources: for different teams and projects.
What are the downsides of a serverless data plane?
One drawback of using almost any serverless platform is the issue of potential latency related to the time that the orchestration engine needs to allocate and prepare the compute resources (a.o. pulling the latest version of the image from the container registry to the allocated worker node and building the image) before a container (or your K8s pod) will turn into a running state.
If this latency is not acceptable by your workloads, using the Amazon EKS cluster with the traditional data plane may work better for your use case. However, you can have both!
Mixing serverless data plane with traditional EC2 worker nodes
AWS allows mixing the two options! This way, we can have a serverless data plane for some use cases, and constantly running Kubernetes worker nodes for low-latency workloads, both within the same cluster.
This separation of serverless and non-serverless worker nodes is achieved by using different namespaces, and optionally also different labels for both.
- Your serverless workloads will use the Kubernetes namespace defined in a Farate profile, which by default specifies that all resources created in the default namespace will be deployed to the Fargate scheduler.
- If your pod specification defines a different namespace than the one specified in a Fargate profile, EKS API will schedule the pod to be run on a non-serverless worker (i.e. EC2 instance).
The possibility to mix the two options demonstrates that AWS built this service with a lot of foresight to save their customers time, money, frustration, and maintenance efforts. It is particularly useful if you have some pods that require to run on a GPU, such as many data science use cases. At the time of writing, Fargate doesn’t support GPUs, which is why the combination of both worker node types is very useful.
The control plane checks not only the namespace but also the assigned label selectors before deciding where to deploy a specific container. This means that, when you create a deployment for a pod that doesn’t match the namespace and labels defined in the Fargate Profile, it will be scheduled to the EC2 worker nodes which you maintain and which will be able to start running containers with no latency related to provisioning micro-VMs.
Time for a Demo!
If you want to follow along, make sure that you have an AWS account with either admin access or a user with IAM permissions for creating ECR, EKS, and ECS resources. Additionally, you should download and install AWS CLI and configure it with your AWS credentials (AWS Key + AWS Secret Key). Finally, to use EKS, you should install eksctl, as described here: AWS docs.
Deploy your Docker images to ECR
EKS on Fargate provides seamless integration with the AWS-specific container registry called Elastic Container Registry (ECR), which is used to host your Docker images, similarly to Dockerhub. To authenticate a local terminal session with your ECR account, run:
- If you use the new AWS CLI v2:
- If you use the old AWS CLI v1:
Note: <YOUR_AWS_REGION> could be ex. us-east-1, eu-central-1, and more. To check which AWS CLI version you have, use aws --version.
If you get Login Succeeded message, you can create your ECR repositories and push your custom images to the registry. To create a new repository:
Now you only need to build your Docker container and push it to the ECR repository you just created:
Create a serverless Kubernetes cluster on AWS in a single command
Now, all you need to do to actually create the cluster is to run a single eksctl command, which will create the following resources within your VPC:
- a Kubernetes control plane,
- IAM role for pod execution,
- Fargate Profile.
I used the name fargate-eks for the cluster, you can name it as you like.
The --fargate flag ensures that we create a new Fargate profile to be used with this cluster so that we use Fargate as an orchestrator of our data plane. If you want to create the profile yourself, you can skip the--fargate flag and create the profile yourself in the management console after your cluster has been created. Alternatively, you could do it from the terminal (but you don’t have to do it if you used --fargate flag):
The process of creating a cluster and a Fargate profile may take several minutes. When finished, you should see a similar output confirming that our cluster is ready.
Then, if we check our context: kubectl config current-context, we should get a similar output:
This confirms that we are connected to a Kubernetes cluster running on AWS Fargate and EKS! To prove it further, run kubectl get nodes — it should display at least one Fargate node waiting for our pod deployments.
Note: those nodes are running inside of our VPC but they are not visible within the EC2 dashboard. You cannot SSH to those nodes, as they are fully managed and deployed by Fargate in a serverless fashion. However, you could SSH to a specific pod using kubectl exec -it <pod_name> /bin/bash.
In order to deploy some containers to this cluster, we could use the default nginx image:
How Does EKS on Fargate Work Under the Hood?
You may ask: how is this even possible that we deploy resources to nodes that haven’t yet been provisioned? AWS Fargate adds additional logic to the control plane to ensure that we don’t have to modify our pod definitions to make it work . The Fargate profile mutates the pod definition on the fly for any pod that is deployed into the Fargate’s namespace by making use of validating and mutating webhooks.
If those webhooks find a match between the namespace and labels specified in the pod with those from a Fargate profile, the pod will get sent to the Fargate Scheduler. In contrast, if the webhooks won’t find any match, the pod will be sent to the standard Kubernetes scheduler, which then sends the pods to be deployed on the non-serverless EC2-based data plane .
Resource allocation per pod
When we define our pods, we specify how much vCPU and memory is needed by our containers. This is an important step to ensure the proper allocation of resources. We don’t need to be overly explicit about it, but it’s useful to determine what limits we want to set so that our pod will be deployed to an instance that has this required level of vCPU and memory. For example, if you specify a limit of 4GB of memory, your application may use fewer resources in the end, but this way you can ensure that those 4GB are available on the micro-VM.
“Since there are no worker nodes available in the cluster, the pods themselves will size and dictate the underling capacity required.” 
The easiest way to determine the resource allocation is to assign a bit more resources at first and then monitor how much resources are used per pod by leveraging tools such as:
- Datadog that provides supports for EKS on Fargate,
- open-source technologies such as Prometheus or Grafana, as demonstrated in this AWS blog post.
In this blog post, we discussed EKS on Fargate, a service that lets us run a serverless Kubernetes cluster on AWS. We discussed the differences between ECS and EKS on Fargate and their implications. We also looked at the benefits of running Kubernetes pods in a serverless way and the use cases it opens up.
After that, we looked at the possible drawbacks related to Fargate and how the mix of serverless and non-serverless data plane on EKS can mitigate those disadvantages.
We also conducted a demo showing how to create a serverless Kubernetes cluster on AWS and how to push images to ECR to make our custom images accessible by the Kubernetes API. Finally, we looked at how this service works under the hood and how to allocate resources per pod.
I hope it can help you to determine whether EKS on Fargate can work for your use case and to start using it. Thank you for reading!
 Launch at re:Invent 2019: https://www.youtube.com/watch?v=m-3tMXmWWQw&t=3s
 EKS workshop: https://www.eksworkshop.com/beginner/180_fargate/creating-profile/
 AWS Docs on Fargate profiles: https://docs.aws.amazon.com/eks/latest/userguide/fargate-profile.html
 AWS Blog on Fargate with respect to compute saving plans, security, and more: https://aws.amazon.com/blogs/containers/saving-money-pod-at-time-with-eks-fargate-and-aws-compute-savings-plans/
 Article by Kevin Casey: https://enterprisersproject.com/article/2019/7/kubernetes-statistics-13-compelling