Sharded applications on Kubernetes using Helm, ArgoCD, and Argo-Rollouts
In this story, we aim to deploy a sharded application service into the Kubernetes cluster using Helm & ArgoCD. We will use Argo-Rollouts for deploying the app using the Blue/Green strategy. We can easily change this to a canary rollout too. This story does not aim to discuss the data-side sharding, rather suggests the organization of the app tier to talk to a sharded datastore
What exactly is a shard?
A shard in the context of a multi-tenant application can represent the app and the data tiers either for a single tenant (if that tenant is large), or a group of tenants.
What are we trying to build?
That is best described by the following diagram
We have the application traffic coming in via an API gateway to our EKS cluster. We used Kong
as the API gateway
in this implementation due to its widespread adoption.
In the API gateway, we have a bunch of plugins such as Tenant Plugin
, ShardID Plugin,
and OAuth plugin
. Some of these are custom plugins and some of them are off-the-shelf. The Tenant plugin
identifies the tenant-id based on some traffic pattern. The ShardID plugin
then maps this tenant-id to an application shard that will service the request. The ShardID plugin for this case will set up a custom header to the request of the format x-shard: a
, where a
is the shard-id. The OAuth plugin does the Authentication and Authorization and the request is finally routed to the target cluster. We will not be going into the details of any of these Kong plugins.
On the target cluster, we have a Kong Ingress Controller
that receives the request. I chose the Kong Ingress Controller
because of its ability to route requests based on an HTTP request header (something that seems to be missing with Kubernetes community Nginx Ingress Controller, present with Enterprise paid version of Nginx Ingress controller)
The source code is in Github and a Github Actions
workflow builds a docker image and pushes this to the AWS ECR repository upon a merge to the main branch.
Helm chart for deploying the application is also on Github. ArgoCD will be set up to watch the helm-chart in Github for any changes and deploy the application to the EKS cluster. We will use Argo-Rollouts to achieve Blue/Green deployment. Argo-Rollouts can also be easily changed to drive a canary deployment.
The applications are connected to sharded datastores.
ToolsArgo-CD
, Argo-Rollouts,
and Sealed-secrets
are installed into the Kubernetes cluster. This article does not discuss the installation of these tools.
On the client-side, you will need Argo-cd client, Kubernetes CLI, and AWS CLI installed and configured for this to work well.
Helm Chart
The helm chart structure looks like -
We have a parent helm chart that contains a sub-chart. The sub-chart is an independent helm chart and represents an individual shard.
The parent chart reuses the same sub-chart as dependencies to model the different shards in the application
The shard1.enabled
controls whether the first shard is deployed and the shard2.enabled
controls whether the second shard is deployed. So increasing the number of shards becomes as easy as adding more dependencies to the main chart.
The values
YAML is given below:
We will only be describing some of the templates in the sub-chart. Others, we left it out for brevity.
The Ingress routing by header was adapted from this link. The Ingress spec is as below:
The plain vanilla Ingress definition cannot route by request-header. We hence decorate with a KongIngress
and tie this to the ingress using the konghq.com/override
annotation. The spec for the KongIngress
is as given below:
One thing that we have noticed is that custom-header-based routing in Kong does not work using numbers alone. The header value needs to be a string
ArgoCDArgoCD
is a declarative, GitOps continuous delivery tool for Kubernetes. ArgoCD is used to deploy our application to all the respective EKS clusters in one-go, based on the application deployment definition (helm chart) in the Github. The docker images used by the helm charts are based out of an AWS ECR repository. We have used Kubernetes style declarative manifests for creating the ArgoCD artifacts. The most basic artifact in ArgoCD is the concept of an application, that basically tells ArgoCD, where the source (i.e. application deployment definition) and the target (i.e. the target kubernetes clusters) are. The application.yml for ArgoCD is below
The ArgoCD metadata sets up the helm chart to be watched in https://github.com/mygit/myproject.git
Git Repo, under a sub-folder called app-chart
and deploys the chart in xyz
namespace in the local cluster (i.e. same Kubernetes cluster), where ArgoCD is installed.
Another unit of organization in the ArgoCD is the concept of a project, that helps us to isolate a set of related applications. The project.yaml file for the ArgoCD project is as follows
Both these kubernetes style manifests can be applied using the following command
kubectl apply -f project.yaml
kubectl apply -f application.yaml
ArgoCD also requires that we set up the git-repository, so that it can check the source for any updates
argocd repo add https://github.com/mygit/myproject.git --username $argoGithubUser --password $argoGithubPass --project $argoCdProject --insecure-skip-server-verification
Here the username and password are github username and github tokens.
Similarly, we will also need to add the ECR repository to the ArgoCD repository definition
ecrPass=`aws ecr get-login-password --region us-west-1`argocd repo add $ecrRepo --enable-oci --username AWS --type helm --password $ecrPass --name myrepo
Here $ecrRepo
is the AWS ECR repository
Blue-Green using Argo-Rollouts
Argo Rollouts is a Kubernetes Controller that provides advanced deployment capabilities such as blue-green, canary, and other progressive delivery features to Kubernetes. We used Argo-Rollouts for establishing a blue-green deployment. We can easily adopt our existing deployments to use Argo-Rollouts. The Argo-rollout definition is as follows
Notice we have mentioned the strategy.blueGreen
and the kind
as Rollout
. These are the only noticeable differences between a normal kubernetes deployment spec and an Argo-rollout.
The Service manifest is as follows
I have not shown the datastore credentials in the rollout / deployment. You could easily add the non-sensitive data into the values-staging.yaml
and then inject these in the rollout-deployment.yaml
.
Storing sensitive data
For storing sensitive data, we would recommend using sealed-secrets (https://github.com/bitnami-labs/sealed-secrets) to encrypt the secret value against the Kubernetes cluster. Then you can take the encrypted value and stick this into the helm values-staging.yaml
The process might be as follows:
# Create a secret first
kubectl --kubeconfig $KUBECONFIG create secret generic my-secret --dry-run=client --from-literal=db_password=mypass -n xyz -o yaml > /tmp/secret.yaml# Use kubeseal utility to sign/seal the secret
kubeseal -f /tmp/secret.yaml--kubeconfig $KUBECONFIG --scope namespace-wide -o yaml > /tmp/sealed-secret.yaml
The file /tmp/sealed-secret.yaml
contains a .spec.encryptedData
that contains the encrypted secret. We can take these encrypted values and put these into our helm values file and safely check these in to our Github Repository.
Then we need to get a sealed-secret spec as follows
This will result in the creation of the SealedSecret
when the helm chart deploys. As soon as the SealedSecret
spec is created, the sealed-secrets controller in the Kubernetes cluster will unseal the secret and create a regular Kubernetes secret, which can be easily injected into your application.
And now your application is ready!
You can set up a tremendous amount of automation in many of these steps using Github Actions too and enjoy the fruits of such automation!