Elastic stack in AKS. Part 1 Provisioning Kubernetes with Terraform

This is part 1 of a serious of blog posts about deploying Elastic stack in Azure Kubernetes Service.

Related GitHub repository with Terraform source code K8sAzurTerraform

Infrastructure

We deploy the following infrastructure

ComponentDescription
Azure Kubernetes Service (AKS)The environment for running Elastic stack
Ingress NginxIngress controller based on Nginx
DeploymentAny arbital deployment inside AKS
Inbound IPPublic static IP for inbound traffic
Outbound IPPublic static IP for outbound traffic. With static outbound IP can be whitelisted in any external services or components for incoming traffic from our cluster. Particular useful for Elastic Heartbeat.
Terraform state storageAzure storage account for Terraform state. This is obviously not deployed by Terraform. It must be available before the AKS deployment.
Backup storageAzure storage account for Elastic snapshots
Log AnalyticsAzure Log Analytics workspace for AKS logs and monitoring

Prerequisites

  • Access to source code K8sAzureTerraform from deployment environment
  • Azure environment prerequisites:
    • Service principal that has access to subscription with Contributor role
    • Storage account to store Terraform state

Hardcoded parameters

  • Kubernetes version 1.20.5
  • VM nodes size: Standard_B2s with 2vCPU, 4GiB memory

These are defined as variables in deploy.sh script and can be changed there.

Tools

The easiest way with no tools required is to use Azure Cloud Shell

In order to use this from a local environment, the following tools are required:

  • Terraform version >= 0.14
  • Azure CLI version >= 2.8
  • git
  • kubectl – Kubernetes command line tool
  • helm
  • Bash shell (the deployment script is written in Bash)

Deployment

  1. Login to Azure Cloud Shell https://shell.azure.com
  2. Clone the repository
git clone https://github.com/mchudinov/K8sAzureTerraform.git

3. Change to source code directory cd K8sAzureTerraform

4. Run deploy.sh script

For example:

./deploy.sh -c mytestk8s -n 3 -r westeurope -p XXX -s terraformstate

The keys for the deployment script are:

  • -c Cluster name
  • -n Number of nodes (default 1)
  • -r Azure region (default West Europe)
  • -p Azure service principal ID for Terraform
  • -s Storage account name for Terraform state

After a couple of minutes a new Kubernetes cluster will be ready.

Terraform places all the created resource in two resource groups:

  • rg-<cluster_name> – this group is for Kubernetes cluster itself
  • rg-node-<cluster_name> – this one is for VMs of the cluster

Terraform outputs and post deployment actions

fqdn = "k8s-mytestk8s-a9f94e35.hcp.northeurope.azmk8s.io"
host = "https://k8s-mytestk8s-a9f94e35.hcp.northeurope.azmk8s.io:443"
public_ip_inbound_address = "52.142.123.240"

Script adds the ne K8S context to the local .cube config file

Merged "k8s-mytestk8s" as current context in /home/michael/.kube/config

Scripts deploys Nginx ingress controller in ingress-nginx namespace:

namespace/ingress-nginx created
NAME: ingress-nginx
LAST DEPLOYED: Sat May  1 16:57:04 2021
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.

Script installs CSI drivers for Azure Key Vault using Helm

How-tos

Add Kubernetes credentials to the local .cube config file

az aks get-credentials --name <AKS_NAME> --resource-group <RESOURCE_GROUP>

How to run an interactive shell

kubectl apply -f interactive.yaml

Check CSI driver is running

kubectl get csidrivers
kubectl describe csidriver secrets-store.csi.k8s.io
kubectl get pods -l app=secrets-store-csi-driver

Log Analytics Workspace works immediately after deployment.

Clean-up

Delete everything created in Azure using destroy.sh script. It runs Terraform destroy

destroy.sh script has same parameters as as the script for deployment deploy.sh. For example:

./destroy.sh -c mytestk8s -n 3 -r westeurope -p XXX -s terraformstate

Delete everything created manually. If destroy.sh script fails everything that was created can be easily erased manually. Just delete the resource groups created by Terraform: 

az group delete --resource-group rg-<cluster_name>
az group delete --resource-group rg-node-<cluster_name>

Discard all local changes in git repository

git reset --hard

Delete the source code directory in Cloud Shell

rm -rf K8sAzureTerraform

Pros and cons

Deployment of AKS with Terraform has an advantage over deploying with scripts since it always represents the latest state of your infrastructure. At a glance, you can tell what’s currently deployed and how it’s configured. You just focus on describing your desired state, and Terraform figures out how to get from one state to the other automatically. However the deep discussion of procedural vs declarative infrastructure programming is beyond the scope of this blog post.

Nevertheless there is one important disadvantage using current Terraform Azure rm provider (version 2.49) related to Elasticsearch: it doesn’t support AKS custom node configuration. Customize node configuration for Azure Kubernetes Service (AKS) node pools (preview) Custom AKS node configuration in particular useful for running Elasticsearch as it requires vm.max_map_count=262144

Without custom configuration Elasticsearch pods must be run privileged in order to satisfy virtual that memory requirement.

Wrapping up

We have successfully deployed a managed Kubernetes service in Azure using Terraform. This cluster suits for deployment of Elasticsearch as well as other services.

The newly created Kubernetes cluster has static outbound and inbound IPs which make it easy to whitelist these addresses in any external infrastructures.

Kubernetes cluster got an Nginx-based ingress bound to public static inbound IP. It also got installed Container Storage Interface (CSI) driver for Azure Key Vault. This is in particular useful for ingestion of certificates and secrets such as login credentials for Elasticsearch.