Kubernetes is a popular container orchestration engine, but it’s hard. Especially for newcomers. If you’re used to running a bunch of virtual machines and then connecting them together to get your app running, and then worrying about keeping those machines patched, maintained and ensuring the development environments are consistent, then you’re going to have a bad time. Containers have changed so much about how software is developed, tested and deployed and has reduced so much cognitive burden and knowledge which is often locked away in the minds of people/staff who just “know how to run” the organisation’s applications on a server “somewhere”. This post is aimed at removing much of that burden and moving the locked-away knowledge to documentation by introducing concepts such as infrastructure-as-code and automation.
Resources
Terraform is a software package by Hashicorp which helps to manage infrastructure in the cloud by using a declarative templating language which is compiled and converted into API calls by the toolchains from the various providers. In the context of Kubernetes, I have created a base starting point which can be used and expanded on which I wanted to share. Let’s get stuck into it.
The repo is available here and has a number of configuration files of which some will need to be customised for the deployment. Go clone the repo and then we can get started.
Getting setup
The first step is to sign into your Google Cloud account and in the IAM section, create a service account which will
have enough permissions to complete the tasks required for deploying the infrastructure. Taking the approach of the
least privilege principle will help
reduce the risk of too much power over your GCP resources (and by extension, your own resources - aka your cash!) being
mis-used or stolen if it got into the wrong hands. Plus it will make any security and finance people in your
organisation happy. Once you’ve created the service account and downloaded the JSON file which contains the
authentication information, it’s time to update the terraform.tfvars
file which exists at the root of the repo.
Configure the two lines with your GCP project and point to the correct JSON file you’ve just downloaded.
Deploy the cluster
Next up ensure terraform is installed and then run a terraform init
command, to initialise terraform in the repository
if you haven’t already. Once done, you’ll be able to run terraform plan
command. This will spit out all the changes
that terraform will do for you with respect to the cloud infrastructure it will create. It’s worth reviewing this to see
everything and be fully aware of what will be done for you. If you’re ready you can run terraform deploy
which will
deploy all of the resources outlined in the previous step. Let’s do that now. The process will take around 8 minutes to
create the cluster, network and machines before we can start issuing commands to the cluster.
Here’s a basic architecture diagram of what we will end up with once everything is deployed and kubernetes is configured.
Understanding the terraform files
Whilst the resources are being provisioned, we can take a look at each of the files in the repository to see how it all
fits together. Let’s start with the variables.tf
file.
You’ll see things like:
variable "initial_node_count" {
default = 1
}
variable "region" {
default = "australia-southeast1"
}
variable "machine_type" {
default = "n1-standard-1"
}
In short this variable file allows us to declare variables that we can use in our infrastructure, should we want to use them in multiple places.
Many configurations in GCP require a region flag for example. Since we’re providing a default
flag here, it means that
by default we will put our resources in the australia-southeast1
region. We could put other things here, like for
example instead of default
, we could use united-states
and then set a value of us-central1
or whatever we wanted.
You don’t have to provide these defaults, but it’s a good idea to have a standard, which you can then tweak in the actual resource you declare if you need or just avoid putting in a declaration at all if you want to use that value everywhere in your app.
Next up we have the main.tf
file which is the entry point for all the infrastructure declarations. In this case, we’re
just using one file since our infrastructure is relatively straightforward and doesn’t use too many resources. Normally
it makes sense to configure things like VMs, networking, IAM, and so on in separate resource files, so it’s clear and
logical to understand.
In the provider block, we’re declaring that google
is the cloud provider for our infrastructure. Terraform can also be
used with other clouds, so if you wanted to build out templates for them, you’d change the value here.
provider "google" {
credentials = file(var.credentials_file)
project = var.project
region = var.region
zone = var.zone
}
This block is pretty self explanatory, so I won’t go into too much detail here.
The two other blocks contain the infrastructure declaration we need to run our app. Take a
look at the resources here. Let’s start with the
node pool. We start by passing in a variable at the block declaration, default
, which explicitly tells the block that
we are giving it a label which we can use in other places. One such example is is with the container cluster on line 32.
We can see there we are referencing the above block with the following syntax: resource.label
. In our case, it is:
google_container_cluster.default
. We want to assign the value of whatever the cluster’s name ends up being to the
container node pool resource, we’re declaring here. This tells terraform that in order to create the resource block that
the cluster must exist first as we are using it’s values in another resource.
Some of the other configuration options you may recognise if you’re familiar with GCP. For example, in this case, we’re making use of pre-emptible VM’s and giving some oauth scopes to the node pool, such as monitoring and the ability to write logs to stackdriver.
The only other thing I’d like to point out at this point is in the google_container_cluster
resource, that we are
removing the default node pool that is automatically created when the cluster is created. This is because we want to use
our own node pool of pre-emptible VMs and so we remove the nodes once the cluster has been created in order to not waste
resources and stick with the configuration we want.
Kubectl do the thing
If you’re new to kubernetes, then kubectl
won’t be a tool you’ve had the joy of using yet, but will very soon. It
allows the user to use declarative yml templates to outline how to run workloads inside kubernetes, and also enables
the use of configuration of networking, secrets, load balancing, traffic splitting, amongst many others. In our example
we’ll be focusing on creating a service, a deployment and an ingress.
Let’s take a look at the service file.
apiVersion: v1
kind: Service
metadata:
name: example
namespace: default
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
run: example
type: NodePort
This specification creates a new Service object named “example” in the default
namespace, which targets TCP port 80 on
any Pod with the example
label.
This is a good segue into the deployment.yml file. From the kubernetes docs:
A Deployment provides declarative updates for Pods and ReplicaSets. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.
apiVersion: apps/v1
kind: Deployment
metadata:
name: example
namespace: default
spec:
selector:
matchLabels:
run: example
template:
metadata:
labels:
run: example
spec:
containers:
- image: nginx:latest
imagePullPolicy: IfNotPresent
name: example
ports:
- containerPort: 80
protocol: TCP
This file says how we want to run the service we just created. The deployment example
is created as indicated by the
metadata.name
field. The .spec.selector
field defines how the Deployment finds which Pods to manage. In this case,
you simply select a label that is defined in the Pod template (spec.selector.matchLabels.run: example
).
The spec.container here specifically talks to the container you want to run and it’s configuration. In this case, we’re just pulling in the latest nginx container for our demo to prove the concept out.
Side note, if you wanted to have more than one container serving traffic in your cluster, you could amend the spec
at
the top level and add the key replicas
in, with a value of your choosing. For example if you wanted 5 containers of
your app running you could do: replicas: 5
like so:
apiVersion: apps/v1
kind: Deployment
metadata:
name: example
namespace: default
spec:
replicas: 5
...
Finally we’ve got our ingress.yml file which tells kubernetes how we can reach our application from the outside world. This will also conveniently create a load balancer for us on google cloud which will give us a public IP address, so public internet traffic can start to reach our app!
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: example-ingress
spec:
backend:
serviceName: example
servicePort: 80
Here you can see that we’ve specified an ingress rule for port 80 to our service which we talked about first. It’s extremely straightforward.
Configuring our cluster
By now your cluster may have completed deployment, if not you may be a speedy reader and it’s not quite done yet! Once terraform has reported that your cluster is ready, then we need to configure it!
The tool I referred to earlier, kubectl
is about to get some use. If you haven’t already,
go install it, and whilst you’re at
it, if you haven’t already install the
gcloud command tools. Once installed, you’ll need
to authenticate to the cluster. To do that, run the following command:
gcloud container clusters get-credentials sample-cluster
You’ll get a confirmation message, then you should be able to connect with kubectl
. To verify it works, run
kubectl cluster-info
If that works, then you have command-line access to the cluster and should be able to issue commands. To get our app running we need to apply the three configurations we discussed in the last section.
kubectl apply -f app/service.yml
kubectl apply -f app/deployment.yml
kubectl apply -f app/ingress.yml
You’ll get confirmation messages that they’ve been applied. Now the kubernetes control plane is going about applying those desired configurations and creating the load balancing for you.
If you pop into the console and click through to the kubernetes engine, you’ll see the running cluster and that it’s bringing up the pod(s) and creating the service and load balancer. If you view the load balancer section in the console you’ll be given a public IP address. Once the service is deployed you’ll be able to see your app running!
All done!
Cleaning up
If you were running through the above to get your hands dirty but don’t need the infrastructure we just created, then
you can destroy everything quite easily! Simply run terraform destroy
in your terminal, and you’ll get a confirmation
from terraform that you want to delete the resources you’ve declared. Confirming this will tear everything down.
Questions? Comments? Let me know! Tweet at me, or hit me up on the contact form.
Resources and useful links
- https://www.terraform.io/docs/configuration/index.html
- https://cloud.google.com/iam/docs/using-iam-securely
- https://kubernetes.io/docs/concepts/services-networking/service/
- https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
- https://kubernetes.io/docs/concepts/services-networking/ingress/
- https://github.com/jgunnink/terraform_gke_sample/