BoxBoat Blog

Service updates, customer stories, and tips and tricks for effective DevOps

x ?

Get Hands-On Experience with BoxBoat's Cloud Native Academy

Getting started with Logging and Kubernetes (Part 2)

by Peyton Vaughn | Tuesday, Sep 3, 2019 | Kubernetes

featured.png

In our previous post on logging, we took a look at enhancing Kubernetes’ native functionality by adding log aggregation. Along the way (and perhaps more importantly) we discussed how to wade through the overwhelming number of options available and zero in on a single solution.

While getting the cluster logs aggregated to one spot is a great first step, in this post we discuss taking it to the next level. Agile development shops need to be able react quickly. To do so, they must be able to curate and interrogate data about their applications quickly and easily.

The de facto solution for this (on-prem anyway) is typically the combination of Elasticsearch fronted by a dashboard. Elasticsearch provides a scalable, battle-tested solution for indexing data (even non-log data), and a dashboard provides a user-friendly interface for querying.

In this blog post, we'll walk through the different ways to implement this type of monitoring stack, and we'll actually go through a deployment with you step-by-step.

Related: Need your own Kubernetes cluster for this walkthrough? Try Getting Started with Rancher

ELK, EFK, What Does it all Mean?

As before, we have to start by making some decisions.

If you've researched container logging solutions, you've probably come across references to the ELK stack, and more recently the EFK stack. Luckily, they are the primary options to choose from:

ELK: Elasticsearch, Logstash, Kibana
EFK: Elasticsearch, Fluentd, Kibana

As mentioned, Elasticsearch is where logs are stored and indexed. Kibana provides a dashboard for querying and visualizations. The variable part, Logstash/Fluentd, provides aggregation and forwarding.

Seems simple enough. However, there's a small catch. There is actually another letter hiding out in those acronyms, and really they could be written:

ELfK: Elasticsearch, Logstash, Filebeat, Kibana
EFfK: Elasticsearch, Fluentd, Fluent Bit, Kibana

(Yup, that's right, the f is different for each stack 😭)

And just to complicate things further, sometimes when bloggers reference EFK, they're actually referring to Elastic, Fluent Bit/Filebeat, and Kibana, and leaving Logstash/Fluentd out altogether.

Why is this ELK/EFK Stuff so Complicated?

Why so many options? And how is that we can leave out the aggregation layer?

To answer, let's have a quick (and riveting!) history recap on logging:

Logstash was written, in JRuby, to parse and forward logs to Elasticsearch. It's flexible and performant, but since it runs in the JVM, is a bit of a memory hog.

Later Fluentd, also written in Ruby, and also parsing and forwarding logs to Elasticsearch, was released. But since it uses MRI Ruby, it consumes less memory than Logstash.

As time goes on and cluster sizes increase, the realization grows that we can reduce memory consumption across the cluster by replacing those heavyweight aggregators on every node with lightweight processes forwarding logs to only a few aggregators.

To answer this, Logstash releases Filebeat; Fluentd releases Fluent Bit.

As time goes on - and this bit is crucial - more and more features begin to be added to Filebeat and Fluent Bit. In fact they reach the point that for many organizations, the native ability of the lightweight forwarders to parse is sufficient. So they get updated to forward to Elasticsearch directly - bypassing the aggregation layer altogether.

Which Kubernetes Logging Stack is Right for You?

The good news is, you can't really make a wrong choice - their functionality is too similar.

If you have older, or more esoteric apps deployed, it may be worth checking the plugin list for each stack (ELK vs EFK). One may have better support for your applications than the other.

That said, for many organizations running modern workloads, an EFK stack is sufficient. And for the rest of this post, we're actually going to go with the Elasticsearch, Fluent Bit, Kibana stack.

How to Deploy the EFK Stack to Kubernetes

Decision made. Now for the easy part!

How to Deploy Elasticsearch

We'll first deploy Elasticsearch, since the other two components depend on it. We'll use Helm to make things super simple.

Related: What is Helm and why is it Important for Kubernetes?

Here is the Elasticsearch Helm Chart. We'll use it to deploy Elasticsearch to our Kubernetes cluster.

$ helm repo add elastic https://helm.elastic.co
$ helm repo update
$ helm install elastic/elasticsearch \
       --name elasticsearch \ 
       --namespace logging \
       --set replicas=1 \
       --set minimumMasterNodes=1 \
       --set volumeClaimTemplate.accessModes[0]=ReadWriteMany \
       --set resources.requests.storage=100Gi

And to just confirm we've successfully deployed:

$ kubectl  get pods -n logging
NAME                     READY   STATUS    RESTARTS   AGE
elasticsearch-master-0   1/1     Running   0          10s

How to Deploy Fluent Bit

Fluent Bit Helm Chart

$ helm install stable/fluent-bit \
     --name fluent-bit \
     --namespace logging \
     --set backend.type=es \
     --set backend.es.host=elasticsearch-master

And to just confirm we've successfully deployed:

$ kubectl  get pods -n fluent-bit
NAME               READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
fluent-bit-7sbpf   1/1     Running   0          20s   10.244.3.187   node-1    <none>           <none>
fluent-bit-7z8zj   1/1     Running   0          20s   10.244.2.138   node-2   <none>           <none>
fluent-bit-bqq65   1/1     Running   0          20s   10.244.0.233   node-3   <none>           <none>

How to Deploy Kibana

Kibana Helm Chart

$ helm install elastic/kibana \
     --name kibana \
     --namespace logging \
     --set ingress.enabled=true \
     --set ingress.hosts[0]=kibana.pv \
     --set service.externalPort=80 

And just to confirm we've successfully deployed:

$ kubectl  get pods -n kibana
NAME                     READY   STATUS    RESTARTS   AGE
kibana-c597fd4d5-mwsl9   1/1     Running   0          5s

And that's it. We can fire up our favorite web browser and go to the ingress URL we used when deploying Kibana above to check out our logs.

Kibana Dashboard

With that you've got your logs not only centrally located, but curated and searchable.

Terms and Conditions May Apply

Now for some disclaimers.

The above installs represent the simplest possible. While fine for trying things out, a production deployment would be more complex.

If you follow the links to the individual Helm charts, you'll also discover there are a ton of options available to configure. This is actually great in some respects. Helm charts provide a consistent, opinionated framework that makes discovering and keeping track of all the config options manageable. However, for organizations that are new to Kubernetes, it can be daunting.

In the course of composing this post, I actually learned something new myself. What I thought was the official home of the Helm charts for Elasticsearch and Kibana, has actually been deprecated. Make no mistake - the projects themselves are still active. But the Helm charts have a new, official (yet oddly still beta?) home under the Elastic Github repo.

Keeping Track of It All

Hopefully these posts have helped illustrate the decision making steps in choosing and deploying an open source logging solution. If it seems a bit overwhelming, quite frankly, that's because it is - at first. There's definitely a learning curve to overcome when starting out with Kubernetes - but they payoff in the long run makes it enormously worthwhile.

And of course, if you need help getting over that initial curve, we're happy to help.