Unlocking Scalable Logging in Your Kubernetes Cluster

Demystifying Kubernetes Logging Architecture

To grasp log aggregation, you need to understand how logging works in Kubernetes. Imagine a cluster with multiple nodes, each running multiple containers within pods. Each container produces output, automatically collected by Kubernetes and stored on the node. A log aggregation system merges these log files and forwards the output to an external log collector. We can achieve this using a DaemonSet, running our log aggregator on every node and giving it access to the node’s accumulated log files.

kubectl get nodes
kubectl get pods --all-namespaces

Getting Started with Log Aggregation

To explore log aggregation, you’ll need a Kubernetes cluster, a Linux-style terminal, Docker, and kubectl installed. You can find the example code for this tutorial on GitHub. Before diving into advanced log aggregation, try the basic approaches first. Use the Kubernetes dashboard or the kubectl logs command to view recent logs from any pod.

kubectl logs -f <pod_name>

Building a Simple Log Aggregator

Meet Loggy, a lightweight Node.js microservice designed to forward logs from Kubernetes to an external log collector. We’ll explore how to build Loggy, focusing on finding log files, eliminating system log files, tracking log files, parsing log entries, and sending them to an external log collector.

const globby = require('globby');
const tail = require('node-tail');

// Find log files
globby.sync('/var/log/**/*.log').forEach(file => {
  // Eliminate system log files
  if (!file.includes('syslog')) {
    // Track log files
    tail.file(file, function(data) {
      // Parse log entries
      const logEntry = JSON.parse(data);
      // Send to external log collector
      console.log(logEntry);
    });
  }
});

Tackling Log Aggregation Challenges

To tackle log aggregation, we need to answer several questions:

  • How do we find log files? We’ll use Globby, a great npm package for finding files based on globs.
  • How do we eliminate system log files? We’ll use Globby again to exclude system log files.
  • How do we track log files? We’ll use the node-tail npm package to receive new lines as they come.
  • How do we parse log entries? We’ll use the built-in JavaScript function JSON.parse.
  • Where do we send each log entry? This depends on your chosen external log collector.

Deploying Loggy to Your Kubernetes Cluster

Once you’ve built Loggy, you can deploy it to your Kubernetes cluster as a DaemonSet. This will run Loggy on every node, giving it access to the node’s accumulated log files. You can then view the aggregated logs for each node using the logs command.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: loggy
spec:
  selector:
    matchLabels:
      app: loggy
  template:
    metadata:
      labels:
        app: loggy
    spec:
      containers:
      - name: loggy
        image: loggy:latest
        volumeMounts:
        - name: log-volume
          mountPath: /var/log
      volumes:
      - name: log-volume
        hostPath:
          path: /var/log
          type: Directory

Next Steps: Storing and Analyzing Your Logs

Now that you have Loggy collecting logs, you need to decide where to store and analyze them. You could store logs in a database or forward them to an external log collector. The possibilities are endless!

Leave a Reply