Photo by Lukas Blazek on Unsplash
Decentralized systems observability has always been challenging. Dealing with latency, distributed transactions, failures etc. became increasingly complex. The more abstraction a decentralized system has, the more difficult it is to reason about it, debug and troubleshoot.
The main reason that makes Kubernetes observability so difficult is the volatile and dynamic nature of workloads and resources. Instead of dealing with one server, we now deal with an unknown number of servers (due to autos calling). Rather than having one monolithic application, we now have multiple distributed services. Same with databases, which often reside outside of the cluster.
Let’s imagine making an HTTP(s) call to an API running on a Kubernetes cluster hosted on a cloud provider. Here is a simplified sequence diagram showing critical points where
At any point of this communication chain, things can go wrong, performance can degrade, security issues might occur, etc. Knowledge of what is happening on the cluster and detailed insights into every step of the communication chain is essential for operational performance.
Now we know what to observe, but the question is how and where do we put our observability points, gateways of insight.
There are a couple of options:
Combine this with exporting more important metrics into Prometheus and you are good to go.
Such a granular level of observability is possible thanks to eBPF (Extended Barkley Packet Filter). A protocol that makes the kernel programmable in a safe and performant way.
eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in an operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules. To learn more about eBPF, visit
Introduction to eBPF . Source:https://ebpf.io/what-is-ebpf/
Here is a
Below diagram shows how eBPF works on a high level
Earlier we’ve seen a diagram with an example of traffic flow in Kubernetes. Every step in this traffic should produce valuable insights into our workloads.
Here is a list of typical insights both Dev and Ops will be interested in.
In the demo part, we will be looking into HTTP traffic on a sample app.
Pixie CLI comes with predefined demo apps that we can install directly from the command line. However, those demos take a long to load, instead, we will use a different application.
To follow along with the demo, you will need to install the following components:
Pixie is an open-source observability tool for Kubernetes applications. Pixie uses eBPF to automatically capture telemetry data without the need for manual instrumentation.
We will choose the docker option for Pixie CLI to minimize system clutter.
alias px="docker run -i --rm -v ${HOME}/.pixie:/root/.pixie pixielabs/px"
Pixie currently only supports minikube, following install instructions are for Debian Linux Other installation instructions available at
Minikube Page
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube_latest_amd64.debsudo dpkg -i minikube_latest_amd64.deb
This will start minikube with a KVM driver.
minikube start --driver=kvm2 --cni=flannel --cpus=4 --memory=8000 -p=pixie-cluster
If you are running Windows/MacOs use --driver=hyperkit
the option
It is possible to self-host Pixie, but for demo purposes, we will create a free account to access metrics UI.
px auth login
px deploy-key create
export PIXIE_DEPLOY_KEY=<copy key from the command result>
helm install pixie pixie-operator/pixie-operator-chart --set deployKey=$PIXIE_DEPLOY_KEY --set clusterName=pixie-cluster --namespace pl --create-namespace
The installation might take a couple of minutes.
kubectl run --restart=Never --image=gcr.io/kuar-demo/kuard-amd64:blue kuard
Once the pod is ready, forward port and access web UI
kubectl port-forward kuard 8080:8080
open http://localhost:8080/
It is possible to run a query directly from a command line, but we are going to go right into a live UI.
Navigate to
From the cluster
menu select your cluster
Click on the script drop-down and select http/data
.
In the destination filter type kuard
to filter only the traffic to our pod.
Refresh the kuard page a few times and rerun the script using RUN
button on the right-hand side of Pixie UI.
Feel free to further explore Pixie UI and find metrics you are interested in.
If you are considering Pixie for your workloads there might be architectural considerations that you want to address. Here are a few facts about how Pixie works that might help address some of them.
Pixie is a sandbox CNCF project.
To learn more about Pixie, check out their
web page with more examples and in depth explanations as well as theirGithub Repo .
Also published here.