Troubleshooting applications deployed into Kubernetes is notoriously tough. Kubernetes is a complex system that has low visibility and many moving parts. In production environments, these issues compound to create tough situations for any team.
While debugging applications is slightly easier than solving issues with K8s clusters, it’s hardly a straightforward task.
Most applications use microservices developed by separate teams or DevOps, and app development teams collaborate on the same cluster. The result is a lack of clarity and delineation of responsibilities. Ensuring that you don’t waste valuable resources debugging applications is, therefore, a key challenge if you deploy with Kubernetes.
Here are three ways of simplifying the K8s troubleshooting process.
The first step to efficiently debugging your application is to narrow down the problem. Specifically, is it an issue with your pods, replication controller, or service? Begin by taking a look at your pods by running:
kubectl describe pods ${POD_NAME}
Check whether all containers in the pod are “running” or whether any of them have recently been restarted. Often, you’ll see a “pending” status along with a description of why the pod cannot be scheduled into a node. A lack of resources is often the case. You might have exhausted CPU space in your cluster, or you might have bound a pod to “hostPort.”
Similarly, “waiting” statuses will describe the issue in detail and you can act accordingly. Often pods will run well but produce unexpected results. This happens due to incorrectly typing a key name or nesting a pod incorrectly. Delete the pod and recreate it with the “--validate” option. Next, check whether the pod on the apiserver matches your newly created pod.
Replication controller issues are relatively straightforward to debug since error messages describe the issue in detail. Debugging services is more convoluted. First, verify whether there are endpoints specified for each service. You can do this by running the following command:
kubectl get endpoints ${SERVICE_NAME}
Check whether the endpoints match the number of pods you expect to exist as members. If you find you’re missing endpoints, try listing pods using the same labels that the service uses. Explore all of these options well to localize the issue. This will make it simple to define the next steps you ought to take.
Often, you’ll find that applications run well but present unique challenges. Retrieving information from running applications can be complicated, but there are a few options you can use.
For starters, you can use “
kubectl describe pod <<pod_name>>
” to retrieve a ton of information related to pods in an application. The command retrieves pod configuration information, resource requirements, status information, and pod state, readiness, restart count, events, and so on.You’ll also see a log of recent events related to the pod. Take note that “From” indicates the logging component, and “SubobjectPath” identifies the container within the pod. “Reason” and “Message” display the event itself. While the kubectl describe pod command doesn’t do the debugging for you, it helps you retrieve detailed information easily.
Follow the trail that the command’s output gives you, and you’ll manage to uncover issues in your application.
Debugging services deserve their own section since every developer will inevitably run into an issue with them. First, make sure you’re accessing a service that actually exists.
Often, developers forget to create a service when deploying a pod and refer to service names that don’t exist. Typically, you’ll receive an error message like the one below:
Resolving hostnames (hostnames)... failed: Name or service not known.
wget: unable to resolve host address 'hostnames'
Use the “kubectl get svc hostnames” command to check whether the service exists. If your service exists but is still throwing an error, check to see whether DNS lookups to it work. Begin by looking up service names in the same Namespace. If the command retrieves nothing, your service and pod probably exist in different namespaces.. Adjust your app to use a cross-namespace identity or run your app and service in the same namespace as the pod.
If none of these methods work, it’s safe to say DNS lookups aren’t working for your service and it’s helpful to check what else isn’t working. The Master Service should always work. Once you’ve confirmed it does, debug the DNS service itself using the same steps outlined previously.
Assuming the DNS works, test whether the service works by its IP address. If it isn’t, check to see whether the service is defined correctly, whether it has any Endpoints, and whether the pods are working. It seems trivial to say this but errors in application services are often the cause of silly mistakes at these levels.
While the number of errors you’ll encounter when deploying an application on Kubernetes might be large, almost all of them originate for simple reasons. From referencing non-existent services to incorrectly defining them, the number of simple mistakes you can make is huge. It’s helpful to always take a step back and review your actions. More likely than not, you’ll eventually unearth the issue and manage to debug your application.