paint-brush
Kafdropby@Koutanov
2,221 reads
2,221 reads

Kafdrop

by Emil KoutanovDecember 13th, 2019
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Kafdrop is an Apache 2.0 licensed project, like Apache Kafka itself, so it’s won’t cost you a penny. Kafdrop does an amazing job of filling the apparent gaps in the observability tooling of Kafka, solving problems that the community has been pointing out for too long. The Kafdrop web UI project is hosted on GitHub at obsidiandynamics/kafdrop. (I sure would) It's actually surprisingly easy to configure Kafdrop for a SASL/SSL/SSL locked-down cluster.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Kafdrop
Emil Koutanov HackerNoon profile picture
Open-Source Web UI for Apache Kafka

As a messaging platform, Kafka needs no introduction. Since its inception, it has virtually rewritten the book on event streaming and has catalyzed the adoption of the now household design patterns — microservices, event-sourcing, and CQRS.

Being such a godsend, it almost gets away with its notorious lack of tooling. You’d be hard-pressed to find a developer who hasn’t at one time looked at the built-in CLI tools, cupped their face and uttered: “Is this it? Are you kidding me?”

What’s Out There

With the popularity of Kafka, it’s no surprise that several commercial vendors have jumped on the opportunity to monetize Kafka’s apparent lack of tooling by offering their own. Kafka ToolLandoop, and KaDeck are some examples, but they’re all for personal use only unless you’re willing to pay. (And it’s not to say that you shouldn’t, au contraire, but that’s rather beside the point.)

Any non-trivial use in a commercial setting would be a violation of their licensing terms. It’s one thing using them at home for tutorials or personal projects; when you’re using a commercial tool without the appropriate license, you are putting your employer at risk of litigation and playing Russian Roulette with your career.

But what about open-source?

When it comes to Kafka topic viewers and web UIs, the go-to open-source tool is Kafdrop. With 800K Docker pulls at the time of writing, and growing at a rate of 10K pulls/day, there aren’t many Kafka tools that have enjoyed this level of adoption. And there’s a reason behind that: Kafdrop does an amazing job of filling the apparent gaps in the observability tooling of Kafka, solving problems that the community has been pointing out for too long.

Kafdrop is an Apache 2.0 licensed project, like Apache Kafka itself. So it’s won’t cost you a penny. If you haven’t used it yet, you probably ought to. So let’s take a deeper look.

What Can It Do?

  • View Kafka brokers — topic and partition assignments, and controller status
  • View topics — partition count, replication status, and custom configuration
  • Browse messages — JSON, plain text and Avro encoding
  • View consumer groups — per-partition parked offsets, combined and per-partition lag
  • Create new topics
  • View ACLs

Getting Started

The Kafdrop web UI project is hosted on GitHub at obsidiandynamics/kafdrop.

There’s a couple of options at your disposal. You could show a little bravery by cloning the repository and building from source. It’s a Java (JDK 11) Spring Boot project, and you can build it with a single Maven command, providing you have the JDK installed.

If you want to go down this path, the repo’s 

README.md
 file will guide you through the steps. For now, let's take the easy way — Docker. (I sure would.)

Launching Kafdrop

Docker images are hosted on DockerHub. Images are tagged with the Kafdrop release number. The 

latest
 tag points to the latest stable release.

To launch the container in the foreground, run the following command:

docker run -it --rm -p 9000:9000 \
    -e KAFKA_BROKERCONNECT=<host:port,host:port> \
    obsidiandynamics/kafdrop

The 

KAFKA_BROKERCONNECT
 environment variable must be set to the bootstrap list of brokers.

That’s it. We should be up and running. Once it starts, you can launch the Kafka web UI by navigating to localhost:9000.

Note: The above example assumes an authenticated connection over a plaintext TCP socket. If your cluster is configured to use authentication and/or transport-level encryption, consult the README.md for connection options. It's actually surprisingly easy to configure Kafdrop for a SASL/SSL locked-down cluster.

Running in a Kafka sandbox

Don’t have a Kafka broker running? No worries. Just use the following 

docker-compose.yaml
 file to bring up a Kafka + Kafdrop stack:

version: "2"
services:
  kafdrop:
    image: obsidiandynamics/kafdrop
    restart: "no"
    ports:
      - "9000:9000"
    environment:
      KAFKA_BROKERCONNECT: "kafka:29092"
    depends_on:
      - "kafka"
  kafka:
    image: obsidiandynamics/kafka
    restart: "no"
    ports:
      - "2181:2181"
      - "9092:9092"
    environment:
      KAFKA_LISTENERS: "INTERNAL://:29092,EXTERNAL://:9092"
      KAFKA_ADVERTISED_LISTENERS: "INTERNAL://kafka:29092,EXTERNAL://localhost:9092"
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: "INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT"
      KAFKA_INTER_BROKER_LISTENER_NAME: "INTERNAL"

Now launch the stack with docker-compose up. Once started, browse to localhost:9000.

Navigating the Kafka Web UI

Browsing the Cluster

The Cluster Overview screen is the landing page of the web UI.

You get to see the overall layout of the cluster — the individual brokers that make it up, their addresses and some key broker stats — whether they are a controller and the number of partitions each broker owns. The latter is quite important — as your cluster size and the number of topics (and therefore partitions) grows, you generally want to see an approximately level distribution of partitions across the cluster.

Next is the Topics List, which in most cases is what you’re really here for. Any reasonably-sized microservices-based ecosystem might have hundreds, if not thousands of topics. As you’d expect, the list is searchable. The stats displayed alongside each topic are fairly ho-hum. The one worth noting is the under-replicated column.

Essentially, it’s telling us the number of partition replicas that have fallen behind the primary. Zero is a good figure. Anything else is indicative of either a broker or a network issue that requires immediate attention.

Note: Kafdrop is a discovery exploration tool; it is not a real-time monitoring tool. You should instrument your brokers and raise alerts when things go awry.

Listing Topics

Click on a topic on the list to get to the Topic Overview screen.

The screen is subdivided into four sections.

On the top-left, there is a summary of the topic stats — a handy view, not dissimilar to what you would have seen in the cluster overview.

On the top-right, you can view the custom configuration. In the example above, the topic runs a stock-standard config, so there’s nothing to see. Had the configuration been overridden, you’d see a set of custom values like in the example below.

The bottom-left section enumerates over the partitions. The partition indexes are links — clicking through will reveal the first 100 messages in the topic.

There are several interesting parameters displayed in this section:

  • Partition ID: The zero-based index of the partition within its encompassing topic.
  • First offset: The offset of the earliest message in the partition. If the partition is empty, the first offset is the same as the high-water mark.
  • Last offset: The high-watermark of the partition, is the offset that will be assigned to the next published message.
  • Size: The number of messages in the partition.
  • Leader Node: The ID of the broker node that is presently acting as the leader.
  • Replica Nodes: The IDs of all broker nodes that hold a replica of the partition. This includes the leader ID.
  • In-sync Replica Nodes: The IDs of replica nodes that are in-sync, inclusive of the leader node.
  • Offline Replica Nodes: The IDs of replica nodes that are currently offline. Under healthy conditions, this should be an empty set.
  • Preferred Leader: Whether the current leader node happens to be the preferred one.
  • Under-replicated: Whether the partition is under-replicated, i.e. there is at least one replica that is not in sync with the primary.

The consumers' section on the bottom-right lists the consumer group names as well as their aggregate lag (the sum of all individual partition lags).

Viewing Consumer Groups

Clicking on the consumer group on the Topic Overview gets you into the Consumer View. This screen provides a comprehensive breakdown of a single consumer group.

The view is sectioned by topic. For each topic, a separate table lists the underlying partitions. Against each partition, we see the committed offset, which we can compare against the first and last offsets to see how our consumer is tracking. Conveniently, Kafdrop displays the computed lag for each partition, which is aggregated at the footer of each topic table.

Note: Some amount of lag is unavoidable. For every message published, there will invariably be a quantum of time between the point of publishing and the point of consumption. In Kafka, this period is usually in the order of tens or hundreds of milliseconds, depending on both the producer and consumer client options, network configuration, broker I/O capabilities, the size of the pagecache and a myriad of other factors.

What you need to look out for is growing lag — suggesting that the consumer is either unable to keep up or has stalled altogether. In the latter case, you’ll also notice that the lag isn’t being depleted even when the producer is idling. This is when you’ll need to ALT-TAB away from Kafdrop into your favourite debugger.

Viewing Messages

The Message View screen is the coveted topic viewer that has in all likelihood brought you here. You can get to the message view in one of two ways:

  1. Click the View Messages button on the Topic Overview screen.
  2. Click the individual partition link in the Topic Overview.

It’s exactly what you’d expect — a chronologically-ordered list of messages (or records, in Kafka parlance) for a chosen partition.

Each entry conveniently displays the offset, the record key (if one is set), the timestamp of publication, and any headers that may have been appended by the producer.

There’s another little trick up Kafdrop’s sleeve. If the message happens to be a valid JSON document, the topic viewer can nicely format it. Click on the green arrow on the left of the message to expand it.

In Conclusion

The more you use Kafka, the more you come to discover and appreciate its true potential — not just as a versatile event streaming platform, but as a general-purpose messaging middleware that lets you assemble complex business systems from asynchronous, loosely coupled services.

You’ll invariably experience the frustrations that are to be expected from a technology that has only recently entered the mainstream, by comparison with the more mature MQ brokers of yore. What’s reassuring is that the Open Source community hasn’t stood still, producing an evolving ecosystem, complete with the documentation and tooling necessary for us to get on with our jobs. The least we can do in return is raise a pull request once in a while or maybe answer a StackOverflow question or three.

Was this article useful to you? I’d love to hear your feedback, so don’t hold back! If you are interested in Kafka or event streaming, or just have any questions, follow me on Twitter.