A small introduction into Kafka and what use cases it could fulfill for Vaamo

To keep our architecture up to date with industry standards and to make sure our system can cope with the increasing requirements of a growing system (and userbase), we always look into tools that could help us accomplish what we want to accomplish better, faster, and more robust. One such tool that established its place in the industry in recent years is Kafka.

Kafka is an open source, distributed, publish-subscribe messaging system. Without going into too much detail, this means in a nutshell, it runs in a cluster and there are message producers and consumers. This is cool because it brings several guarantees: As messages are replicated in the cluster, kafka nodes can be down without messages getting lost. As messages are stored on disk for a configurable amount of time, message recipients can also be down without messages getting lost. Furthermore, in comparison to other ways of communication between components (e.g. http), message recipients can choose when to fetch new messages, which protects the recipient against getting overwhelmed by huge numbers of messages.

Now, you surely wonder: How do I use this cool stuff? It essentially is as easy as this:

val producer = new KafkaProducer[String, String](props)
for (nEvents <- Range(0, 1000000)) {
  val randomNumber = rnd.nextInt(255).toString
  val msg = s"Event number $nEvents : I created the random number " + randomNumber
  val data = new ProducerRecord[String, String](topic, randomNumber, msg)

  producer.send(data)
}

And boom, you just sent a million messages to your Kafka cluster. (There is of course some setup involved, but you can see the details of that here)

Reading on the other side, is similarly straightforward:

val consumer = new KafkaConsumer[String, String](props)

consumer.subscribe(Collections.singletonList(this.topic))

while (true) {
  val records = consumer.poll(1000)
  for (record <- records) {
    println("Received message: (" + record.key() + ", " + record.value() +
                       "))
  }
}

meep meep

Now you probably wonder: What can Kafka do, that other messaging queues cannot do. The answer is pretty simple: Speed. Kafka trades off certain securities like keeping track of the read status of messages or fancy configuration for blazing speed (and overall ressource efficiency).


After learning about what Kafka does, we looked into possible use cases for vaamo. In the short term those could be fueling our reporting and data analytics solution with real-time data, asynchronous inter-service communication and log/metrics aggregation between services. In the long run, we could look at Kafka as a scalable CommandBus or EventBus (If you wonder what those are, please watch this great talk by Raimo Radczewski). Kafka could then also enable all of our services to react to Events that are published in response to user actions in our app.

Since this post just scratches the surface of the surface of what Kafka does, I can highly recommend this post by Kevin Sookocheff to understand how Kafka works under the hood.

Do you already have experience with using Kafka in production? What has it enabled you to do and what are caveats you encountered? Or are you in a similar situation like us and wondering whether Kafka is the right tool for the job? Let us know!