Member-only story
Poll batches of Kafka messages with confluent-kafka-python with ThreadPoolExecutor
You should have some knowledge of writing Python code and how Apache Kafka consumer polling works.
The Challenge
I was tasked to create Event Driven microservices by consuming messages that are being produced and down-streamed into Apache Kafka from our MySQL databases with Debezium.
I am going to use Python with confluent-kafka-python for this as all the messages are stored in Avro format and the library provides Avro deserializer and client to Schema Registry to ease the process of deserializing and serializing without the need to maintain the schema manually.
To speed up the consumption of Kafka messages and ultimately increases the speed of the microservices, ideally the messages should be polled in batches. Unfortunately, I am not able to retrieve messages in batch with confluent-kafka-python consumer.
The Solution
We are going to use asynio.gather() and ThreadPoolExecutor to poll a batch of messages from Apache Kafka.