Kafka Confluent Certified Administrator (CCOAK) Preparation

Krishna Chaitanya Sarvepalli
4 min readMay 31, 2021

I was unable to find many resources while preparing for Kafka Administrator Certification. Finally I have cleared the exam and would like to share my preparation thoughts with the community, hoping that it will help for folks who is looking for some initial guidance.

You might have already reviewed the study questions or preparation posted in the Confluent website. This was my first stop to check the depth that we need to prepare for the exam. Please refer this link. https://www.confluent.io/certification/

Coming back to my experience with Kafka, I have close to 4 years of handling production applications, which helped me in understanding in depth for Kafka Admin Certification Preparation. At my company, we are using Kafka heavily for all Async based systems, Event Processing and several other use cases. We have faced handful of (good) issues which helped me in troubleshooting & optimizing Producer, Consumer and Kafka Brokers Configurations.

My Preparation Journey:

  1. Kafka The Definitive Guide https://www.confluent.io/resources/kafka-the-definitive-guide-v2/ (Free ebook with registration) — This book has covered in depth for Security and ACL’s.
  2. Configuration parameters: https://docs.confluent.io/platform/current/installation/configuration/ This might be little overwhelming to go through all the configs. But you need to start practicing high importance properties.
  3. Kafka Streams: https://www.manning.com/books/kafka-streams-in-action — This book has cleanly explained concepts for KTable & KStreams. Although you won’t get many questions in the exam, Going through this book helped me in understanding Kafka Streams much better.
  4. Schema Registry, Kafka Connect and KSQL DB: — We need to have basic understanding and usage for Schema Registry, Connect and KSQL. KSQL is a wrapper around Kafka Streams.

Udemy Courses:

If you search in Udemy for Kafka you will see bunch of courses by “Stephane Maarek”. All of them are hands on and will help in implementing the concepts My company provides Udemy business account, so explored almost all videos that Stephen has provided and really helped me in understanding in detail.

Whether to go with Udemy courses vs recommended books, it’s more of a personal choice and the time availability. Personally I went through both the books (Definitive guide & Streams in Action) and practiced few courses.

Udemy Practice Tests:

Only two practice tests available in Udemy.

  1. By Stephane Maarek:

2. By Bhavuk Chawla — Highly recommended.

Concepts to learn:

Apart from basics, the following advanced concepts will help in clearing the exam.

  1. Broker Concepts:
  • What is the purpose of __consumer_offsets topic.?
  • What is the role of Group Coordinator.?
  • What is the role of Controller.?
  • Important Broker Configurations,
replica.lag.time.max.ms
leader.imbalance.check.interval.seconds
auto.leader.rebalance.enable
message.max.bytes
log.retention.bytes vs log.segment.bytes

2. Kafka Cluster Concepts:

  • What would be the suggested approach to handle broker failures like replication factor, retries etc.
  • What is the responsibility of zookeeper, what kind of information Kafka will store in ZK (This might be obsolete once KIP-500 implementation is released.)
  • How to update Kafka configurations dynamically and where will that configuration be stored.?
  • What is zero copy transfer.?

3. Producer Concepts:

  • Idempotent Producer
  • How to achieve exactly once semantics.?
  • Transactions and committing to multiple topics
  • ACLs for specifying what operations allowed for a topic.
  • How brokers handle if the messages from producer were sent with compression vs without compression.?
  • Avro serialization → backward, forward and full compatibility
  • How to achieve maximum throughput from producer side.?
  • How to achieve minimum latency for realtime scenarios.?
  • Important Producer configurations,
batch.size and linger.ms
compression.type
max.in.flight.requests.per.connection
enable.idempotence
acks = [0,1,All] --> very important

4. Consumer Concepts:

  • Group Leader responsibilities
  • Manual Commit — Sync vs Async → why to choose one over the other.?
  • Consumer Rebalance Listener
  • Important Consumer Configurations,
session.timeout.ms vs heartbeat.interval.ms vs max.poll.interval.ms
auto.offset.reset = [EARLIEST, LATEST, NONE]
  • How do we avoid duplicates for slow running consumer, meaning business logic takes too much of time.?

It’s a long journey and with multiple reschedules (little scared) and support from my family, I was able to clear the Kafka administration certification.

Kafka Certified…

Please let me know your feedback in comments and if you are interested in learning more or if you want me to cover any other information. I am happy to take feedback and update any section.

--

--

Krishna Chaitanya Sarvepalli

Solution Architect @TSYS Good @ Java, Kubernetes, Kafka, AWS cloud, devops , architecture and complex problems