Apache Kafka Knowledge Quick Reference

Kafka Theory

Cluster
Rack
Broker

Every broker in Kafka is a “bootstrap server” which knows about all brokers, topics and partitions (metadata) that means Kafka client (e.g. producer, consumer etc) only need to connect to one broker in order to connect to entire cluster.
At all times, only one broker should be the controller, and one broker must always be the controller in the cluster

Topic

Kafka takes bytes as input without even loading them into memory (that’s called zero copy)
Brokers have defaults for all the topic configuration parameters

Partition

Topic can have one or more partition.
It is not possible to delete a partition of topic once created.
Order is guaranteed within the partition and once data is written into partition, its immutable!
If producer writes at 1 GB/sec and consumer consumes at 250MB/sec then requires 4 partition!

Segment

Partitions are made of segments (.log files)
At a time only one segment is active in a partition
log.segment.bytes=1 GB (default) Max size of a single segment in bytes
log.segment.ms=1 week (default) Time Kafka will wait before closing the segment if not full
Segment come with two indexes (files)
An offset to position index (.index file): Allows Kafka where to read to find a message
A timestamp to offset index (.timeindex file): Allows Kafka to find a message with a timestamp
log.cleanup.policy=delete (Kafka default for all user topics) Delete data based on age of data (default is 1 week)
log.cleanup.policy=compact Delete based on keys of your messages. Will delete old duplicate keys after the active segment is committed. (Kafka default for topic __consumer_offsets)
Log cleanup happen on partition segments. Smaller/more segments means the log cleanup will happen more often!
The cleaner checks for work every 15 seconds (log.cleaner.backoff.ms)
log.retention.hours= 1 week (default) number of hours to keep data for
log.retention.bytes = -1 (infinite default) max size in bytes for each partition
Old segments will be deleted based on log.retention.hours or log.retention.bytes rule
The offset of message is immutable.
Deleted records can still be seen by consumers for a period of delete.retention.ms=24 hours (default)

Offset

Partition is having its own offset starting from 0.

Topic Replication

Replication factor = 3 and partition = 2 means there will be total 6 partition distributed across Kafka cluster. Each partition will be having 1 leader and 2 ISR (in-sync replica).
Broker contains leader partition called leader of that partition and only leader can receive and serve data for partition.
Replication factor can not be greater then number of broker in the Kafka cluster. If topic is having a replication factor of 3 then each partition will leave on 3 different brokers.

Producer

Automatically recover from errors: LEADER_NOT_AVAILABLE, NOT_LEADER_FOR_PARTITION, REBALANCE_IN_PROGRESS
Non retriable errors: MESSAGE_TOO_LARGE
When produce to a topic which doesn’t exist and auto.create.topic.enable=true then kafka creates the topic automatically with the broker/topic settings num.partition and default.replication.factor

Producer Acknowledgment

acks=0: Producer do not wait for ack (possible data loss)
acks=1: Producer wait for leader ack (limited data loss)
acks=all: Producer wait for leader+replica ack (no data loss)

acks=all must be used in conjunction with min.insync.replicas (can be set at broker or topic level)
min.insync.replica=2 implies that at least 2 brokers that are ISR(including leader) must acknowledge
e.g. replication.factor=3, min.insync.replicas=2, acks=all can only tolerate 1 broker going down, otherwise the producer will receive an exception NOT_ENOUGH_REPLICAS on send

Safe Producer Config

min.insync.replicas=2 (set at broker or topic level)
retries=MAX_INT: number of reties by producer in case of transient failure/exception. (default is 0)
max.in.flight.per.connection number=5: number of producer request can be made in parellel (default is 5)
acks=all
enable.idempotence=true: producer send producerId with each message to identify for duplicate msg at kafka end. When kafka receives duplicate message with same producerId which it already committed. It do not commit it again and send ack to producer (default is false)

High Throughput Producer using compression and batching

compression.type=snappy: value can be none(default), gzip, lz4, snappy. Compression is enabled at the producer level and doesn’t require any config change in broker or consumer Compression is more effective in case of bigger batch of messages being sent to kafka
linger.ms=20:Number of millisecond a producer is willing to wait before sending a batch out. (default 0). Increase linger.ms value increase the chance of batching.
batch.size=32KB or 64KB: Maximum number of bytes that will be included in a batch (default 16KB). Any message bigger than the batch size will not be batched

Message Key

Producer can choose to send a key with message.
If key= null, data is send in round robin
If key is sent, then all message for that key will always go to same partition. This can be used to order the messages for a specific key since order is guaranteed in same partition.
Adding a partition to the topic will loose the guarantee of same key go to same partition.
Keys are hashed using “murmur2” algorithm by default.

Consumer

Per thread one consumer is the rule. Consumer must not be multi threaded.
records-lag-max (monitoring metrics) The maximum lag in terms of number of records for any partition in this window. An increasing value over time is your best indication that the consumer group is not keeping up with the producers.

Consumer Group
Consumer Offset

When consumer in a group has processed the data received from Kafka, it commits the offset in Kafka topic named _consumer_offset which is used when a consumer dies, it will be able to read back from where it left off.

Delivery Semantics

At most once : Offset are committed as soon as message batch is received. If the processing goes wrong, the message will be lost (it won’t be read again)
At least once (default): Offset are committed after the message is processed.If the processing goes wrong, the message will be read again. This can result in duplicate processing of message. Make sure your processing is idempotent. (i.e. re-processing the message won’t impact your systems). For most of the application, we use this and ensure processing are idempotent.
Exactly once: Can only be achieved for Kafka=>Kafka workflows using Kafka Streams API. For Kafka=>Sink workflows, use an idempotent consumer.

Consumer Offset commit strategy

enable.auto.commit=true & synchronous processing of batches**: with auto commit, offset will be committed automatically for you at regular interval (auto.commit.interval.ms=5000 by default) every time you call .poll(). If you don’t use synchronous processing, you will be in “at most once” behavior because offsets will be committed before your data is processed.
enable.auto.commit=false & manual commit of offsets (recommended)

Consumer Offset reset behavior

auto.offset.reset=latest: will read from the end of the log
auto.offset.reset=earliest: will read from the start of the log
auto.offset.reset=none: will throw exception of no offset is found
Consumer offset can be lost if hasn’t read new data in 7 days. This can be controlled by broker setting offset.retention.minutes

Consumer Poll Behaviour

fetch.min.bytes = 1 (default): Control how much data you want to pull at least on each request. Help improving throughput and decreasing request number. At the cost of latency.
max.poll.records = 500 (default): Controls how many records to receive per poll request. Increase if your messages are very small and have a lot of available RAM.
max.partition.fetch.bytes = 1MB (default): Maximum data returned by broker per partition. If you read from 100 partition, you will need a lot of memory (RAM)
fetch.max.bytes = 50MB (default): Maximum data returned for each fetch request (covers multiple partition). Consumer performs multiple fetches in parallel.

Consumer Heartbeat Thread

Heartbeat mechanism is used to detect if consumer application in dead.
session.timeout.ms=10 seconds (default) If heartbeat is not sent in 10 second period, the consumer is considered dead. Set lower value to faster consumer rebalances
heartbeat.interval.ms=3 seconds (default) Heartbeat is sent in every 3 seconds interval. Usually 1/3rd of session.timeout.ms

Consumer Poll Thread

Poll mechanism is also used to detect if consumer application is dead.
max.poll.interval.ms = 5 minute (default) Max amount of time between two .poll() calls before declaring consumer dead. If processing of message batch takes more time in general in application then should increase the interval.

Kafka Guarantees

Messages are appended to a topic-partition in the order they are sent
Consumer read the messages in the order stored in topic-partition
With a replication factor of N, producers and consumers can tolerate upto N-1 brokers being down
As long as number of partitions remains constant for a topic ( no new partition), the same key will always go to same partition

Client Bi-Directional Compatibility

an Older client (1.1) can talk to Newer broker (2.0)
a Newer client (2.0) can talk to Older broker (1.1)

Kafka Connect

Source connect: Get data from common data source to Kafka
Sink connect: Publish data from Kafka to common data source

Zookeeper

ZooKeeper servers will be deployed on multiple nodes. This is called an ensemble. An ensemble is a set of 2n + 1 ZooKeeper servers where n is any number greater than 0. The odd number of servers allows ZooKeeper to perform majority elections for leadership.
At any given time, there can be up to n failed servers in an ensemble and the ZooKeeper cluster will keep quorum. If at any time, quorum is lost, the ZooKeeper cluster will go down.
In Zookeeper multi-node configuration, initLimit and syncLimit are used to govern how long following ZooKeeper servers can take to initialize with the current leader and how long they can be out of sync with the leader.
If tickTime=2000, initLimit=5 and syncLimit=2 then a follower can take (tickTimeinitLimit) = 10000ms to initialize and may be out of sync for up to (tickTimesyncLimit) = 4000ms
In Zookeeper multi-node configuration, The server.* properties set the ensemble membership. The format is server.=::. Some explanation:
myid is the server identification number. In this example, there are three servers, so each one will have a different myid with values 1, 2, and 3 respectively. The myid is set by creating a file named myid in the dataDir that contains a single integer in human readable ASCII text.
This value must match one of the myid values from the configuration file. If another ensemble member has already been started with a conflicting myid value, an error will be thrown upon startup.
leaderport is used by followers to connect to the active leader. This port should be open between all ZooKeeper ensemble members.
electionport is used to perform leader elections between ensemble members. This port should be open between all ZooKeeper ensemble members.

Kafka CLI

Start a zookeeper at default port 2181

1
bin/zookeeper-server-start.sh config/zookeeper.properties

Start a kafka server at default port 9092

1
bin/kafka-server-start.sh config/server.properties

Create a kafka topic with name my-first-topic

1
2
3
4
5
6
bin/kafka-topics.sh `
  --zookeeper localhost:2181 `
  --topic my-first-topic `
  --create `
  --replication-factor 1 `
  --partitions 1

List all kafka topics

1
2
3
bin/kafka-topics.sh `
  --zookeeper localhost:2181 `
  --list

Describe kafka topic my-first-topic

1
2
3
4
bin/kafka-topics.sh `
  --zookeeper localhost:2181 `
  --topic my-first-topic `
  --describe

Delete kafka topic my-first-topic

1
2
3
4
bin/kafka-topics.sh `
  --zookeeper localhost:2181 `
  --topic my-first-topic `
  --delete

Note: This will have no impact if delete.topic.enable is not set to true

Find out all the partitions without a leader

1
2
3
4
bin/kafka-topics.sh `
  --zookeeper localhost:2181 `
  --describe `
  --unavailable-partitions

Produce messages to Kafka topic my-first-topic

1
2
3
4
5
6
7
8
bin/kafka-console-producer.sh `
  --broker-list localhost:9092 `
  --topic my-first-topic `
  --producer-property acks=all

>hello ashish
>learning kafka
>^C

Start Consuming messages from kafka topic my-first-topic

1
2
3
4
5
6
7
bin/kafka-console-consumer.sh `
  --bootstrap-server localhost:9092 `
  --topic my-first-topic `
  --from-beginning

>hello ashish
>learning kafka

Start Consuming messages in a consumer group from kafka topic my-first-topic

1
2
3
4
5
bin/kafka-console-consumer.sh `
  --bootstrap-server localhost:9092 `
  --topic my-first-topic `
  --group my-first-consumer-group `
  --from-beginning

List all consumer groups

1
2
3
bin/kafka-consumer-groups.sh `
  --bootstrap-server localhost:9092 `
  --list

Describe consumer group

1
2
3
4
bin/kafka-consumer-groups.sh `
  --bootstrap-server localhost:9092 `
  --describe `
  -group my-first-consumer-group

Reset offset of consumer group to replay all messages

1
2
3
4
5
6
7
8
bin/kafka-consumer-groups.sh `
  --bootstrap-server localhost:9092 `
  --describe `
  -group my-first-consumer-group `
  --reset-offsets `
  --to-earliest `
  --execute `
  --topic my-first-topic

Shift offsets by 2 (forward) as another strategy

1
2
3
4
5
6
7
bin/kafka-consumer-groups `
  --bootstrap-server localhost:9092 `
  --group my-first-consumer-group `
  --reset-offsets `
  --shift-by 2 `
  --execute `
  --topic my-first_topic

Shift offsets by 2 (backward) as another strategy

1
2
3
4
5
6
7
bin/kafka-consumer-groups `
  --bootstrap-server localhost:9092 `
  --group my-first-consumer-group `
  --reset-offsets `
  --shift-by -2 `
  --execute `
  --topic my-first_topic

Default Ports

Zookeeper:

Zookeeper: 2181
Zookeeper Leader Port: 3888
Zookeeper Election Port (Peer port): 2888

Kafka:

Broker: 9092
REST Proxy: 8082
Schema Registry: 8081
KSQL: 8088

Kafka Theory#

Kafka CLI#

Default Ports#

Kafka Theory

Kafka CLI

Default Ports