Overview
- A Kafka cluster is composed of multiple brokers (servers)
- Each broker is identified by a unique integer ID
- Each broker stores specific topic partitions
- Connecting to any broker (called a bootstrap broker) allows access to the entire cluster
- Kafka clients automatically discover all brokers using metadata
- A typical starting point: 3 brokers
- Large clusters may have 100+ brokers
- In examples, brokers are numbered starting from 100 (arbitrary choice)
Example: Topic Distribution
- Topic-A: 3 partitions
- Topic-B: 2 partitions
- Data is distributed across brokers; some brokers may have no data for a topic
- Example: Broker 103 has no Topic-B data

Kafka Broker Discovery
- Every broker is also called a bootstrap server
- You only need to connect to one broker — the client will discover the rest
- Each broker stores metadata about:
- All brokers
- All topics
- All partitions

Replication Factor
- Topics should have replication factor > 1 (commonly 2 or 3)
- This ensures high availability — if one broker goes down, others can serve the data
- Example:
- Topic-A has 2 partitions
- Replication factor = 2

- Example: Losing Broker 102
- Brokers 101 and 103 still serve all data

Partition Leaders
- Only one broker can be the leader for a partition at a time
- Producers send data only to the leader broker of a partition
- Other brokers store replicas of the data (called ISR — In-Sync Replicas)

- Producers write only to the leader broker
- Consumers by default read from the leader broker

Reading from Closest Replica (Since Kafka 2.4)
- Consumers can be configured to read from the closest replica instead of the leader
- Benefits:
- Reduced latency
- Lower network costs (especially in cloud deployments)
