16/03/2026
4 min read
System Design
ChaiCode
kafka
queue
Suppose you are watching a cricket match IND vs AUS, World Cup Final 2023 :( . While checking the live score, you expect updates to appear in the correct sequence.
For example:
After 2 overs → IND 10/0
After 5 overs → IND 45/0
After 8 overs → IND 50/2
Now imagine if the updates appeared like this instead:
After 5 overs → IND 45/0
After 2 overs → IND 10/0
After 8 overs → IND 50/2
This would mean the events(live scores) are arriving out of order, which creates confusion for users.
So now think that millions of people are watching the match live on platforms like Cricbuzz. The score updates are processed by multiple servers running in a distributed system. Despite this complexity, the score almost always appears in the correct order.
Simple Analogy to understand multiple servers running in a distributed system :
Imagine a restaurant where thousands of customers arrive at once.
If only one waiter handled all the orders, everything would become slow and chaotic.
Instead, the restaurant uses many waiters working together so orders can be processed faster.
Similarly, apps like Cricbuzz use many computers working together to handle millions of live score requests.
So the question is:
How does Cricbuzz ensure that live score updates always appear in the correct order?
This is where tools like Apache Kafka come in.
Kafka is a distributed event streaming platform designed to handle large amounts of real-time data.
In simple terms, it acts like a central pipeline for events.
Whenever something happens in a system, for example:
a cricket ball is bowled
a run is scored
a wicket falls
that event is sent to Kafka.
Kafka then:
stores the events in order
processes them very quickly
delivers them to multiple services that need the data
This allows platforms like Cricbuzz to process millions of live score updates in real time while keeping the sequence correct.
Its mainly consists of three important parts:
A topic is simply a name for a type of event.
For example, in a cricket scoring system we might have a topic like score_update
Every time something happens in the match a run, a wicket, message related to the score is sent to this topic.
You can think of a topic like a timeline where related events are recorded.
A producer is the system that sends messages to Kafka.
In our example, the scoring system that records each ball could act as a producer.
Whenever a ball is completed, it sends a message like this:
Message {
topic: "score_update",
body: {
over: 5.3,
runs: 4,
batsman: "Rohit"
}
}
Kafka receives this message and stores it inside the score_update topic.
Now imagine millions of people checking the score at the same time.
One machine processing all the messages would become slow.
So Kafka splits a topic into multiple partitions.
Each partition can be processed by a different machine, which allows the system to handle a large amount of data efficiently.
Example:
topic : score_update
[ partition 0 , partition 1, partition 2]
Messages are distributed across these partitions.
Kafka often uses a partition key (such as match_id) so that all events for the same match go to the same partition. This ensures that ball-by-ball updates remain in order.
Partition = ordered sequence of messages
A consumer is a system that reads events from Kafka.
For example:
a service that updates the live score API
a service that sends notifications
a service that calculates match statistics
Consumers subscribe to a topic to read its events.
When they connect to Kafka, they also provide a consumer group id.
Example:
consumerGroupId: "SEND_LIVE_SCORE"
topic: "score_update"
The consumer group tells Kafka that these consumers belong to the same processing group.
Kafka then distributes the partitions among the consumers in that group so that the work is shared.
Every message is available to all consumer groups.
Inside each consumer group, partitions are assigned to consumers so that only one consumer processes messages from a partition.
This is why Kafka is used in systems where maintaining event order while enabling parallel processing across multiple machines is critical.