Kafka
Last updated
Last updated
The following notes are all taken from reading HelloInterview:
When an event happens, the producer creates a message (also called a record) and sends it to a Kafka topic. Each message includes a required value field and three optional fields:
Key: Determines which partition the message goes to.
Timestamp: Helps to order messages within a partition.
Headers: Key-value pairs, similar to HTTP headers, used to store metadata about the message.
Partition Assignment: Kafka assigns messages to partitions based on their key. If a message has no key, it uses a round-robin or another set rule. Messages with the same key always go to the same partition, keeping them in order.
Broker Selection: Kafka identifies which broker handles the partition using cluster data. The producer then sends the message directly to that broker.
Composed of multiple brokers
More brokers = higher scalability for storage and client handling.
The servers (physical/virtual) that hold the "queue".
Stores data and manages client requests.
Ordered, immutable sequence of messages, like a log file.
Key for scaling, as partitions enable parallel message consumption.
Logical grouping of partitions.
Used for publishing and subscribing to data.
Supports multiple producers writing data simultaneously.
Topic: Logical organization of messages.
Partition: Physical organization of messages (can span multiple brokers).
Producers: Write data to topics.
Consumers: Read data from topics.
Kafka provides APIs for both but leaves message creation/processing to developers.
Message Queue: Consumers acknowledge messages after processing.
Stream: Consumers process messages without acknowledgments, enabling complex processing.