👨‍💻
Mike's Notes
  • Introduction
  • MacOs Setup
    • System Preferences
    • Homebrew
      • Usage
    • iTerm
      • VIM
      • Tree
      • ZSH
    • Visual Studio Code
    • Git
    • SSH Keys
  • DevOps Knowledge
    • SRE
      • Scaling Reliably
        • Splitting a Monolith into Microservices
      • Troubleshooting Common Issues
      • Service Level Terminology
      • Toil
      • Monitoring
      • Release Engineering
      • Best Practices
      • On-Call
      • Alerting
    • Containers
      • Docker
        • Best Practices
          • Image Building
          • Docker Development
        • CLI Cheat Sheet
      • Container Orchestration
        • Kubernetes
          • Benefits
          • Cheat Sheet
          • Components
          • Pods
          • Workload Resources
          • Best Practices
    • Developer Portal 👨‍💻
      • Solution Overview 🎯
      • System Architecture 🏗️
      • Implementation Journey 🛠️
      • Cross-team Collaboration 🤝
      • Lessons & Future 🎓
    • Provisioning
      • Terraform
        • Installation
        • Usage
    • Configuration Management
      • Ansible
        • Benefits
        • Installation
    • Build Systems
      • Bazel
        • Features
  • Security
    • Secure Software Engineering
    • Core Concepts
    • Security Design Principles
    • Software Security Requirements
    • Compliance Standards and Policies
      • Sarbanes-Oxley (SOX)
      • HIPAA and HITECH
      • Payment Card Industry Data Security Standard (PCI-DSS)
      • General Data Protection Regulation (GDPR)
      • California Consumer Privacy Act (CCPA)
      • Federal Risk and Authorization Management Program (FedRAMP)
    • Privacy & Data
  • Linux Fundamentals
    • Introduction to Linux
    • Architecture
    • Server Administration
      • User / Groups
      • File Permissions
      • SSH
      • Process Management
    • Networking
      • Diagrams
      • Browser URL Example
      • Network Topologies
      • Signal Routing
      • DNS (Domain Name System)
      • SSL (Secure Sockets Layer)
      • TLS (Transport Layer Security)
  • System Design
    • Process
    • Kafka
      • Advanced Topics
    • URL Shortener
Powered by GitBook
On this page
  • Overview
  • Terminology
  • Kafka Cluster
  • Broker
  • Partition
  • Topic
  • Topic vs Partition
  • Producers and Consumers
  • Message Queue vs Stream

Was this helpful?

  1. System Design

Kafka

PreviousProcessNextAdvanced Topics

Last updated 5 months ago

Was this helpful?

The following notes are all taken from reading :

Overview

When an event happens, the producer creates a message (also called a record) and sends it to a Kafka topic. Each message includes a required value field and three optional fields:

  1. Key: Determines which partition the message goes to.

  2. Timestamp: Helps to order messages within a partition.

  3. Headers: Key-value pairs, similar to HTTP headers, used to store metadata about the message.

Partition Assignment: Kafka assigns messages to partitions based on their key. If a message has no key, it uses a round-robin or another set rule. Messages with the same key always go to the same partition, keeping them in order.

Broker Selection: Kafka identifies which broker handles the partition using cluster data. The producer then sends the message directly to that broker.

Terminology

Kafka Cluster

  • Composed of multiple brokers

  • More brokers = higher scalability for storage and client handling.

Broker

  • The servers (physical/virtual) that hold the "queue".

  • Stores data and manages client requests.

Partition

  • Ordered, immutable sequence of messages, like a log file.

  • Key for scaling, as partitions enable parallel message consumption.

Topic

  • Logical grouping of partitions.

  • Used for publishing and subscribing to data.

  • Supports multiple producers writing data simultaneously.

Topic vs Partition

  • Topic: Logical organization of messages.

  • Partition: Physical organization of messages (can span multiple brokers).

Producers and Consumers

  • Producers: Write data to topics.

  • Consumers: Read data from topics.

  • Kafka provides APIs for both but leaves message creation/processing to developers.

Message Queue vs Stream

  • Message Queue: Consumers acknowledge messages after processing.

  • Stream: Consumers process messages without acknowledgments, enabling complex processing.

HelloInterview
Kafka Message Structure
Kafka Architecture