Advanced Topics

For scaling Kafka, focus on partitioning (key choice and number of partitions) and adding brokers. For fault tolerance, use replication and track consumer offsets. To improve performance, batch and compress messages, and always think about efficient partitioning.

Kafka Broker Constraints:

  • A single broker can store ~1TB and handle ~10,000 messages/sec (depends on hardware).

  • Keep Kafka messages small (<1MB) for optimal performance; Kafka is not for storing large files.

  • Use Kafka for small messages like pointers (e.g., store large videos in S3, not in Kafka).

Scalability

  • Horizontal Scaling: Add more brokers to distribute load. Ensure enough partitions to utilize all brokers.

  • Partitioning Strategy: Choose a good key for partitioning (e.g., ad ID). A bad key can cause "hot partitions" (overloaded).

  • Use random partitioning, salting (adding randomness), or compound keys to handle hot partitions.

Fault Tolerance & Durability

  • Replication: Each partition is replicated to ensure durability. The replication factor (e.g., 3) defines how many replicas exist.

    • The replication factor should not exceed the total number of brokers in your cluster.

    • Rule of Thumb:

      • Ensure the replication factor is less than or equal to the number of brokers.

  • Producer Acknowledgments (acks): Set acks=all for maximum durability—ensures all replicas confirm receipt of a message.

  • Consumer Recovery: Offsets are tracked to ensure consumers can pick up where they left off if they crash. Rebalancing happens automatically if a consumer fails.

Errors & Retries

  • Producer Retries: Kafka producers automatically retry sending failed messages with a configurable number of attempts.

  • Consumer Retries: Kafka doesn't handle retries for consumers natively, but you can set up a separate "dead letter queue" (DLQ) for retrying or logging failed messages.

Performance Optimizations

  • Batching: Send messages in batches to reduce overhead. Adjust maxSize and maxTime for better throughput.

  • Compression: Compress messages (e.g., GZIP) to improve speed by reducing message size.

  • Partitioning Strategy: Ensure even distribution across partitions for better parallelism and throughput.

Retention Policies

  • Kafka allows setting a retention period for messages via retention.ms (time-based) or retention.bytes (size-based). The default is 7 days or 1GB.

  • If you need longer storage, adjust retention settings—but be mindful of storage costs and performance trade-offs.

Last updated