rabbitMQ & kafka

these services are considered message brokers
facilitate communication between different components of a distributed application by allowing for the asynchronous exchange of messages
used to scale services - scalability for dummies

RabbitMQ (Traditional Q)

take in messages → queue them → send to consumer
handles moderate data volumes
handles complex message routing
typical use-cases: (1) job worker system, (2) message queue, (3) decoupling microservices

Kafka

Kafka is essentially a stream processing system - designed to take in a stream of events, and send them off to a bunch of consumers
features high throughput
retains messages till TTL expires - consumed messages can be replayed after
fan out - each consumer connected to the queue will get one copy of each event
typical use-cases: (1) streaming data processing, (2) event bus, (3) logging systems, (4) real time communication

Kafka vs. RabbitMQ

Kafka is better suited for situations where we want to “fan out”
- eg. logs, data analysis, real-time updates (from the same event)
- RabbitMQ will handle each event individually (one processor for each event)
with Kafka, the producer handles message routing - part of the reason why Kafka allows for such high throughput
- to scale, we can have multiple Kafka partitions, and hash the message on the producer side to determine which partition the message is sent to
- however, there is no control over where the messages go after they have been produced
RabbitMQ uses “exchanges” - takes in all messages and routes them to different queues
- can also duplicate messages, to enable a “fan out” style
- consumers have control over what messages they are consuming
summary: Kafka is better suited for messages that take uniform (and short) time to process, and for messages that “fan out”
- RabbitMQ is better suited for long running (unknown time) tasks, messages with complex routing, and sporadic / bursty data flow

What about ACKs? What if a Consumer Fails.

with Kafka, we don’t use ACKs - we use offsets
- consumers fetch from Kafka, and then commits offset when they’re done
- ie. a consumer fails → offset is not committed → another consumer picks up
while RabbitMQ uses ACKs
- consumers subscribes to Q for new data, and sends an ACK after processing
- ie. if a consumer fails → ACK is not received → after timeout, RabbitMQ sends to another consumer
usually, services that use Kafka can tolerate messages being dropped to some extent

kenf's 2¢s