Sunday, March 23, 2025

Kafka Idempotent Producer

Idempotent Kafka Producer

When an application publishes events to a Kafka topic there is a risk that duplicate events can be written in failure scenarios, and consequently message ordering can be lost. This can be avoided by configuring the Kafka Producer to be idempotent. This article describes how duplicate events can be published and how to make the Producer idempotent.

Duplicate Messages

Duplicate messages can occur in the scenario where:

  • A Producer attempts to write a message to a topic partition.
  • The broker does not acknowledge the write due to some transient failure scenario.
  • The Producer retries as it does not know whether the write succeeded or not.
  • If the Producer is not idempotent and the original write did succeed then the message would be duplicated.
Duplicate message scenario

By configuring the Producer to be idempotent, each Producer is assigned a unique Id (PID) and each message is given a monotonically increasing sequence number. The broker tracks the PID + sequence number combination for each partition, rejecting any duplicate write requests it receives.

Idempotent producer behavior

Idempotent Producer Configuration

The Kafka Producer configuration enable.idempotence determines whether the producer may write duplicates of a retried message to the topic partition when a retryable error is thrown.

To ensure idempotent behavior, acks must be set to all. The leader waits until the minimum required number of in-sync replicas acknowledge the message before responding.

If retries = 0, the Producer will not retry and may dead-letter messages unnecessarily. This is not recommended.

Unlike implementing an idempotent consumer, enabling an idempotent producer requires no code changes—only configuration.

Producer & Consumer Timeouts

It is recommended to leave retries at the default (max integer) and instead limit retries by time using delivery.timeout.ms.

If the timeout exceeds the consumer poll timeout, the consumer may be removed from the group, causing duplicate downstream events.

Guaranteed Message Ordering

The max.in.flight.requests.per.connection setting increases throughput by allowing multiple unacknowledged requests.

  • If the Producer is not idempotent and this value > 1 → ordering may break.
  • If the Producer is idempotent → ordering is guaranteed up to a value of 5.
Message ordering and in-flight requests

Recommended Configuration

Recommended Kafka producer configuration

Client Library Support

Kafka Java client defaults changed in 3.0.0 to enable.idempotence=true and acks=all.

KafkaJS marks idempotence as experimental. librdkafka added full support in v1.4.0.

Problem with Retries

Retrying a message can cause duplicates if the broker wrote the message but the acknowledgment was lost.

Duplicate writes due to retry

Kafka Idempotent Producer

Idempotent producer internals

When enable.idempotence=true, each producer gets a PID and each message gets a sequence number. The broker tracks the highest sequence number per PID and discards duplicates.

Java example:

Properties properties = new Properties();
properties.setProperty(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");

Overall, enabling idempotence is recommended for all Kafka producers.

No comments:

Post a Comment

Kafka Partition

🧩 What Exactly Is a Partition in Kafka? A partition is the fundamental unit of storage, parallelism, and scalability in Kafka. Think of ...