Kafka

Question

Q1. What does fetch.max.bytes do in Kafka consumer configuration?

Answer 1

It defines the maximum number of bytes a consumer can receive in a single fetch response. It is a hard cap across all partitions in a fetch cycle.

Kafka will not return that record in the fetch response. The message is effectively "skipped" until fetch.max.bytes is increased to accommodate it.

No. It silently skips the record. The consumer just keeps polling and receives nothing.

Kafka includes only as many messages as it can fit within the limit. The rest will be delivered in the next fetch call.

Aspect	Non-Transactional (Idempotence Only)	Transactional (Exactly-Once)
No duplicates	Yes (for producer retries), but not end-to-end	Yes, end-to-end
Ordering preserved (per partition)	Yes	Yes
No message loss	No (seq=0 loss, offset-commit race)	Yes (within transactions)
Atomic write + offset commit	No	Yes (via `sendOffsetsToTransaction`)
Exactly-once semantics	No	Yes

Offset	Meaning
`currentOffset`	Where the next `poll()` will start
`endOffset`	The next available offset, NOT the last existing record
`endOffset - 1`	Points to the actual last message in the topic/partition

Kafka Broker	Kafka Partition
A Kafka server (process running in the cluster)	A data structure (log) inside a topic
Stores and manages partitions	Contains the actual messages (data)
Communicates with producers and consumers	Used for parallelism in writing/reading data
Can act as leader or follower for partitions	Has a single leader, rest are replicas

Message	Offset
A	0
B	1
C	2

Saturday, January 3, 2026

📌 A Partition Is an Ordered Log

📦 A Topic Is Split Into Multiple Partitions

🔁 Replication Happens Per Partition

⚙️ Why Partitions Matter

1. Scalability

2. Fault Tolerance

3. Ordering Guarantees

🧠 A Simple Visual

🔍 Want to Go Deeper?

🧩 How Kafka Partitions Map to Segment Files on Disk

📁 1. Partition → Directory

📄 2. A Partition Is Split Into Multiple Segment Files

🧱 3. What’s Inside a Segment?

a) .log file

b) .index file

c) .timeindex file

🔍 4. How Kafka Uses Segments

Appending

Reading

⚙️ 5. Why Kafka Uses Segment Files

✔ Infinite log size

✔ Fast recovery

✔ Efficient retention

✔ High throughput

🧠 6. How Offsets Map to Segments

🧩 7. How This Relates to Replication

🔥 8. Why This Matters for You

🚀 Want to go deeper?

🧩 Step 1 — Kafka lists all segment base offsets

🧠 Step 2 — Kafka chooses the correct segment

✅ Segment B (base offset 1000)

📁 Step 3 — Kafka opens the .index file for Segment B

🧭 Step 4 — Kafka uses sparse index lookup

📖 Step 5 — Kafka reads sequentially from that byte

🧨 Putting it all together

🚀 If you want, I can also explain:

Kafka Idempotence, Transactions, Offsets, and Exactly-Once Semantics

1. Idempotent producer: what it is and what it guarantees

1.1. Core idea

1.2. The idempotence protocol: PID and sequence numbers

1.3. The seq=0 loss scenario

1.4. Why you cannot “fix” seq=0 by renumbering

2. Transactions: fixing the limitations of idempotence

2.1. What transactions add on top of idempotence

2.2. Transactional producer flow (high level)

2.3. Why transactions eliminate the seq=0 loss scenario

3. Committing consumer offsets and why it matters

3.1. What is a consumer offset?

3.2. The classic failure problem: read → process → write

3.3. Transactions + sendOffsetsToTransaction()

4. Failure scenarios in the read → process → write pipeline

4.1. Non-transactional pipeline (unsafe)

Scenario A — Commit offset before writing output (loss)

Scenario B — Write output before committing offset (duplicates)

4.2. Transactional pipeline (safe)

Scenario C — Transaction commits successfully

Scenario D — Failure before commit → transaction aborts

4.3. Summary table: non-transactional vs transactional

5. Transaction coordinator and broker crash behavior (high level)

5.1. Transaction coordinator state machine (conceptual)

5.2. Broker crash and recovery

6. The core use case: exactly-once processing, not “reading A and B in order”

7. Quick mental model recap

Wednesday, April 16, 2025

Kafka Consumer Fetch Tuning - Interview Guide

⚙️ Core Settings

📈 Example Scenario

🧠 Diagram - Fetch Behavior Matrix

🎯 Interview Questions & Answers

📘 Bonus: Java Config Snippet

💡 Interview Tips

Monday, April 14, 2025

endOffsets()

🔢 1. Understanding endOffsets()

🧠 Example:

🔍 Why Not Use endOffset Directly?

✅ Summary

🧠 Interview Line:

Saturday, April 12, 2025

a) `.log` file

b) `.index` file

c) `.timeindex` file

📁 Step 3 — Kafka opens the `.index` file for Segment B

`endOffsets()`

🔢 1. Understanding `endOffsets()`

🔍 Why Not Use `endOffset` Directly?