Read time: ~

Consuming Deeper: Groups, Partitions, Offsets, Concurrency

Consumer groups and partition assignment, the poll loop, offset commits and AckMode, auto.offset.reset, container concurrency, and seeking to replay.

Consuming is where most of the interesting behavior lives. A single @KafkaListener hides a poll loop, a consumer group, and an offset tracked per partition. This module is the canonical home for how offsets and partition assignment work, so the Payment service can scale out, survive restarts, and replay history when it needs to.


What you’ll be able to do after this module

  • Explain how a consumer group splits partitions across its instances.
  • Describe the poll, process, commit loop and where offsets are stored.
  • Choose between auto commit and manual commit with the right AckMode.
  • Set auto.offset.reset correctly and know when it bites.
  • Scale a listener with the container concurrency setting.
  • Seek to an offset to replay or skip records.

1. Consumer groups and partition assignment

A consumer group is a set of instances that share the reading of a topic. Kafka assigns each partition to exactly one instance in the group, so work is divided without any instance processing the same partition as another.

flowchart TD
    subgraph topic [orders topic]
        p0["partition 0"]
        p1["partition 1"]
        p2["partition 2"]
    end
    subgraph group [payment-service group]
        c1["instance 1"]
        c2["instance 2"]
        c3["instance 3"]
    end
    p0 --> c1
    p1 --> c2
    p2 --> c3

The number of partitions caps the useful parallelism. With three partitions, a fourth instance in the same group sits idle, because there is no unassigned partition to give it. This is why partition count, chosen back in Local Lab, sets the ceiling on how far a consumer group can scale.

Different groups are independent. The Notification service reading orders under its own group gets every record too, with its own offsets, which is the fan-out behavior from Core Concepts.


2. The poll, process, commit loop

Under the hood, the listener container runs a loop: fetch a batch of records, hand each to your method, then commit the offset of what it finished. The committed offset is stored back in Kafka, in an internal topic named __consumer_offsets.

sequenceDiagram
    participant Container as Listener container
    participant Broker as Kafka
    participant Handler as @KafkaListener method

    loop poll cycle
        Container->>Broker: poll() for assigned partitions
        Broker-->>Container: batch of records
        Container->>Handler: invoke per record
        Handler-->>Container: return normally
        Container->>Broker: commit offset
    end

The committed offset is the position the group resumes from after a restart or reassignment. Committing means “I am done up to here”, so committing before processing risks losing records, and committing after risks reprocessing them. That timing is exactly what the commit mode controls next.


3. Offset commits and AckMode

By default Spring commits offsets for you after the batch is processed. You can change when that happens with the container ack-mode.

AckModeCommits
BATCH (default)After the whole polled batch is processed
RECORDAfter each record’s handler returns
MANUALWhen you call acknowledge(), batched with the next poll
MANUAL_IMMEDIATEWhen you call acknowledge(), immediately

For most services the default is fine. Use manual acks when a record must only be marked done after a side effect succeeds, such as writing to the database.

spring:
  kafka:
    listener:
      ack-mode: manual
    consumer:
      enable-auto-commit: false
@KafkaListener(topics = "orders", groupId = "payment-service")
public void onOrderCreated(OrderCreated event, Acknowledgment ack) {
    paymentService.charge(event);   // do the work first
    ack.acknowledge();              // then commit the offset
}

Even with careful commits, at-least-once delivery means a record can be processed twice after a crash between the side effect and the commit. Making the consumer safe against that is covered in Idempotent Consumers, Ordering, and Duplicates.


4. auto.offset.reset: where a new group starts

When a group has no committed offset yet, for example a brand new groupId, the broker has no position to resume from. auto.offset.reset decides what happens.

ValueBehavior
latest (default)Start at the end; only read records produced from now on
earliestStart at the beginning; read the entire retained log
noneThrow an error if there is no committed offset
spring:
  kafka:
    consumer:
      auto-offset-reset: earliest

This setting only applies when there is no valid committed offset. It does not run on every restart. A common surprise is a new service that misses historical events because it defaulted to latest, or one that unexpectedly replays everything because someone set earliest on a huge topic.


5. Concurrency: scaling one application

You do not need multiple application instances to consume partitions in parallel. The container concurrency setting runs several consumer threads inside one application, each assigned a share of the partitions.

spring:
  kafka:
    listener:
      concurrency: 3

With concurrency: 3 on a six-partition topic, the application runs three consumer threads, each owning two partitions. Setting concurrency higher than the partition count just leaves the extra threads idle, the same ceiling as adding more instances.

flowchart LR
    subgraph app [payment-service, one app]
        t1["thread 1<br/>partitions 0,1"]
        t2["thread 2<br/>partitions 2,3"]
        t3["thread 3<br/>partitions 4,5"]
    end

Concurrency and instances combine: total consumer threads across all instances in a group share the partitions, up to the partition count.


6. Seeking: replay and skip

Sometimes you need to move a consumer’s position deliberately, to replay after a bug fix or to skip past a bad stretch. Implement ConsumerSeekAware (or use the callback on assignment) to seek when partitions are assigned.

@Component
public class ReplayListener implements ConsumerSeekAware {

    @Override
    public void onPartitionsAssigned(Map<TopicPartition, Long> assignments,
                                     ConsumerSeekCallback callback) {
        // Replay everything from the start of each assigned partition
        assignments.keySet().forEach(callback::seekToBeginning);
    }

    @KafkaListener(topics = "orders", groupId = "orders-replay")
    public void reprocess(OrderCreated event) {
        // reprocess historical orders
    }
}

Because reading never deletes, seeking to the beginning replays the retained history without affecting other groups. This is the practical payoff of the log model: the data is still there to reprocess.


7. Guided practical

Run this against the lab, using an orders topic with at least three partitions.

  1. Start the Payment service with groupId: payment-service, then start a second instance. In the Kafka UI, confirm the partitions split across the two instances.
  2. Stop one instance and confirm its partitions are reassigned to the survivor.
  3. Set concurrency: 3 on a single instance over a six-partition topic and confirm three threads share the partitions.
  4. Switch to ack-mode: manual, acknowledge only after the work succeeds, and confirm offsets advance only then.
  5. Start a fresh groupId with auto-offset-reset: earliest and confirm it replays the whole topic.

Next: Section 4, Schema Registry and Data Contracts, where typed events grow into versioned, cross-team data contracts with Avro, Protobuf, and JSON Schema.