Read time: ~

Retry, Error Handling and Backoff

Handle consumer failures with retry interceptors, exponential backoff, and a clear split between transient and deterministic errors.

Prerequisite:Consumer Acknowledgements and PrefetchYou’ll need: A broker running locally and a Spring AMQP consumer.

Consumers fail. A downstream API times out, a record is briefly locked, or a message is simply malformed. Naive requeuing turns a single bad message into an infinite loop. This module builds a disciplined retry strategy that separates errors worth retrying from errors that never will succeed.


What you’ll be able to do after this module

  • Distinguish transient failures from deterministic ones.
  • Configure listener retry with exponential backoff.
  • Decide when a message should be retried, rejected, or dead-lettered.
  • Classify exceptions as fatal or non-fatal so the container treats them correctly.

1. Transient vs deterministic failures

The single most important distinction in error handling:

  • Transient: likely to succeed if tried again later. A network blip, a lock timeout, a temporarily unavailable dependency. Retry these.
  • Deterministic: will fail every time no matter how often you retry. A malformed payload, a validation error, a missing required field. Do not retry these; route them out of the way.
flowchart TD
    Start["Listener throws"]
    Q{"Transient error?"}
    Retry["Retry with backoff"]
    Q2{"Retries exhausted?"}
    DLX["Reject to dead-letter exchange"]
    Done["Ack on success"]
    Start --> Q
    Q -->|no, deterministic| DLX
    Q -->|yes| Retry
    Retry --> Q2
    Q2 -->|no| Retry
    Q2 -->|yes| DLX
    Retry -->|succeeds| Done

2. Listener retry with backoff

Spring AMQP can retry a failing listener in-memory before the message is nacked. Configure it declaratively:

spring:
  rabbitmq:
    listener:
      simple:
        retry:
          enabled: true
          initial-interval: 1000ms
          multiplier: 2
          max-attempts: 4
          max-interval: 10000ms
        default-requeue-rejected: false

This retries a failing message up to four times with exponential backoff (1s, 2s, 4s, …). default-requeue-rejected: false ensures that once retries are exhausted, the message is rejected without being requeued, so it can be dead-lettered instead of looping.

Caution: In-memory retry blocks that consumer thread for the whole backoff window. Keep max-attempts and max-interval modest. For long delays between attempts, use TTL-based delayed retry queues, covered in Dead Letter Exchanges.


3. What happens when retries are exhausted

By default, an exhausted retry republishes the exception up and the message is rejected. To send exhausted messages somewhere useful, use a RepublishMessageRecoverer that publishes the failed message (with the stack trace in headers) to a dedicated error exchange.

@Bean
MessageRecoverer messageRecoverer(RabbitTemplate template) {
    return new RepublishMessageRecoverer(template, "orders.dlx", "order.failed");
}

This is often cleaner than plain dead-lettering because it captures the failure reason in the message headers for later diagnosis.


4. The error handler and fatal exceptions

Each listener container has an error handler. The default, ConditionalRejectingErrorHandler, treats certain exceptions as fatal, meaning retrying is pointless, so the message is rejected immediately rather than retried. Message conversion failures (a payload that can never deserialize) are fatal by default, which is exactly what you want: a malformed message should not be retried four times.

You can declare your own exceptions fatal by throwing AmqpRejectAndDontRequeueException, which tells the container to reject without requeue regardless of retry config:

@RabbitListener(queues = "inventory.queue")
public void onOrderCreated(OrderCreated event) {
    if (!isValid(event)) {
        throw new AmqpRejectAndDontRequeueException("invalid order payload");
    }
    inventoryService.reserve(event); // transient failures here get retried
}

5. Putting it together

A robust consumer combines the pieces:

  1. Enable retry with exponential backoff for transient failures.
  2. Throw AmqpRejectAndDontRequeueException for deterministic failures so they skip retries.
  3. Configure a dead-letter exchange so exhausted or rejected messages are captured, not lost.
  4. Alert on the dead-letter queue depth so failures are visible.

Steps 3 and 4 are the subject of the next module.


Checkpoint

You should now be able to:

  • Tell a transient failure from a deterministic one.
  • Configure listener retry with exponential backoff and a sane attempt limit.
  • Reject deterministic failures immediately with AmqpRejectAndDontRequeueException.
  • Route exhausted messages to an error destination instead of losing or looping them.