Connection Recovery and Fault Tolerance
Automatic connection recovery, connecting to a cluster with multiple addresses, listener container recovery, and graceful shutdown.
Prerequisite:First Producer and ConsumerYou’ll need: A broker running locally.
Networks drop and brokers restart. A resilient client survives this without manual intervention. This concise module covers what Spring AMQP recovers for you automatically and the few settings you should get right.
What you’ll be able to do after this module
- Explain what Spring AMQP recovers automatically after a connection drop.
- Configure a client to connect to a clustered broker with multiple addresses.
- Ensure listener containers reconnect and redeclare topology.
- Shut down cleanly so in-flight messages are not lost.
1. Automatic recovery
The RabbitMQ Java client, which Spring AMQP builds on, recovers connections and topology automatically. After a connection loss it periodically retries, and once reconnected it re-declares the exchanges, queues, and bindings your app owns and restarts consumers.
sequenceDiagram
participant App as Spring Boot app
participant Broker as RabbitMQ
App->>Broker: connected, consuming
Note over Broker: node restarts / network drops
App--xBroker: connection lost
loop until reconnected
App->>Broker: retry connect (backoff)
end
App->>Broker: reconnected
App->>Broker: re-declare topology, restart consumers
Note over App: resumes publishing and consuming
Recovery is on by default. You mainly tune how long a consumer waits before restart with spring.rabbitmq.listener.simple.recovery-interval.
2. Connecting to a cluster
In production the broker is usually a cluster (covered in Cluster Architecture and High Availability). Give the client every node so it can fail over to a surviving one:
spring:
rabbitmq:
addresses: rmq-1:5672,rmq-2:5672,rmq-3:5672
addresses takes precedence over a single host/port. On connect or recovery, the client tries the listed nodes until one accepts. Pair this with quorum queues so the data survives a node loss, not just the connection.
3. Publisher-side resilience
A publish attempted during a connection blip can fail. Combine recovery with the reliability tools from earlier in this section:
- Enable publisher confirms so you know whether a publish during a wobble actually succeeded.
- Retry failed publishes with backoff, and keep them idempotent so a retry that both succeeded and appeared to fail does not cause harm.
Caution: In-flight publishes are not automatically replayed after a reconnect. Confirms plus publish retry are what make publishing reliable across a failover.
4. Graceful shutdown
When your service stops, let in-flight messages finish instead of dropping them. Spring AMQP shuts listener containers down gracefully and waits up to a configurable timeout:
spring:
rabbitmq:
listener:
simple:
shutdown-timeout: 15000ms
On shutdown the container stops accepting new deliveries, lets current handlers finish and ack, then closes. Combined with at-least-once delivery, anything not finished in time is simply redelivered to another instance after restart, and your idempotent consumers (previous module) handle the possible duplicate safely.
Checkpoint
You should now be able to:
- Explain what Spring AMQP re-declares and restarts after a connection drop.
- Configure multiple cluster addresses for failover.
- Combine confirms and publish retry for resilient publishing.
- Enable graceful shutdown so in-flight work is not lost.