Performance Tuning and Throughput
{"A holistic tuning guide" => "prefetch and concurrency, connection and channel sizing, persistence trade-offs, batching, and publisher throughput, driven by measurement."}
Prerequisite:Consumer Acknowledgements and Prefetch and ObservabilityYou’ll need: A workload you can measure and the metrics pipeline from the previous module.
Performance work is not a list of magic settings; it is a loop of measure, find the bottleneck, change one thing, and measure again. This module pulls the levers together into a decision guide. It references the prefetch and concurrency mechanics from module 7 rather than repeating them.
What you’ll be able to do after this module
- Find the real bottleneck before changing settings.
- Balance prefetch and consumer concurrency for your workload.
- Size connections and channels without leaking them.
- Weigh persistence and durability against speed.
- Use batching and the right converter to raise publisher throughput.
1. Measure first, then tune
Every change below is a trade-off, so never guess. Start from the metrics in Observability and let the numbers point to the bottleneck.
flowchart TD
Start["Throughput too low or latency too high"]
Q{"Queue depth rising?"}
Slow{"Handler time high?"}
Pub{"Publisher confirms slow?"}
Start --> Q
Q -->|Yes| Slow
Q -->|No| Pub
Slow -->|Yes| Fix1["Optimize handler / add concurrency"]
Slow -->|No| Fix2["Raise prefetch / consumer count"]
Pub -->|Yes| Fix3["Batch, async confirms, more channels"]
Pub -->|No| Fix4["Bottleneck is downstream (DB, network)"]
A rising queue with idle consumers is a prefetch problem; a rising queue with busy consumers is a handler problem. These need opposite fixes, which is why you measure before touching config.
2. Consumer throughput: prefetch and concurrency
The two biggest consumer levers are prefetch (how many unacked messages a consumer holds) and concurrency (how many consumer threads run). The mechanics live in Consumer Acknowledgements and Prefetch; here is how to tune them.
spring:
rabbitmq:
listener:
simple:
prefetch: 20 # start modest, raise while latency is stable
concurrency: 4 # threads per listener
max-concurrency: 16 # scale up under load
- Prefetch too low: consumers idle between acks waiting for the next message, wasting the network round trip.
- Prefetch too high: one consumer hoards the backlog, hurting fair distribution and memory. A value in the low tens per consumer is a common sweet spot.
- Concurrency: raise it for I/O-bound handlers (database, HTTP calls). For CPU-bound work, more threads than cores just adds contention.
Tune prefetch and concurrency together, because raising one shifts the ideal value of the other.
3. Connections and channels
A connection is a TCP socket and is expensive; a channel is a lightweight virtual stream multiplexed over it. The rule: share one connection, use many channels, and never share a channel across threads.
flowchart LR
App["Spring app"]
subgraph conn ["1 CachingConnectionFactory connection"]
ch1["channel (publish)"]
ch2["channel (consumer 1)"]
ch3["channel (consumer 2)"]
end
App --> conn
conn --> Broker["RabbitMQ"]
CachingConnectionFactory pools channels for you. Size the cache to your concurrency so publishers are not blocked waiting for a free channel:
spring:
rabbitmq:
cache:
channel:
size: 25 # >= peak concurrent publishers/consumers
publisher-confirm-type: correlated
Watch the connection and channel gauges from module 18. A steadily climbing count means a leak, usually a channel opened per request instead of reused.
4. Persistence and durability trade-offs
Durability is the classic speed-versus-safety dial. Persistent messages on durable queues survive a broker restart but pay a disk write on every message.
| Choice | Faster | Safer |
|---|---|---|
| Transient message on durable queue | yes | lost on restart |
| Persistent message, durable quorum queue | no | survives restart and node loss |
| Publisher confirms on | slightly slower | know the broker got it |
Do not blanket-disable persistence to chase numbers. Instead, match durability to the message: a payment event must be persistent, while a live dashboard tick can be transient. Quorum queues (the course default) add replication cost for their safety, which is usually the right trade for business events.
5. Batching and the publisher path
If the bottleneck is the publisher, reduce per-message overhead.
- Batch on the client.
BatchingRabbitTemplategroups several messages into one broker frame, cutting round trips for high-volume, small messages. - Use asynchronous confirms. Waiting synchronously for each confirm serializes publishing. Correlated async confirms let you keep publishing while acknowledgements stream back (see Publisher Confirms).
- Keep payloads lean. JSON is convenient, but large or deeply nested payloads cost serialization and bandwidth. Send identifiers and let consumers fetch details when the payload would be heavy.
- Reuse the template and converter. Recreating a
RabbitTemplateor converter per call throws away pooling.
BatchingStrategy strategy = new SimpleBatchingStrategy(100, 16_384, 200);
BatchingRabbitTemplate template =
new BatchingRabbitTemplate(strategy, scheduler);
Batching trades a little latency (messages wait to fill a batch) for a large throughput gain, so use it for volume, not for latency-critical single messages.
6. A tuning checklist
Work top-down, re-measuring after each change:
- Confirm the bottleneck is RabbitMQ and not a downstream database or API.
- Fix slow handlers first; no broker setting beats a faster consumer.
- Tune prefetch, then concurrency, to keep consumers busy but fair.
- Size the channel cache to peak concurrency; hunt connection leaks.
- Match persistence and confirms to each message’s importance.
- Batch and go async on the publisher only if it is the proven limit.
Checkpoint
You should now be able to:
- Use metrics to locate the real bottleneck before tuning.
- Balance prefetch and concurrency for I/O vs CPU-bound handlers.
- Share a connection, pool channels, and detect leaks.
- Choose persistence per message instead of globally.
- Apply batching and async confirms where they actually help.
Next:Testing RabbitMQ Apps.