Assessment
Scenario-based self-check spanning all eight sections, from messaging foundations and Spring AMQP to reliability, event-driven design, production readiness, and operations.
Prerequisite: all prior sections
Scenario-based questions grouped by section, covering the full course. Try to answer each one before expanding the answer. If you get one wrong, revisit the linked module rather than just reading the correction, the point is to fix the gap, not just see the right answer.
This is self-assessment, not a graded test. If you can confidently answer most of these without peeking, you are ready to design and operate event-driven systems with RabbitMQ.
Section 1: Foundations and Setup
1. A teammate asks why your team uses RabbitMQ instead of just having orders-service call shipping-service directly over REST. Give a one-sentence answer.
Answer
Decoupling and load-leveling, the producer doesn't need to know who's listening or wait for them to be available, and traffic spikes get absorbed into a queue instead of overloading the downstream service directly. See Why Messaging: REST vs RabbitMQ.2. Why is it standard practice to open a single connection per application and many channels on it, rather than a new connection per worker thread?
Answer
A connection is a TCP socket and is relatively expensive; a channel is a lightweight virtual stream multiplexed over one connection. Sharing one connection and using many channels keeps broker file-descriptor and memory usage low. The one rule: a channel must not be shared across threads. See AMQP and the RabbitMQ Model.Section 2: Core Messaging and Routing
3. You see this Spring configuration:
@Bean
TopicExchange ordersExchange() { return new TopicExchange("orders.exchange"); }
@Bean
Binding binding(Queue q, TopicExchange e) {
return BindingBuilder.bind(q).to(e).with("order.*.created");
}
Which of these routing keys would match this binding: order.eu.created, order.created, order.eu.eu.created?
Answer
Onlyorder.eu.created. * matches exactly one word, order.created has too few segments, and order.eu.eu.created has too many. (# would match zero-or-more, but this binding uses *.) See Exchanges, Bindings and Routing Topologies. 4. You want every subscriber to receive a copy of an event, and separately you want one specific service to receive a command. Which exchange type and routing approach fits each?
Answer
Broadcast a copy to every subscriber with a fanout exchange (or a topic exchange with each consumer's own queue bound to the event key). Direct a command at one specific handler with a direct (or topic) exchange and a targeted routing key so only that queue is bound. See Exchanges, Bindings and Routing Topologies.5. For reliable work distribution across consumers you pick one queue type; for rebuilding a read model by replaying a year of events you pick another. Which is which, and why?
Answer
Reliable work distribution: a quorum queue (replicated via Raft, the course default; consumption is destructive, which is fine for task queues). Replay: a stream, an append-only log where consumption is non-destructive so a new consumer can read history from any offset. A quorum queue cannot replay because acked messages are removed. See Queues and Messages Deep Dive and RabbitMQ Streams.Section 3: Building with Spring AMQP
6. True or false: if your @RabbitListener method returns normally (no exception), Spring automatically acks the message for you.
Answer
True, under the defaultAcknowledgeMode.AUTO. A thrown exception triggers a nack instead. See Consumer Acknowledgements and Prefetch. 7. One consumer sits idle between messages waiting for the next one; in another service a single consumer hoards the whole backlog while its peers stay empty. Which single setting is wrong in each case, and in which direction?
Answer
Both are prefetch (QoS). Idle-between-messages means prefetch is too low (the consumer waits a network round trip per message), so raise it. One consumer hoarding the backlog means prefetch is too high, so lower it toward the low tens per consumer for fair distribution. See Consumer Acknowledgements and Prefetch.8. You need a synchronous-style request/reply over RabbitMQ (send a message, wait for a correlated response). Which Spring AMQP feature gives you this without hand-building reply queues?
Answer
RabbitTemplate.sendAndReceive() / convertSendAndReceive() with direct reply-to, which manages a fast pseudo-queue for the response and correlates it automatically. Use it sparingly; request/reply reintroduces temporal coupling that events avoid. See Core Messaging Patterns. Section 4: Reliability and Resilience
9. Publisher confirms and publisher returns both give the producer feedback. What does each one actually tell you, and what does neither tell you?
Answer
A confirm tells you the broker accepted (and, for durable/persistent, persisted) the message. A return (withmandatory) tells you the message was unroutable, no queue was bound for its key. Neither tells you a consumer processed it, and neither ties the publish to your database commit (that needs the outbox). See Reliable Publishing: Publisher Confirms and Returns. 10. A consumer sometimes fails on a temporary network blip to a downstream API, and sometimes fails permanently on a malformed payload. How should each failure be handled differently?
Answer
Transient failures: retry with backoff, the next attempt is likely to succeed. Deterministic (poison) failures: retrying is futile and blocks the queue, so after limited attempts dead-letter the message to a DLQ/parking-lot for investigation instead of requeuing forever. See Retry, Error Handling and Backoff and Dead Letter Exchanges.11. A queue has a Dead Letter Exchange configured. You notice its DLQ has been steadily accumulating messages for the past hour, while the main queue’s ready count stays near zero. Is this a good sign or a bad sign, and what should you check next?
Answer
Mixed, but better than the alternative: the DLX safety net is working as designed (rather than an endless redelivery loop clogging the main queue), so the main queue itself is healthy. But an accumulating DLQ still means messages are genuinely failing, check the DLQ message contents and the consumer logs for the actual exception, and don't just leave them there or purge them without investigating. See Dead Letter Exchanges and Playbook 06.12. A consumer processes the same order twice, a day apart, with no crash or error in between. The dev team insists this must be a RabbitMQ bug. What do you tell them, and what evidence would you look for first?
Answer
At-least-once delivery is documented, deliberate broker behavior, not a defect, a redelivery happens when an ack never reached the broker (e.g., a crash or connection drop between processing and acking). The real fix is an idempotent consumer keyed on the event id. Look for the same message/order ID appearing twice in consumer logs, and check whether a crash, nack, or connection blip happened between the two occurrences. See Idempotency, Ordering and Duplicate Handling and Playbook 09, Part B.13. Your app connects to a 3-node cluster but spring.rabbitmq.host names a single node. Why is this a latent outage, and what is the fix?
Answer
If that one named node is down or unreachable, the app sees total broker unavailability even though two healthy nodes remain. Configurespring.rabbitmq.addresses with all cluster nodes so the client (with automatic recovery) can connect to and fail over to any of them. See Connection Recovery and Fault Tolerance. Section 5: Event-Driven Architecture and Advanced Patterns
14. Your design has messages named ReserveStock and OrderCreated. Which is a command and which is an event, and how does that choice affect coupling?
Answer
ReserveStock is a command (imperative, "do this", aimed at one known handler, higher coupling). OrderCreated is an event (past tense, "this happened", any number of subscribers, lower coupling). Prefer events for cross-service reactions so the publisher never needs to know who reacts. See Event-Driven Microservices with RabbitMQ. 15. When would you choose choreography over orchestration for a multi-step order workflow, and what is the main downside of each?
Answer
Choreography (services react to each other's events) suits simple fan-out reactions and gives maximum decoupling, but the end-to-end flow is implicit and harder to trace. Orchestration (a coordinator issues commands and tracks state) makes complex, ordered, compensating workflows explicit, but the coordinator is a coupling point and potential bottleneck. See Event-Driven Microservices with RabbitMQ.16. A service writes an order to its database and then publishes OrderCreated. Why don’t publisher confirms make this safe, and what pattern does?
Answer
The database and the broker are two systems with no shared transaction; a crash between the commit and the publish (or vice versa) leaves them inconsistent, and a confirm only proves the broker got the publish, not that it happened atomically with the DB write. The transactional outbox fixes it: write the event to an outbox table in the same DB transaction, then a relay publishes it with confirms. See Transactional Outbox and Saga Patterns.17. In a saga, one completed step already sent a customer a confirmation email, but a later step fails. You cannot “roll back” a sent email. How does a saga handle this?
Answer
With a compensating action, a new business-level action that offsets the effect, not a database rollback. You cannot un-send the email, so you compensate by sending a follow-up cancellation notice. Compensations must be idempotent because the triggering event can arrive more than once. See Transactional Outbox and Saga Patterns.18. Name one thing a RabbitMQ stream can do that a classic or quorum queue cannot, and one reason you would still default to a quorum queue.
Answer
A stream can replay: consumption is non-destructive, so a new consumer can read history from any offset, and many consumers can independently read the same high-volume log. You still default to quorum queues because they are the right tool for reliable, destructive work distribution, and streams add complexity and are only worth it when you specifically need replay or very high fan-out. See RabbitMQ Streams.Section 6: Production Readiness
19. You are provisioning the broker account for order-service. What tags and permissions should it have, and what should it definitely not be?
Answer
It should be a dedicated user with no administrative tag, scoped to its vhost, with the narrowest configure/write/read patterns it needs (for the Order Service, publish toorders.* and little or no read). It should not connect as guest or as an administrator. Least privilege limits the blast radius if the credential leaks. See Security: Users, Permissions, vhosts and TLS. 20. What’s the key difference in blast radius between a RabbitMQ broker certificate expiring vs. a single Spring Boot app’s mTLS client certificate expiring?
Answer
A broker-side certificate expiring breaks every client trying to connect over TLS at once (total outage for TLS clients). A single app's client certificate expiring (under mTLS) only breaks that one app, everyone else keeps working fine. This "everyone vs. one app" split is the fastest way to tell which side the problem is on. See Security and Playbook 08.21. You can only put a handful of RabbitMQ signals on your primary dashboard. Which give the earliest, clearest warning that consumers are falling behind?
Answer
Queue depth (messages_ready) trending up, and the publish rate outpacing the ack rate, are the leading indicators of consumer lag. Pair them with unacked count (stuck consumers), listener processing time (the root cause), and DLQ/redelivery rate (poison messages). See Observability: Metrics, Tracing and Health. 22. Two services both show a rising queue. In one, consumers are idle; in the other, consumers are busy at high CPU. Why do these need opposite fixes?
Answer
Rising queue with idle consumers is a delivery/throughput problem: raise prefetch and/or add consumer concurrency so they stay fed. Rising queue with busy consumers is a handler problem: no broker setting beats making the handler faster (or removing a slow downstream call); adding threads to CPU-bound work just adds contention. Always measure first. See Performance Tuning and Throughput.23. Why test acknowledgement and dead-letter behavior against a real broker (Testcontainers) instead of a mock, and how do you assert on an async consumer without flaky sleeps?
Answer
Acks, requeue, retry, and DLX are broker behaviors a mock cannot reproduce, so only a real broker verifies them. Because the publish returns before the consumer runs, assert with Awaitility (poll until the expected outcome or a timeout) rather thanThread.sleep, which is either flaky or slow. See Testing RabbitMQ Apps. Section 7: Operations and Troubleshooting
24. You run rabbitmqctl list_queues name messages_ready messages_unacknowledged consumers and get:
name messages_ready messages_unacknowledged consumers
payments.queue 15420 0 0
What’s your first hypothesis, and what’s the very next thing you’d check?
Answer
Zero consumers with a large and growing ready count means nobody is attached to the queue, the consumer app is likely down, crashed, or was never deployed/misconfigured to point at this queue. Next: check whether the consumer app is even running (health check/Actuator), then its logs for a startup or connection failure. See Playbook 01.25. Same command, different output:
name messages_ready messages_unacknowledged consumers
payments.queue 40 1 3
Is this healthy or alerting? Why?
Answer
Healthy, low ready count, consumers attached and matching an expected instance count, and only 1 message currently in flight (unacked). This is normal steady-state traffic, not a backlog. See Tooling Walkthrough.26. A producer’s rabbitTemplate.convertAndSend() calls start hanging/timing out across multiple unrelated services at the same time. The RabbitMQ Management UI shows a banner alarm on the Overview page. What’s the most likely cause, and why does it affect multiple unrelated producers simultaneously instead of just one?
Answer
A memory or disk high-watermark alarm has tripped, which causes RabbitMQ to block all publishers cluster-wide as a deliberate self-protection mechanism, it's not scoped to one app because the broker doesn't know or care which app is publishing, it just stops accepting new publishes entirely until the resource pressure clears. See Playbook 02.27. You find this in a Spring Boot app’s logs:
org.springframework.amqp.AmqpConnectException: Connection refused
Name two plausible causes, one broker/infra-side and one that’s more of a “check this before you assume the broker is down” sanity check.
Answer
Broker/infra-side: the broker node is actually down, or a security group/NACL is blocking port 5672/5671 between the app and broker subnets. Sanity check: confirm the app is pointed at the correct host/port at all (e.g., a config typo or an environment-specific value not being overridden correctly) before escalating a "broker is down" incident. See Tooling Walkthrough and Playbook 07.28. A RabbitMQ user’s password was rotated in Secrets Manager three days ago. A Spring Boot app that’s been running continuously since before the rotation just started throwing PossibleAuthenticationFailureException, but only just now, not three days ago. Explain the delay.
Answer
RabbitMQ doesn't re-authenticate an already-established connection, the app kept working fine on its existing TCP connection for three days. The failure appears now because something caused that connection to drop (a reconnect attempt, node restart, network blip), and when Spring AMQP'sCachingConnectionFactory tried to reconnect, it used its cached (now-stale) credential instead of fetching the current one from Secrets Manager. See Playbook 05. 29. You’re investigating a connection count that’s climbing steadily on the broker, with no corresponding increase in traffic. rabbitmqctl list_connections name peer_host state shows dozens of connections all from the same peer_host. What’s the likely app-side code smell, and what’s the correct fix?
Answer
Likely smell: the app is creating a newConnection (or Channel) per message/request, e.g., via new ConnectionFactory().newConnection(), instead of reusing a shared, pooled connection. Fix: inject and reuse Spring's managed RabbitTemplate/ConnectionFactory bean (backed by CachingConnectionFactory) instead of manually creating connections/channels. See Playbook 04. 30. You’re diagnosing intermittent latency spikes. You check the Management UI and see the slowdown is affecting every queue on the cluster, not just one app’s. Which of Playbook 09’s two causes does this scope point toward, and what metric would you check next to confirm it?
Answer
Broker-wide scope points toward broker-side CPU credit exhaustion on a burstable EC2 instance type, not an app-side JVM GC pause (which would only affect that one consumer's queue). CheckCPUCreditBalance in CloudWatch for the broker nodes. See Playbook 09, Part A. 31. In a 3-node quorum-queue cluster, one node is terminated by AWS due to underlying hardware failure. What happens to queues that had a replica on that node, and what happens if a second node goes down five minutes later, before the first has been replaced?
Answer
After the first node loss: no data loss, brief leader re-election, queues keep operating normally on the remaining 2 nodes (quorum queues tolerate 1-of-3 failures). After a second node loss: the cluster loses quorum (only 1-of-3 nodes left) and those queues stop accepting new writes until quorum is restored, this ispause_minority behavior protecting against data divergence, not a bug. See AWS Architecture and Playbook 03. 32. You’ve diagnosed an incident: a security group change during an unrelated migration accidentally blocked the broker’s inter-node clustering ports (4369/25672), causing a false-looking network partition. Is this something you fix yourself, or escalate? Justify using the ownership distinction from the operations section.
Answer
Escalate, this is a networking/AWS configuration change (security group rules), not a broker-level or application-level fix within support tier's remit, and this kind of infra change should go through the team that owns network configuration, with your diagnosis (which port, which SG, when the migration happened) attached so they can fix it immediately rather than re-investigate. See Playbook 07 and Escalation and Communication.Scoring yourself
- 27-32 correct: You can design, build, and operate event-driven systems on RabbitMQ with confidence.
- 20-26 correct: Solid grasp; revisit the specific modules behind whichever questions you missed.
- Below 20: Work back through the sections more slowly, actually running the practicals and the capstone rather than reading past them, the hands-on steps are where this sticks.
Next: the Capstone Project if you haven’t built it yet, or the Quick-Reference Cheat Sheet for daily use.