Schema Registry Incompatibility in Production
Diagnose a breaking schema change rejected at publish or crashing consumers, compatibility-mode mismatches, and rollout order.
1. Symptom
Right after a deploy that changed an event’s shape, one of two things happens: producers start failing to publish because the Schema Registry rejects the new schema, or consumers start throwing deserialization errors and crashing on records they cannot read. Either way, the orders or payments pipeline stalls following a schema change.
The goal is to identify whether the change violated the compatibility mode, and to fix it with the correct rollout order rather than forcing the change through, building on Schema Registry and Data Contracts.
2. Likely causes
| Cause | How it manifests |
|---|---|
Breaking change under BACKWARD compatibility | Registry rejects the new producer schema at registration |
| Removed/renamed required field | New consumers cannot read old records, or vice versa |
| Wrong compatibility mode for the rollout | The mode does not match whether producers or consumers upgrade first |
| Producer and consumer upgraded in the wrong order | A window where one side cannot read the other’s records |
| Default value missing on a new field | Old readers cannot supply a value, breaking compatibility |
3. How it manifests to the Spring app
| Cause | What the service sees |
|---|---|
| Registry rejects schema | Producer fails on send with a registry incompatibility error |
| Consumer reads incompatible record | SerializationException / deserialization failure (see Poison Messages) |
| Wrong rollout order | Transient failures during the deploy window that clear once both sides match |
4. Diagnostic steps
- Read the error. A registration rejection is producer-side and blocks publishing; a deserialization failure is consumer-side and blocks reading.
- Check the compatibility mode for the subject in the registry (
BACKWARD,FORWARD,FULL). This defines what changes are legal, from Schema Registry. - Diff the schemas. Compare the new schema version to the previous: was a required field removed, renamed, or added without a default?
- Check rollout order. Under
BACKWARD, consumers upgrade first; underFORWARD, producers first. A mismatch creates a broken window. - Check for stuck records. Incompatible records already produced may be blocking a partition (cross-check the poison-message playbook).
| Step | Question it answers | Time cost |
|---|---|---|
| 1. Error side | Producer registration or consumer read? | seconds |
| 2. Compatibility mode | What changes are legal? | 1 min |
| 3. Schema diff | What exactly changed? | 2-3 min |
| 4. Rollout order | Did the right side deploy first? | 1-2 min |
| 5. Stuck records | Are bad records blocking a partition? | 2-3 min |
5. Safe remediations
| Situation | Safe action |
|---|---|
| Breaking change rejected at registration | Revert to a compatible schema (add fields with defaults instead of removing/renaming) |
| Wrong rollout order | Roll back the side that deployed early, then redeploy in the correct order for the mode |
| Consumers crashing on new records | Roll back the producer change; upgrade consumers first if the mode is BACKWARD |
| Records already stuck | Route via DLT (see Poison Messages) while fixing the contract |
6. Escalation trigger
Page on-call engineering and the owning team if:
- A breaking change is already deployed and consumers are crashing in production.
- The correct rollout order requires coordinating multiple teams’ deploys.
- Someone proposes setting compatibility to
NONEto force the change. - Stuck incompatible records require offset or DLT intervention on a production topic.
7. Relevant commands and exhibits
# Producer-side: registry rejects a breaking change
Schema being registered is incompatible with an earlier schema for subject
"orders-value"; error code: 409
Reason: READER_FIELD_MISSING_DEFAULT_VALUE: customerEmail
# Consumer-side: cannot deserialize a record written with an incompatible schema
org.apache.kafka.common.errors.SerializationException:
Error deserializing Avro message for id 7
# Check the compatibility mode for a subject (Confluent Schema Registry REST)
curl -s $SCHEMA_REGISTRY/config/orders-value | jq .
# Compatibility-safe evolution: add a field WITH a default
# (backward compatible: old readers use the default)
{ "name": "customerEmail", "type": ["null", "string"], "default": null }
Recall from Schema Registry: adding optional fields with defaults is safe; removing or renaming required fields is breaking.
8. Guided practical
Reproduce an incompatibility in the local lab with Schema Registry running.
- Register the
OrderCreatedAvro schema and produce a record, as in Schema Registry. - Attempt to register a new version that removes a required field: the registry rejects it under
BACKWARD. - Now add a new field with a default instead and confirm it registers and old consumers still read old records.
- Explain the correct producer/consumer rollout order for
BACKWARDmode.
Next:Hands-On Lab: Incident Diagnosis, where you diagnose a full incident end to end.