Read time: ~

Schema Registry Incompatibility in Production

Diagnose a breaking schema change rejected at publish or crashing consumers, compatibility-mode mismatches, and rollout order.


1. Symptom

Right after a deploy that changed an event’s shape, one of two things happens: producers start failing to publish because the Schema Registry rejects the new schema, or consumers start throwing deserialization errors and crashing on records they cannot read. Either way, the orders or payments pipeline stalls following a schema change.

The goal is to identify whether the change violated the compatibility mode, and to fix it with the correct rollout order rather than forcing the change through, building on Schema Registry and Data Contracts.


2. Likely causes

CauseHow it manifests
Breaking change under BACKWARD compatibilityRegistry rejects the new producer schema at registration
Removed/renamed required fieldNew consumers cannot read old records, or vice versa
Wrong compatibility mode for the rolloutThe mode does not match whether producers or consumers upgrade first
Producer and consumer upgraded in the wrong orderA window where one side cannot read the other’s records
Default value missing on a new fieldOld readers cannot supply a value, breaking compatibility

3. How it manifests to the Spring app

CauseWhat the service sees
Registry rejects schemaProducer fails on send with a registry incompatibility error
Consumer reads incompatible recordSerializationException / deserialization failure (see Poison Messages)
Wrong rollout orderTransient failures during the deploy window that clear once both sides match

4. Diagnostic steps

  1. Read the error. A registration rejection is producer-side and blocks publishing; a deserialization failure is consumer-side and blocks reading.
  2. Check the compatibility mode for the subject in the registry (BACKWARD, FORWARD, FULL). This defines what changes are legal, from Schema Registry.
  3. Diff the schemas. Compare the new schema version to the previous: was a required field removed, renamed, or added without a default?
  4. Check rollout order. Under BACKWARD, consumers upgrade first; under FORWARD, producers first. A mismatch creates a broken window.
  5. Check for stuck records. Incompatible records already produced may be blocking a partition (cross-check the poison-message playbook).
StepQuestion it answersTime cost
1. Error sideProducer registration or consumer read?seconds
2. Compatibility modeWhat changes are legal?1 min
3. Schema diffWhat exactly changed?2-3 min
4. Rollout orderDid the right side deploy first?1-2 min
5. Stuck recordsAre bad records blocking a partition?2-3 min

5. Safe remediations

SituationSafe action
Breaking change rejected at registrationRevert to a compatible schema (add fields with defaults instead of removing/renaming)
Wrong rollout orderRoll back the side that deployed early, then redeploy in the correct order for the mode
Consumers crashing on new recordsRoll back the producer change; upgrade consumers first if the mode is BACKWARD
Records already stuckRoute via DLT (see Poison Messages) while fixing the contract

6. Escalation trigger

Page on-call engineering and the owning team if:

  • A breaking change is already deployed and consumers are crashing in production.
  • The correct rollout order requires coordinating multiple teams’ deploys.
  • Someone proposes setting compatibility to NONE to force the change.
  • Stuck incompatible records require offset or DLT intervention on a production topic.

7. Relevant commands and exhibits

# Producer-side: registry rejects a breaking change
Schema being registered is incompatible with an earlier schema for subject
  "orders-value"; error code: 409
  Reason: READER_FIELD_MISSING_DEFAULT_VALUE: customerEmail

# Consumer-side: cannot deserialize a record written with an incompatible schema
org.apache.kafka.common.errors.SerializationException:
  Error deserializing Avro message for id 7
# Check the compatibility mode for a subject (Confluent Schema Registry REST)
curl -s $SCHEMA_REGISTRY/config/orders-value | jq .

# Compatibility-safe evolution: add a field WITH a default
# (backward compatible: old readers use the default)
{ "name": "customerEmail", "type": ["null", "string"], "default": null }

Recall from Schema Registry: adding optional fields with defaults is safe; removing or renaming required fields is breaking.


8. Guided practical

Reproduce an incompatibility in the local lab with Schema Registry running.

  1. Register the OrderCreated Avro schema and produce a record, as in Schema Registry.
  2. Attempt to register a new version that removes a required field: the registry rejects it under BACKWARD.
  3. Now add a new field with a default instead and confirm it registers and old consumers still read old records.
  4. Explain the correct producer/consumer rollout order for BACKWARD mode.

Next:Hands-On Lab: Incident Diagnosis, where you diagnose a full incident end to end.