Skip to content

Improve the broker's handling of message corruption#2136

Merged
cshannon merged 4 commits into
apache:mainfrom
cshannon:corrupt-message-handling
Jun 23, 2026
Merged

Improve the broker's handling of message corruption#2136
cshannon merged 4 commits into
apache:mainfrom
cshannon:corrupt-message-handling

Conversation

@cshannon

@cshannon cshannon commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

This PR improves the broker' handling of messages that are corrupt and can't be read, such as when unmarshaling the properties or body. Currently if there is an error an IOException is triggered and can lead to a client connection be closed. Furthermore for queues messages can be stuck and no new messages can be delivered.

To improve things the following changes have been made:

  1. The MarshallingSupport utility that is used to unmarshal message properties and bodies has improve validation to check for errors such is incorrectly encoded size values or a large stack depth of nested objects (limited to 127). This prevents problems such as large buffers being allocated by mistake.

  2. The broker will now handle message format errors and there are two main spots errors had to be handled:

    • The first spot is when messages are published and a topic or queue try to match to consumers. The properties (or body for xpath) could be unmarshaled for for selector matching which could trigger an error before the message has been added to the subscription. The previous behavior just skips the message and continues. For topics this is makes sense because each sub is independently handled and will just ack/remove for an unmatching sub and it won't block any other sub. However for the queue case, we can end up in a loop and be "stuck" and need to DLQ.
    • The second spot is later during actual dispatch attempt to clients from the subscriptions if the message made it. The primary cause of failure here is an error tying to convert for another protocol the headers/body. Currently this just triggers an IOException and closes the consumer but doesn't deal with the message. This PR will fix this so that errors that happen that we know are due to a corrupt message will be acked and DLQ'd so they get removed by the queue or durable
  3. The Stomp protocol converter was fixed to not auto ack or track acks until the message has been converted.

  4. AMQP no longer swallows message format errors and will throw so the errors can be handled by the TransportConnection.

  5. Any compressed streams that are passed to MarshallingSupport now are wrapped using the new FrameSizeLimitedFilterInputStream class from Provide a flexible filter style input stream that limits read amounts #2118 so the utility doesn't break during inflation. As noted below a future PR will handle limiting the inflation amount.

All of these changes allow the broker to deal with corrupt messages and remove them (and possibly DLQ) vs causing the connections to close or the messages to block consumers forever in the queue case.

Note: There will be a follow on PR after this is merged to handle compressed message validation as well that builds on this.

This commit improves the brokers handling of messages that are corrupt
and can't be read, such as unmarshaling the properties or body.
Currently if there is an error an IOException is triggered and can lead
to a client connection be closed. Furthermore for queues messages can be
stuck and no new messages can be delivered.

To improve things the following changes have been made:

* The MarshallingSupport utility that is used to unmarshal message
  properties and bodies has improve validation to check for errors
  such is incorrectly encoded size values.
* The broker will now handle message format errors both when messages
  are evaluated to add to subscriptions and during dispatch to consumers
  when the messages are already on a subscription.
* The Stomp protocol converter was fixed to not auto ack or track acks
  until the message has been converted.
* AMQP no longer swallows message format errors and will throw so the
  erors can be handled by the TransportConnection.

All of these changes allow the broker to deal with corrupt messages and
remove them (and possibly DLQ) vs causing the connections to close or
the messages to block consumers forever in the queue case.
Comment thread activemq-broker/src/main/java/org/apache/activemq/broker/region/Queue.java Outdated
Comment thread activemq-broker/src/main/java/org/apache/activemq/broker/region/Queue.java Outdated
@cshannon cshannon requested a review from mattrpav June 22, 2026 18:36

@tabish121 tabish121 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't spot any obvious issues, after all the fixes that have already been applied it seems reasonable to me.

@cshannon

Copy link
Copy Markdown
Contributor Author

All tests passed, I am going to merge and wrap up my second follow on PR

@cshannon cshannon dismissed mattrpav’s stale review June 23, 2026 00:13

changes were made to address concerns

@cshannon cshannon merged commit b20e61c into apache:main Jun 23, 2026
10 checks passed
@cshannon cshannon deleted the corrupt-message-handling branch June 23, 2026 00:14
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Apache ActiveMQ v6.3.0 Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants