RabbitMQ and transactions

By: on October 25, 2016

RabbitMQ can’t (in general) participate in two phase commit.

From a practical point of view, RabbitMQ can only make a message durable by adding it to a queue. This makes quite a few optimisations possible. Transaction participation would require RabbitMQ to spool messages temporarily on disk before adding them to a queue on transaction commit, doubling to cost of a publish.

RabbitMQs TX support doesn’t do this: transactions are not durable or atomic, so a broker failure might result in the partial application of a transaction. The protocol doesn’t allow transactions to be resumed in new connections in any case. In fact even publishing is not atomic if messages are routed to multiple queues. This rules out using the last resource commit optimisation, except when publishing to a single queue.

When you want to send a message as a side effect of a database transaction completing, and the result needs to be reliable, you are going to need to roll something yourself.

There are a few options:

If you want to get messages in an order that’s consistent with the transaction serialisation, you will need to do something like this:

Write a transaction log into a database table, and then have your publisher chase the tail. If you want the order of messages to match the actual transaction order, you are going to have to use a database generated sequence as one of the columns, or have a single thread perform all transactions, and have it generate the sequence. These are the only  options for consistent message order. Even then, you are going to have to read the documentation for your database. Consistent with doesn’t mean in the same order, because your database may only establish a partial order. Use the sequence in the message correlation id and use this in consumers to make them idempotent. You don’t have to use a ‘sent’ flag: your publisher can track the sequence, and checkpoint where it’s up to periodically. This will result in some re-sends if the publisher fails, but your consumers are idempotent. Re-sends are inevitable anyway. You will need to use publish acknowledgements to do the checkpoints.

This solution is eventually consistent: transactions can succeed without the side effect of a message being sent immediately. The message will eventually be sent.

If you don’t need total ordering, but perhaps just order for a particular object, you can use the object version in the transaction log, instead of a sequence. This will improve database performance a great deal.

Another alternative which is atomic, but in which threads committing transactions race to determine the message order:

Create a transaction queue and use the last resource commit optimisation to publish to it: The ‘last’ participant in a two-phase commit gets to prepare-and-commit. Make RabbitMQ the last resource. Another process reads messages from the transaction queue, and sends messages to multiple recipients, if required. remember publishing a single message to an exchange can mean multiple recipients.

This is quite easy to implement using PostgreSQL PREPARE TRANSACTION. Once you get the publish acknowledgement, COMMIT PREPARED. One strategy for dealing with un-acknowledged transactions is to roll back any transactions older than a time-out value (query pg-prepared-xacts).

Obviously that way of doing things is tied to PostgreSQL, but any database that implements X/Open XA can theoretically do this. It’s just more complicated to describe…

This option has different semantics: If the message can’t be sent, the transaction fails, and will need to be re-tried. The reason to delay sending a message until the transaction is committed is because you think prepare might fail, which is generally because of a conflict between transactions. In that case, you will need to retry the transaction in any case. You can then use your existing re-try mechanism to achieve reliability. Having multiple, overlapping reliability mechanisms just means more complexity, more testing and ultimately less reliability, so this is a win.

This is all pretty complicated, particularly the transaction log table. Haven’t we just implemented a queue in our database?

Now we have our transaction log/queue, we can fairly trivially create a web service that can return the list of messages since transaction X. A consumer can use this web service to stay up to date with the ‘publisher’. There are lots of systems which need an audit log, and the transaction log can serve this purpose, so it need never be deleted. If that’s not possible, just keep the log for long enough to give your consumers ‘long enough’ to catch up.

 

 

This is an end to end mechanism: it’s much more reliable than using RabbitMQ. In fact it’s one of the mechanisms I intended to write about in a follow up to the end-to-end article. Better late than never.

 

Share

Comment

  1. Edwin Dalorzo says:

    Very interesting post. I found it very useful because I have been investigating how to use RabbitMQ publish confirms to guarantee delivery of messages.

    Regarding your suggestion:

    > Write a transaction log into a database table, and then have your publisher chase the tail.

    How do you handle the fact hat there could be multiple machines doing the role of publishers. I mean, how do make sure that publishers does not step into each other’s toes. That’s what I haven’t been able to figure out, because for this I would need that the log tail would be qualified with unique publisher id, and make sure that such id can survive machine restarts.

    Do you have any advise on how to deal with that part of the problem?

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*