'rabbitmq-upgrade revive' command returns error #12013
-
Describe the bugWhen attempting a rolling upgrade of RabbitMQ from 3.13.2 to 3.13.6 'rabbitmq-upgrade revive' command errors on 1st node in cluster and upgrade stalls. rabbitmq-upgrade revive Reproduction steps
Expected behaviorWe expect the command to not error as with previous successful rolling upgrades. Additional contextNo response |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 4 replies
-
@alphamonkey79 I cannot reproduce and this has not been reported elsewhere (that I know of). We need details about the state of the cluster. One easy workaround would be to leave out |
Beta Was this translation helpful? Give feedback.
-
Most likely a quorum queue was being deleted concurrently with the "revival" operation. So this list of errors handled simply needs one extra clause added. The reason why it is so hard to hit is that |
Beta Was this translation helpful? Give feedback.
-
#12014 will avoid this exception. To summarize:
|
Beta Was this translation helpful? Give feedback.
-
Hello Michael, This is a 3 node cluster and is currently in a mixed version state. This is the 3rd environment / cluster to go from 3.13.2 to 3.13.6 and the first to run into this issue which is strange. The potential of a quorum queue replica create / remove concurrency makes sense given that we only see the error on 2 out of 3 nodes. |
Beta Was this translation helpful? Give feedback.
-
I have pushed some Upgrades guide updates to make it clear that @kjnilsson has identified an unintentional mistake in what queues do those command attempt to operate on, which can produce this and other unnecessary exceptions. Even though #12014 will handle this and most (if not all) other errors our Raft library can return, a separate fix for that is coming in Together with an easy to adopt workaround (a node must be restarted one way or another), this hopefully can be considered addressed. |
Beta Was this translation helpful? Give feedback.
-
Thank you @michaelklishin! Tim |
Beta Was this translation helpful? Give feedback.
I have pushed some Upgrades guide updates to make it clear that
revive
is not necessary in most cases, and its only purpose is to undo (some) effects ofdrain
rabbitmq/rabbitmq-website@bb21df7.@kjnilsson has identified an unintentional mistake in what queues do those command attempt to operate on, which can produce this and other unnecessary exceptions. Even though #12014 will handle this and most (if not all) other errors our Raft library can return, a separate fix for that is coming in
3.13.7
.Together with an easy to adopt workaround (a node must be restarted one way or another), this hopefully can be considered addressed.