Creating RabbitMQ cluster is tricky. It is easy to do manually and hard if you’d like to automate it.
TL;DR Use this Terraform configuration to create RabbitMQ cluster in less than 5 minutes.
The simplest cluster requires 2 nodes and a load balancer. In AWS we are going to use ELB as a load balancer and put nodes in Auto Scaling group, so that if a node goes down (or became unhealthy) it will be replaced by a new one.
Our setup will be:
Using Terraform we can create Launch Configuration, Auto Scaling group and ELB. This is our ELB configuration:
What about nodes ? We use cloud-init to initialize a node and in there we configure RabbitMQ to run in Docker.
To find out what are the other nodes in the cluster we prepared a bash script that query nodes in our Auto Scaling group:
And then a script to join to these nodes:
The tricky part here is that to join a cluster, you have to stop the node first. So there is a chance that other node could also be stopped as well. To mitigate this problem we set sleep for some random amount of seconds before stopping the server Also, in case of errors, we perform sane amount of retries.
Using this Terraform configuration we successfully deployed many RabbitMQ clusters with up to 4 nodes.
Leave a comment if you find this useful or a question in case of troubles. Cheers!