# 🚀 RabbitMQ Cluster on Kubernetes (Complete Setup + Troubleshooting Guide)
---
# 📌 Objective
Deploy a **3-node RabbitMQ Cluster** on Kubernetes with:
* High Availability
* Persistent Storage (NFS)
* Auto Clustering
* Management UI
* Application connectivity (Tomcat)
---
# 🏗️ Components Created
## 1. Persistent Volumes (NFS)
We created 3 PVs:
* pv-rabbitmq1
* pv-rabbitmq2
* pv-rabbitmq3
Each mapped to:
```text
/data/nfsshared/rabbitmq-pv1
/data/nfsshared/rabbitmq-pv2
/data/nfsshared/rabbitmq-pv3
```
Used:
```yaml
accessModes: ReadWriteOnce
```
👉 Ensures **1 pod = 1 storage**
---
## 2. ConfigMap
Contains:
### enabled_plugins
```erlang
[rabbitmq_management,rabbitmq_peer_discovery_k8s].
```
### rabbitmq.conf
```ini
cluster_formation.peer_discovery_backend = k8s
cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.k8s.address_type = hostname
cluster_formation.k8s.service_name = service-rabbitmq-headless
cluster_formation.k8s.hostname_suffix = .service-rabbitmq-headless.default.svc.cluster.local
cluster_formation.node_cleanup.interval = 10
cluster_formation.node_cleanup.only_log_warning = true
cluster_partition_handling = autoheal
queue_master_locator=min-masters
```
👉 Enables **auto clustering using Kubernetes**
---
## 3. RBAC (CRITICAL)
```yaml
ServiceAccount → rabbitmq
Role → access pods, endpoints
RoleBinding → bind both
```
👉 Required because:
```text
RabbitMQ calls Kubernetes API → needs permission
```
---
## 4. Headless Service
```yaml
name: service-rabbitmq-headless
clusterIP: None
publishNotReadyAddresses: true
```
👉 Enables DNS like:
```text
rabbitmq-0.service-rabbitmq-headless
```
---
## 5. NodePort Service (UI)
```yaml
port: 15672
nodePort: 30072
```
👉 Access UI:
```text
http://<NodeIP>:30072
```
---
## 6. ClusterIP Service (App)
```yaml
name: rabbitmq-svc
port: 5672
```
👉 Used by:
```text
Tomcat → rabbitmq-svc:5672
```
---
## 7. StatefulSet
Key points:
```yaml
serviceName: service-rabbitmq-headless
replicas: 3
```
### ENV:
```yaml
RABBITMQ_DEFAULT_USER=admin
RABBITMQ_DEFAULT_PASS=admin
RABBITMQ_ERLANG_COOKIE=mysecretcookie
RABBITMQ_USE_LONGNAME=true
```
### Volumes:
* PVC → /var/lib/rabbitmq
* ConfigMap → rabbitmq.conf + plugins
👉 Ensures:
* Stable identity
* Persistent data
* Config-driven clustering
---
# ⚙️ FINAL EXECUTION ORDER (VERY IMPORTANT)
👉 Always follow this order:
```bash
kubectl apply -f pv-rabbit-01.yaml
kubectl apply -f pv-rabbit-02.yaml
kubectl apply -f pv-rabbit-03.yaml
kubectl apply -f rbac-rabbitmq.yaml
kubectl apply -f configmap-rabbit.yaml
kubectl apply -f service-rabbitmq-headless.yaml
kubectl apply -f service-rabbitmq-svc.yaml
kubectl apply -f service-rabbitmq-nodeport.yaml
kubectl apply -f StatefulSet-rabbitmq.yaml
```
---
# 🔥 TROUBLESHOOTING JOURNEY
---
## ❌ Issue 1: DNS Not Working
Problem:
```text
rabbitmq-1 not resolving
```
Fix:
```yaml
publishNotReadyAddresses: true
```
---
## ❌ Issue 2: Service Name Mismatch
Problem:
```text
rabbitmq-headless vs service-rabbitmq-headless
```
Fix:
```text
Must match EXACTLY
```
---
## ❌ Issue 3: No rabbitmq.conf
Fix:
Added clustering config
---
## ❌ Issue 4: 403 Error (CRITICAL)
Log:
```text
Failed to fetch nodes from Kubernetes API: 403
```
Fix:
Added RBAC
---
## ❌ Issue 5: Short vs Long Names
Error:
```text
epmd nxdomain
```
Fix:
```yaml
RABBITMQ_USE_LONGNAME=true
```
---
## ❌ Issue 6: Cluster Join Failure
Error:
```text
tables_not_present
mnesia_not_running
```
👉 Root cause:
```text
Pods not ready at same time (timing issue)
```
---
## ❌ Issue 7: Cluster Not Forming
Final log:
```text
Starting as a blank standalone node
```
👉 Reason:
```text
Retry failed → node becomes standalone
```
---
# 🧠 WHY THIS HAPPENS
RabbitMQ:
```text
Cluster formation happens ONLY at startup
```
If peers not ready → join fails
---
# 🔧 FINAL FIXES APPLIED
* Enabled RBAC ✅
* Enabled longnames ✅
* Fixed serviceName ✅
* Fixed DNS ✅
* Added retry logic ✅
* Restarted pods cleanly ✅
---
# 📊 FINAL RESULT
```bash
rabbitmqctl cluster_status
```
Output:
```text
rabbit@rabbitmq-0
rabbit@rabbitmq-1
rabbit@rabbitmq-2
```
---
# 🎯 WHAT WE ACHIEVED
✅ 3-node RabbitMQ cluster
✅ Auto discovery via Kubernetes
✅ Persistent storage
✅ UI access
✅ App connectivity
✅ HA-ready setup
---
# ⚠️ ALTERNATIVES
| Approach | Result |
| -------------- | ------------------------ |
| No RBAC | No clustering ❌ |
| Manual join | Works but not stable ⚠️ |
| Classic config | Static, not scalable ❌ |
| Helm chart | Best production option ✅ |
---
# 🧠 FINAL LEARNING
* Kubernetes = dynamic → needs API
* RabbitMQ = startup-based clustering
* RBAC = mandatory
* Headless service = must
* Longnames = required
* Timing = critical
---
# 🚀 NEXT STEPS
* Create quorum queues
* Test failover (kill pod)
* Connect Tomcat producer/consumer
* Monitor cluster
---
# 📌 FINAL CONCLUSION
You successfully built a **production-grade RabbitMQ cluster on Kubernetes**
and solved real-world issues like:
* DNS
* RBAC
* Clustering
* Node naming
* Startup timing
---
==================================configmap-rabbit.yaml========================================
apiVersion: v1
kind: ConfigMap
metadata:
name: configmap-rabbit
labels:
type: configmap-rabbit
data:
enabled_plugins: |
[rabbitmq_management,rabbitmq_peer_discovery_k8s].
rabbitmq.conf: |
cluster_formation.peer_discovery_backend = k8s
cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.k8s.address_type = hostname
cluster_formation.k8s.service_name = service-rabbitmq-headless
cluster_formation.k8s.hostname_suffix = .service-rabbitmq-headless.default.svc.cluster.local
cluster_formation.node_cleanup.interval = 10
cluster_formation.node_cleanup.only_log_warning = true
cluster_partition_handling = autoheal
queue_master_locator=min-masters
==================================pv-rabbit-01.yaml========================================
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-rabbitmq1
labels:
type: pv-rabbitmq
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
mountOptions:
- sec=sys
- nfsvers=4.1
- hard
nfs:
server: controlnode
path: /data/nfsshared/rabbitmq-pv1
readOnly: false
==================================pv-rabbit-02.yaml========================================
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-rabbitmq2
labels:
type: pv-rabbitmq
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
mountOptions:
- sec=sys
- nfsvers=4.1
- hard
nfs:
server: controlnode
path: /data/nfsshared/rabbitmq-pv2
readOnly: false
==================================pv-rabbit-03.yaml========================================
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-rabbitmq3
labels:
type: pv-rabbitmq
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
mountOptions:
- sec=sys
- nfsvers=4.1
- hard
nfs:
server: controlnode
path: /data/nfsshared/rabbitmq-pv3
readOnly: false
==================================rbac-rabbitmq.yaml========================================
apiVersion: v1
kind: ServiceAccount
metadata:
name: rabbitmq
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: rabbitmq
namespace: default
rules:
- apiGroups: [""]
resources:
- endpoints
- pods
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: rabbitmq
namespace: default
subjects:
- kind: ServiceAccount
name: rabbitmq
namespace: default
roleRef:
kind: Role
name: rabbitmq
apiGroup: rbac.authorization.k8s.io
==================================service-rabbitmq-headless.yaml========================================
apiVersion: v1
kind: Service
metadata:
name: service-rabbitmq-headless
labels:
type: service-rabbitmq-headless
spec:
clusterIP: None
publishNotReadyAddresses: true
selector:
app: rabbitmq
ports:
- name: amqp
port: 5672
- name: management
port: 15672
- name: epmd
port: 4369
- name: cluster-rpc
port: 25672
==================================service-rabbitmq-nodeport.yaml========================================
apiVersion: v1
kind: Service
metadata:
name: rabbitmq-nodeport
labels:
type: rabbitmq-nodeport
spec:
type: NodePort
selector:
app: rabbitmq
ports:
- name: management
port: 15672
targetPort: 15672
nodePort: 30072
==================================service-rabbitmq-svc.yaml========================================
apiVersion: v1
kind: Service
metadata:
name: rabbitmq-svc
labels:
type: rabbitmq-svc
spec:
type: ClusterIP
selector:
app: rabbitmq
ports:
- name: amqp
port: 5672
targetPort: 5672
==================================StatefulSet-rabbitmq.yaml========================================
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
labels:
type: rabbitmq
spec:
serviceName: service-rabbitmq-headless
replicas: 3
selector:
matchLabels:
app: rabbitmq
template:
metadata:
labels:
app: rabbitmq
spec:
serviceAccountName: rabbitmq
containers:
- name: rabbitmq
image: rabbitmq:3.12-management
ports:
- containerPort: 5672
- containerPort: 15672
env:
- name: RABBITMQ_DEFAULT_USER
value: "admin"
- name: RABBITMQ_DEFAULT_PASS
value: "admin"
- name: RABBITMQ_ERLANG_COOKIE
value: "mysecretcookie"
- name: RABBITMQ_USE_LONGNAME
value: "true"
volumeMounts:
- name: data
mountPath: /var/lib/rabbitmq
- name: config
mountPath: /etc/rabbitmq/enabled_plugins
subPath: enabled_plugins
- name: rabbitconf
mountPath: /etc/rabbitmq/rabbitmq.conf
subPath: rabbitmq.conf
volumes:
- name: config
configMap:
name: configmap-rabbit
- name: rabbitconf
configMap:
name: configmap-rabbit
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
storageClassName: ""
resources:
requests:
storage: 1Gi
selector:
matchLabels:
type: pv-rabbitmq
# 👍 END
| Task | Imperative Command | Declarative YAML |
| Create Namespace | kubectl create namespace dev | apiVersion: v1 kind: Namespace metadata: name: dev |
| Create Pod | kubectl run nginx-pod --image=nginx:1.25 | apiVersion: v1 kind: Pod metadata: name: nginx-pod spec: containers: - name: nginx image: nginx:1.25 |
| Create Deployment | kubectl create deployment sampledeploy --image=nginx:1.25 --replicas=4 | apiVersion: apps/v1 kind: Deployment metadata: name: sampledeploy spec: replicas: 4 selector: matchLabels: app: sampledeploy template: metadata: labels: app: sampledeploy spec: containers: - name: nginx image: nginx:1.25 |
| Create ClusterIP Service | kubectl expose deployment sampledeploy --port=80 --target-port=80 | apiVersion: v1 kind: Service metadata: name: sampledeploy-service spec: selector: app: sampledeploy ports: - port: 80 targetPort: 80 |
| Create NodePort Service | kubectl expose deployment sampledeploy --type=NodePort --port=80 --target-port=80 | apiVersion: v1 kind: Service metadata: name: sampledeploy-nodeport spec: type: NodePort selector: app: sampledeploy ports: - port: 80 targetPort: 80 nodePort: 30080 |
| Create ConfigMap | kubectl create configmap app-config --from-literal=APP_MODE=production --from-literal=COLOR=blue | apiVersion: v1 kind: ConfigMap metadata: name: app-config data: APP_MODE: production COLOR: blue |
| Create Secret | kubectl create secret generic db-secret --from-literal=username=admin --from-literal=password=pass123 | apiVersion: v1 kind: Secret metadata: name: db-secret type: Opaque stringData: username: admin password: pass123 |
| Create ServiceAccount | kubectl create serviceaccount frontend-sa | apiVersion: v1 kind: ServiceAccount metadata: name: frontend-sa |
| Create Role | kubectl create role pod-reader --verb=get,list,watch --resource=pods | apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: pod-reader rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "list", "watch"] |
| Create RoleBinding | kubectl create rolebinding pod-reader-binding --role=pod-reader --serviceaccount=default:frontend-sa | apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: pod-reader-binding subjects: - kind: ServiceAccount name: frontend-sa namespace: default roleRef: kind: Role name: pod-reader apiGroup: rbac.authorization.k8s.io |
| Create ClusterRole | kubectl create clusterrole node-reader --verb=get,list,watch --resource=nodes | apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: node-reader rules: - apiGroups: [""] resources: ["nodes"] verbs: ["get", "list", "watch"] |
| Create ClusterRoleBinding | kubectl create clusterrolebinding node-reader-binding --clusterrole=node-reader --serviceaccount=default:frontend-sa | apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: node-reader-binding subjects: - kind: ServiceAccount name: frontend-sa namespace: default roleRef: kind: ClusterRole name: node-reader apiGroup: rbac.authorization.k8s.io |
| Create PVC | Usually declarative | apiVersion: v1 kind: PersistentVolumeClaim metadata: name: app-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi |
| Create StatefulSet | Usually declarative | apiVersion: apps/v1 kind: StatefulSet metadata: name: mongodb spec: serviceName: mongodb-headless replicas: 1 selector: matchLabels: app: mongodb template: metadata: labels: app: mongodb spec: containers: - name: mongodb image: mongo:6 |
| Create Job | kubectl create job test-job --image=busybox | apiVersion: batch/v1 kind: Job metadata: name: test-job spec: template: spec: containers: - name: busybox image: busybox restartPolicy: Never |
| Create CronJob | kubectl create cronjob backup-job --image=busybox --schedule="*/5 * * * *" | apiVersion: batch/v1 kind: CronJob metadata: name: backup-job spec: schedule: "*/5 * * * *" jobTemplate: spec: template: spec: containers: - name: busybox image: busybox restartPolicy: Never |
###############################
# MYSQL 8.0.x → 8.4.x UPGRADE
# CLEAN INSTALL + RESTORE METHOD
###############################
ENVIRONMENT:
- RHEL8 / Rocky Linux
- Custom datadir: /data/mysql_server
- No GTID
- Old MySQL 8.0.x
- New MySQL 8.4.x
- Using encryption/keyring
- Fresh initialize method
====================================================
STEP 1 — CHECK OLD MYSQL ENVIRONMENT
====================================================
mysql -uroot -p
SHOW PLUGINS;
SELECT TABLE_SCHEMA,TABLE_NAME,CREATE_OPTIONS
FROM information_schema.tables
WHERE CREATE_OPTIONS LIKE '%ENCRYPTION%';
grep -Ri keyring /etc/my.cnf*
====================================================
STEP 2 — TAKE FULL BACKUP
====================================================
mysqldump \
--all-databases \
--routines \
--events \
--triggers \
--single-transaction \
--hex-blob \
-u root -p > /backup/full.sql
====================================================
STEP 3 — BACKUP KEYRING (VERY IMPORTANT)
====================================================
tar -cvzf /backup/mysql-keyring.tar.gz \
/data/mysql-keyring
====================================================
STEP 4 — BACKUP CONFIG
====================================================
cp -p /etc/my.cnf /backup/
====================================================
STEP 5 — REMOVE ENCRYPTION FROM DUMP
(RECOMMENDED SAFEST METHOD)
====================================================
cp /backup/full.sql /backup/full_no_encrypt.sql
sed -i "s/ENCRYPTION='Y'/ENCRYPTION='N'/g" \
/backup/full_no_encrypt.sql
====================================================
STEP 6 — STOP MYSQL
====================================================
systemctl stop mysqld
ps -ef | grep mysqld
IF STILL RUNNING:
pkill -9 mysqld
====================================================
STEP 7 — REMOVE OLD MYSQL RPMs
====================================================
rpm -qa | grep -i mysql
dnf remove mysql*
OR
rpm -e mysql-community-server \
mysql-community-client \
mysql-community-common \
mysql-community-libs
====================================================
STEP 8 — INSTALL MYSQL 8.4 RPMs
====================================================
cd /data/pkg/mysql84/
yum install mysql-community-*.rpm
OR
rpm -ivh mysql-community-common-8.4*.rpm
rpm -ivh mysql-community-client-plugins-8.4*.rpm
rpm -ivh mysql-community-libs-8.4*.rpm
rpm -ivh mysql-community-client-8.4*.rpm
rpm -ivh mysql-community-server-8.4*.rpm
====================================================
STEP 9 — RENAME OLD DATADIR
====================================================
mv /data/mysql_server \
/data/mysql_server_80_backup
====================================================
STEP 10 — CREATE NEW DATADIR
====================================================
mkdir -p /data/mysql_server
chown -R mysql:mysql /data/mysql_server
chmod 750 /data/mysql_server
====================================================
STEP 11 — EDIT /etc/my.cnf
====================================================
REMOVE OLD KEYRING CONFIG:
#early-plugin-load=keyring_file.so
#keyring_file_data=/data/mysql-keyring/keyring
====================================================
STEP 12 — CREATE NEW KEYRING DIRECTORY
====================================================
mkdir -p /data/mysql-keyring
chown -R mysql:mysql /data/mysql-keyring
chmod 750 /data/mysql-keyring
====================================================
STEP 13 — CREATE COMPONENT CONFIG
====================================================
mkdir -p /var/lib/mysql-files
vi /var/lib/mysql-files/component_keyring_file.cnf
ADD:
{
"path": "/data/mysql-keyring/keyring",
"read_only": false
}
SAVE FILE
====================================================
STEP 14 — FIX PERMISSIONS
====================================================
chown mysql:mysql \
/var/lib/mysql-files/component_keyring_file.cnf
chmod 640 \
/var/lib/mysql-files/component_keyring_file.cnf
====================================================
STEP 15 — CREATE BOOTSTRAP FILE
(VERY IMPORTANT)
====================================================
vi /usr/sbin/mysqld.my
ADD:
{
"components": "file://component_keyring_file"
}
SAVE FILE
====================================================
STEP 16 — FIX BOOTSTRAP FILE PERMISSIONS
====================================================
chown mysql:mysql /usr/sbin/mysqld.my
chmod 640 /usr/sbin/mysqld.my
====================================================
STEP 17 — INITIALIZE MYSQL 8.4
====================================================
mysqld \
--defaults-file=/etc/my.cnf \
--initialize \
--user=mysql
====================================================
STEP 18 — START MYSQL
====================================================
systemctl start mysqld
====================================================
STEP 19 — CHECK LOGS
====================================================
journalctl -xeu mysqld
tail -f /data/mysql_server/mysqld.log
====================================================
STEP 20 — LOGIN MYSQL
====================================================
mysql -uroot -p
====================================================
STEP 21 — VERIFY KEYRING COMPONENT
====================================================
SELECT * FROM performance_schema.keyring_component_status;
IF NOT EMPTY = SUCCESS
====================================================
STEP 22 — REGISTER COMPONENT
====================================================
INSTALL COMPONENT 'file://component_keyring_file';
SELECT * FROM mysql.component;
====================================================
STEP 23 — TEST ENCRYPTION
====================================================
CREATE DATABASE testdb;
USE testdb;
CREATE TABLE t1 (
id INT
) ENCRYPTION='Y';
====================================================
STEP 24 — RESTORE DUMP
====================================================
mysql -uroot -p < /backup/full_no_encrypt.sql
====================================================
STEP 25 — VERIFY DATABASES
====================================================
SHOW DATABASES;
SELECT user,host FROM mysql.user;
====================================================
STEP 26 — OPTIONAL RE-ENABLE ENCRYPTION LATER
====================================================
ALTER TABLE table_name ENCRYPTION='Y';
====================================================
IMPORTANT NOTES
====================================================
1. NEVER DELETE:
/data/mysql-keyring
2. NEVER MIX:
old plugin + new component
3. DO NOT USE:
early-plugin-load=keyring_file.so
4. NEW MYSQL 8.4 USES:
component_keyring_file
5. MOST IMPORTANT FILE:
/usr/sbin/mysqld.my
WITHOUT IT:
- component installs
- but encryption fails
6. IF RESTORE FAILS:
- use ENCRYPTION='N'
- restore first
- re-enable later
###############################
END
###############################