Overview
Innodb CLuster is a modern solution to provide group replication for MySQL. In this configuration, one node is primary (RW) while the rest are ONLINE (RO)
Of course our IPs will look different because I am waaaay too lazy to care for proper pictures.
Installation
I am not big fan of manual work, ask my grandma about it. So I automated the whole shabang using Terraform + Ansible. Now, I will not show the terraform because it is strictly specific, but I will show you the ansible
Preparation
- name: Configure MySQL Libraries vars: ansible_user: opc ansible_become: yes ansible_become_method: sudo ansible_python_interpreter: /usr/bin/python hosts: dbservers tasks: - block: - name: Install the necessary packages yum: name: - mysql-server - mysql-shell - mysql-devel - mysql-common - mysql-libs - mysql state: present - name: Disable Selinux ansible.builtin.command: setenforce 0 - name: Stop firewall service: name: firewalld state: stopped
Configuration
After that you can configure the rest using the following ansible:
Configuration
- name: Configure MySQL Libraries vars: ansible_user: opc ansible_become: yes ansible_become_method: sudo hosts: dbservers tasks: - block: - name: Start the MySQL Service ansible.builtin.service: name: mysqld state: started - name: Check the existance of a mysql file stat: path: /root/.my.cnf register: mysql_file - name: Obtain & Change temporal password shell: | temporal_password=`awk '/A temporary password is generated for/ {a=$0} END{ print a }' /var/log/mysqld.log | awk '{print $(NF)}'` mysql -uroot -p${temporal_password} -e "SET GLOBAL super_read_only = 0; SET GLOBAL read_only = 0; " --connect-expired-password mysql -uroot -p${temporal_password} -e "ALTER USER 'root'@'localhost' IDENTIFIED BY '{{ root_password }}';flush privileges; " --connect-expired-password echo "[client] user=root password=\"{{ root_password }}\"" > /root/.my.cnf when: not mysql_file.stat.exists - name: Generate random server_id shell: | echo $((0x$(echo `hostname` | md5sum | cut -f 1 -d " " | cut -c 1-3))) | cut -c 1-3 register: mysql_server_id - name: Stop MySQL If running service: name: mysqld state: stopped - name: Define the primary host (e.g. first host in the group) set_fact: primary_node: "{{ groups['dbservers'][0] }}" - name: Define the secondary nodes set_fact: secondary_nodes: "{{ groups['dbservers'][1:] }}" - name: Get secondary hosts debug: var: secondary_nodes - name: Start MySQL service: name: mysqld state: started - name: Template bootstrapping js for mysqlsh template: dest: "/tmp/bootstrap_cluster.js" src: templates/bootstrap_cluster.js.j2 owner: "mysql" group: "mysql" mode: '0644' - name: Template InnoDB Cluster template: dest: "/tmp/add_instances.js" src: templates/add_instances.js.j2 owner: "mysql" group: "mysql" mode: '0644' - name: Bootstrap the cluster (new deployment) command: mysqlsh -f /tmp/bootstrap_cluster.js - name: Wait to start wait_for: timeout: 5 - name: Add the other instances command: mysqlsh -f /tmp/add_instances.js when: inventory_hostname == groups['dbservers'][0]
Congrats, now we have the cluster running:
Cluster status
MySQL 10.14.2.193:33060+ ssl JS > var cluster = dba.getCluster() MySQL 10.14.2.193:33060+ ssl JS > cluster.status(); { "clusterName": "InnoDBCluster", "defaultReplicaSet": { "name": "default", "primary": "mysql-cluster-a:3306", "ssl": "REQUIRED", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "mysql-cluster-a:3306": { "address": "mysql-cluster-a:3306", "memberRole": "PRIMARY", "mode": "R/W", "readReplicas": {}, "replicationLag": "applier_queue_applied", "role": "HA", "status": "ONLINE", "version": "8.0.37" }, "mysql-cluster-b:3306": { "address": "mysql-cluster-b:3306", "memberRole": "SECONDARY", "mode": "R/O", "readReplicas": {}, "replicationLag": "applier_queue_applied", "role": "HA", "status": "ONLINE", "version": "8.0.37" }, "mysql-cluster-c:3306": { "address": "mysql-cluster-c:3306", "memberRole": "SECONDARY", "mode": "R/O", "readReplicas": {}, "replicationLag": "applier_queue_applied", "role": "HA", "status": "ONLINE", "version": "8.0.37" } }, "topologyMode": "Single-Primary" }, "groupInformationSourceMember": "mysql-cluster-a:3306" }
Templates
Because we want to look (only to look like, being one is boring) professional, here are the template files:
Templates
#my.cnf.js.j2 [mysqld] server_id={{ mysql_server_id.stdout }} binlog_checksum=NONE gtid_mode=ON enforce_gtid_consistency=ON log_bin log_slave_updates=ON master_info_repository=TABLE relay_log_info_repository=TABLE skip-host-cache skip-name-resolve binlog_transaction_dependency_tracking=WRITESET #bootstrap.js.j2 var cluster_user_password = "{{ cluster_user_password }}"; var clusterName = "{{ cluster_name }}"; var secondary_nodes = "{{ secondary_nodes }}"; var root_password = "{{ root_password }}"; var cluster_user = "{{ cluster_user }}"; var mysql = require('mysql'); try{ dba.configureInstance('root@localhost', { password: `${root_password}`, restart: true, interactive: false, clusterAdmin: '{{ cluster_user }}', clusterAdminPassword: `${cluster_user_password}` }); } catch(e){ print('\n The innodb cluster admin user could not be created\n' + e.message + '\n'); } #Adding the instances var cluster_user = "{{ cluster_user }}"; var cluster_user_password = "{{ cluster_user_password }}"; var clusterName = "{{ cluster_name }}"; var secondary_nodes = "{{ secondary_nodes }}"; var secondary_nodes = secondary_nodes.replaceAll("u","").replaceAll("'","\"").replaceAll("\"","").replaceAll("[","").replaceAll("]","").replaceAll(" ","").split(','); var mysql = require('mysql'); try { print('Adding instances...\n'); shell.connect('{{ cluster_user }}@{{ primary_node }}', cluster_user_password); var cluster = dba.createCluster(clusterName); for(let i=0; i < {{ secondary_nodes|length }}; i++) { cluster.addInstance(`${cluster_user}@${secondary_nodes[i]}`, { password: `${cluster_user_password}`, recoveryMethod: "clone"} ); } } catch(e){ print('\nThe InnoDB cluster could not be created.\n\nError: ' + e.message + '\n'); }
Management
It is all cool and stuff, but let's go through several scenarios with our cluster:
1. Loss of a member 2. Backup 3. Recovery
First thing first, what we do if the connection between the primary and the secondary fails, well we have to rejoin, let's see how.
Loss of a member
To do that, we will simulate a stop of the service, someone stopped our service .
Loss of a node
[root@mysql-cluster-b ~]# service mysqld stop Redirecting to /bin/systemctl stop mysqld.service [root@mysql-cluster-b ~]#
Since our cluster can tolerate one failure, after that, it cannot tolerate anymore:
Cluster check
MySQL 10.14.2.193:33060+ ssl JS > var cluster = dba.getCluster(); MySQL 10.14.2.193:33060+ ssl JS > cluster.status(); { "clusterName": "InnoDBCluster", "defaultReplicaSet": { "name": "default", "primary": "mysql-cluster-a:3306", "ssl": "REQUIRED", "status": "OK_NO_TOLERANCE_PARTIAL", "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active.", "topology": { "mysql-cluster-a:3306": { "address": "mysql-cluster-a:3306", "memberRole": "PRIMARY", "mode": "R/W", "readReplicas": {}, "replicationLag": "applier_queue_applied", "role": "HA", "status": "ONLINE", "version": "8.0.37" }, "mysql-cluster-b:3306": { "address": "mysql-cluster-b:3306", "memberRole": "SECONDARY", "mode": "n/a", "readReplicas": {}, "role": "HA", **"shellConnectError": "MySQL Error 2003: Could not open connection to 'mysql-cluster-b:3306': Can't connect to MySQL server on 'mysql-cluster-b:3306' (111)",** "status": "(MISSING)" }, "mysql-cluster-c:3306": { "address": "mysql-cluster-c:3306", "memberRole": "SECONDARY", "mode": "R/O", "readReplicas": {}, "replicationLag": "applier_queue_applied", "role": "HA", "status": "ONLINE", "version": "8.0.37" } }, "topologyMode": "Single-Primary" }, "groupInformationSourceMember": "mysql-cluster-a:3306" }
We see that the node is down. So let's try to simulate a restoration of the connection:
Start the service
[root@mysql-cluster-b ~]# service mysqld start Redirecting to /bin/systemctl start mysqld.service [root@mysql-cluster-b ~]#
Now we can check again:
Check again
{ "clusterName": "InnoDBCluster", "defaultReplicaSet": { "name": "default", "primary": "mysql-cluster-a:3306", "ssl": "REQUIRED", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "mysql-cluster-a:3306": { "address": "mysql-cluster-a:3306", "memberRole": "PRIMARY", "mode": "R/W", "readReplicas": {}, "replicationLag": "applier_queue_applied", "role": "HA", "status": "ONLINE", "version": "8.0.37" }, "mysql-cluster-b:3306": { "address": "mysql-cluster-b:3306", "memberRole": "SECONDARY", "mode": "R/O", "readReplicas": {}, "replicationLag": "applier_queue_applied", "role": "HA", "status": "ONLINE", "version": "8.0.37" }, "mysql-cluster-c:3306": { "address": "mysql-cluster-c:3306", "memberRole": "SECONDARY", "mode": "R/O", "readReplicas": {}, "replicationLag": "applier_queue_applied", "role": "HA", "status": "ONLINE", "version": "8.0.37" } }, "topologyMode": "Single-Primary" }, "groupInformationSourceMember": "mysql-cluster-a:3306" }
Now we see that it is working, if for some reason it doesn't connect, we can try to force it using the:
Rejoin and rescan
Cluster.rejoinInstance(instance) cluster.rescan()
It is important that if the UUID was changed, we have to remove the instance from the metadata before adding it:
In case UUID is changed
cluster.removeInstance("root@instanceWithOldUUID:3306", {force: true}) cluster.rescan()