=====Overview=====
Innodb CLuster is a modern solution to provide group replication for MySQL. In this configuration, one node is primary (RW) while  the rest are ONLINE (RO)

{{:mysql_innodb_cluster_overview.png?800|}}

Of course our IPs will look different because I am waaaay too lazy to care for proper pictures.


=====Installation=====
I am not big fan of manual work, ask my grandma about it. So I automated the whole shabang using Terraform + Ansible.
Now, I will not show the terraform because it is strictly specific, but I will show you the ansible


<Code:bash|Preparation>
- name: Configure MySQL Libraries
  vars:
    ansible_user: opc
    ansible_become: yes
    ansible_become_method: sudo
    ansible_python_interpreter: /usr/bin/python
  hosts: dbservers
  tasks:
    - block:
      - name: Install the necessary packages
        yum:
          name:
            - mysql-server
            - mysql-shell
            - mysql-devel
            - mysql-common
            - mysql-libs
            - mysql
          state: present

      - name: Disable Selinux
        ansible.builtin.command: setenforce 0

      - name: Stop firewall
        service:
          name: firewalld
          state: stopped
</Code>

=====Configuration=====
After that you can configure the rest using the following ansible:

<Code:bash|Configuration>
- name: Configure MySQL Libraries
  vars:
    ansible_user: opc
    ansible_become: yes
    ansible_become_method: sudo
  hosts: dbservers
  tasks:
    - block:
      - name: Start the MySQL Service
        ansible.builtin.service:
          name: mysqld
          state: started

      - name: Check the existance of a mysql file
        stat:
          path: /root/.my.cnf
        register: mysql_file

      - name: Obtain & Change temporal password
        shell: |
          temporal_password=`awk '/A temporary password is generated for/ {a=$0} END{ print a }' /var/log/mysqld.log | awk '{print $(NF)}'`
          mysql -uroot -p${temporal_password} -e "SET GLOBAL super_read_only = 0; SET GLOBAL read_only = 0; " --connect-expired-password
          mysql -uroot -p${temporal_password} -e "ALTER USER 'root'@'localhost' IDENTIFIED BY '{{ root_password }}';flush privileges; " --connect-expired-password
          echo "[client]
          user=root
          password=\"{{ root_password }}\"" > /root/.my.cnf
        when: not mysql_file.stat.exists

      - name: Generate random server_id
        shell: |
          echo $((0x$(echo `hostname` | md5sum | cut -f 1 -d " " | cut -c 1-3))) | cut -c 1-3
        register: mysql_server_id

      - name: Stop MySQL If running
        service:
          name: mysqld
          state: stopped

      - name: Define the primary host (e.g. first host in the group)
        set_fact:
          primary_node: "{{ groups['dbservers'][0] }}"

      - name: Define the secondary nodes
        set_fact:
          secondary_nodes: "{{ groups['dbservers'][1:] }}"

      - name: Get secondary hosts
        debug:
          var: secondary_nodes

      - name: Start MySQL
        service:
          name: mysqld
          state: started

      - name: Template bootstrapping js for mysqlsh
        template:
          dest: "/tmp/bootstrap_cluster.js"
          src: templates/bootstrap_cluster.js.j2
          owner: "mysql"
          group: "mysql"
          mode: '0644'

      - name: Template InnoDB Cluster
        template:
          dest: "/tmp/add_instances.js"
          src: templates/add_instances.js.j2
          owner: "mysql"
          group: "mysql"
          mode: '0644'

      - name: Bootstrap the cluster (new deployment)
        command: mysqlsh -f /tmp/bootstrap_cluster.js

      - name: Wait to start
        wait_for:
          timeout: 5

      - name: Add the other instances
        command: mysqlsh -f /tmp/add_instances.js
        when: inventory_hostname == groups['dbservers'][0]
</Code>

Congrats, now we have the cluster running:


<Code:bash|Cluster status>
 MySQL  10.14.2.193:33060+ ssl  JS > var cluster = dba.getCluster()
 MySQL  10.14.2.193:33060+ ssl  JS > cluster.status();
{
    "clusterName": "InnoDBCluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "mysql-cluster-a:3306",
        "ssl": "REQUIRED",
        "status": "OK",
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
        "topology": {
            "mysql-cluster-a:3306": {
                "address": "mysql-cluster-a:3306",
                "memberRole": "PRIMARY",
                "mode": "R/W",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.37"
            },
            "mysql-cluster-b:3306": {
                "address": "mysql-cluster-b:3306",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.37"
            },
            "mysql-cluster-c:3306": {
                "address": "mysql-cluster-c:3306",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.37"
            }
        },
        "topologyMode": "Single-Primary"
    },
    "groupInformationSourceMember": "mysql-cluster-a:3306"
}
</Code>

=====Templates=====
Because we want to look (only to look like, being one is boring) professional, here are the template files:


<Code:bash|Templates>
#my.cnf.js.j2
[mysqld]
server_id={{ mysql_server_id.stdout }}
binlog_checksum=NONE
gtid_mode=ON
enforce_gtid_consistency=ON
log_bin
log_slave_updates=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
skip-host-cache
skip-name-resolve
binlog_transaction_dependency_tracking=WRITESET

#bootstrap.js.j2
var cluster_user_password = "{{ cluster_user_password }}";
var clusterName = "{{ cluster_name }}";
var secondary_nodes = "{{ secondary_nodes }}";
var root_password = "{{ root_password }}";
var cluster_user = "{{ cluster_user }}";

var mysql = require('mysql');

try{
   dba.configureInstance('root@localhost', {
    password: `${root_password}`,
    restart: true,
    interactive: false,
    clusterAdmin: '{{ cluster_user }}',
    clusterAdminPassword: `${cluster_user_password}`
  });
} catch(e){
    print('\n The innodb cluster admin user could not be created\n' + e.message + '\n');
}

#Adding the instances
var cluster_user = "{{ cluster_user }}";
var cluster_user_password = "{{ cluster_user_password }}";
var clusterName = "{{ cluster_name }}";
var secondary_nodes = "{{ secondary_nodes }}";
var secondary_nodes = secondary_nodes.replaceAll("u","").replaceAll("'","\"").replaceAll("\"","").replaceAll("[","").replaceAll("]","").replaceAll(" ","").split(',');

var mysql = require('mysql');

try {
    print('Adding instances...\n');
    shell.connect('{{ cluster_user }}@{{ primary_node }}', cluster_user_password);
    var cluster = dba.createCluster(clusterName);
    for(let i=0; i < {{ secondary_nodes|length }}; i++) {
        cluster.addInstance(`${cluster_user}@${secondary_nodes[i]}`, { password: `${cluster_user_password}`, recoveryMethod: "clone"} );
   }
} catch(e){
    print('\nThe InnoDB cluster could not be created.\n\nError: ' + e.message + '\n');
}
</Code>


=====Management=====
It is all cool and stuff, but let's go through several scenarios with our cluster:

1. Loss of a member
2. Backup
3. Recovery


First thing first, what we do if the connection between the primary and the secondary fails, well we have to rejoin, let's see how.


====Loss of a member====
To do that, we will simulate a stop of the service, someone stopped our service :-).

<Code:bash|Loss of a node>
[root@mysql-cluster-b ~]# service mysqld stop
Redirecting to /bin/systemctl stop mysqld.service
[root@mysql-cluster-b ~]# 
</Code>

Since our cluster can tolerate one failure, after that, it cannot tolerate anymore:


<Code:bash|Cluster check>
 MySQL  10.14.2.193:33060+ ssl  JS > var cluster = dba.getCluster();
 MySQL  10.14.2.193:33060+ ssl  JS > cluster.status();
{
    "clusterName": "InnoDBCluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "mysql-cluster-a:3306",
        "ssl": "REQUIRED",
        "status": "OK_NO_TOLERANCE_PARTIAL",
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active.",
        "topology": {
            "mysql-cluster-a:3306": {
                "address": "mysql-cluster-a:3306",
                "memberRole": "PRIMARY",
                "mode": "R/W",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.37"
            },
            "mysql-cluster-b:3306": {
                "address": "mysql-cluster-b:3306",
                "memberRole": "SECONDARY",
                "mode": "n/a",
                "readReplicas": {},
                "role": "HA",
                **"shellConnectError": "MySQL Error 2003: Could not open connection to 'mysql-cluster-b:3306': Can't connect to MySQL server on 'mysql-cluster-b:3306' (111)",**
                "status": "(MISSING)"
            },
            "mysql-cluster-c:3306": {
                "address": "mysql-cluster-c:3306",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.37"
            }
        },
        "topologyMode": "Single-Primary"
    },
    "groupInformationSourceMember": "mysql-cluster-a:3306"
}
</Code>

We see that the node is down. So let's try to simulate a restoration of the connection:

<Code:bash|Start the service>
[root@mysql-cluster-b ~]# service mysqld start
Redirecting to /bin/systemctl start mysqld.service
[root@mysql-cluster-b ~]#
</Code>

Now we can check again:

<Code:bash|Check again>
{
    "clusterName": "InnoDBCluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "mysql-cluster-a:3306",
        "ssl": "REQUIRED",
        "status": "OK",
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
        "topology": {
            "mysql-cluster-a:3306": {
                "address": "mysql-cluster-a:3306",
                "memberRole": "PRIMARY",
                "mode": "R/W",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.37"
            },
            "mysql-cluster-b:3306": {
                "address": "mysql-cluster-b:3306",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.37"
            },
            "mysql-cluster-c:3306": {
                "address": "mysql-cluster-c:3306",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.37"
            }
        },
        "topologyMode": "Single-Primary"
    },
    "groupInformationSourceMember": "mysql-cluster-a:3306"
}
</Code>

Now we see that it is working, if for some reason it doesn't connect, we can try to force it using the:  
<Code:bash|Rejoin and rescan>
Cluster.rejoinInstance(instance) 
cluster.rescan()
</Code>

It is important that if the UUID was changed, we have to remove the instance from the metadata before adding it:

<Code:bash|In case UUID is changed>
cluster.removeInstance("root@instanceWithOldUUID:3306", {force: true})
cluster.rescan()
</Code>


====Backup====


====Restore====