====== Replication ======

High Availibility Database Replication Setup

===== Replication Setups =====

**A) Master - Slave**

  * 2 Servers required
  * Instant backup (the slave may be seconds or hours behind the master)
  * Slave does not impact the performance on master, it's asyncronous, so the master does not wait for the slave
  * Dumps can be done on slave, without locking the master
  * Switching slave to master is easy, but restoring the old master is tricky..

**B) Master - Master**

  * 2 Servers required
  * Instant backup (the slave may be seconds or hours behind the master)
  * Performance impact should be low on active master
  * Setup is easy: it's basically only a Master -> Slave connection in both directions
  * Do not write into both servers!
    * If both servers are inserting or updating in the same table replication will break on one server due to “duplicate entry” error. To resolve this issue you have "auto_increment_increment" and "auto_increment_offset" settings.
  * master-master setup as active-passive replication: If we know that only one node is performing data modifications we can avoid many possible problems. In case of the failover "slave" could be easily promoted to a new master. Data modifications are automatically replicated to failed node when it comes back up.

**C) Galera Cluster**

  * min. 3 Servers required (to avoid split-brain situation with 2 Server)
  * performance is as slow as the slowest node!! all writes need to be confirmed, because of syncronous replication
  * A fake quorum server 'garbd arbitator' can be setup, which does not run an actual database
  * ...


===== Master <-> Master Setup =====

Get two nodes running on two machines with SSL connection.

This works even over internet connections between different datacenters, but the slave lag may be 0 seconds, minutes or even hours, depending on the connection and load on the servers.

==== my.cnf Options ====


# on Server A:
  server-id               = 1
  gtid-domain-id          = 1
  
  # on Server B:
  # server-id               = 2
  # gtid-domain-id          = 2
  
  log_bin                 = /var/log/mysql/mysql-bin.log
  log_bin_index           = /var/log/mysql/mariadb-bin.index
  
  relay_log               = /var/log/mysql/relay-bin
  relay_log_index         = /var/log/mysql/relay-bin.index
  relay_log_info_file     = /var/log/mysql/relay-bin.info
  
  expire_logs_days        = 10
  max_binlog_size         = 100M
  
  # binlog_do_db           = example,...  # you may set specifc dbs for replication, leave at to sync all dbs
  binlog_format=ROW
  log_slave_updates=1


==== Activate replication ====


Import data e.g. from a innobackup on the slave - ideally it is the same data. at least it should not have much lag, so export on the running master server and import ASAP. 


Log into you mysql console as root for the various sql commands:

  mysql -u root --password=xxxx

**Add a replication user on both nodes:**
  
  GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%' IDENTIFIED BY 'xyxyxyxyxyxyxyxyxy' REQUIRE SSL;

**The password must not be longer than __32 chars__ or you get permission denied after the first restart!!**

**On Server B - the first Slave:**
**
Import a fresh myxtrabackup from server A - the first master as soon as possible.**

**The best way to sync the first slave is to get the binlog position from the backup, because there might be writes on server A in between so the slave needs to catch up from the backups' position:**

  cat /var/lib/mysql/xtrabackup_binlog_info 

get some info like this:
  mysql-bin.000002        9235331


Set all replication parameter - use values from above to set the position: 

  CHANGE MASTER TO MASTER_HOST='1.2.3.4', MASTER_USER='replicator', MASTER_PASSWORD='xyxyxyxyxyxyxyxyxy', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=9235331, MASTER_SSL=1, MASTER_SSL_CA = '/etc/mysql/ssl/ca-cert.pem', MASTER_SSL_CERT = '/etc/mysql/ssl/client-cert.pem', MASTER_SSL_KEY = '/etc/mysql/ssl/client-key.pem';
  
  START SLAVE;
  SHOW SLAVE STATUS \G;
  
The slave should be running now and seconds_behind-master should decrease.

\\ 
**And the reverse connection: Server A becomes Slave, too:**

**On Server A - the first Master:**

 stop slave;  # so the master writes do not change the log position
 show master status\G;

  ..  
  File: mysql-bin.000052
  Position: 597
  ..

**On Server B:**

FIXME:  Insert the values from server A, but first **Lock the tables**:   

  FLUSH TABLES WITH READ LOCK; 
  
  CHANGE MASTER TO MASTER_HOST='1.2.3.4', MASTER_USER='replicator', MASTER_PASSWORD='xyxyxyxyxyxyxyxyxy', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000052', MASTER_LOG_POS=597, MASTER_SSL=1, MASTER_SSL_CA = '/etc/mysql/ssl/ca-cert.pem', MASTER_SSL_CERT = '/etc/mysql/ssl/client-cert.pem', MASTER_SSL_KEY = '/etc/mysql/ssl/client-key.pem';

  start slave;
  show slave status\G;

You should see:

  Slave_IO_State: Waiting for master to send event
  ...
  Seconds_Behind_Master: 0

Seconds_Behind_Master is the slave lag. It may be 0 or should be descreasing as the slave catches up.

\\

FIXME there are discussions to run myxtrabackup with --no-lock, because at the end of the process it sets a FTWRL (flush tables with read lock), which can take 5-10 seconds or even much longer, depending on the running queries. During that time Write queries are blocked. Need to test, if the import works with binlog postion from xtrabackup_binlog_pos_innodb which is generated anyway without lock.

\\

==== GTID ====

Switch to global-transaction-id is very easy, if you run mariadb. GTID is the modern, prefered replication method and it should be more reliable... although it might be more complicated to fix, if it breaks (i don't know yet).


  STOP SLAVE;
  CHANGE MASTER TO master_use_gtid=slave_pos;
  START SLAVE;
  
  show slave status\G;

Not you should see last lines like this:
  Using_Gtid: Slave_Pos
  Gtid_IO_Pos: 1-1-4416385,2-2-28619,0-2-8