10 common replication problems

by nivas on September 23, 2014

Below are 10 common replication problems :

1)Session binlog :

Setting sql_log_bin = 0 will disable binary logging for the session. So any DML/DDL query executed within that session on the master will not be replicated to slave

Example:

Turn off binary logging on master :

mysql> set sql_log_bin = 0 ;
Query OK, 0 rows affected (0.00 sec)

Create table in reptest database ( on master )

mysql> create table reptest(ID int) ;
Query OK, 0 rows affected (0.01 sec)

mysql> show tables ;
+-------------------+
| Tables_in_reptest |
+-------------------+
| reptest           |
+-------------------+
1 row in set (0.00 sec)

Validate to see that table is not created on slave :

mysql> use reptest ;
Database changed
mysql> show tables ;
Empty set (0.00 sec)

2) Updating slave directly

Master and slave will get out of synch  if  the application makes changes to slave directly . It is a good practice to make the slave read_only but this wont stop DBAs from modifying slave directly

3) Binlog corruption

Below is an example of corrupted log :

# Position  Timestamp   Type   Master ID        Size      Master Pos    Flags 
#        4 51 2e 30 4d   0f   01 00 00 00   66 00 00 00   6a 00 00 00   00 00
#       17 04 00 35 2e 32 2e 34 2d  4d 61 72 69 61 44 42 2d |..5.2.4.MariaDB.|
#       27 6d 61 72 69 61 64 62 39  34 2d 6c 6f 67 00 00 00 |mariadb94.log...|
#       37 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|
#       47 00 00 00 00 00 00 00 00  13 38 0d 00 08 00 12 00 |.........8......|
#       57 04 04 04 04 12 00 00 53  00 04 1a 08 00 00 00 08 |.......S........|
#       67 08 08 02  |...|
#       Start: binlog v 4, server v 5.2.4-MariaDB-mariadb94-log created 110114  6:06:57
BINLOG '
US4wTQ8BAAAAZgAAAGoAAAAAAAQANS4yLjQtTWFyaWFEQi1tYXJpYWRiOTQtbG9nAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAUwAEGggAAAAICAgC
'/*!*/;
# at 48954702

In case of such corruption below are steps you can take :

i) Start replication issuing ‘change master’ command using appropriate replication coordinates(next position)

change master to master_log_position = next_pos;
start slave;

ii) You might need to use pt-table-checksum and pt-table-sync to synchronize the slave.

iii) Restore the database if above steps fail

4) Relay log corruption

Relay log corruption can be  caused due to network/transmission issues . Use mysqlbinlog to find the replication coordinates before corruption.  Start replication using ‘CHANGE MASTER’ command with right coordinates

STOP SLAVE;
CHANGE MASTER TO
MASTER_HOST='< master-host ip or DNS >',
MASTER_PORT=3306,
MASTER_USER='< usernmae >',
MASTER_PASSWORD='< password >',
MASTER_LOG_FILE='< MMMM >',
MASTER_LOG_POS=< PPPP >;START SLAVE;

5)Deleting/Purging binary logs

Before you purge binary logs on the master , you have to make sure that those logs are already processed by slave. So , be extra-cautious before you use ‘PURGE BINARY LOGS ‘ on the master server

6) Replication unsafe (non-deterministic function)

Below are some of the examples for non-deterministic functions  . Using these functions will yield a different result on master and slave

FOUND_ROWS(), GET_LOCK(), IS_FREE_LOCK(), IS_USED_LOCK(), LOAD_FILE(), MASTER_POS_WAIT(), PASSWORD(), RAND(), RELEASE_LOCK(), ROW_COUNT(), SESSION_USER(), SLEEP(), SYSDATE(), SYSTEM_USER(), USER(), UUID(), and UUID_SHORT().

Example :
On master :

mysql> insert into timetable values(sysdate()) ;
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> select * from timetable ;
+---------------------+
| tvalue              |
+---------------------+
| 2014-09-21 14:56:05 |
+---------------------+
1 row in set (0.00 sec)

Slave has a different value now:

mysql> select * from timetable ;
+---------------------+
| tvalue              |
+---------------------+
| 2014-09-21 14:56:02 |
+---------------------+
1 row in set (0.00 sec)

 

7) Bad change master to statement

‘Change master to ‘ statement has to be executed with caution. If you supply incorrect binlog position that could miss some of the statements. Other statements like ‘ skip global sql_slave_skip_counter ‘ , ‘reset slave’ , ‘reset master’ have to be used with caution

8)binlog-do-db, binlog-ignore-db, replicate-do-db, and replicate-ignore-db.

These configuration options can be used to replicate selectively. But if these are configured incorrectly it could cause data drift

9)Replication bugs

Below are some replication bugs(currently active and not resolved) that could break replication

http://bugs.mysql.com/bug.php?id=72556

http://bugs.mysql.com/bug.php?id=71859

http://bugs.mysql.com/bug.php?id=70832

http://bugs.mysql.com/bug.php?id=71111

10)Master crash

Master server crash could result in replication errors . Sync_binlog variable helps in this case

If the value of this variable is greater than 0, the MySQL server synchronizes its binary log to disk (using fdatasync()) after every sync_binlog writes to the binary log. There is one write to the binary log per statement if autocommit is enabled, and one write per transaction otherwise. The default value of sync_binlog is 0, which does no synchronizing to disk—in this case, the server relies on the operating system to flush the binary log’s contents from to time as for any other file. A value of 1 is the safest choice because in the event of a crash you lose at most one statement or transaction from the binary log. However, it is also the slowest choice (unless the disk has a battery-backed cache, which makes synchronization very fast).

Comments on this entry are closed.

Previous post:

Next post: