MySQL Replication

Let's suppose a MySQL server failed was catastrophically destroyed by any typical circumstance such as by a stray gun shot in the machine room. Naturally, the first step would be to point all queries at the spare, but then you need to rebuild the failed node. Call the other node source, since it's the one we'll clone to get the failed node running again. The general idea of the procedure is to clone the data, then reconnect replication. I'll use S as a placeholder for the source server and F as the placeholder for failed server.

mysqldump --add-drop-database --events --routines --triggers --add-locks=0 --complete-insert --create-options --disable-keys --extended-insert --lock-tables=0 --single-transaction --master-data=1 -h 127.0.0.1 -u root -p --all-databases CHANGE MASTER TO MASTER_HOST = '192.168.x.x', MASTER_USER = 'replicator', MASTER_PASSWORD = 'xxx', MASTER_CONNECT_RETRY = 3, MASTER_RETRY_COUNT=0, MASTER_HEARTBEAT_PERIOD=1, MASTER_LOG_FILE = 'log-bin.xxx', MASTER_LOG_POS = xxx; START SLAVE;
 * Stop replication on S. The prevents it from automatically restarting replication from the wrong position. On S, run "STOP SLAVE;".[1]
 * Clone the dataset atomically.
 * One can do this by stopping S and using tar+ssh, but this incurs downtime.
 * A better option is to use mysqldump, but if you have MyISAM tables that are taking changes you'll need to omit --single-transaction and instead use --lock-tables, which will pause writes to the database. This may be considered downtime.
 * 1) Note: Need to update these notes with notes for when GTIDs are enabled.
 * mysqldump can take a really long time, so an even better tool is a ZFS snapshot, if the source filesystem is ZFS. Start with "FLUSH LOGS WITH READ LOCK", "zfs snapshot", "UNLOCK TABLES" (or close mysql session), and then at your leisure use "zfs send" and "zfs receive" to transfer the snapshot.
 * Before starting mysql on F, be sure "server-id" and auto-increment-increment are unique within the MySQL cluster. Also, temporarily add "skip-slave-start" to prevent it from trying to start replicating from itself. Also remove auto.cnf and .{err,pid,log}.
 * Record the replication position. It's important to start replication from exactly the right spot. Note that "SHOW MASTER STATUS" offers these values but they will not be from the exact point in time that you took the snapshot. To get that, find the last log-bin.xxx file and its size in the snapshot (or before starting MySQL) on F.
 * Configure replication on both S and F. Since you just cloned it, the starting position will be the same in both directions (although they will quickly diverge). Connect to MySQL on F first to be sure it works correctly, and then on S. (Either order works, but it's safer to do F first in case you make a mistake):
 * Confirm that replication is working by inspecting "SHOW SLAVE STATUS\G" on both. It should report Seconds_Behind_Master as something besides NULL. Normally this should be 0 or maybe a few seconds, but while replication catches up it may get pretty high.
 * Remove skip-slave-start from my.cnf on both.
 * When you're comfortable, remember to destroy the old ZFS snapshots on both nodes to avoid running out of disk space.

In another scenario, let's suppose that the underlying machinery and filesystems and databases are fine, but replication has broken (Seconds_Behind_Master = NULL). Other than connectivity problems, the most common reason for this is a transaction conflict. Recovering from this type of situation is very fiddly and requires careful thought to avoid making the problem worse. "SHOW SLAVE STATUS\G" will report the error. Such an error might look like: "Could not execute Write_rows event on table regcode.soap_audit; Duplicate entry '2202500' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log log-bin.000013, end_log_pos 1263". Your first priority is to restore data integrity. This means you have to think about what the error means. In this case, it means that the slave is trying to insert a row with the same primary key as a row that already exists. You have to figure out whether the new row is merely a duplicate of the existing row, or if its logically different data that you have to patch up. Do this by inspecting each node with the MySQL client (SELECT the row and look at it). In the end, you may need to tell replication to skip the offending transaction, then fix it up after replication recovers. The commands to do this are: "STOP SLAVE; SET GLOBAL sql_slave_skip_counter = 1; START SLAVE;". Another useful command is "SET sql_log_bin = 0;", which will let you issue queries that are not replicated so you can change data on one node to match the other without having that statement replicated which would cause another conflict later; of course, there is danger is that you'll make things even more different if you make a mistake. Another option is to destroy one replica and rebuild it from the other, but you'll need to somehow find the differences between the two first to be sure you don't lose data.

[1] Note that "STOP SLAVE;" is a temporary command. If you were to restart mysql, it would begin replicating again just as if you'd executed "START SLAVE;". Adding skip-start-slave to my.cnf is a persistent way to prevent it from starting replication, but be careful not to forget to remove that when you're done. "CHANGE MASTER TO" is the command for permanently changing replication or turning it off.

Rolling back a single database
Suppose one database in the cluster is screwed up from a bad query or whatever. You can restore from a backup without disrupting the other databases.

First, you need a temporary server with the old data. An easy way to get this is to just restore the whole backup of the production server to a spare machine or jail or VM. Before booting it, or a least before allowing MySQL to start, set skip-start-slave in my.cnf, or else it will immediately start syncing data from the production server and replay the catastrophic queries. Note that when restoring a filesystem-level backup, which is what we're using, you need the same or later version of MySQL running on the same OS and architecture.

The next thing is to use mysqldump to extract the one database you're interested in. Then pipe that into the mysql client. I like to store it in a file so I can inspect it inbetween to be comfortable that it contains what I expect. DO NOT specify --master-data; that's for rebuilding a server from scratch and in this scenario will mess up the production server's replication and replay queries for other databases. mysqldump --add-drop-database --events --routines --triggers --add-locks=0 --complete-insert --create-options --disable-keys --extended-insert --lock-tables=0 --single-transaction --flush-logs --master-data=1 -h 127.0.0.1 -u root -p --databases mydb > /tmp/mydb.mysqldump less /tmp/mydb.mysqldump mysql -h prodserver -u root -p < /tmp/mydb.mysqldump

Debugging replication
Detailed explanation: Debugging Problems With Row Based Replication by Percona
 * SHOW SLAVE STATUS\G
 * mysqlbinlog /mysql/log-relay.000002 --base64-output=decode-rows -v | less
 * --offset doesn't seem to do what I think it should. Instead search: ^# at 42231037$