Replication Threads MySQL

Every time replication is started on the slave mysqld, i.e. when mysqld is started with some replication options (--master-host= this_ hostname etc.) or some existing master.info and relay- log.info files, or when the user does START SLAVE on the slave, two threads are created on the slave, in slave.cc:

The I/O thread connects to the master using a user/password. When it has managed to connect, it asks the master for its binary logs:

Replication Threads

Replication Threads

read_event() calls net_safe_read() to read what the master has sent over the network. queue_event() writes the read event to the relay log, and also updates the counter which keeps track of the space used by all existing relay logs. flush_master_info() writes to the master.info file the new position up to which the thread has read in the master's binlog. Finally, if relay logs take too much space, the I/O thread blocks until the SQL thread signals it's okay to read and queue more events. The <cut> code handles network failures and reconnections.

When the relay log gets too large, it is “rotated”: The I/O thread stops writing to it, closes it, opens a new relay log and writes to the new one. The old one is kept, until the SQL thread (see below) has finished executing it, then it is deleted. The meaning of “too large” is determined as follows:

  • max_relay_log_size, if max_relay_log_size > 0
  • max_binlog_size, if max_relay_log_size = 0 or MySQL is older than 4.0.14

The Slave SQL Thread

The Slave SQL Thread

exec_relay_log_event() reads an event from the relay log (by calling next_event()). next_event() will start reading the next relay log file if the current one is finished; it will also wait if there is no more relay log to read (because the I/O thread is stopped or the master has nothing more to send to the slave). Finally exec_relay_log_event() executes the read event (all ::exec_event() methods in log_event.cc) (mostly this execution goes through sql_parse.cc), thus updating the slave database and writing to relay-log.info the new position up to which it has executed in the relay log. The ::exec_event() methods in log_event.cc will take care of filter options like replicate-do-table and such.

When the SQL thread hits the end of the relay log, it checks whether a new one exists (that is, whether a rotation has occurred). If so, it deletes the already-read relay log and starts reading the new one. Otherwise, it just waits until there's more data in the relay log.

Why 2 Threads?

In MySQL 4.0.2 we split the job into two threads, using a relay log file to exchange between them.

This makes code more complicated. We have to deal with the relay log being written at the end, read at another position, at the same time. Plus handling the detection of EOF on the relay log, switching to the new relay log. Also the SQL thread must do different reads, depending on whether the relay log it is reading

  • is being written to by the I/O thread; then the relay log is partly in memory, not all on disk, and mutexes are needed to avoid confusion between threads.
  • has already been rotated (the I/O thread is not writing to it anymore), in which case it is a normal file that no other threads touches.
  • The advantages of having 2 threads instead of one:

  • Helps having a more up-to-date slave. Reading a query is fast, executing it is slow. If the master dies (burns), there are good chances that the I/O thread has caught almost all updates issued on the master, and saved them in the relay log, for use by the SQL thread.
  • Reduces the required master-slave connection time. If the slave has not been connected for a long time, it is very late compared to the master. It means the SQL thread will have a lot of executing to do. So with the single-thread read-execute-read-execute technique, the slave will have to be connected for a long time to be able to fetch all updates from the master. Which is stupid, as for a significant part of the time, the connection will be idle, because the single thread is busy executing the query. Whereas with 2 threads, the I/O thread will fetch the binlogs from the master in a shorter time. Then the connection is not needed anymore, and the SQL thread can continue executing the relay log.

The Binlog Dump Thread

Binlog Dump ThreadBinlog Dump Thread

If this thread starts reading from the beginning of a binlog, it is possible that the slave does not know the binlog's name (for example it could have just asked “give me the FIRST binlog”). Using fake_rotate_event(), the master tells the slave the binlog's name (required for master.info and SHOW SLAVE STATUS) by building a Rotate_log_event and sending this event to the slave.

In this event the slave will find the binlog's name. This event has zeros in the timestamp (shows up as written in year “1970” when reading the relay log with mysqlbinlog).


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

MySQL Topics