Replication Event Format in Detail MySQL

The Common Header

Each event starts with a header of size LOG_EVENT_HEADER_LEN=19 (defined in log_event.h),which contains:

  • timestamp
  • 4 bytes, seconds since 1970.
  • event type
  • 1 byte. 1 means START_EVENT, 2 means QUERY_EVENT, etc (these numbers are defined in an enum Log_event_type in log_event.h).
  • server ID
    4 bytes. The server ID of the mysqld which created this event. When using circular replication (with option --log-slave-updates on), we use this server ID to avoid endless loops. Suppose that M1, M2, and M3 have server ID values of 1, 2, and 3, and that they are replicating in circular fashion: M1 is the master for M2, M2 is the master for M3, and M3 is that master for M1. The master/ server relationships look like this:

    A client sends an INSERT query to M1. Then this is executed on M1, then written in the binary log of M1, and the event's server ID is 1. The event is sent to M2, which executes it and writes it to the binary log of M2; the event written still has server ID 1 (because that is the ID of the server that originally created the event). The event is sent to M3, which executes it and writes it to the binary log of M3, with server ID 1. This last event is sent to M1, which sees “server ID = 1” and understands this event comes from itself, so has to be ignored.

  • event total size
  • 4 bytes. Size of this event in bytes. Most events are 10-1000 bytes, except when using LOAD DATA INFILE (where events contain the loaded file, so they can be big).
  • position of the event in the binary log
  • 4 bytes. Offset in bytes of the event in the binary log, as returned by tell(). It is the offset in the binary log where this event was created in the first place. That is, it is copied as-is to the relay log. It is used on the slave, for SHOW SLAVE STATUS to be able to show coordinates of the last executed event in the master's coordinate system. If this value were not stored in the event, we could not know these coordinates because the slave cannot invoke tell() for the master's binary log.
  • flags
    2 bytes of flags. Almost unused for now. The only one which is used in 4.1 is LOG_EVENT_THREAD_SPECIFIC_F, which is used only by mysqlbinlog (not by the replication code at all) to be able to deal properly with temporary tables. mysqlbinlog prints queries from the binary log, so that one can feed these queries into mysql (the command-line interpreter), to achieve incremental backup recovery. But if the binary log looks like this:

    (two simultaneous threads used temporary tables with the same name, which is allowed as temporarytables are visible only in the thread which created them), then simply feeding this into mysql will lead to the “table t already exists” error. This is why events which use temporary tables are marked with the flag, so that mysqlbinlog knows it has to set the pseudo_thread_id before, like this:

    This way there is no confusion for the server which receives these queries. Always printing SET PSEUDO_THREAD_ID, even when temporary tables are not used, would cause no bug, it would just slow down.

The “Post-headers” (Event-specific Headers)

After the common header, each event has an event-specific header of fixed size (0 or more bytes) and a variable-sized part (0 or more bytes). It's easy for the slave to know the size of the variable-sized part: it is the event's size (contained in the common header) minus the size of the common header, minus the size of the event-specific header.

  • START_EVENT
  • In MySQL 4.0 and 4.1, such events are written only for the first binary log since mysqld startup. Binlogs created afterwards (by FLUSH LOGS) do not contain this event. In MySQL 5.0 we will change this; all binary logs will start with a START_EVENT, but there will be a way to distinguish between a START_EVENT created at mysqld startup and other START_EVENTs; such distinction is needed because the first category of START_EVENT, which means the master has started, should trigger some cleaning tasks on the slave (suppose the master died brutally and restarted: the slave must delete old replicated temporary tables).
  • 2 bytes: The binary log format version. This is 3 in MySQL 4.0 and 4.1; it will be 4 in MySQL 5.0.
  • 50 bytes: The MySQL server's version (example: 4.0.14-debug-log).
  • 4 bytes: Timestamp in seconds when this event was created (this is the moment when the binary log was created). In fact this is useless information as we already have the timestamp in the common header, so this useless timestamp should NOT be used, because we plan to change its meaning soon.
  • No variable-sized part.
  • QUERY_EVENT
  • 4 bytes: The thread ID of the thread that issued this query. Needed for temporary tables. This is also useful for a DBA for knowing who did what on the master.
  • 4 bytes: The time in seconds which the query took for execution. Only useful for inspection by the DBA.
  • 1 byte: The length of the name of the database which was the default database when the query was executed (later in the event we store this name; this is necessary for queries like INSERT INTO t VALUES(1) which don't specify the database, relying on the default database previously selected by USE).
  • 2 bytes: The error code which the query got on the master. Error codes are defined in include/ mysqld_error.h. 0 means no error. How come queries with a non-zero error code can exist in the binary log? This is mainly due to the non-transactional nature of MyISAM tables. If an INSERT SELECT fails after inserting 1000 rows (for example, with a duplicate-key violation), then we have to write this query to the binary log, because it truly modified the MyISAM table. For transactional tables, there should be no event with a non-zero error code (though it can happen, for example if the connection was interrupted (Control-C)). The slave checks the error code: After executing the query itself, it compares the error code it got with the error code in the event, and if they are different it stops replicating (unless --slave-skip-errors was used).
  • Variable-sized part: The concatenation of the name of the default database (null-terminated) andthe query. The slave knows the size of the name of the default database (it's in the event-specific header) so by subtraction it can know the size of the query.
  • STOP_EVENT
  • No event-specific header, no variable-sized part. It just means “Stop” and the event's type says it all. This event is purely for informational purposes, it is not even queued into the relay log.
  • ROTATE_EVENT
  • This event is information for the slave to know the name of the next binary log it is going to receive.
  • 8 bytes: Useless, alway contains the number 4 (meaning the next event starts at position 4 in the next binary log).
  • variable-sized part: The name of the next binary log.
  • INTVAR_EVENT
  • 8 bytes: the value to be used for the auto_increment counter or LAST_INSERT_ID(). 8 bytes corresponds to the size of MySQL's BIGINT type.
  • No variable-sized part.
  • LOAD_EVENT
  • This is an event for internal use. One should only need to be able to read CREATE_FILE_EVENT (see below).
  • SLAVE_EVENT
  • This event is never written so it cannot exist in a binlog. It was meant for failsafe replication which will be reworked.
  • CREATE_FILE_EVENT
  • LOAD DATA INFILE is not written to the binlog like other queries; it is written in the form of a CREATE_FILE_EVENT; the command does not appear in clear-text in the binlog, it's in a packed format. This event tells the slave to create a temporary file and fill it with a first data block. Later, zero or more APPEND_BLOCK_EVENT events append blocks to this temporary file. EXEC_ LOAD_EVENT tells the slave to load the temporary file into the table, or DELETE_ FILE_EVENT tells the slave not to do the load and to delete the temporary file. DELETE_ FILE_EVENT occurs is when the LOAD DATA failed on the master: on the master we start to write loaded blocks to the binlog before the end of the command. If for some reason there is an error, we have to tell the slave to abort the load. The format for this event is more complicated than for others, because the command has many options. Unlike other events, fixed headers and variablesized parts are intermixed; this is due to the history of the LOAD DATA INFILE command.
  • 4 bytes: The thread ID of the thread that issued this LOAD DATA INFILE. Needed for temporary tables. This is also useful for a DBA for knowing who did what on the master.
  • 4 bytes: The time in seconds which the LOAD DATA INFILE took for execution. Only useful for inspection by the DBA.
  • 4 bytes: The number of lines to skip at the beginning of the file (option IGNORE number LINES of LOAD DATA INFILE).
  • 1 byte: The size of the name of the table which is to be loaded.
  • 1 byte: The size of the name of the database where this table is.
  • 4 bytes: The number of columns to be loaded (option (col_name,...)). Will be non-zero only if the columns to load were explicitly mentioned in the command.
  • 4 bytes: An ID for this file (1, 2, 3, etc). This is necessary in case several LOAD DATA INFILE commands have been run in parallel on the master: in that case the binlog contains events for the first command and for the second command intermixed; the ID is used to resolve to which file the blocks in APPEND_BLOCK_EVENT must be appended, and which file must be loaded by the EXEC_LOAD_EVENT event, and which file must be deleted by the DELETE_FILE_EVENT.
  • 1 byte: The size of the field-terminating string (FIELDS TERMINATED BY option).
  • variable-sized part: The field-terminating string (null-terminated).
  • 1 byte: The size of the field-enclosing string (FIELDS ENCLOSED BY option).
  • variable-sized part: The field-enclosing string (null-terminated).
  • 1 byte: The size of the line-terminating string (LINES TERMINATED BY option).
  • variable-sized part: The line-terminating string (null-terminated).
  • 1 byte: The size of the line-starting string (LINES STARTING BY option).
  • variable-sized part: The line-starting string (null-terminated).
  • 1 byte: The size of the escaping string (FIELDS ESCAPED BY option).
  • variable-sized part: The escaping string (null-terminated).
  • 1 byte: Flags: OPT_ENCLOSED_FLAG (FIELD OPTIONALLY ENCLOSED BY option), REPLACE_FLAG (LOAD DATA INFILE REPLACE), IGNORE_FLAG (LOAD DATA INFILE IGNORE), DUMPFILE_FLAG (unused). All these are defined in log_event.h.
  • 1 byte: The size of the name of the first column to load.
  • etc
  • 1 byte: The size of the name of the last column to load.
  • Variable-sized part: The name of the first column to load (null-terminated).
  • etc
  • Variable-sized part: The name of the last column to load (null-terminated).
  • Variable-sized part: The name of the table which is to be loaded (null-terminated).
  • Variable-sized part: The name of the database containing the table (null-terminated).
  • Variable-sized part: The name of the file which was loaded (that's the original name, from the master) (null-terminated).
  • Variable-sized part: The block of raw data to load.
  • Here is a concrete example:

    concrete example

  • Line 1a0: number of lines to skip at the beginning of the file (02 0000 00), size of the table's name (01), size of the database's name (04), number of columns to load (03 0000 00), the file's id (03 0000 00).
  • Line 1b0: size of the field terminating string (01), field terminating string (2c i.e. ,), size of the field enclosing string (01), field enclosing string (22 i.e. "), size of the line terminating string (01), line terminating string (0a i.e. newline), size of the line starting string (01), line starting string (3e i.e. >), size of the escaping string (01), escaping string (5c i.e. backslash), flags (06)(that's OPT_ENCLOSED_FLAG | REPLACE_FLAG), size of the name of the first column to load (01), size of the name of the second column to load (01), size of the name of the third column to load (01), name of the first column to load (61 00 i.e. "a").
  • Line 1c0: name of the second column to load (62 00), name of the third column to load (63 00), name of the table to load (78 00), name of the database to load (74 6573 7400), name of the file loaded on the master (2f6d 2f74 6d70 2f75 2e74 7874 00).
  • Line 1d0 and following: raw data to load (3e 312c 322c 330a 3e34 2c35 2c36 0a3e 372c 382c 390a 3e31 302c 3131 2c31 32). The next byte is the beginning of the EXEC_LOAD_EVENT event.
  • APPEND_BLOCK_EVENT
  • 4 bytes: The ID of the file this block should be appended to.
  • Variable-sized part: the raw data to load.
  • EXEC_LOAD_EVENT
  • 4 bytes: the ID of the file to be loaded.
  • No variable-sized part.
  • DELETE_FILE_EVENT
  • 4 bytes: The ID of the file to be deleted.
  • No variable-sized part.
  • NEW_LOAD_EVENT
  • For internal use.
  • RAND_EVENT
  • RAND() in MySQL uses 2 seeds to compute the random number.
  • 8 bytes: Value for the first seed.
  • 8 bytes:
  • Value for the second seed.
  • No variable-sized part.
  • USER_VAR_EVENT
  • 4 bytes: the size of the name of the user variable.
  • variable-sized part: A concatenation. First is the name of the user variable. Second is one byte, non-zero if the content of the variable is the SQL value NULL, ASCII 0 otherwise. If this bytes was ASCII 0, then the following parts exist in the event. Third is one byte, the type of the user variable, which corresponds to elements of enum Item_result defined in include mysql_com.h. Fourth is 4 bytes, the number of the character set of the user variable (needed for a string variable). Fifth is 4 bytes, the size of the user variable's value (corresponds to member val_len of class Item_string). Sixth is variable-sized: for a string variable it is the string, for a float or integer variable it is its value in 8 bytes.



Face Book Twitter Google Plus Instagram Youtube Linkedin Myspace Pinterest Soundcloud Wikipedia

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

MySQL Topics