MySQL Backup Code Walkthrough Pseudo Code: Level 1: mysql_execute_command("backup database test to 'test.bak' overwrite") { Perform implicit transaction commit; (for such statemetns) Get BML shared lock if needed: // For DDL blocking (similar to RD lock) BML_instance->get_shared_lock(thd); Wait for Global Read Lock if needed; // Only for "create user" etc cmds For backup: execute_backup_command(); } execute_backup_command() { Backup_restore_ctx context(thd); // create context instance For Backup : // prepare for backup Backup_info *info= context.prepare_for_backup(location, orig_loc); // select objects to backup info->add_dbs(); info->close(); // indicate that selection is done context.do_backup(); // perform backup context.close(); // explicit clean-up For Restore: Backup_restore_ctx context(thd); // create context instance // prepare for restore Restore_info *info= context.prepare_for_restore(location, orig_loc); context.do_restore(); // perform restore context.close(); // explicit clean-up } Backup_info * Backup_restore_ctx::prepare_for_backup(...) { Report Starting Time; If another backup/restore session running, report error; Compute full path to backup file; mem_alloc = new Mem_allocator(); // Separate mem allocator for backup // Get backup meta lock to block all DDLs. (kind of write lock) obs::bml_get(m_thd); // Open new output stream Output_stream *s= new Output_stream(...); Backup_info *info= new Backup_info(...) { Create linked list of backup engines: default, ConsistentSnapshot, nodata ; // Above is used to select backup engine later. Mark and remember server binlog is enabled-- we will store vp's binlog pos in image. Mark the start time. } } int Backup_info::add_dbs(...) { Get the list of databases (specified) and add it to the backup info. Do all of the following for each db: // Execute query using run_service_interface_sql() to get info // about DB from information schema obs::Obj *obj= get_database_stub(thd, &db_name); check_db_existence() { run query: "show create database test" ; & confirm } Db *db= add_db(obj) { check_access(m_thd, BACKUP_ACL, ...); // Check BACKUP privileges obs::check_user_access(m_thd, name) { // Check if user has access to all DB objects // such as table, trigger, event and routine Get number of (db) objects by running select query without privilege elevations and with privilege elevation. If the results differ, then user lacks privilege. } // Note: Backup_info inherits from Image_info. Db *db= Image_info::add_db(obs:Obj *, pos); } add_db_items(*db) { Obj_iterator *it= get_db_tables(m_thd, &db.name()) { // si_objects.cc: // run select query and get info. create_row_set_iterator(thd, ..query...); } For each table t { add_table(db, obj) { backup::Snapshot_info *snap= find_backup_engine(t) { Get storage engine reference for this table. // See backup_info.cc: get_storage_engine(). // It opens the table as a temp table to get SE info !!! if partitioned table, consider SE of underlying table. native drivers don't support partitioning. Get handlerton from SE reference. hton->get_backup_engine provides native backup engine. Go through list of backup engines and select appropriate one. } // End of find_backup_engine() Image_info::add_table(); // Backup_info uses Image_info. } obj= find_tablespace_for_table(); if there is tablespace for this table, add table space. } // End for each table Add all Stored procedures & functions for the database: get_db_stored_procedures() && add_objects(db,BSTREAM_IT_SPROC,..); get_db_stored_functions() && add_objects(db, BSTREAM_IT_SFUNC,..); Add all views defined in this database. Add all DB Events in this database. Add all DB Triggers in this database. Add all Privileges information (using Information_schema.schema_privileges) } // End add_db_items() }// End Backup_info::add_dbs context.do_backup() { write_preamble(info, output_stream) { bstream_wr_header(); // flags|start_time|total_snapshots|server_version|snap_shot_descriptors // snapshot_descriptor is (usually) one per storage engine, // includes total no of tables, etc. info. bstream_wr_catalogue(); // List of dbnames, table names, etc. bstream_wr_meta_data() { For each global db item (including databases) do { Single entry has following format : item_entry: type flags catalog_pos [extra_data(unused)] [object_metadata] object_metadata : create_stmt_string_for_table, etc. catalog_pos : The pos index(offset?) with in catalogue header. Write : type, flags, catalog_pos ; [ bstream_wr_item_def() => bstream_wr_meta_item() ] Run query: show create database $dbname and write it to output stream; First item database (global item) : create database test /* charset latin */; Second item table t: "use test; create table t (i integer) engine=MyISAM ..." Item privileges are currently not written. } // For each db item } // End bstream_wr_metadata() } // End write preamble write_table_data() { // --- INIT PHASE --- create Backup_pump() for "MyISAM" snapshot; // First 'At End' drivers Note: There is one Backup_pump instance created for each driver. There is one Block_writer for each Backup_pump instance. Backup_pump is kernel internal class defined in data_backup.cc The driver is external interface but Backup_pump is not. MyISAM backup pump state is "INIT" now; create Backup_pump() for "InnoDB", push it to inactive list of pumps; snap.get_backup_driver() => Returns backup driver as per backup engine API. (See myisam_backup_engine.cc) Note: Each unknown initial size driver gets to be active first. There is no specific check for "At End" or "At Begin" driver here!!! Add this pump to scheduler : sch.add(p) { // For example, consider MyISAM here: p->begin() { // begin pumping from Backup::begin() Allocate hash of tables, memory, etc. Unless Env variable MYISAM_BACKUP_NO_INDEX is set to 1, mark flag to backup index also. Set state to dumping data index files; } } Loop: Scheduler::step() { // Call next backup engine's pump() method to get All data. pump(){ m_bw->get_buf(); m_drv->get_data(); } Update statistics about how much pump() has written etc. Compare pump->state before and after the call to pump(); Update counters to handle the state changes: init_count, prepare_count, finish_count, etc. } MyISAM backup pump state is now "INIT" -> "WAITING" # Here all of MyISAM data has been already written to archive. // Start activating 'at begin' drivers ; Call sch.add() for them. # Here innodb, default drivers get activated now only. InnoDB (Snapshot Engine) pump state is "INIT" InnoDB data gets written to archive; InnoDB pump state is INIT->WAITING; // --- PREPARE PHASE --- Prepare for VP : // See WL#4610 for refined commit blocker -- which is todo do: Block commits : block_commits(thd) { // Step 1: Global read lock lock_global_read_lock(thd); // lock mutex &LOCK_global_read_lock // Use global read lock to block commits make_global_read_lock_block_commit(); } # Call prepare() on all drivers; sch.prepare(){ pump->prepare() => m_drv->prelock(); } MyISAM goes to "PREPARING" state; InnoDB goes to "READY" state; // Nothing to prepare for InnoDB!!! MyISAM locks all it's table to prevent writes on them. This may be redundant because we already have global readlock ?? However, the "block commit" logic is originally designed to only block the commits and not to prevent writes. MyISAM goes to PREPARING -> READY sch.step() { m_drv->get_data() } // --- SYNC PHASE --- // VP creation start sch.lock(); // Call lock() for all drivers save_vp_info() { save timestamp; binlog position; } sch.unlock() { // call unlock() for all drivers For MyISAM: m_drv->unlock() { ... Backup::kill_locking_thread(); ... } } InnoDB and MyISAM goes to "FINISHING" state one after another. unblock_commits(); report_vp_info(); // --- FINISHING PHASE --- while sch.finish_count > 0 sch.step() { } MyISAM goes to FINISHING -> DONE and shuts down. InnoDB gets it's table data and writes here (In finishing state) !!! InnoDB goes FINISHING -> DONE and shuts down. } // End write_table_data() Save end time; write_summary() { write 0, vptime, endtime, binlog pos, binlog group } } // End context.dobackup()