Backup Pointers : 

   http://forge.mysql.com/wiki/OnlineBackup#Architecture_of_online_backup_system  (from Rafal)
   http://forge.mysql.com/wiki/MySQL_Internals_MySQL_Backup  (list of WLs etc)

MySQL Backup Reference Manual:
   http://dev.mysql.com/doc/mysql-backup/en/index.html


Replication & Backup Team:
  https://inside.mysql.com/wiki/Replication

Misc Extra :

Q3 Goals:
 https://intranet.mysql.com/~lthalmann/ltplan.pl?action=report_2009_q3

FAQ for Replication & Backup Team:
  https://intranet.mysql.com/~lthalmann/ltplan.pl?file=faq.org

Team Goals:
https://intranet.mysql.com/~lthalmann/ltplan.pl?file=projects.org&section=Team-Goals-2009-Q3

Sprint Proposal:
https://intranet.mysql.com/%7Elthalmann/ltplan/sprint_proposal.php

Pushbuild 2 pushes status:
http://pb2.norway.sun.com/web.py?branch=mysql-6.0-backup&template=show_pushes

https://inside.mysql.com/wiki/BackupQualityReport


Notes from developer manual: 

Logical backup : 
   select * from yourtable into outfile  'mytable.txt' ;

To restore:
   load data infile 'mytable.txt';

mysqlimport program is just a convenience program which reads my.cnf
and runs mysql client and SQL command  'load data infile  ... '.


     Physical Backup:
       The way you do physical backup highly depends on storage engine:
        - For Memory storage engine, better to go with logical backup.
        - For MyISAM storage engine, use: cp files (offline) or use 
                      mysqlhotcopy (perl script to backup db or table. Done online 
                             after locking server. supports myisam and archive engines only)
                      For restore, copy files back to data directory.
        - For InnoDB, use ibbackup tool. (not bundled by default, commercial); 
                      use ibback for restore
        - For NDB, use START BACKUP command ; ndb_restore for restore.???

    * Full versus incremental backups
          Server's binary log can be used for incremental backups.
          Even if MyISAM does not support ACID transactions, the logs can still be
          used for incremental backup and recovery (?)

    * Point-in-time recovery
    * Backup scheduling, compression, and encryption
    * Table maintenance

How to do incremental backup ?

 - Enable binary logging by starting server with --log-bin option.

What are the different types of logs in MySQL ?
Binary Log, General Querylog, Slow Query Log, Error log

Note: To display binlog use mysqlbinlog  <binlogfilename.indexnumber>

show variables like '%log%' ;

mysqldump --tab=/path/to/some/dir --opt db_name
mysqldump db_name > backup-file.sql
You can read the dump file back into the server like this:
shell> mysql db_name < backup-file.sql
Or like this:
shell> mysql -e "source /path-to-backup/backup-file.sql" db_name


mysqlhotcopy db_name /path/to/some/dir  (supports myisam and archive engines only)

How to dump schema only (not data) ?
  mysqldump --no-data
// mysqldump outputs to stdout in text format.

User Documentation Reference Manual (notes):

  http://dev.mysql.com/doc/mysql-backup/en/index.html

  -- The BACKUP DATABASE and RESTORE SQL statements
  -- mysqlbackup program displays info about backup image file:

shell> mysqlbackup /usr/local/mysql/data/test.bak
Image path:          '/usr/local/mysql/data/test.bak'
Image size:          12305 bytes
Image compression:   none
Image version:       1
Creation time:       2009-09-08 18:08:43 UTC
Server version:      5.4.4 (5.4.4-beta)
Server byte order:   little-endian

  -- BACKUP {DATABASE | SCHEMA}
    { * | db_name [, db_name] ... }
    TO 'image_file_name'
    [WITH COMPRESSION [COMPRESSION_ALGORITHM [=] algorithm_name]]

BACKUP DATABASE * TO '/tmp/all.backup';
BACKUP DATABASE world TO '/tmp/world.backup';
BACKUP DATABASE db1, db2 TO '/tmp/db1-db2.backup';

  mysql> BACKUP DATABASE test TO '/tmp/world.backup';
+-----------+
| backup_id |
+-----------+
| 8         |
+-----------+

Note: backup_id is some arbitrary id so that you can lookup the backup_history 
      table which contains an entry with backup_id column with same value.

mysql>  RESTORE FROM '/tmp/mybackupfile';

-- BACKUP DATABASE currently does not back up the mysql or INFORMATION_SCHEMA databases. 

-- BACKUP DATABASE saves privileges for the backed-up databases in the backup image file. 
   The privileges are stored in the form of GRANT statements.

-- Only privileges at the database level or below (table, column, routine) are saved. Global privileges are not 
   saved because they are not specific to the databases included in the backup. 

-- GRANT SELECT, INSERT ON db1.* to 'someuser'@'localhost'
   The privileges specified by this statement are restored if the 'someuser'@'localhost' account exists.

-- since "mysql" database is not backed up by backup command, you can backup mysql using mysqldump cmd:

      shell> mysqldump --databases mysql > path/mysql-db.sql
      mysql> BACKUP DATABASE * TO 'path/other-dbs.bak';
      shell> mysql -u root -p < path/mysql-db.sql
      mysql> RESTORE FROM 'path/other-dbs.bak';

--  The backup system chooses from among the backup engines available to it:

      * There is a default blocking backup engine.
      * A consistent-snapshot engine implements the same kind of backup as that made 
        by mysqldump --single-transaction. 
     A backup engine is the one which provides backup/restore drivers for storage engine.

-- A backup operation begins at time t1  and ends at time t2, producing a backup image that contains 
   the backup state (database state) at time t, where t1  < t <  t2. The time t is called the validity point 
   of the backup image. It represents the time when all storage engines are synchronized for the backup. 
   Restoring this image restores the state to be the same as it was at time t. 

 Consistency of the backup means that these constraints must be true:

    * Data from transactional tables is included only for committed transactions.
    * Data from nontransactional tables is included only for completed statements.
    * Referential integrity is maintained between all backed-up tables within a given backup image.

-- When a backup or restore operation is in progress, it is not allowable to modify the structure 
   of database objects. Following statements are blocked:

       DROP   DATABASE/TABLE/VIEW/FUNCTION/PROCEDURE/EVENT/TRIGGER/INDEX/
       USER/TABLESPACE
       CREATE DATABASE/TABLE/VIEW/FUNCTION/PROCEDURE/EVENT/TRIGGER/INDEX
       ALTER  DATABASE/TABLE/VIEW/FUNCTION/PROCEDURE/EVENT/TABLESPACE
       RENAME TABLE/USER
       GRANT/REVOKE
       TRUNCATE/OPTIMIZE/REPAIR TABLE

     Currently, all instances of statements that change metadata are blocked, 
    even for database or table objects that are not included in the backup but this will be improved.

--  The server maintains backup_history and backup_progress tables in the mysql 
    database that contain metadata indicating backup status and progress

-- You can change log_backup_output system variable to 'TABLE,FILE' to enable logging to both:
   (or invoke server with  --log_backup_output=TABLE,FILE option)


mysql> show variables like '%backup%' ;
+--------------------------+---------------------------------------------------------+
| Variable_name            | Value                                                   |
+--------------------------+---------------------------------------------------------+
| backup_history_log       | ON                                                      |
| backup_history_log_file  | /home/thava/repo/backup/install/var/backup_history.log  |
| backup_progress_log      | ON                                                      |
| backup_progress_log_file | /home/thava/repo/backup/install/var/backup_progress.log |
| backup_wait_timeout      | 50                                                      |
| backupdir                | /home/thava/repo/backup/install/var                     |
| log_backup_output        | TABLE                                                   |
| secure_backup_file_priv  |                                                         |
+--------------------------+---------------------------------------------------------+
8 rows in set (0.01 sec)

       mysql>  set log_backup_output='TABLE,FILE';

The backup_history table has this structure:

CREATE TABLE backup_history (
    backup_id           BIGINT UNSIGNED NOT NULL,
    process_id          INT UNSIGNED NOT NULL,
    binlog_start_pos    INT UNSIGNED NOT NULL,
    binlog_file         CHAR(64) NOT NULL,
    backup_state        ENUM('complete', 'starting', 'validity point',
                             'running', 'error', 'cancel') NOT NULL,
    operation           ENUM('backup', 'restore') NOT NULL,
    error_num           INT NOT NULL,
    num_objects         INT UNSIGNED NOT NULL,
    total_bytes         BIGINT UNSIGNED NOT NULL,
    validity_point_time DATETIME NOT NULL,
    start_time          DATETIME NOT NULL,
    stop_time           DATETIME NOT NULL,
    host_or_server_name CHAR (30) NOT NULL,
    username            CHAR (30) NOT NULL,
    backup_file         CHAR (100) NOT NULL,
    backup_file_path    VARCHAR (512) NOT NULL,
    user_comment        VARCHAR (200) NOT NULL,
    command             VARCHAR (512) NOT NULL,
    engines             VARCHAR (100) NOT NULL
) ENGINE=CSV CHARSET=utf8;

--  FLUSH Syntax
    FLUSH [NO_WRITE_TO_BINLOG | LOCAL] flush_option [, flush_option] ...
      The FLUSH statement has the following MySQL Backup-related flush_option value: BACKUP LOGS
   Example:   flush backup logs;  (used just to sync outstanding writes to backup log file.)
      Closes and reopens the backup log files. This option was added in MySQL 5.4.4.

-- PURGE BACKUP LOGS [ TO id | BEFORE datetime_expr ]
     PURGE BACKUP LOGS TO 143;     // clean entries in backup_log table whose backup_id is less than 143
     PURGE BACKUP LOGS BEFORE '2009-08-31 14:32:19'; // clean entries before this time.

--  RESTORE Syntax

    RESTORE FROM 'image_file_name' [option] ...
       option = {OVERWRITE | SKIP_GAP_EVENT}


-- Server options:
     --log_backup_output=TABLE,FILE  (or NONE)
     --mysql-backup[={0|1}]       /* set to 1 to enable backup/restore commands to work */
     --secure-backup-file-priv=path  /* backup image file must reside in this dir for backup/restore to work */

-- Session system variables:


-- -----
Notes from sql/backup/kernel.cc :

  Structure of the MySQL Backup System 
  
                |      Online Backup Module
                |
            (1) |   +---------------+      +------------------------+
   Server      ---> | Backup Kernel | <--> | Backup/Restore drivers |
                |   +---------------+ (3)  +------------------------+
    Core      <---  | Stream layer  |
            (2) |   +---------------+
                |     (4)    ^ (5)
                |        |   |
                +--------|---|----------------------------
                |        v
                |   Backup Stream Library
                |     

  Backup stream library is autonomous so that external applications can link
  against it to be able to read backup images created by the system.
  The format of backup images is described @ref stream_format "here".

  The components of the system communicate with each other using well defined
  interfaces:
  -# @ref KernelAPI "Backup Kernel API"
  -# Object Services API and Backup Log API
  -# Backup Driver API
  -# @ref streamlib "Backup Stream Library API"
  -# Backup Stream Library Callback API

 Backup kernel API.

  This is how backup is performed using the context object:

   Backup_restore_ctx context(thd); // create context instance
   Backup_info *info= context.prepare_for_backup(location,
                                         orig_loc); // prepare for backup

   // select objects to backup
   info->add_all_dbs();
   or
   info->add_dbs(<list of db names>);

   info->close(); // indicate that selection is done

   context.do_backup(); // perform backup

   context.close(); // explicit clean-up

  } // if code jumps here, context destructor will do the clean-up automatically

  Similar code will be used for restore (bit simpler as we don't support
  selective restores yet):

   Backup_restore_ctx context(thd); // create context instance
   Restore_info *info= context.prepare_for_restore(location,
                                               orig_loc); // prepare for restore

   context.do_restore(); // perform restore

   context.close(); // explicit clean-up

  } // if code jumps here, context destructor will do the clean-up automatically

------

Note: 
   -- backup uses it's own backup::Mem_allocator class.
   -- Heavy use of String, LEX_STRING
   -- obs is a backup service namespace mostly defined in si_objects.cc

Notes on kernel.cc functions:

Backup_restore_ctx::Backup_restore_ctx(thd) // constructor.
{
   Save thd->options to m_thd_options;
   MYSQL_BACKUP_LOG  *log_handler = logger.get_backup_history_log_file_handler();
}

// Prepare for backup and restore operations
Notes on Backup_restore_ctx class:
Backup_restore_ctx::prepare(backupdir, location string)
{
   if backup already running give error;
   mem_alloc = new Mem_allocator();
   obs::bml_get(m_thd);   // Get backup meta lock which blocks all DDL operations.   
}

//  Prepare actions specific to backup only
Backup_info*
Backup_restore_ctx::Backup_restore_ctx::prepare_for_backup(...)
{
  Logger::init();
  Report starting time;
  prepare();  // Do common prepare defined above.
  Output_stream *s= new Output_stream(...);  // Create output stream
 
  // create all info about what you are backing up. This is also called "backup catalogue" 
  Backup_info *info= new Backup_info(*this, m_thd);  
 
  // Remember if binlog is turned on. We will store validity point's binlog pos in image.
  info->flags |= is bin log turned on ? ;
  info->save_start_time( now );
}


Restore_info* 
Backup_restore_ctx::prepare_for_restore(...)
{
  // Block replication from starting.
  obs::block_replication(TRUE, "RESTORE");

  // Refuse to run restore on slave;
  if (obs::is_slave())  Give error;

  Report Start time; 
  prepare();  // Do common prepare actions defined earlier

  // Open input stream
  Input_stream *s= new Input_stream(*this, &m_path);

  // create all info about what you are restoring. This is also called "restore catalogue" 
  Restore_info *info= new Restore_info(*this, m_thd);  // reports errors
  m_catalog= info;

  info->save_start_time(when);

  // Read header from input stream
  read_header(*info, *s);  

  // Move to next chunk from input stream
  ret= s->next_chunk();

  // Read remaining catalog info (after header) from input stream.
  ret= read_catalog(*info, *s);  
  
  // Move to next chunk
  ret= s->next_chunk();
  
  // Disable new slave connections for replication
  obs::disable_slave_connections();

  // Write "RESTORE_EVENT" in binlog (also called "incident" or GAP event),
  // so that existing slaves stop replication after hitting this.
  obs::write_incident_event();

  // turn off binlog during restore
  obs::engage_binlog(FALSE);
}


------

Backup command pseudo code : 

 sql_parse : 
    Lex->db_list => contains the list of databases.
    Lex->value_list => contains list of items ???

 mysql_execute_command( "backup  database test to '/home/thava/tmp/test.bak2");
execute_backup_command()
{

    trans_commit_implicit(); // Implicit tx.
    close_thread_tables();   // Release metadata locks.
    If not in locked tables mode, thd->mdl_context.release_all_locks(); // ???
    bml_enter(thd);   // Check (and acquire?) backup meta lock.
    thd->backup_in_progress =  sql_command; // backup or restore.
    
    Backup_restore_ctx context(thd); // Initialize backup history log file handler.

    From backup/kernel.cc, To report error do following :
       DBUG_RETURN(send_error(context, ER_BACKUP_CONTEXT_CREATE));  // normal error
       fatal_error(report_killed() ? ER_QUERY_INTERRUPTED : ER_BACKUP_LOGGER_INIT); // fatal error
       m_log.report_error(ER_BACKUP_CANNOT_INCLUDE_DB, db_name.c_ptr());

   Backup_info *info= context.prepare_for_backup(...) {
      If another backup/restore operation running, report error;
      Check if backup location is valid.
      mem_alloc = new Mem_allocator();    // Separate mem allocator for backup
      obs::bml_get(m_thd);   // Get backup meta lock (block DDL) si_objects.cc
      Output_stream *s = new Output_stream*(...);
      Backup_info *info= new Backup_info(...) {
         Create linked list of backup engines: default, ConsistentSnapshot, nodata ;
         // used to select backup engine later.
         Mark and remember server binlog is enabled-- we will store vp's binlog pos in image.
         Mark the start time.
      }
   } // End prepare for backup.

   info->add_dbs() {
         Get the list of databases (specified) and add it to the backup info.
         For each db:
           // Execute query using run_service_interface_sql() to get info about DB from information schema
           obs::Obj *obj= get_database_stub(thd, &db_name);
           check_db_existence() {
             run query: "show create database test" ; & confirm 
           }
           Db *db= add_db(obj) {
              check_access(m_thd, BACKUP_ACL, ...); // Check for BACKUP privileges for user.
              obs::check_user_access(m_thd, name) {  // Check if user has access to all DB objects
                                        // such as table, trigger, event and routine
                 Get number of (db) objects by running select query without privilege elevations 
                 and with privilege elevation.  If the results differ, then user lacks privilege.
              }
              Db *db= Image_info::add_db(obs:Obj*, pos);
           }
           add_db_items(*db){
              Obj_iterator *it= get_db_tables(m_thd, &db.name()) {
                 // si_objects.cc: // run select query and get info.
                 create_row_set_iterator(thd, ..query...);
              }
              For each table t {
                add_table(db, obj){
                  backup::Snapshot_info *snap= find_backup_engine(t){
                    Get storage engine reference for this table.
                    // See backup_info.cc: get_storage_engine(). 
                    // It opens the table as a temp table to get SE info !!!
                    if partitioned table, consider SE of underlying table.
                    native drivers don't support partitioning.
                    Get handlerton from SE reference.
                    hton->get_backup_engine provides native backup engine.
                    Go through list of backup engines and select appropriate one.
                  }
                  Image_info::add_table(); // Backup_info uses Image_info.
                }
                obj= find_tablespace_for_table();
                if there is tablespace for this table, add table space.
             } // End for each table
             Add all Stored procedures & functions for the database:
                 get_db_stored_procedures() && add_objects(db, BSTREAM_IT_SPROC,...);
                 get_db_stored_functions() && add_objects(db, BSTREAM_IT_SFUNC,...);
             Add all views defined in this database.
             Add all DB Events in this database.
             Add all DB Triggers in this database.
             Add all Privileges information (using Information_schema.schema_privileges)
           } // End add_db_items()
   }// End Backup_info::add_dbs
   if (info->db_count == 0) context.report_error(ER_BACKUP_NOTHING_TO_BACKUP);
   context.do_backup(){
     write_preamble(info, output_stream){
       bstream_wr_header();  // flags|start_time|total_snapshots|server_version|snap_shot_descriptors
               // snapshot_descriptor is one per storage engine, includes total no of tables, etc. info.
       bstream_wr_catalogue();  // List of dbnames, table names, etc.
       bstream_wr_metadata() {
          For each global db item (including databases) do {

             Single entry has following format :

              item_entry: type flags catalog_pos [extra_data(unused)] [object_metadata]
              object_metadata : create_stmt_string_for_table, etc.
              catalog_pos     : The pos index(offset?) with in catalogue header.

             Write :  type, flags, catalog_pos ;
             [ bstream_wr_item_def() => bstream_wr_meta_item()  ]
             Run query: show create database $dbname and write it to output stream;
             First item database (global item) : create database test /* charset latin */;
             Second item table t:  "use test; create table t (i integer) engine=MyISAM ..."
             Item privileges are currently not written.
          }
       } // End bstream_wr_metadata()
     } // End write preamble
     write_table_data(){
        create Backup_pump() for "MyISAM" snapshot
        snap.get_backup_driver() => Returns backup driver as per backup engine API.
        (See myisam_backup_engine.cc)
        Add this pump to scheduler : sch.add(p) {
           p->begin() {  // begin pumping from Backup::begin()
              Allocate hash of tables, memory, etc.
              Unless Env variable MYISAM_BACKUP_NO_INDEX is set to 1, 
                   mark flag to backup index also.
              Set state to dumping data index files;
           }
        }
        Loop:  Scheduler::step(){
          // Call next backup engine's pump() method to get All data.
          pump(){
             m_bw->get_buf();
             m_drv->get_data();
          }
        }
        // Start activating 'at begin' drivers ; Call sch.add() for them.
        Prepare for VP;
        // See WL#4610 for refined commit blocker -- which is todo
        do: Block commits : block_commits(thd) {
          // Step 1: Global read lock
          lock_global_read_lock(thd);  // lock mutex  &LOCK_global_read_lock
          // Use global read lock to block commits
          make_global_read_lock_block_commit();
        }
        sch.prepare(){
           pump->prepare() => m_drv->prelock();
        }
        sch.step() {   m_drv->get_data() }
        // VP creation start
        sch.lock();
        save_vp_info() { save timestamp; binlog position; }
        sch.unlock() { m_drv->unlock() { ... Backup::kill_locking_thread(); ...  } }
        unblock_commits();
        report_vp_info();
        //  Finish Phase
        while sch.finish_count > 0 
          sch.step() {   }
     } // End write_table_data()
     Save end time;
     write_summary() { write 0, vptime, endtime, binlog pos, binlog group }
   } // End context.dobackup()
   context.close();

} // End execute_backup_command
         
Note: Use debug flag "backup" to see all steps.

Backup_image  :=  Preamble | [table_data]* | Summary

Preamble :=  Header | Catalogue | Metadata

Header   :=  flags|start_time|total_snapshots|server_version|snap_shot_descriptors
Catalogue:=  Charset_info|User_info|Tablespace_info|list_of_dbnames|list_table_info|list_dbitem_info

Summary contains validity point information plus some loginfo for point-in-time recovery

Chunk contains snapshots ???
Total number of snapshots <= 256 !

  
where
#0  mysql_execute_command (thd=0x95acf18) at sql_parse.cc:1797
#1  0x083b9d17 in Execute_sql_statement::execute_server_code (this=0xb7445128, thd=0x95acf18) at sql_prepare.cc:2933
#2  0x083b94c6 in Prepared_statement::execute_server_runnable (this=0xb7444e48, server_runnable=0xb7445128) at sql_prepare.cc:3437
#3  0x083ba27f in Ed_connection::execute_direct (this=0xb74451dc, server_runnable=0xb7445128) at sql_prepare.cc:3967
#4  0x083ba3ad in Ed_connection::execute_direct (this=0xb74451dc, sql_text={str = 0x9625820 "SHOW CREATE DATABASE `test`", length = 27}) at sql_prepare.cc:3933
#5  0x0852113a in run_service_interface_sql (thd=0x95acf18, ed_connection=0xb74451dc, query=0xb74451cc) at si_objects.cc:211
#6  0x08521d6c in obs::check_db_existence (thd=0x95acf18, db_name=0x96257b4) at si_objects.cc:2928
#7  0x08872923 in Backup_info::add_dbs (this=0x9622f80, thd=0x95acf18, dbs=@0x95ae69c) at backup_info.cc:686
#8  0x0885b6db in execute_backup_command (thd=0x95acf18, lex=0x95ade60, backupdir=0xb7445af8, overwrite=false, skip_gap_event=false) at kernel.cc:234
#9  0x08308288 in mysql_execute_command (thd=0x95acf18) at sql_parse.cc:2432
#10 0x08310c7e in mysql_parse (thd=0x95acf18, inBuf=0x95daa10 "backup database test to '/home/thava/tmp/test.bak2'", length=51, found_semicolon=0xb74469c8) at sql_parse.cc:5950
#11 0x08311830 in dispatch_command (command=COM_QUERY, thd=0x95acf18, packet=0x95ceaf1 "backup database test to '/home/thava/tmp/test.bak2'", packet_length=51) at sql_parse.cc:1062
#12 0x08312d9c in do_command (thd=0x95acf18) at sql_parse.cc:744
#13 0x082ff357 in handle_one_connection (arg=0x95acf18) at sql_connect.cc:1163
#14 0xb7f5450f in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#15 0xb7d5c7ee in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) 


-----

Backup FAQ :

1) What operations are blocked during backup process ?

    All DDLs are blocked. (using bml_enter(); bml_get() calls )

    All DMLs related to backup database tables are blocked by
    the driver acquring read lock on the tables from the prelock() stage.
 
    All writes are blocked during create-vp lock() stage.


Questions:

  - Backup synchronization ???

  - Where is myisamlog used ?  There is no myisam.log file in our system.

  - Does 'flush logs' result in checkpoint ?
    It does increment (and rotates) the binary log.


What is the significance of XID_EVENT in binary log ? 
   - This is used before commit of tx involving Tx capable SE tables.
What is the significance of TABLE_MAP_EVENT in binary log ? 
   - Used when different definitions are used in master and slave.

- During replication, only committed txns are sent to slaves ? (delayed relay log?)
  Otherwise the "rollback" statement would be very tricky.

-  Does MyISAM suspend checkpoint operations during the entire backup process ?


All about (server) binary log:

  - Statements are stored in the form of "events" that describe the modifications. 
  - The binary log also contains information about how long each statement took that updated data.

   * Information about the state of the server that is needed to reproduce statements correctly
   * Error codes
   * Metadata needed for the maintenance of the binary llog itself (for example, rotate events) 

   Purpose: Replication or recovery

  The log consists of a set of binary log files, plus an index file. 
  

   + The header bytes provide information about the type of event, when it was generated, 
      by which server, and so forth.
   + The data bytes provide information specific to the type of event
      o The first event is a descriptor event that describes the format version of the file 
        (the format used to write events in the file).
      o The remaining events are interpreted according to the version.
      o The final event is a log-rotation event that specifies the next binary log filename. 
   * The index file is a text file that lists the current binary log files. 

  - Currently, there are three binary log format versions

       HOSTNAME-bin.0000101, HOSTNAME-bin.0000102 HOSTNAME-bin.0000103,  etc.
       HOSTNAME-bin.index

  - Log event structure (for the latest version v4) :

+=====================================+
| event  | timestamp         0 : 4    |
| header +----------------------------+
|        | type_code         4 : 1    |
|        +----------------------------+
|        | server_id         5 : 4    |
|        +----------------------------+
|        | event_length      9 : 4    |
|        +----------------------------+
|        | next_position    13 : 4    |
|        +----------------------------+
|        | flags            17 : 2    |
|        +----------------------------+
|        | extra_headers    19 : x-19 |
+=====================================+
| event  | fixed part        x : y    |
| data   +----------------------------+
|        | variable part              |
+=====================================+


How the recovery is done using checkpoint and tx logs :

Transaction recovery procedures generally make use of a deferred-write and write-through techniques.

When the recovery procedure uses deferred write or deferred update, the transaction operations 
do not immediately update the physical database. Instead, only the transaction log is updated. 
The database is physically updated only after the transaction reaches its commit point 
using the transaction log information.


The recovery process for all started and committed transactions (before the failure) 
will follow these steps:

   1.  Identify the last checkpoint in the transaction log. 
   2.  For a transaction that started and committed before the last checkpoint, 
       nothing needs to be done because the data are already saved. 
   3.  For a transaction that performed a commit operation after the last checkpoint, 
       the DBMS uses the transaction log records to redo the transaction and update the database, 
       using the “after” values in the transaction log. 
   4.  For any transaction with a ROLLBACK operation after the last checkpoint or that was left active
       (with neither a COMMIT nor a ROLLBACK) before the failure occurred, nothing needs to be done 
       because the database was never updated, and the transaction will be rerun. 

When the recovery procedure uses write-through or immediate update, the database is immediately 
updated by transaction operations during the transaction’s execution, even before the 
transaction reaches its commit point. The recovery process will follow these steps:

  Cases 1-3 are as above in the case of deferred write.
 
  case 4.  For any transaction with a ROLLBACK operation after the last checkpoint or that was left active
           (with neither COMMIT nor a ROLLBACK) before the failure occurred, the DBMS uses the 
           transaction log records to a ROLLBACK or undo the operations, using the “before” values 
           in the transaction log.  (No need for a separate undo log. Same log used as "undo" log!!!)

Note: When checkpoint is happening, no updates to the database are allowed. 


==================

Notes on MySQL backup Driver class :

backup_engine.h :

class Driver {

   /// Construct from list of tables. The list is stored for future use.
  Driver(const Table_list &tables) :m_tables(tables) {}; 

  // buf_size is minsize that backup kernel will provide driver for buf.
  // After return from begin() call driver should be ready to serve 
  // requests for sending/receiving data.
  virtual result_t  begin(const size_t buf_size) =0;

  // Finalize backup/restore process.  What is next ?
  virtual result_t end()=0;

  /// Cancel ongoing backup/restore process.
  virtual result_t  cancel() =0; 

protected:  const Table_list &m_tables;

}


Driver Phases : (Mainly from the perspective of Driver)
 
Phase 1:  Idle
    After creation of the driver instance and before @c begin() call

Phase 2: Initial Transfer
    when initial data is sent before driver can create VP.
    "At end" drivers will send majority of their data in this phase.

Phase 3: Waiting for prelock 
    When driver waits for other drivers to finish their initial transfer. 
    This phase is ended by a call to  prelock() method.

Phase 4: Preparing for lock
    when driver does necessary preparations (if any) to be able to 
    instantly create a VP upon request from kernel.

Phase 5: Waiting for lock
    when driver waits for other drivers to finish their
    preparations. Phase is finished by a call to @c lock() method.

Phase 6: Synchronization
    when the validity point is created inside lock() method. 
    For synchronization reasons, data in all tables
    being backed-up should be frozen during that phase. 
    Phase is ended by a call to @c unlock() method.

Phase 7: Final transfer
    when final backup image data (if any) is sent to the kernel. 
    "At begin" will send all their data in this phase. 
    This phase is ended by a call to end() method.

Note:

 In each phase, except for the synchronization phase (6),  kernel is polling
 atleast one driver using get_data() method. Thus a driver has a chance to 
 send data in each phase of the backup process. 

 For example, when waiting in phase 3 or 5, driver can send log recording 
 changes which happen during that time.

 A driver informs the kernel about finishing the initial transfer phase (2)
 or the lock preparation phase (4) by the value returned from the
 get_data() method.


class Backup_driver : public Driver
{
public: 
    
/*
  Return Values for get_data() :
     OK     - New data in buffer.
     READY  - OK + End of Initial Transfer OR Prepare-lock
     DONE   - OK + End of Final Transfer
     PROCESSING - Req accepted but not complete. Buffer in use.
     BUSY   - Req not accepted. Retry later. Buffer not used.
     ERROR  - Error condition
*/
   virtual result_t  get_data(Buffer &buf) =0;

/*
   Prepare for synchronization. Will wait for lock() call.
   Return values for prelock() :
      READY   The driver is ready for synchronization, i.e. it can accept
              the following lock() call.

      OK      The driver is preparing for synchronization. Kernel should 
              call get_data() and wait until driver is ready.
      ERROR
*/
      virtual result_t  prelock() {  return READY; }; 
   
     // Create VP by calling lock() method:
     // Returns Error code or OK upon success.
     // Note: global block commit is active now.
     // All engines should create VP on this latest state.
     virtual result_t  lock() =0;

     // Flag end of VP creation using unlock() call:
     virtual result_t  unlock() =0; /* Returns OK or ERROR */

     virtual size_t    size() =0; /* Return estimate size */

     // Return initial size() estimate.
     virtual size_t    init_size() =0;
}


Note: Buffer is allocated by kernel given to driver.

struct Buffer
{
  size_t  size;   /* in/out param. Total size. Driver sets it to usedsize */
  uint    table_num;   /* table no to which data belongs.*/
  bool    last;       // TRUE if this is last block of data in the stream.
  byte    *data;      // Pointer to data ar
  ...
}
  For example, consider backing-up three tables t1, t2 and t3. Data blocks
  produced by a backup driver are divided into four streams:

  #0: shared data
  #1: data for table t1
  #2: data for table t2
  #3: data for table t3

Backup Image Format, and Obj instances, etc. :

image_info.h :

struct st_bstream_image_header
{
  unsigned int      version;    /* Image's format version number. */
  server_version;
  /* flags indicate (a)  if image created using big/little endian. 
                    (b)  if binlog position stored in image or not.
   */
  unsigned int      flags;      /* Image options. */
  time_t            start_time; 
  time_t            end_time;  
  time_t            vp_time; 

  /** Position of the last binlog entry at the VP time. */
  struct st_bstream_binlog_info  binlog_info;

  /** Number of table data snapshots in the image. 
     One snapshot per driver.
  */
  unsigned short int        snap_count;

  struct st_bstream_snapshot_info snapshot[256];
};

/* What kind of items are saved in image ? */
enum_bstream_item_type {
  charset, user, privilege, db, table, view, stored proc, func, event,
  triggers, tablespace
};

/* Item info basic structure. */
struct st_bstream_item_info{
   item type;  /* db, table, etc. */
   name of the item;
   long int pos; /* position of the item in image's catalogue. */
};

/* DB specific item info. Nothing but basic item info. */
/* The same style is used for table space info, etc. */
struct st_bstream_db_info
{
  struct st_bstream_item_info  base;  /**< The base of the info class. */
};

/**
  Describes per database item.

  @note The fields snap_num, pos are used for triggers only and are NULL
  otherwise. The fields contain the coordinates of the table on which the
  trigger fires.

  All other per db items are not tied to another per db item.
  However trigger (per db item) is tied to table (another per  db item).
  Hence this needs special treatment.

  Hence trigger info contains :
     Basic item info; describes name, db catalog pos number;
     Associated db info;
     snapshot number of the associated table;
     position of the table within snapshot;
     What about associated table basic info (catalog pos no?) ????
*/

struct st_bstream_dbitem_info
{
  struct st_bstream_item_info  base;/**< Data common to all items. */
  struct st_bstream_db_info    *db; /**< Database to which this item belongs. */
  unsigned short int  snap_num;     /**< Snapshot where table data is stored. */
  unsigned long int   pos;          /**< Table position within the snapshot. */
};


class Image_info : public st_bstream_image_header
{
   /* Define following class types only for the purpose of 
      backup image.
   */
   class Obj, Ts, Db, Table;
   class Dbobj;   /* Any perdb obj other than table:func,event,triggers...*/
   /* we need these iterators to list the image contents! */
   class Iterator, Ts_iterator, Db_iterator, Dbobj_iterator;

   ulonglong data_size; /* How much table data saved in image ? */

   /* What is the diff between master binlog info and the one in 
      the image header ? */
   st_bstream_binlog_info  master_binlog_info; 

   /* get count functions for obj, table, view, routine, priv, db, ts
      and snapshots: */
   ulong object_count(); ulong table_count(); ...

   /* Get obj given name: db, ts */
   Db* find_db(const String&) const;
   Ts* find_ts(const String&) const;

   /* Get obj pointer given pos number: db, ts, db_object,
                         table.  */
   Db*  get_db(uint pos) ;  Ts* get_ts(uint pos);
   Dbobj* get_db_object(uint db_num, ulong pos);
   Table* get_table(ushort snap_num, ulong pos) const;

   /* Iterators for enumerating contents of archive: dbs, ts, db_objects */
   Db_iterator* get_dbs();
   Ts_iterator* get_tablespaces();
   Dbobj_iterator*  get_db_objects(Db& db);

   /* all info about snapshots */
   Snapshot_info  *m_snap[MAX_SNAP_COUNT];

   /*Routines to save start, endtime, vptime, binlog & master binlog pos..*/
   /* Routines to Get vp time, end time */

protected:
   /* Routines to:
    Add snapshot, Add Db* at uint pos, Add Ts* at pos,
    Add db object given obj_type, obs::Obj*, at pos:
    Dbobj* add_db_object(Db &db, const obj_type type, obs::Obj *obj, 
                                       ulong pos);
    Table* add_table(Db &db, obs::Obj *tbl, Snapshot_info &snap, ulong pos);
 
protected:
   /* Implementation specific */

   MEM_ROOT mem_root;
   class  Tables;  /* implements Table_list interface */
   ...

private:
   Map<uint, Db>  m_dbs;     /* List of dbs */
   Map<uint, Ts>  m_ts_map;  /* List of Ts */
};

Note:
  - class Image_info::Tables implements tables list:

    class Image_info::Tables: public Table_list, 
                              public Map<uint, Image_info::Table>;

  - The list is implemented using map. Insertion/Retrieval happens
    using pos in the map.
  -  There is following typedef for Base defined with in this 
     Tables class:
         typedef Map<uint, Image_info::Table> Base;
  - Table_list is abstract class which enforces that operator[pos] 
    returns Table_ref (which contains basic table name info).
  -
 
class Snapshot_info {
public:
  - Enum for snapshot type: Native, Default, CS, No data
  - Get snapshot type:  type()
  - version_t version();  // Get snapshot version. int type.
                          // This is separate from Image version (int)

  - char *name(); // For "%s backup/restore driver" pattern

  - Get table count.  

  - Determine if given table can be saved in this snapshot:
     bool accept(Table_ref&, storage_engine_ref);

  - Create backup/restore driver:  get_backup_driver(Backup_driver*&) etc.

  - Get table given pos:  Image_info::Table* get_table(ulong pos) const; 
      - Is this pos in snapshot or pos in Image catalogue ???

protected:
  - Add table, given pos: int add_table(Image_info::Table& t, ulong pos);
private:
  Image_info::Tables  m_tables;  // List of tables in this snapshot  
}

//  Note: This is pure virutal class !!!
class Image_info::Obj : public Sql_alloc
{
   // Image allocation is done using MEM_ROOT. Dummy destructors!
  obj_type type();  // Same as  enum_bstream_item_type
       // charset, user, privilege, db, table, view, stored proc, func, 
       // event, triggers, tablespace
  
  // info() returns pointer to item info
  st_bstream_item_info* info();

  //  Pointer to obs::Obj instance if it is known.
  obs::Obj  *m_obj_ptr;

  // Description of obj. (defn??)
  char* describe(describe_buf&) const =0;

protected:
  Obj(obs::Obj *obj_ptr);  // Given  obs::Obj
};

//  Definition of Table space

class Image_info::Ts  :
   public st_bstream_ts_info,
   public Image_info::Obj,
   public Db_ref;

Note:  - Table space is per DB item.
       - Table space implements generic Image_info::Obj
       - Table space implements ts item info class.


// Definition of Table
class Image_info::Db
 : public st_bstream_db_info,
   public Image_info::Obj,
   public Db_ref ;

Note: - Db also contains functions to add/get obj given pos.
        Internally uses Map<ulong, Dbobj> m_objs; to store the 
        list of Dbobj's. 
        This pos seems to be just a pos in Db rather than pos in the
        catalogue or snapshot.

/**
  Specialization of Image_info::Obj for storing per-database object.
  @note For tables, there is dedicated class @c Image_info::Table.
*/

class Image_info::Dbobj
  : public st_bstream_dbitem_info,
    public Image_info::Obj,
    public Table_ref;

Note: - Why does it implement Table_ref ??? 
           Many Dbobj may not be associated with Table.
           Trigger is an exception which is associated with table.
        It appears like a hack since Table_ref provides name() kind
        of basic functions which are reused here even for non-tables!


/// Specialization of @c Image_info::Obj for storing info about a table.

class Image_info::Table
 : public st_bstream_table_info,
   public Image_info::Obj,
   public Table_ref
{
  const Db &m_db;      /* Associated Database */
      Note: Db_ref just provides name() etc basic functions.
            Db is a class which is ready for storing in Image.

  Table  *next_table;  /* can be used to create linked list in db */
  TABLE_LIST  *m_table; /* If not NULL, points at opened table. */
  .....
  .....
}

/**
  Base class for all iterators.

  An iterator is used as follows

  Iterator_X      it;
  Image_info::Obj *obj;

  while ((obj= it++))
  {
    <do something with obj>
  }

  This is an abstract class. Derived iterators must define @c get_ptr() and
  @c next() methods which are used to implement @c operator++().
*/

class Image_info::Iterator
{
      Obj* operator++(int);

  protected:
     const Image_info &m_info;     <==  Note this.
  ....  
};


/* Used to iterate over all tablespaces stored in a backup image. */

class Image_info::Ts_iterator
 : public Image_info::Iterator
{
   Ts_iterator(const Image_info&);

protected:
  uint pos;     // Position in the iterator starting from 0.
 ....
};


Similarly:
class Image_info::Db_iterator    : public Image_info::Iterator;
class Image_info::Dbobj_iterator : public Image_info::Db_iterator;

Note: constructor:  Dbobj_iterator(Image_info &info, const Db &db);

Note: void Image_info::count_object(const enum_bstream_item_type type);
      increments counter for given type:

  case BSTREAM_IT_SPROC:
  case BSTREAM_IT_SFUNC:
  case BSTREAM_IT_EVENT:
  case BSTREAM_IT_TRIGGER:
    m_routine_count++;       // wow!!!
    break;

ulong Image_info::routine_count() { return m_routine_count; }
ulong Image_info::db_count() const { return m_dbs.count(); }
ulong Image_info::ts_count() const { return m_ts_map.count(); }
ulong Image_info::table_count() const { return m_table_count; }
ulong Image_info::view_count() const { return m_view_count; }
...

// This pos happens to be same as pos in image catalog
// since databases are stored first in the catalog.
Image_info::Db* Image_info::get_db(uint pos) const { return m_dbs[pos]; }

// This pos starts from 0. Not an abs pos in catalog.
Image_info::Ts* Image_info::get_ts(uint pos) const { return m_ts_map[pos];}


Note:  How to get instance of obs::Obj ? 
Examples: 
    obs::Obj *db_obj= obs::get_database(&name_str);
    obs::Obj *tbl_obj= obs::get_table(&db->name(), &name_str);
    

Peformance Test Results:

https://inside.mysql.com/wiki/PerformanceResults-backup

Efficiency of backup operation vs mysqldump vs plain copy vs xtrabackup 

xtrabackup - Percona's backup utility

Engine    MySQL-Backup   Copy      MySQLDump    XtraBackup
MyISAM    6.72S/GB       6.29s/gb  29s/gb         6.30s
InnoDB    52s            ??        28s            ??

Restore:
MyISAM     Almost Same Time as backup
InnoDB     Takes 25 Times as backup

Big number of DB Objects:
 -- Results for backup database with big number of DB objects 
    do not depend much on engine since it is mostly metadata.
 -- Restore is 2 to 4 times slower.
 -- Dump is around 20 times slower for backup!!! (why ???)
 -- Restore from dump is almost as efficient as native backup

Impact of backup operation on the workload

- During backup there may be 50% performance drop for InnoDB
- For MyISAM there may be around 30% performance drop.

Note: There is no obvious and striking findings in backup performance 
      report. Comparison charts are bit confusing. Will be good to
      plot the throughput for one vs other.
      The restore gives deterministic figures because it is intrusive.
      The restore may take longer in future if it is to be non-intrusive.

      Probably innodb engine statistics about locks/waiting time may
      be useful to register ??

The parent worklog for online backup testing is WL#4208

Notes on WL#4406 : Online Backup: Performance Testing
  Implementor is Ranger. Trigger is not mentioned in the WL but is tested.
  The idea is to make a baseline for performance testing.
  The figures will be compared with file copying time.
  Testing with different database sizes are included.
  Testing with high amount of metadata is not included (but is tested?)
  Effect of online queries on backup performance is also to be tested.
  Good comprehensive goals.

WL#4220: Backup Stress Testing -- Created by chuck.
         Includes perf testing WL#4406 and 
         WL#4218 - Invariant testing - (Tx consistency. Only for innodb?)
                           (Patch queued. Completed?)
         Endurance Testing (WL#4922) - Highly parallel exec not to fail.
               (This may expose any backup starvation problem)

Question: Where backup_restore.pl script from WL#4044 
                           (large datatest by Elena)?

WL#5214 : Test MySQL backup with different database sizes
      Though WL#4406 covers some part of it, the focus of 4406 is
      general performance where as this should just focus on database size
      only (hence could do better job in this area).
      Should also handle large metadata.
      Will be better to have throughput graph as a result vs data size.


MyISAM Backup:
  - See worklog 866 to see how MyISAM handles backup