A Japanese Geek's MySQL Blog.

2009-12-30

awk and mysqldump

Awk, which has been existing for long time on UNIX like operating system, may sound legacy for people who live in modern web-based programming world. However, awk is really sophisticated and useful tool for various purposes. Today, I'll show you how to use it with mysqldump ;)

One feature which is missing on the current mysqldump command is to separate dump files per DB or table. It can store all data into a large single file only, whether we run the command with --result-file option or redirect the output to some file. Of course it is possible to run mysqldump command several times per DB or table, but it cannot produce a consistent snapshot at a certain time, each dump may be time-shifted in other words, unless all involved tables are locked during dump. So, we need to backup database using a single mysqldump instance if we need a consistent data.

Now, you can make use of awk! It is a really powerful text processor. As an output of mysqldump is text, you can process the output using awk. See the following awk script:
#!/usr/bin/awk -f

function is_new_db(db_name) {
  for (i = 1; i <= num_db; i++) {
    if (db_name == db[i]) {
      return 0;
    }
  }
  return 1;
}

BEGIN {
  num_db = 0
  num_prelines = 0
  num_postlines = 0
  current_file = "/dev/null"
}

/^\-\-/ {
  if ($2 == "Current" && $3 == "Database:") {
    close(current_file);
    db_name = $4
    gsub("`", "", db_name);
    current_file = db_name ".sql";
    if (is_new_db(db_name)) {
      db[++num_db] = db_name;
      print "--\n" $0 "\n--\n" > current_file;
      for (i = 1; i <= num_prelines; i++)
        print prelines[i] >> current_file;
    }
  } else if (num_db == 0) {
    num_prelines++;
    prelines[num_prelines] = $0;
  } else if ($2 == "Dump" && $3 == "completed") {
    num_postlines++;
    postlines[num_postlines] = "";
    num_postlines++;
    postlines[num_postlines] = $0;
  } else {
    print $0 >> current_file
  }
  next;
}

/^\/\*.+\*\/;/ {
  if (match($0, "character|collation")) {
    print $0 >> current_file;
  } else if (match($0, "SET")) {
    if (num_db == 0) {
      if (match(prelines[num_prelines], "^\-\-")) {
        num_prelines++;
        prelines[num_prelines] = "";
      }
      num_prelines++;
      prelines[num_prelines] = $0;
    } else {
      num_postlines++;
      postlines[num_postlines] = $0;
    }
  } else {
    print $0 >> current_file;
  }
  next;
}

{ print $0 >> current_file }

END {
  for (i = 1; i <= num_db; i++) {
    current_file = db[i] ".sql";
    print "" >> current_file
    for (j = 1; j <= num_postlines; j++) {
      print postlines[j] >> current_file;
    }
  }
}
Save the script into a file named separate-dump.awk etc, and make it exectable (chmod u+rx). You can separate dump files per DB by using the script like below:
shell> mysqldump -A --single-transaction --master-data=2 --flush-logs | ./separate-dump.awk
Then, dump files named like "database_name.sql" are created under your current directory! Of course, you can also process an existing dump file like below:
shell> ./separate-dump.awk < dump.sql
Making use of "good old fashioned" unixy command tools will make us happy ;)

Enjoy!!

3 comments:

  1. I've been looking to do this for a while, and seen a few solutions that looked rather .. umm .. interesting. Even bodged a bash script that would do similar by identifying every database in the server and then calling mysqldump one by one, but this is so much nicer :)

    ReplyDelete
  2. Yesterday in a programming system course test I have a item where the answer had to be this one : 1. shell> mysqldump -A --single-transaction --master-data=2 --flush-logs | ./separate-dump.awk

    ReplyDelete

  3. Its a wonderful post and very helpful, thanks for all this information. You are including better information regarding this topic in an effective way.Thank you so much

    Personal Installment Loans
    Title Car loan
    Cash Advance Loan

    ReplyDelete