A Japanese Geek's MySQL Blog.

2009-12-30

awk and mysqldump

Awk, which has been existing for long time on UNIX like operating system, may sound legacy for people who live in modern web-based programming world. However, awk is really sophisticated and useful tool for various purposes. Today, I'll show you how to use it with mysqldump ;)

One feature which is missing on the current mysqldump command is to separate dump files per DB or table. It can store all data into a large single file only, whether we run the command with --result-file option or redirect the output to some file. Of course it is possible to run mysqldump command several times per DB or table, but it cannot produce a consistent snapshot at a certain time, each dump may be time-shifted in other words, unless all involved tables are locked during dump. So, we need to backup database using a single mysqldump instance if we need a consistent data.

Now, you can make use of awk! It is a really powerful text processor. As an output of mysqldump is text, you can process the output using awk. See the following awk script:
  1. #!/usr/bin/awk -f  
  2.   
  3. function is_new_db(db_name) {  
  4.   for (i = 1; i <= num_db; i++) {  
  5.     if (db_name == db[i]) {  
  6.       return 0;  
  7.     }  
  8.   }  
  9.   return 1;  
  10. }  
  11.   
  12. BEGIN {  
  13.   num_db = 0  
  14.   num_prelines = 0  
  15.   num_postlines = 0  
  16.   current_file = "/dev/null"  
  17. }  
  18.   
  19. /^\-\-/ {  
  20.   if ($2 == "Current" && $3 == "Database:") {  
  21.     close(current_file);  
  22.     db_name = $4  
  23.     gsub("`""", db_name);  
  24.     current_file = db_name ".sql";  
  25.     if (is_new_db(db_name)) {  
  26.       db[++num_db] = db_name;  
  27.       print "--\n" $0 "\n--\n" > current_file;  
  28.       for (i = 1; i <= num_prelines; i++)  
  29.         print prelines[i] >> current_file;  
  30.     }  
  31.   } else if (num_db == 0) {  
  32.     num_prelines++;  
  33.     prelines[num_prelines] = $0;  
  34.   } else if ($2 == "Dump" && $3 == "completed") {  
  35.     num_postlines++;  
  36.     postlines[num_postlines] = "";  
  37.     num_postlines++;  
  38.     postlines[num_postlines] = $0;  
  39.   } else {  
  40.     print $0 >> current_file  
  41.   }  
  42.   next;  
  43. }  
  44.   
  45. /^\/\*.+\*\/;/ {  
  46.   if (match($0, "character|collation")) {  
  47.     print $0 >> current_file;  
  48.   } else if (match($0, "SET")) {  
  49.     if (num_db == 0) {  
  50.       if (match(prelines[num_prelines], "^\-\-")) {  
  51.         num_prelines++;  
  52.         prelines[num_prelines] = "";  
  53.       }  
  54.       num_prelines++;  
  55.       prelines[num_prelines] = $0;  
  56.     } else {  
  57.       num_postlines++;  
  58.       postlines[num_postlines] = $0;  
  59.     }  
  60.   } else {  
  61.     print $0 >> current_file;  
  62.   }  
  63.   next;  
  64. }  
  65.   
  66. { print $0 >> current_file }  
  67.   
  68. END {  
  69.   for (i = 1; i <= num_db; i++) {  
  70.     current_file = db[i] ".sql";  
  71.     print "" >> current_file  
  72.     for (j = 1; j <= num_postlines; j++) {  
  73.       print postlines[j] >> current_file;  
  74.     }  
  75.   }  
  76. }  
Save the script into a file named separate-dump.awk etc, and make it exectable (chmod u+rx). You can separate dump files per DB by using the script like below:
  1. shell> mysqldump -A --single-transaction --master-data=2 --flush-logs | ./separate-dump.awk  
Then, dump files named like "database_name.sql" are created under your current directory! Of course, you can also process an existing dump file like below:
  1. shell> ./separate-dump.awk < dump.sql  
Making use of "good old fashioned" unixy command tools will make us happy ;)

Enjoy!!

4 comments:

  1. I've been looking to do this for a while, and seen a few solutions that looked rather .. umm .. interesting. Even bodged a bash script that would do similar by identifying every database in the server and then calling mysqldump one by one, but this is so much nicer :)

    ReplyDelete
  2. Yesterday in a programming system course test I have a item where the answer had to be this one : 1. shell> mysqldump -A --single-transaction --master-data=2 --flush-logs | ./separate-dump.awk

    ReplyDelete
  3. Thanks for posting the blog. I felt comfortable while reading the post. Keep posting more blogs.
    Click Here For More Information...
    Oracle SQL Training in Al Karama, Dubai.

    ReplyDelete