Backup and restore

mysqldump breaks character encoding?

 
 
Picture of Visvanath Ratnaweera
mysqldump breaks character encoding?
Group Particularly helpful Moodlers
I need to understand how mysqldump (the cli tool) handles the character encoding.

I had a case where taking a db dump and playing it back broke the environment check
(Site administration -> Server -> Environment). As a result the next upgrade also stops
at the initial unicode check. The db dump was taken with the mysqldump cli tool, played
back with mysql cli tool.

I localized the problem in a test setup.  These steps are reproducible - all on Debian
Squeeze, 64 bit:

1. Create a fresh 1.9.16 instance on http://my.site/test/. Worked error free, including
the environment check after install.

2. Check the character set:
mysql> use db_test;
mysql> show local variables like 'character_set_database';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| character_set_database | utf8  |
+------------------------+-------+

3. Take a sql dump,
$ mysqldump -u db_user -pSECRET1 db_test > db_test.sql

4.  drop and recreate the database, play back the sql dump.
$ mysqladmin -pSECRET1 drop -f db_test
$ mysqladmin -pSECRET1 create db_test --default-character-set=utf8
$ mysql --user=db_user --password=SECRET2 db_test < db_test.sql

Now the enviroment check fails!

5. Check the character set:
mysql> use db_test;
mysql> show local variables like 'character_set_database';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| character_set_database | latin1  |
+------------------------+-------+

Character set has changed from utf8 to latin1 !

What am I doing wrong?

P.S. Sorry for the broken formatting. For whatever the reason: '"Plain text format" is not
monospace'  http://moodle.org/mod/forum/discuss.php?d=200351.
 
Average of ratings: -
Picture of Iñaki Arenaza
Re: mysqldump breaks character encoding?
Group DevelopersGroup Documentation writersGroup Particularly helpful Moodlers
Hi Visvanath,

when using mysqldump, I always use the '--opt' option (and '--routines' and '--triggers' if I suspect the database has stored procedures and/or triggers, which I think Moodle doesn't use so far with MySQL). Maybe --opt is the default now, And I always use the mysql root user to dump the database. I also use the mysql root user when restoring it.

That helps restoring the database exactly as it was when the backup was taken (including character set options, even at the column level).

(please, keep me updated on this via your authentication forum post, as I'm not suscribed to this forum)

Saludos.
Iñaki.
 
Average of ratings: -
Picture of Visvanath Ratnaweera
Re: mysqldump breaks character encoding?
Group Particularly helpful Moodlers
@all

The problem still remains. It must be my understanding of MySQL character encoding and the way the tools mysqldump and mysql corporate.

The problem was, if I do
===
$ mysqldump -u db_user -pSECRET1 db_test > db_test.sql
$ mysqladmin -pSECRET1 drop -f db_test
$ mysqladmin -pSECRET1 create db_test --default-character-set=utf8
$ mysql --user=db_user --password=SECRET2 db_test < db_test.sql
===
it break Site Adminstration -> Server -> Environment -> unicode.

If I don't drop and create the database, i.e. not execute the second and third commands above, the site passes the unicode test.

Note: The db dump contains
===
DROP TABLE IF EXISTS 'tablename';
CREATE TABLE `tablename` (
....
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='some comment';
/*!40101 SET character_set_client = @saved_cs_client */;
===
kind of statements for each table.

Apparently I need a different "CREATE db" command from the one above. Any hints?
 
Average of ratings: -
Picture of Iñaki Arenaza
Re: mysqldump breaks character encoding?
Group DevelopersGroup Documentation writersGroup Particularly helpful Moodlers
Hi Visvanath,

this is what I just did with my local test install using 1.9.17+ (Build: 20120419)

mysqldump --opt -u root -pSECRET moodle_19 > moodle_19.dump
mysql -u root -pSECRET -e "drop database moodle_19"
mysql -u root -pSECRET -e "create database moodle_19 default character set utf8 collate utf8_unicode_ci"
mysql -u root -pSECRET moodle_19 < moodle_19.dump

and I don't get any problems in "Site Adminstration -> Server -> Environment -> unicode".

Saludos.
Iñaki.
 
Average of ratings: -
Picture of Visvanath Ratnaweera
Re: mysqldump breaks character encoding?
Group Particularly helpful Moodlers
Hi Iñaki

I read somewhere that --opt is the default and wasn't convinced whether this would make a difference. But you are right.

After changine the two lines:
$ mysqladmin -u root -pSECRET drop -f db_test
$ mysqladmin -u root -pSECRET create db_test --default-character-set=utf8
to
$ mysql -u root -pSECRET -e "drop database db_test"
$ mysql -u root -pSECRET -e "create database db_test default character set utf8 collate utf8_unicode_ci"
"Site Adminstration -> Server -> Environment -> unicode" is OK.

Apart from a different tool (mysql instead of mysqladmin) now there is the additional option "default collate utf8_unicode_ci". I think collate was the crucial thing.

Thank you very much for going out of way to help me solving this problem!
 
Average of ratings: -
Gmads
Re: mysqldump breaks character encoding?
Group Particularly helpful MoodlersGroup Testers

Hi Visvanath,

The problem is caused because the default-character-set option is ignored by mysqladmin (whether that is by design or it is a bug, I don't know).

The mysqladmin program reads and uses the value from character-set-server parameter defined in the my.ini file.

 
Average of ratings: -
Picture of Iñaki Arenaza
Re: mysqldump breaks character encoding?
Group DevelopersGroup Documentation writersGroup Particularly helpful Moodlers
Hi Visvanath,

when using mysqldump, I always use the '--opt' option (and '--routines' and '--triggers' if I suspect the database has stored procedures and/or triggers, which I think Moodle doesn't use so far with MySQL). Maybe --opt is the default now, And I always use the mysql root user to dump the database. I also use the mysql root user when restoring it.

That helps restoring the database exactly as it was when the backup was taken (including character set options, even at the column level).

(please, keep me updated on this via your authentication forum post, as I'm not suscribed to this forum)

Saludos.
Iñaki.
 
Average of ratings: -
Picture of Visvanath Ratnaweera
Re: mysqldump breaks character encoding?
Group Particularly helpful Moodlers
Hi Iñaki

Thanks for the hints. They lead to the tons of information here http://docs.moodle.org/en/Converting_your_MySQL_database_to_UTF8. Sure, I'll keep the other thread "auth/ldap/auth_ldap_sync_users.php behaves differently in 1.9.17+ than in 1.9.16+" http://moodle.org/mod/forum/discuss.php?d=200291 updated.

@all
If you have further hints to narrow down my search, I appreciate.
 
Average of ratings: -
Picture of Howard Miller
Re: mysqldump breaks character encoding?
Group DevelopersGroup Documentation writersGroup Particularly helpful Moodlers
I'm not sure if this helps (have not read this thread in detail) but we have had lots of fun with the collation settings in MySQL.

mysqldump does not comprehensively dump the collation settings for the fields/tables (regardless of --opt as far as I can tell). To make matters worse the way in which the default collation for fields is defined when the table is created is really weird. I assumed that it would come from the parent table/database but this turns out not to be the case. It is derived from (I *think* anyway) whataver locale the MySQL server happened to be built with!

So, it makes it remarkably easy to migrate a database from one machine to another and screw up all the collations.

Our solution was to run a script over the dump file so that the collations are specified explicitly in all cases.

Increasingly annoyances like this are pushing me more and more into the PostgreSQL camp which is altogether a more civilised database.
 
Average of ratings: -
Picture of Visvanath Ratnaweera
Re: mysqldump breaks character encoding?
Group Particularly helpful Moodlers
Hi Howard

I got is solved in the other subthread: http://moodle.org/mod/forum/discuss.php?d=200610#p878358.

Side note: Pity one can not tag discussions as [SOLVED] in moodle.org.

Well, the "solution" didn't explain what was wrong with the old method. Still it is a consolation to know that I'm not alone in this.
;-(
 
Average of ratings: -
Picture of Helen Foster
Re: marking discussions as solved
Group DevelopersGroup Documentation writersGroup Moodle HQGroup Particularly helpful MoodlersGroup Testers

Although it's not possible to tag discussions, [Solved] can be added to the subject line of a post, as suggested in the Moodle.org forums Code of Conduct.

 
Average of ratings: -
Picture of Visvanath Ratnaweera
Re: marking discussions as solved
Group Particularly helpful Moodlers
Hi Helen

That solution has a big handicap that you can't see the "tag" in the forum overview, http://moodle.org/mod/forum/view.php?id=746 for example.

BTW, how often is this "suggestion" followed? http://moodle.org/mod/forum/search.php?id=5&subject=solved gives me 992. Out of how many threads?
 
Average of ratings: -