### Converting to UTF8 - what Moodle Docs procedure to use?

Converting to UTF8 - what Moodle Docs procedure to use?

Hello fellow Moodlers,

I recently upgraded one of our Moodle environments from 2.9.6 to 3.3.3, and with that, wanted to convert our MariaDB database (v 10.2.10) from utf8 to utf8mb4, since I saw a warning on the Server > Environment page.

I ran into trouble because of these separate procedures, since the 2nd procedure specifically says that you should add

﻿skip-character-set-client-handshake

to the my.cnf file, but the 1st procedure does not include that line. Once we added that line, everything worked fine.

We also experienced a separate error because of the 1st procedure, since it says you should add﻿

default-character-set=utf8mb4default-collation=utf8mb4_unicode_ci

to my.cnf, but these options are were apparently deprecated in MySQL 5.1 and removed in 5.5. For more information, read these bug reports and discussions. Once we removed those lines, the error went away.

I admit that I am a mere noob when it comes to these technical details about databases, collations, character sets, etc. (our fantastic IT department helps us manage these aspects), but still I wonder - are these Moodle Docs pages about the same thing? Could they be merged? Of course I could edit the wiki pages or write in the page comments, but given my lack of expertise, I fear I would do more harm than good, and wanted to share this with some expert eyes.

- Ray

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

If you were running 2.9 then your database is already *definitely* UTF8.

What you probably want is the second link about 'full' unicode support (utf8mb4).

Average of ratings: Useful (1)
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Hi Howard,

Thanks for taking a look! Yes, our site was definitely on utf8, and we were moving to utf8mb4.

I guess my question to the community is: what is the appropriate situation for using the 1st procedure, as opposed to the 2nd? In my case, I tried using the 1st procedure, since it was the first one that I found, and nothing in the documentation prompted me to wonder whether I was in the right place. In fact, when I read the sentence, "﻿You need to do two things. 1) Change your mysql to have utf8mb4 as its character set and 2) Change your database to utf8mb4", I thought to myself, "Yes, those are exactly the two things I need to do. I am in the right place!"

Both of these procedures have:

• a step about updating settings in my.cnf
• a step about updating the database character sets,

but one of them has deprecated/missing MySQL settings, and the other does not. Is the 1st procedure just out of date?

I say all of this in the spirit of helping others who might make the same mistake I did, and also with a deep sense of humility if I am missing something obvious about character sets, etc., that necessitates two separate pages and procedures.

- Ray

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

The first docs "Converting your MySQL database to UTF8" clearly says at its start that it "describes how to convert your MySQL database from the latin1 charset to UTF8". It is something we went through around Moodle 1.6 times. As your site has already been running on UTF8, that docs does not apply to you (like Howard already said).

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Hi David,

Many thanks for taking a look!

When that sentence says "latin1 charset to UTF8", does it mean three-byte UTF8, or four-byte UTF8? The document says lots of things about "utf8", "UTF8", and "utf8mb4". I would think that my concern about deprecated MySQL settings would hold either way.

Also, the second document actually links directly to the first document. Under the section "Steps to upgrade", the bold NOTE says "For large sites, this script will be slow. It is recommended to dump and reimport your data according to https://docs.moodle.org/33/en/Converting_your_MySQL_database_to_UTF8#Default_Mysql_character_set".

Thus, I would still actually end up on the first document if I followed those directions.

I hope that some part of this is helpful!

- Ray

Average of ratings: Useful (1)
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Thanks Ray for highlighting documentation needing improvement, and thanks Howard and David for your advice.

As David mentions, the page "Converting your MySQL database to UTF8" dates back to Moodle 1.6, so I have deleted the page in the 3.3 and 3.4 docs wikis. (It's still available in the 3.2 docs wiki though - https://docs.moodle.org/32/en/Converting_your_MySQL_database_to_UTF8 )

I also noticed that MySQL full unicode support linked to it, but found it unclear which bit of text it referred to. Thus I removed the link and am hopeful that if further information is required, someone knowledgeable will add it.

Average of ratings: Useful (3)
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Hi Helen et al.,

I really appreciate everyone's careful attention to this nitty-gritty issue! Those changes sound about right to me.

The only final thing I would say, which I think you alluded to ("hopeful that if further information is required, someone knowledgeable will add it. "), is that that old page did have examples of some useful find/replace commands that could be used with the "dumpfile" method.

Specifically, I am talking about the commands in the Converting a database containing tables > Linux & Mac section. My understanding is that, to modernize those commands, one would change instances of latin1 to uft8, or something like that. Since I know the old versions of that page exist, I can find the example commands and use them, but other people would not know to look for the old 3.2 version. All that said, the PHP script exists to do the same thing, and perhaps it is appropriate to expect that power-users with huge sites will seek out the appropriate information elsewhere. We can't expect the Moodle Docs to explain everything about everything!

Anyway, thanks again for helping resolve this.

- Ray

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

I'm a bit lost... you shouldn't be seeing 'latin1' encoding at all. It's nearly 2018!!

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Hi Howard,

Yes, that is correct! (and wow, 2018 snuck up on me!) My database never used latin1 encoding. But if you read this example find/replace command in the Linux & Mac section of "Converting your MySQL database to UTF8" (the old document that is now deleted for 3.3+), you will see that it says:

:%s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci/
:%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8mb4/

When we used these commands, we just replaced latin1 with utf8. So, we found those commands useful as a suggestion of what to do, but we needed to tweak them. If a new person comes along and only reads the 3.3+ versions of these wiki pages, they will never see these useful commands (since that page is deleted). Again though, they are directed to use the script, which does the same thing.

Therefore, in the current state, I think the 3.3+ version of these wiki pages is technically 100% correct (in my humble opinion) and does not lead anyone astray, but it could have one more piece of useful information if it somehow included an up-to-date version of the find/replace commands above. Nit-picky, I know, but I just want to make sure everyone understands what I mean.

-Ray

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

I may be missing something... why would you still need these commands?

Moodle has a built-in tool to do the conversion.

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Because those in-built commands were reported to take a while to run so it may be faster to actually do it this way (that's how I understood the reasoning).

Average of ratings: Useful (1)
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Ye-es. In fact I think it was me that reported it. On a copy of my live site it took several days (and then I gave up). I thought that some work had been done on the script to improve it?

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

I don't know, sorry. I haven't been involved in that issue.

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Answering my own question - MDL-58729

It would seem that substantial improvements were achieved.

Average of ratings: Useful (1)
Re: Converting to UTF8 - what Moodle Docs procedure to use?

Hi Howard and David,

Yes, I was going by the recommendation on MySQL full unicode support that said "For large sites, this script will be slow" and in the old version of that page, it then included a link to the (now-deleted) page with the find/replace commands. Perhaps that sentence ("... this script will be slow...") could be removed? Then there would be no more possible confusion. Based on that tracker issue, it seems like the tests that David Monllao reported indicate that the script runs much much faster! Taking less than 10% of the unpatched time, that's amazing!

Anyway, I think all of you are caught up as to exactly what I was asking and experiencing. Thanks again!

- Ray

Average of ratings: -
Re: Converting to UTF8 - what Moodle Docs procedure to use?

I think it might still be slow on large sites. But there's slow and then there's S L O W

On the 3.3 and 3.4 doc pages I have updated that line to say...

On very large sites this may take a long time to run. You should probably establish how long on a test install before taking your live site offline. In some cases you might consider dumping and re-importing your data.

Average of ratings: Useful (2)
Re: Converting to UTF8 - what Moodle Docs procedure to use?

The page has now been deleted (thank you Helen) which should avoid confusion.

There can't be anybody left running Moodle <1.6...

Average of ratings: -