When an arbitrary text file is uploaded into the files area in the now all unicode version 1.6 - how does Moodle handle the file's character encoding? Or does it? Does it assume a particular encoding and convert to UTF-8 or does it do nothing (in which case I guess some files may not be displayed correctly).
I am basically wondering why no option is provided for the user to select the encoding of the file.
How does uploading a text file work with character encodings in 1.6
by Howard Miller -
Number of replies: 11
In reply to Howard Miller
Re: How does uploading a text file work with character encodings in 1.6
Exactly like in Moodle 1.5? Browser decides (based on content-type html headers).
The *only* difference is that content-type http headers will default to utf-8 so some text files and some html files (without the proper html content-type header) will need to be converted).
Conversion of text files to utf-8 should be easy and I really think (hope) that 99% of the current html files should have the html headers properly defined since a lot of time ago.
Ciao
The *only* difference is that content-type http headers will default to utf-8 so some text files and some html files (without the proper html content-type header) will need to be converted).
Conversion of text files to utf-8 should be easy and I really think (hope) that 99% of the current html files should have the html headers properly defined since a lot of time ago.
Ciao
In reply to Eloy Lafuente (stronk7)
Re: How does uploading a text file work with character encodings in 1.6
by Howard Miller -
I'm not sure you follow me - or I didn't read you correctly.
I mean if I type some material into my dumb text editor on my (say) Russian pc, it contains no encoding information. If I go to the file upload area and upload this file into Moodle, how does Moodle handle this? How does it do the conversion into utf-8 having no knowledge of the original encoding.
I ask because this has become an issue in quiz imports. If somebody constructs offline a quiz using a foreign encoding I am thinking that there may be a need for an interface addition to allow the user to specify the encoding. However, before I did this I thought I'd better ask why this isn't neccessary in the only place I could think of a similar thing being done.
I mean if I type some material into my dumb text editor on my (say) Russian pc, it contains no encoding information. If I go to the file upload area and upload this file into Moodle, how does Moodle handle this? How does it do the conversion into utf-8 having no knowledge of the original encoding.
I ask because this has become an issue in quiz imports. If somebody constructs offline a quiz using a foreign encoding I am thinking that there may be a need for an interface addition to allow the user to specify the encoding. However, before I did this I thought I'd better ask why this isn't neccessary in the only place I could think of a similar thing being done.
In reply to Howard Miller
Re: How does uploading a text file work with character encodings in 1.6
Hi Howard,
although both examples are different, both of them will share the same solution:
"Convert ALL your files to UTF-8 before uploading them to Moodle"
This should be the rule to be 100% free of problems. And I must remark that the rule is pretty simple (specially compared with the pre-1.6 rules where there were a lot of different encodings available).
In my previous post I just was trying to clarify that the rule above, although highly recommended, can be avoided at least in two common situations:
- HTML files whose content-type headings have been properly set. No need to convert such files because web browsers will handle them without problems.
- XML files whose encoding has been properly set. No need to convert such files because PHP (>=5.0.2) will handle them without problems.
All the rest of files, like:
- Text files uploaded to be used in resources.
- HTML files without the content-type defined.
- XML files without the encoding defined.
- Import/Export plain text files (quiz, enrol....)
Must be converted to UTF-8 before using them with Moodle running under UTF-8. That's the rule. The only one rule against a myriad of *incompatible* rules existing before 1.6. That was IMO, one of the main causes to embrace UTF-8 ASAP:
"To avoid the Babel of previous encondings, being able to talk (communicate, share...) with a common one, aka, Unicode".
Ciao
P.S.: Then, the functional question remaining is: Should we implement some sort of "select encoding" mechanism inside every import option? My personal opinion is that we don't need it if moodlers follow (and know!) the rule, but it is only my personal opinion. Allowing them to select the encoding implies that they know the encoding so, then, why not simply convert it before loading the file to Moodle? I cannot see any advantage having such options included. Tons of programs support to generate UTF-8 files and utilities to convert between encodings are available everywhere... anyway, that's only my personal opinion, absolutely.
although both examples are different, both of them will share the same solution:
"Convert ALL your files to UTF-8 before uploading them to Moodle"
This should be the rule to be 100% free of problems. And I must remark that the rule is pretty simple (specially compared with the pre-1.6 rules where there were a lot of different encodings available).
In my previous post I just was trying to clarify that the rule above, although highly recommended, can be avoided at least in two common situations:
- HTML files whose content-type headings have been properly set. No need to convert such files because web browsers will handle them without problems.
- XML files whose encoding has been properly set. No need to convert such files because PHP (>=5.0.2) will handle them without problems.
All the rest of files, like:
- Text files uploaded to be used in resources.
- HTML files without the content-type defined.
- XML files without the encoding defined.
- Import/Export plain text files (quiz, enrol....)
Must be converted to UTF-8 before using them with Moodle running under UTF-8. That's the rule. The only one rule against a myriad of *incompatible* rules existing before 1.6. That was IMO, one of the main causes to embrace UTF-8 ASAP:
"To avoid the Babel of previous encondings, being able to talk (communicate, share...) with a common one, aka, Unicode".
Ciao
P.S.: Then, the functional question remaining is: Should we implement some sort of "select encoding" mechanism inside every import option? My personal opinion is that we don't need it if moodlers follow (and know!) the rule, but it is only my personal opinion. Allowing them to select the encoding implies that they know the encoding so, then, why not simply convert it before loading the file to Moodle? I cannot see any advantage having such options included. Tons of programs support to generate UTF-8 files and utilities to convert between encodings are available everywhere... anyway, that's only my personal opinion, absolutely.
In reply to Eloy Lafuente (stronk7)
Re: How does uploading a text file work with character encodings in 1.6
by Howard Miller -
...sounds good to me! It's a lot easier to simply add a line to the instructions 
In reply to Eloy Lafuente (stronk7)
Re: How does uploading a text file work with character encodings in 1.6
by Jeff Forssell -
I feel like expecting all Moodlers to KNOW their encoding could lead to a major decimation in the ranks. 
I'm not sure what to suggest (besides a very concrete guidance for how people can encode/reencode their stuff).
The 2 encoding problems I've run into (with swedish åäö) in Moodle have been when importing functioning SCORM compatible packages:
Would it be possible to:

I'm not sure what to suggest (besides a very concrete guidance for how people can encode/reencode their stuff).
The 2 encoding problems I've run into (with swedish åäö) in Moodle have been when importing functioning SCORM compatible packages:
- from eXe where some of the ÅÖÄ were right and others (I think in the Moodle navigation that was created) wrong
- a package which had som image files with names including åäö would not display those images
I don't know if the coming changes in Moodle will solve that problem. If it doesn't I don't know if that means I have to use the classic safety rule "don't use any letters an American wouldn't use in a file name (and maybe not even more than 8)" or if it is possible to UTF-8 encode filenames!!
Would it be possible to:
- warn if the doc didn't have declared encoding - and briefly describe the problem with that.
- show how the beginning of the document would look with a couple different encoding filters
- Choice to {apply one of those filters} or {CANCEL and give more info on how the person could locally encode the doc before the next try}.
In reply to Eloy Lafuente (stronk7)
Trouble reading importing or uploading text files using UTF8
by Ricardo De la Garza -
Trouble reading importing or uploading text files because UTF8
I recently installed Moodle 1.6. I’m having trouble to upload users using a text file. Users with Spanish names with characters: á, é, í, ó, ú and ñ. CSV files from Excel are ANSI coded but moodle reads incorrectly the names sometimes inserting “,” in the middle of the word misestimating the information.
When I tried to make the text file coded in UTF8 and pasting the information on it, Moodle does not read it.
Same thing happens when tray to import lesson with Spanish texts form a PowerPoint, or import questions form a text file. What can I do, to fix it? I didn’t have these problems with version 1.4.5
In reply to Ricardo De la Garza
Re: Trouble reading importing or uploading text files using UTF8
by Res Hotz-Pohlmann -
Same problem, no solution yet!
In reply to Ricardo De la Garza
Re: Trouble reading importing or uploading text files using UTF8
by Res Hotz-Pohlmann -
For User-Upload I'm going this way now:
- Access-Database with any data source (Text, CSV, Excel, other database or Access-own tabels) with users raw data
- Query to prepare these raw user data
- Export this query with fitting export spezifications, specially UTF-8-coding
- -> resulting in an excellent text file which can be imported in moodle 1.6 without further problems
In reply to Res Hotz-Pohlmann
Re: Trouble reading importing or uploading text files using UTF8
by Antonio Del Olmo -
It works! Thank you very much for your solution, Res.
In the Text Export Assistant of Access, push 'Advanced...' button and then set the drop-down menu 'Code page' to UTF-8.
In the Text Export Assistant of Access, push 'Advanced...' button and then set the drop-down menu 'Code page' to UTF-8.
In reply to Howard Miller
Re: How does uploading a text file work with character encodings in 1.6
by koen roggemans -
Howard, I think that people who are dealing all the time with what we would call difficult character encodings (I mean non-latin encodings), are used to taking care of that and have proper utf8-editors to do that.
We, Latin1 users, are a little bit spoiled and don't have the knowledge/habit of doing so, so I think it will be no problem for them.
Maybe a non-Latin1 Moodler can drop in this discussion and share his/her experience?
We, Latin1 users, are a little bit spoiled and don't have the knowledge/habit of doing so, so I think it will be no problem for them.
Maybe a non-Latin1 Moodler can drop in this discussion and share his/her experience?
In reply to koen roggemans
Re: How does uploading a text file work with character encodings in 1.6
by Theodore Tzidamis -
Well, nice to see lots of people have the same problem... 
We have an old server running moodle 1.4.2 which we want to upgrade to 1.6.3
However, everything on the old platform have been written in ISO-8859-7 (greek). As a result, when a course is moved onto the testbed server, it is not readable.
From what I understand from the moodle fora here, we have to convert everything to UTF-8 before moving it onto the new server (please correct me if I have got this wrong).
Nevertheless, I have another question. I can understand why the new platform doesn't understand ISO-8859-7 since it uses UTF-8. BUT: Shouldn't I be able to read the old course when changing the browser encoding to ISO-8859-7?
Perhaps I have it all mixed in my head, so feel free to correct me, by all means.
Best regards.
We have an old server running moodle 1.4.2 which we want to upgrade to 1.6.3
However, everything on the old platform have been written in ISO-8859-7 (greek). As a result, when a course is moved onto the testbed server, it is not readable.
From what I understand from the moodle fora here, we have to convert everything to UTF-8 before moving it onto the new server (please correct me if I have got this wrong).
Nevertheless, I have another question. I can understand why the new platform doesn't understand ISO-8859-7 since it uses UTF-8. BUT: Shouldn't I be able to read the old course when changing the browser encoding to ISO-8859-7?
Perhaps I have it all mixed in my head, so feel free to correct me, by all means.
Best regards.