Japanese Language Compliance - Please would you merge this with Moodle?

Re: Japanese Language Compliance - New Code for 1.4.x

by Haruhiko Okumura -
Number of replies: 20
The problem with htmlentities(x) is that it is equivalent to htmlentities(x, ENT_COMPAT, ISO-8859-1). This fails for all encodings except ISO-8859-1. The bottom line is that all occurrences of htmlentities(x) should be replaced by htmlentities(x, ENT_COMPAT, get_string('thischarset')), whatever encoding you use.

Professor Kita kindly provided a useful script: http://t-kita.net/rpm/moodle/scripts/replace-htmlentities.pl
Average of ratings: Useful (1)
In reply to Haruhiko Okumura

Re: Japanese Language Compliance - New Code for 1.4.x

by Timothy Takemoto -

Thank you Professor Okumura. You sound like you know what you are talking about. I do not know what I am talking about. But I do hope that the patches being provided by Prof Kita and Mr. Kashiwagi are applied.

By the way though, the script that Professor Kita provides seems to be a perl script. Is that so? If so is there a good reason for re-writing it in php? And is there anyone that can do it?

muchi no

Timothy

In reply to Timothy Takemoto

Re: Japanese Language Compliance - New Code for 1.4.x

by Haruhiko Okumura -
Yes, replace-htmlentities.pl is a Perl script that converts every occurence of htmlentities(x) to htmlentities(x,ENT_COMPAT,get_string('thischarset','moodle')). This script is only useful to developers who only want to rewrite htmlentities() and see what happens. There are other encoding-related glitches which cannot be patched by running this script. Prof. Kita's patch data, http://t-kita.net/rpm/moodle/patches/moodle-t-kita.patch, incorporates these changes plus much more. Normally you only need this patch data. In short, there is no good reason for rewriting the Perl script in PHP.
In reply to Haruhiko Okumura

Re: Japanese Language Compliance - New Code for 1.4.x

by Timothy Takemoto -

Dear Professor Okumura
Thank you. I understand. Prog Kita's script is just a quick way of changing the code.

Dear Developers,

Will the script be applied to the moodle code?

I don't think that it will cure the email problem but it will solve some issues such as, perhaps that with the current wiki.

Since I do not have server access I cannot apply Dr. Kita's patch automatically, and it is too long for me to attempt by hand. I would be sure to miss a ";" somewhere and anyway I would have to do it each time I upgrade moodle.

Importantly, apparently none of the changes would affect moodle in other languages. Nor would they be solved by the move to unicode. 

'My hand is sticking out of my throat' for the functionality: a non-garbling moodle.

Timothy

In reply to Timothy Takemoto

Re: Japanese Language Compliance - New Code for 1.4.x

by Toshihiro KITA -
Picture of Plugin developers Picture of Translators
I belive the htmlentities()-part modification should be essential to utf8-ized Moodle if they want to avoid the corrupted characters. So I do not think it will be long before all the htmlentities() are called with charset specification.

E-mail encoding issue is a bit different, I think. But on this issue I wish utf8-enabled mailers will be so common in Japan that we do not need care about encoding conversion to JIS. (Some think it is already so, it is not impossible to force students to use utf8-readble mailers if they want to join your course)
But  for several years from now, it is better to embed email charset conversion to JIS in Moodle as my patch.

# Excel file issue would be fixed if I could modify Kashiwagi's to patch the current version of Moodle. I need some more time for it.

BTW, Takemoto-sensei, please tell me your environment where you ordinally hand-patch your Moodle and you are running Moodle.
I can make a patched Moodle or provide some easy way to semi-automatically patch for you (and maybe also for others).

In reply to Toshihiro KITA

Re: Japanese Language Compliance - New Code for 1.4.x

by Paul Shew -
Kita-sensei, I think you're right about Moodle's conversion to Unicode. It will solve a lot of problems for Japanese, but the email and Exel encoding are a different matters.

Until these issues are addressed in the standard Moodle package, I think that Japanese Moodle users need a patched version of Moodle for every major release. I know that Mitsuhiro Yoshida-san has made patches for most of the encoding problems and so have others. Rather than duplicating efforts, and making everyone apply patches on their own (which some people have a very hard time with), I think it would be very benefitial to make a patched version that could be downloaded from the Japanese users forum or even at Moodle.jp.

Unfortunately, Moodle still needs more than simple language translation to make it compatable with Japanese, and until those changes are made, Moodle's potential in Japan is limited.

Paul.

In reply to Paul Shew

Re: Japanese Language Compliance - New Code for 1.4.x

by Timothy Takemoto -

Dear Paul,

Bearing in mind the fact that the proposed changes do not (so I am told) effect other languages, it seems to me that it would be appropriate for the stable branch of moodle to be patched. (Assuming that anyone has the time and generosity to do so). I would be very grateful.

As far as I know patch files can be run from the command line to change everything automatically. But I do not have command line access.

At the same time, the suggested patch are mainly or only of use to Japanese language moodle users. So perhaps you are right, for grace and simplicity, the main stable branch of moodle should not be patched to suit the needs of Japanese?

Japanese is the third most used language on the internet. It is a massive market. The Japanese are pants at software (other than game software).

Today I spoke to one of my bosses for the first time in a long time. He is the guy that spent the university budget on getting a programmer to write another moodle. He took the piss out of the name "moodle" (saying it sounded like "noodle" - as in "instant noodles") and closed by saying that I should purchase something from a Japanese company. Frigbat. Ardvark. Why would anyone want to dance with people that say things like that? Well...not all do.

I dream of non-garbling mail. I will try the patch manuallly soon.

Timothy

In reply to Timothy Takemoto

Re: Japanese Language Compliance - New Code for 1.4.x

by Toshihiro KITA -
Picture of Plugin developers Picture of Translators
I can not imagine well your environment to run your Moodle, but
maybe you extract all the Moodle files on your PC and upload them to a server, right?  If so, I will make a zip file of patched Moodle for with a bit of worry about someone might be unhappy to see a variant of the official Moodle package.

In reply to Toshihiro KITA

Re: Japanese Language Compliance - New Code for 1.4.x

by Timothy Takemoto -

Thank you very much. You are right. I unzip moodle and the use FTP to upload.

I don't think that there is any problem, under the licence, of making a variant of the official Moodle package, so long as we keep the licence and the copyright notices intact.

Timothy Takemoto

In reply to Toshihiro KITA

Re: Japanese Language Compliance - New Code for 1.4.x

by Paul Shew -
Kita-sensei,

I think it would be great ifyou can creat a patched version. There should be no problem at all with the license, since Moodle is GPL. That's the beauty of the GPL! Plus, we're not proposing the creation of a fork, just a patched version for Japanese language compliance. I'm sure that others would be willing to help too.

Here's a few of my quick thoughts on what would be nice to include in the "Japanese Compatible" patched version:
HTML entities:
Email
Excel
Bug #4132
Bug #4156 (include font)

Tim, after the patched version is created, can we include it in the files section of the Japanese forum. Then, can we create a new block on the Japanese forum main page with a link to download it?

In reply to Paul Shew

Re: Japanese Language Compliance - New Code for 1.4.x

by Timothy Takemoto -

Dear Paul,

Thank you very much for taking this up. I can feel resolution --  an end to garbled moodle -- in the air.

It is no problem by me to put the patched moodle in the file space of the Japanese, but course is (like this) on Martin's server so I am not sure how he will feel about it. Martin?

Mitsu wrote to me recently, thanks to you, mentioning your suggestion of moodle.jp or perhaps his site.

But...

I am still failing to understand why the stable moodle branch can not be patched (why there needs to be two moodles). These patches do not affect the functionality of Moodle in any other language.

They do add to the download size (if the fonts are included) but as far as I know, moodle is distributed ready to run in many languages as part of moodle policy. Most of the changes would hardly add to the file size at all. IMHO the patch should be applied to moodle stable.

Timothy

In reply to Timothy Takemoto

Re: Japanese Language Compliance - New Code for 1.4.x

by Paul Shew -
Yes, most things should be in the main branch. I think that full Unicode support in 1.6 will clear up most problems for Japanese unicode sites. But two issues unique to Japan will remain: Email text conversion and Excel file conversion.

I don't know if Martin is willing to make a special exception for converting email to iso-2202-jp for Japanese. And we need a similar special exception for converting text in the export routines for downloading the grades as an Excel file.

If Martin is willing to incorporate these necessary special considerations for Japanese into the main branch then great. But if not, then we need to make a seperate patched distribution for Japanese.

Frankly, since the code have already been developed by various people, I think we should create a patched version of 1.5.2+ now, and immediately make it available through the Japanese forum.
In reply to Paul Shew

Re: Japanese Language Compliance - New Code for 1.4.x

by Timothy Takemoto -
Dear Paul

Oops, sorry, you replied to my earlier question as I was reiterating it above.

As far as I know, the patch to convert email to iso-2202-jp are quite small in comparsion with the size of moodle. But perhaps there are other reasons why the patch should not be incorporated into the main body of the code. I suppose the code might become riddled with 
if ($CFG->locale = "'ja_utf8') {
...
}
...
if ($CFG->local = "xyz {
...)

I wonder what moodle / Martin's policy is on this issue.

Any way, I am really grateful to hear that I may be in reciept of a patch (both Mitsu and Professor Kita have offered to make me one) but also thinking of
1) Everyone else
2) The problem of having to keep patching new versions
3) The proliferation of localised versions of moodle on other sites
I wonder if a Japan-localised fork is really the way to go.

Tim
In reply to Timothy Takemoto

Incorporating Japanese compatibility in 1.6

by Paul Shew -
Yes I think these definitely can and should go into the main 1.6 release. These are long-term solutions for long-term problems, and it does not create problems for other languages as far as I know. Martin and/or Eloy, can you please comment on incorporating these changes into 1.6?

But it may be a while before 1.6 is released, so I'm proposing that we make available a patched version of 1.5.2+ ASAP as an interim solution.

In reply to Toshihiro KITA

Re: Japanese Language Compliance - New Code for 1.4.x

by Timothy Takemoto -

Dear Dr. Kita,

Thank you very much for your response.

I hope that the developers agree that there is a need to change html entities.
At the same time, I don't think that I am having a problem with htmlentities, other than in the wiki, which I am hoping will be replaced soon with the new DFwiki.

Email encoding and Excel files are the main problems I have.

Excel files: Teachers want to be able to download grades. They do not find it easy to convert the text file from UTF8 to JIS. I have to do it for all 20 or so classes. Some teachers want a copy of the file part way through the term.

Email: Email is the thing that causes ulcers. The special virus proof, 'deliberately low technology,' email client installed on my university's computers, MaiYU, is not UTF8 compatible. I am not sure when they will change but as you say it will be years before there is complete UFT8 compatibility. When one has several hundred students enrolled (e.g. about 600) then if only one in 10 is using an email client that does not cope with UTF8 then that is a lot of complaints. In my case about 50% of the students are using the university's comptures and getting garbled mail! Ulcers! I get around the situation by writing forum posts in English and then Japanese so that at least the students can see the English. I can do this since I am mainly teaching English. But moodle is just NOT going to work for normal courses taught in Japanese.

I have several moodles but the main ones are 1.5.2 on Free BSD and Mysql. I have FTP and SAMBA access.

If you would be so kind as to make a patched moodle that would be great.

But at the same time, I have to patch my moodle for other reasons (the Japanese privacy laws included) so I am really keen that the main branch of moodle become Japanese language tolerant.

Perhaps there might be enough people to put some money together. I am in debt recently but if it were not too much and there were enough of us that wanted a change...

Timothy

In reply to Timothy Takemoto

Re: Japanese Language Compliance - New Code for 1.4.x

by Toshihiro KITA -
Picture of Plugin developers Picture of Translators
I see your situation a bit clearer. Thank you.
MaiYU will be hopefully UTF8 compatible if many people request so to the author in Yamaguchi University. Recently many applications send e-mails in UTF8. So does WebCT and others.

I guess you use Windows XP to unzip your Moodle, so I will make some semi-automatic batch file to patch the Moodle files in your PC (within a day or two, hopefully...)


In reply to Toshihiro KITA

Re: Japanese Language Compliance - New Code for 1.4.x

by Timothy Takemoto -

Professor Kita

Thank you very much indeed for your response.

I have contacted the programmer of MaiYU (one of our staff) about a year ago, but as yet he has not made MaiYU uft8 compatible.

I use Windows 2000 to unzip moodle and then upload it to a university server using ftp.

I hope that there can be some solution to the garbling problems. Thank you very much indeed for your help.

Timothy

In reply to Timothy Takemoto

Re: Japanese Language Compliance - New Code for 1.4.x

by Eloy Lafuente (stronk7) -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Peer reviewers Picture of Plugin developers Picture of Testers
Hi,

interesting discussion about utf and problems. As moodlers running your servers under UTF-8 since some time ago, I think that all your experiences and ideas will be really important to improve the UTF-8 migration of Moodle.

Some days ago I started writting the "Migration to UFT-8 wiki page. It's unfinished for now (I hope to write all my initial thoughts ASAP, to discuss them properly) but perhaps it could be a good idea if you relate some of your well-known problems and how to solve them. The solution must work under UTF-8 for everybody, of course.

Perhaps, the Recoding PHP scripts page would be a good place.

TIA and ciao smile

P.S.: Just trying to get some free hours to finish the initial document. Please be patient...
Average of ratings: Useful (1)
In reply to Eloy Lafuente (stronk7)

Unicode implementation

by Paul Shew -
Eloy, Good to see the work you've been putting into this.

You should considering adding information about UTF-8 pitfalls or incompatibilities that we're going to run into. That's one of the main problems for Japanese. Unicode is still not widely supported by email clients in Japan, so Unicode sites need to convert all outgoing email into an email-compatible text encoding. Excel files downloaded from the gradebook suffer a similar problem. We're obviously very eager to have these problems addressed as soon as possible.

Really the underlying problem is that the desktop environment (MS Windows or Mac) is not Unicode based, so all text importing (like quiz questions) and exporting routines (email, gradebook, quiz export, etc) need to compensate for that. For languages like English, it may not matter much, but in Japanese it's a show stopper!

I suspect that other languages may run into similar problems.
In reply to Paul Shew

Re: Unicode implementation

by Tim Allen -
I suspect that other languages may run into similar problems.

I am still testing these problems in Korean to be sure, but I am fairly sure that the same problems will occur.  I suspect that all multi-byte languages are in a similar situation. 
In reply to Tim Allen

Re: Unicode implementation

by Eloy Lafuente (stronk7) -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Peer reviewers Picture of Plugin developers Picture of Testers
Hi,

in this discussion an alternative solution for the htmlentities() problem is being discussed. It seems that a simpler htmlspecialchars() will be enough.

There is one small suggested change in weblib.php to test if the solution can be applied without problems to the rest of Moodle.

As you are advanced non ISO-8859-1 moodlers it would be really amazing to get some feedback about how such change works in your test servers.

Ciao smile

Edited: I've just discovered that BOTH discussions belong to the same forum so, obviously, you had received the other posts too. Sorry! blush