Moving lang packs out of the main tree

Moving lang packs out of the main tree

by Martin Dougiamas -
Number of replies: 17
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
OK, in preparation for the migration to a Unicode Moodle we are going to simultaneously start maintaining UTF8 versions of all language packs alongside the normal ones (using upgraded admin/lang.php and admin/langdoc.php scripts that can edit both at once).

As this will nearly double the total size of the language packs it's time to start moving the languages out of the main CVS tree, before 1.6.

Here is the current plan:
  1. All the languages except "en" will be moved into a new CVS module: cvs:/lang parallel to the main Moodle module cvs:/moodle
  2. In Moodle, all non-en language packs will become part of your moodledata directory, in dataroot/lang.  Note that 1.5 already supports language packs in this location.
  3. The language configuration page will be extended to allow you to download and update these dataroot language packs directly from download.moodle.org. For those that don't have fopen($url) capability in their PHP there will be an alternate web form that leads you through downloading the files from moodle.org to your desktop, and then uploading via the web into dataroot.
  4. We'll also add support for xx_local language packs, which will be the preferred way of implementing local language tweaks. If xx_local packs are found, Moodle will use those in preference to xx (but xx will still be what appears in the menus etc)
For people using CVS to maintain their sites, they will notice the languages disappear from their installation. All they need to do is go into their dataroot directory and issue:

cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/moodle co lang

The downside is that you will need to issue two CVS update statements now to upgrade a site via CVS. The upside is that there will only be one copy of the language packs and we won't have any more issues about HEAD vs STABLE. Translators will now be able to run a STABLE moodle and use it to translate the HEAD language packs.


Any comments?
Average of ratings: Useful (1)
In reply to Martin Dougiamas

Re: Moving lang packs out of the main tree

by Samuli Karevaara -
I think this is all great! While the downloading the lang pack itself is/was not much of an issue with gigabit connections, the CVS operations are. With CVS update I can bypass the lang packs, but with local customizations, merging etc. it's always taking quite a while.

For the update needing two CVS commands: a shell script / batch file suits nicely.

For the item three in your plan: was it so that fopen($url) doesn't work with/from safe mode? fopen($url) is very handy, but would it be much of a hassle to do it with the FTP functions? download.moodle.org was sf.net hosted? Does sf.net (free, not premium) support anon ftpd downloads? Might not...
In reply to Samuli Karevaara

Re: Moving lang packs out of the main tree

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
download.moodle.org is my own server so I could set that up, however I'm not very keen on running a public FTP server there.  I've had too many problems with them in the past.

From what I understand, FTP in PHP is not on by default for most systems anyway... fopen usually is unless someone has actually disabled it. I suppose we could always try both before falling back to manual. 
In reply to Martin Dougiamas

Re: Moving lang packs out of the main tree

by Ralf Hilgenstock -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Translators
Hi Martin,

There is a problem in the install-process. I think we should do a second step. Non-english user can  choose the language of the install-process today. If the lang-files are not part of the package they can use only english installations.
I think there are two options:
1. the language files for the installation are part of the moodle package or
2. the better way: in the first step they choose the language packages for the installation/the moodle system and then the install process goes on.
But there will be one problem. In this step we have no config.php-file.

I don't know what is the best way.
Ralf
In reply to Ralf Hilgenstock

Re: Moving lang packs out of the main tree

by Samuli Karevaara -
Moving the lang pack out of CVS main doesn't necessarily imply that they won't be part of the download package. There might be an "all languages" download package anyway.

I'm not sure why you mentioned that in your solution number two it's a problem that there is no config.php file in this step? There is no config.php file now when the user is doing the installation and entering database info etc, it's created at the end of that phase. Like it could be created after the language selection.
In reply to Samuli Karevaara

Re: Moving lang packs out of the main tree

by Ralf Hilgenstock -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Translators
Hi Samuli,
in Martins solution the lang-files goes to the data-folder. That is not further the lang subfolder of the moodle-folder. We don't know where a user places his datafolder and the lang-files.
Today  you can choose the language for the installation on the first install-screen. At this moment there is no link to a data-folder defined and we have no place to add the lang-folders of the selected language. This is defined in a later step of the process of editing the config.php-file.
If we add the lang-files to the data folders, we must define the place of the data folders while unpacking the all-language-download-package.
If you make a english language installation procedure there is no problem, but if you want the  installation process in german, french or any other language you must solve this problem.
Ralf
In reply to Ralf Hilgenstock

Re: Moving lang packs out of the main tree

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Ralf, you're completely right, thanks! I'd forgotten about the install process!

I think we might have to have a special set of files outside of the lang directory (perhaps in /install) just for the install.php files that we include in the distribution.

The installation script should then attempt to download the FULL language packfor the chosen install language from moodle.org automatically (or provide instructions for it).

(Edit: Ah, I just noticed Koen's earlier post where he says the same thing  approve)
In reply to Martin Dougiamas

Re: Moving lang packs out of the main tree

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators
Hi Martin
I welcome these changes as CVS commits of lang packs will be much easier. I just want to be sure - current translators will have the same CVS permissions as they have now, but in cvs:/lang module. Right?
And yet another question: if I run STABLE moodle, I will have "en" STABLE (not HEAD) too, will I? This would be great as I could focus  on "STABLE translation".
And the last one: will you tag cvs:/lang during releases? Can I tag my language pack myself with some reasonable tags?

Thank you not only for this but also for all the time you give the moodle wink
In reply to David Mudrák

Re: Moving lang packs out of the main tree

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
There shouldn't be any need to tag cvs:/lang since language packs should work on any version of Moodle. If you want to put tags in your own language pack to remember milestones, though, you are most welcome!

And yes, "en" will be closely tied to the code, so stable Moodle means stable "en".
In reply to Martin Dougiamas

Re: Moving lang packs out of the main tree

by Chardelle Busch -
Picture of Core developers
You know I'm all for this--yeah!!! I was wondering if, while you are doing all this, you could keep in mind if this might be something to also do with themes. Maybe only have the standard theme in the core?

And, while I'm here, could someone explain what UTF8 is?
In reply to Chardelle Busch

Re: Moving lang packs out of the main tree

by koen roggemans -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Translators
Since computers only now about 0 and 1 and are good in grouping those two and counting with them, they obviously don't know anything about characters (letters). So in order to make writing possible with a computer, every character is obtained by a group of 0 and 1. "Back in the old days", when memory and storage were rare, small and extremely expensive, a character was symbolised by one byte (a set of 8 zeros and ones), wich gives the possibility to show only 128 characters (=ASCII). This is enough for latin characters, but not for hebrew, arab, farsi, ... at the same time. So those characters received a characterset: You tell the computer not to use the latin translation for those 128 characters, but to show different charcters FOR THE SAME NUMBER. And that's were the problem starts: if you want to mix languages with characters from different characterssets, it's very difficult.

And that's where UTF-8 comes up (there's also a 16-version). The length is not limited to 8 bits, but can be 8, 16 or 32, depending of the character.
For ASCII, only 8 bits are used, for other characters, it can be more. For a lot of Asian languages, 32 bits are necessary, wich causes files to be a lot bigger.
Every language wich needs it, has its own block of numbers to assign characters to, so specifying the language code is not necessary anymore if you know it's UTF-8: every character has it's own unique number.

This is why it is so good to use UTF8 for multilangual websites, like a moodle site can be. You might have noticed the difficulties to maintain the langlist on Moodle.org. Bad browsers (guess wich one) give a lot of errors on it, not understanding the code to show the different characters all together.

Off course, a UTF8 site can only be shown with a good font, containing a lot of number-to-character translations. Not all fonts are complete. Some languages (like Khmer and Inuit) are not included in general fontsets, like Trebuchet, en require a specialy designed font that includes characters for that specific block. I think it is a matter of time, before there will be a really complete fontset, but that wil be very big.

In reply to koen roggemans

Re: Moving lang packs out of the main tree

by Chardelle Busch -
Picture of Core developers
Okay, thanks so much guys, now I understand, it only affects the way Moodle handles those languages and doesn't have anything to do with the html editor. So, if I am using an english-based language it doesn't affect me as the html editor will still be able to show symbols, etc..
In reply to Chardelle Busch

Re: Moving lang packs out of the main tree

by koen roggemans -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Translators
Yes you can. The insert character symbols are a part of the ascii table (127 characters, here shown from char 33 (!) and a part of an extended ascii table (127 characters) starts here with . Those character keep their code in utf-8.

The extended table could be a problem. There exist a few extended sets. This one is ANSI with a modification for the Euro at first sight. This could cause a problem for conversion to unicode for difficult characters like the Euro, I'm not sure, but it can only affect those weird character(s).
Average of ratings: Useful (1)
In reply to Martin Dougiamas

Re: Moving lang packs out of the main tree

by koen roggemans -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Translators
Looks like a fine sollution. It starts getting weird now a lot of languages are offered in an UTF-8 and a local encoded version. The language packs are getting extremely large too.

I do second Ralf's remark on the installation issue. I think, for new moodlers, it is a fine start to have the installation in there own language. You might make an exception for install.php and keep the translated versions somewhere in the distribution. It can't be that hard to move them at release time from the language pack to the distribution.
During the installation, the proper language packs could be selected and downloaded separately.
In reply to Martin Dougiamas

CVS access to "install/lang/xx/"?

by Nicolas Martignoni -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers Picture of Translators
Martin,

As I see in the CVS, the up-to-date install.php lang file is now in the "install/lang/XX/" directory.

Could you please give the "official" translators the rights to the this directory?
In reply to Martin Dougiamas

Re: Moving lang packs out of the main tree

by Martín Langhoff -

Catching up late with this... (thanks to Penny for pointing it out).

I'm divided about this. On one hand, it makes sense to move some langs out of the main tree to cope with the limitations CVS imposes. Of course, CVS imposes other limitations too, and I'm more keen on getting rid of CVS than of lang wink

That's a much harder task, so I have to agree: moving lang out of CVS can help us get through in the mid-term.

A bit of an aside follows:

I have to confess, however, that I created a couple of scripts to update/commit all-of-moodle-but-non-en-langpacks and my langpack woes have stopped 100%.

I'm conviced that all of the issues reported against having all languages in the core project can be worked around easily with all but the most broken CVS client programs. Anyone keen on exploring that track? I am quite familiar with most cli and gui cvs clients, and can lend a hand if someone volunteers to do the writing.

As it stands, we will need a little "howto" for langpack maintainers on how to deal with the separate directory, etc. Why not spend that same effort in teaching them how to use their cvs client program effectively with the current arrangement? We can even do smart things with cvswrappers to prevent frequent mistakes (commits to lang on a branch, for instance).

One part I am not convinced about is having languages in moodledata. It does make sense to have it in a separate directory from the "en" that remains in core, so as to make it easier for people to do a nested checkout (cvs, for all its warts, supports nested checkouts), and generally being able to update the langpacks without too much hassle.

Putting them in moodledata means that they are writable by apache. Langpacks are being interpreted as PHP -- having them writable by the webserver means paving the way for security breaches.