Plural forms of messages - should we fix the code?

Plural forms of messages - should we fix the code?

by Tomaž Savodnik -
Number of replies: 9

Martin once wrote, that Moodle mechanism is more flexible compared to gettext so I guess plurals should be handled correctly just as PHP ngettext would handle them if used strictly enough. (I know this is not the case - at least the second part - so this is more or less rhetorical.)

In addition to plural and singular, Slovenian language (and I know few other languages as well) has separate forms for dual and even plural forms are different depending on "n".

Has anybody faced such issues translating Moodle and managed to solve them?

Martin, will you address this issue in code or is there any workaround? I've been playing with filters but it just doesn't make sense.

// What follows just illustrates the issue

Let's look at message.php:

$string['messages'] = 'Messages'; or
$string['readmessages'] = '$a read messages';

those strings work in English for 0, 2 or more messages. In Slovenian we would need different forms for:

n=1 sporoèilo
n=2 sporoèili
n=3 or 4 sporoèila
n=0 or n>4 sporoèil

with extra rule that n should be treated as MOD 100 meanica This basically means that we have borken translation for at least 3 cases of n (2,3 and 4) that are quite frequent.

This could be defined as

"Plural-Forms: nplurals=4; \
    plural=(n%100==1 ? 1 : n%100==2 ? 2 : n%100==3 || n%100==4 ? 3 : 0);
\n"


in GNU Gettext and used with ngettext, but one should design app & messages correctly.

Average of ratings: -
In reply to Tomaž Savodnik

Re: Plural forms of messages - should we fix the code?

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Can we just get Slovenian changed? smile wink Just joking!

I'm sure we can come up with some sort of special string format to account for this, with a moodlelib function to process it. Perhaps:

   $string['messages'] = '0:sporoèil;1:sporoèilo;2:sporoèili;3:sporoèila;4:sporoèila;n:sporoèil';
   $string['messages'] = '1:message;n:messages';

and

   get_string_plural($stringcode, $module, $count);


What other such "untranslateable" issues do languages have?
In reply to Martin Dougiamas

Re: Plural forms of messages - should we fix the code?

by Samuli Karevaara -
I reported this in the bug forum and it's fixed (in places), but for the record: in Finnish you can't use the same word "students" when you say "5 students". "Students" would be "oppilaat" and "5 students" would be "5 oppilasta".

There are similar difficulties in other places too, when sentences are formed from separate pieces. For example, "to 5 something" would be "5:een johonkin" and "from 5 something" would be "5:stä jostakin".
In reply to Samuli Karevaara

Re: Plural forms of messages - should we fix the code?

by John Papaioannou -
A variation of the second issue exists in Greek and in German as well (if I 'm speaking silliness, it's been a long time so excuse me wink).

If you want to say "5 students" as a title, or e.g. "showing 5 students" you 'd use the word "μαθητές". However, if you want to denote e.g. something belonging to 5 students, as in "5 student's grades" you 'd say "... 5 μαθητών". This can probably happen a lot more easily in practice than my engineered example may suggest.

Edit: Of course it happens more easily, I forgot the most blatant example! If you want to say "January", it's "Ιανουάριος". However, if you want to say "January 5th" then this is thought of as "the 5th day belonging to January" and is written "5 Ιανουαρίου". Which means that e.g. all date selection dropdowns in Moodle don't display "correctly" in Greek. Further, since you may need either the bare "January" string or the bare "(belonging to) January" string (to display in a month selection drop down, because the day goes in a different field!) there would need to be two different strings. I really can't think of any way to do that correctly cross-language. mixed

The masculine/feminine issue isn't as important, in spoken Greek you would use the appropriate form of course but in "canned" texts (e.g. forms to fill or computer programs) you have two choices: either use the masculine always (a very common compromise) or else provide endings for both genders in every instance of the word like this: "5 μαθητές/τριες" (μαθητές/μαθήτριες). Needless to say this quickly becomes very tiring to read so I wouldn't suggest it.
In reply to Martin Dougiamas

Re: Plural forms of messages - should we fix the code?

by N Hansen -
Another one I can think of would be gender. Even though you can specify the word for "teacher" and "teachers" or "student" and "students," what do you do with a language that has different words for these depending on the gender?

Case endings might differ too depending on the grammatical construction a word appears in, which I presume is what Samuli is referring to.
In reply to Martin Dougiamas

Re: Plural forms of messages - should we fix the code?

by Tomaž Savodnik -

Syntax you suggest would probably solve most plural issues - and it might be easier than changing language itself smile

Nice reading about "plurals" can be found at http://www.gnu.org/software/gettext/manual/html_node/gettext_150.html

It might be the tiny issue with entering own word for e.g. course where full plural forms should be entered by user.

Gender might be issue not just for (he/she) "teacher" but also for different words used for course or (most notably) addressing the student. While in English "Hello" works for both Martin (he) and Marina (she) in Slovenian different "Hello" strings should be used. To "hide" such things usually such strings are translated in plural (like "Sie" and "Du" in German) but similar to German you would need to distinguish between "he" teacher (lehrer) and "she" teacher (lehrerin).

Stem-Changing verbs or Adjective Endings could be causing problems to (at least) in German translation but I'll let Germans speak for themselves smile

There might be more like "Add "+"forum" and "Adding "+"forum" and  "Read more in "+"forum" it all works in English, but in Slovenian you would need 3 different forms of "forum"... (in this case "forum", "foruma" and "forumu").

Basically this means that concating strings into messages should be avoided and I guess other languages might have the same issue. (To avoid Slovenia taking all the blame smile )

In reply to Tomaž Savodnik

Re: Plural forms of messages - should we fix the code?

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Yes, I'm aware of the concatenating strings issue ... I don't think there's anywhere in Moodle where it happens (at least not my bits of it).

The gender issue, phew, my head hurts ...  mixed
In reply to Martin Dougiamas

Re: Plural forms of messages - should we fix the code?

by koen roggemans -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Translators
To solve that, the only sollution I see is to store the gender as a user field and call that too, to figure out wich string to take. Looks like a canon for a mosquito.
May do a precheck if the current language need this.
An other problem will be that some strings wil only excist in one or two languages and eg not in the English parent language, wich is supposed to contain every string.

In reply to koen roggemans

Re: Plural forms of messages - should we fix the code?

by Robert Brenstein -
May be a similar solution as for numbers can be used with gender and their case variations or may be that solution extended to handle not just quantity variations. You guys just need to agree on a set of keys (or rather key scheme) that indicate the count, gender, and case combination, with default value being picked if no key match is found. I don't think you would need to predefine all possibilities as long as the scheme is flexible enough to accomodate variations. The specific combinations can probably be added on as needed basis only and only in languages that need them.

And this approach could be backwards compatible me thinks.

Restrictions:

the role given by users in assigning instructors would have to be nominative but I think it is how it is used only anyway.

the customizable names for teachers and students would probably have to be predefined lists, so they are picked from popup menus rather than entered free. But probably by now, all variations are pretty much known (or can be collected on moodle.org).