Wrong words (eg. "++") are matched as CamelCase in spite of "Disable CamelCase Linking"

Wrong words (eg. "++") are matched as CamelCase in spite of "Disable CamelCase Linking"

by Tatsuya Shirai -
Number of replies: 0

Hi, Moodlers.

I've been in the face of two strange phenomenons under using ewiki on Moodle1.8. Under checking "Disable CamelCase Linking" on ewiki (Moodle1.8, 1.7, ...), Camel case words (eg. UsingMoodle, WikiModule...) does not match as Camel case. On the other hand, unexpected words (eg. a++, a]+[, ...) will match as Camel case.

For example:
      for ($i = 0; $i < 100; $i++)    : HTML editor
    ->  for ($i = 0; $i < 100; $i++?)  : View page (ewiki)

'++' is linked for new page. I don't expect!

Another phenomenons is as following.

 For example: a[] is an array.
        a[1]   : HTML editor
     -> a1?   : View page(ewiki) : I want to avoid.
        a![1]  : HTML editor
    ->  a[1]  : View page (ewiki) : That's right.
         a[]   : HTML editor
    ->  a?    : View page(ewiki) : Oh?
        a![]   : HTML editor
    -> a![]   : View page (ewiki) : Oh No!
        a![ ]  : HTML editor
    -> a[ ]   : View page(ewiki): huum...O, OK?


(How to correct these problems?)

The cause of these problems is on following code, in mod/wiki/ewiki/ewiki.php

    if ($moodle_disable_camel_case) {
        define("EWIKI_CHARS_L", "");
        define("EWIKI_CHARS_U", "");
    }  else {
        define("EWIKI_CHARS_L", "a-z_??$\337-\377");
        define("EWIKI_CHARS_U", "A-Z0-9\300-\336");
    }

EWIKI_CHARS_L/U are used for detecting Camel case words. These constants are used on regular-expression as follows,

        "wiki_link_regex" => "\007 [!~]?(
        \#?\[" style="border-right:0px;border-top:0px;vertical-align:middle;border-left:0px;border-bottom:0px;" alt="[^<>\[" src="http://moodle.org/filter/tex/pix.php/25443382e01c0fa744dccb716a12d8bc.gif" />\n]+\] |
        \^[-".EWIKI_CHARS_U.EWIKI_CHARS_L."]{3,} |
        \b([\w]{3,}:)*([".EWIKI_CHARS_U."]+[".EWIKI_CHARS_L."]+){2,}\#?[\w\d]* |
        ([a-z]{2,9}://|mailto:)[^\s\[\]\'\"\)\,<]+ |
        \w[-_.+\w]+@(\w[-_\w]+[.])+\w{2,}   ) \007x",

and

    $value = preg_replace_callback("/((\w+:)?([".EWIKI_CHARS_U."]+[".EWIKI_CHARS_L."] +){2,}[\w\d]*)/", "ewiki_link_regex_callback", $value);

Most important part is ([".EWIKI_CHARS_U."]+[".EWIKI_CHARS_L."]+){2,}.
If we set that the "Disable CamelCase Linking" is On, both of EWIKI_CHARS_L and EWIKI_CHARS_U become null (blank?). Therefore, regular expression will become to ([]+[]+){2,}. [*] (eg.[A-Za-z0-9]) is meta symbol that means character class. [*] is not allowed that the inner part of brankets is null. /[]+[]+/ is interpreted as [ ]+[ ]+ , then the render of ewiki view page will misinterpret non-Camel case word as Camel case word in spite of Disable CamelCase is On. /[]+[]+/ means that a word matched with the pattern consists of a character ']' or '+' or '['.



I propose a modification for this problem.

    if ($moodle_disable_camel_case) {
        define("EWIKI_CHARS_L", "\t");
        define("EWIKI_CHARS_U", "\r");
    }  else {
        define("EWIKI_CHARS_L", "a-z_??$\337-\377");
        define("EWIKI_CHARS_U", "A-Z0-9\300-\336");
    }

This patch means that CamelCase word consists of 'Carriage return' and 'Tab code', for example : '\r\t', '\r\t\r\r\t'. This experssion is safe since these strings shall not appear in plain text generally.

I'm using ewiki in Japanese. After applying this modification I have no trouble under setting 'Disable CamelCase Linking'  On.

If you are also in face of this problem, let's try this patch.

 

Average of ratings: -