Attached files and non-ASCII names for wikis: buggy?

Attached files and non-ASCII names for wikis: buggy?

by Enrique Castro -
Number of replies: 14
Picture of Core developers Picture of Particularly helpful Moodlers

Hi,
I am experiencing some problems with wikis related to non-ASCII characters. I have learned the hard way how to avoid been bitten by wikis, but I think there is a lot of space to improve the module to pro-actively avoid some user errors. Some of these errors can be attributed directly to what I think are bugs.

First of all, the problems I will describe below arise when using groups or student wikis, with names in Spanish and used in group mode (either visible or separate). Such a wiki is actually a collection of wikis, one for each group/student. The effect I see is that some times binary files attached to the wikipage are loaded to the server, and links displayed in the wiki "attachment" tab. But when users try to download the files, they get an empty file (0 bytes). I have traced this problem to an anomaly in the name of the directory where the files are stored.

a) Storage of attached files and non-canonical chars

Wikipages store the dirname of that directory in a field. But sometimes the actual directory name is different. Dirname is taken from the Wiki Name. The problem arises when you use short sentences in Spanish to name the Wiki. For instance "Grupo de trabajo del Caso Práctico" becomes in a directory in the moddata filesystem called "Grupo de trabajo del Caso Práctico" (with spaces, ";" etc. Since the string literals for the wiki-stored dirname and the actual dirname are not identical, all internal: references will fail. In fact, the actual dirname may be illegal. In some occasions I have ended with dirs I couldn't remove using Moodle file-manager: I got warnings from OS telling that such file did not exists or the operation was invalid.

I think that the name used to create the directory to store attached files should be "cleaned" before use, just to ensure is a valid POSIX dirname string. The cleaned string should be stored back in the wiki.

For instance, if you ever visit that directory with Moodle filemanager and click "rename", automatically all spaces in the name become "_". You cannot get the original name back. But wiki mechanisms is still pointing to "Grupo de Trabajo", while directory is "Grupo _de_Trabajo": all files stored get accessible.

b) Page Name field

When the Page Name field is not empty this string is used as the name of the wikipage, and the name of the directory to store attachments. Using this field, you can name the wiki whatever you want, but ensure a valid POSIX string for dirname. Conveniently, when any single page of a wiki is created, this field is locked and the author cannot enter it any more. This is the right behavior to ensure the wiki always points to the right place in filesystem.

But there is a bug for multiple wikis, group or student wikis. If you create a page for a particular group the Page Name becomes non-editable. If you visit the config form the Page Name field is no longer writable. But when you save that form, the Page Name variable is reset to the full Wiki Name. All new pages created from that moment on will use the illegal string. And you cannot change Page Name. That means that files attached to those pages become inaccessible.

I think that Page Name variable should be "frozen" upon creating any page in the wiki collection. The bug arises because Page Name is not set for wikipages not yet created, apparently. So, when re-visiting config page Page Name is set to Wiki Name, and used for subsequently created pages.Perhaps, in the case of group wikis initial pages should be created by default, even with blank spaces as content.

(You may need to visit the config page while some users have not created their initial pages to correct things like a typo in wiki name/description, if you forgot to activate binary file upload, to change permissions grantes to students etc.)

c) Wiki Initial Content:

According to the help file, if you specify a file in that field, the content of that field becomes teh initial content of all new wikipages. I have not seen this, never!. If I put there a file name (either a valid HTML file or just plain text file), I still get empty (uncreated) wikipages. The only effect I see is that the wikipage is named as the file. The directory name for attachment storage is also teh filename. Thus, I end with directories named "wiki1.html" or "contenido_inicial.txt" (literal, with points and extensions, at least these are valid POSIX file name strings).

I have observed these behaviors with Moodle v 1.4.3+ AND 1.5 dev. Perhaps I have misunderstood something about wiki usage, but I have the strong feeling that there are a couple of bugs messing this up. Before reporting it in bugtracker I would like to see if any of you can confirm thes eare really bugs. I would appreciate advice about how to report this more concisely.

I wonder if any one have been able to import content to a wiki by pointing to an initial file (and, supposedly, importing all other files in the same dir as other wiki pages). I will love to be able to import a collection of image files (or HTML wrappers for images) and allow groups of students to add texts to them.

TIA,
- Enrique -


Average of ratings: -
In reply to Enrique Castro

Re: Attached files and non-ASCII names for wikis: buggy?

by Françoise Blin -

First of all, the problems I will describe below arise when using groups or student wikis, with names in Spanish and used in group mode (either visible or separate). Such a wiki is actually a collection of wikis, one for each group/student. The effect I see is that some times binary files attached to the wikipage are loaded to the server, and links displayed in the wiki "attachment" tab. But when users try to download the files, they get an empty file (0 bytes). I have traced this problem to an anomaly in the name of the directory where the files are stored.

I have had similar problems in the past and kind of solved them by trial and error...

I found that the location of uploaded files was different depending on which form students used. Basically, if the file was uploaded from the Edit page, it created a lot of problems as the file was uploaded in the main directory and I could not find it in the course directory (moddata). However, if they uploaded the file from the attachment tab, it was uploaded in the group/student 'wiki directory' (in moddata this time) and the problems seem to disappear. Also, I have never been able to delete the files stored on 'main' since I think I need administrator rights, which I don't have.

They still got empty files, though, under two different conditions. The first one was easily solved by using Firefox instead of IE, the latter being unable to find the file... The second one was trickier, and it could be a bug, I don't really know. At the beginning, many students uploaded files that they later decided to modify and upload again. They deleted the original files, uploaded the edited ones again, under the same name, and then got a blank page when trying to download it.

I eventually discovered that even though the original files were not attached to the wiki anymore, they were still in their wiki folder (file manager). I then had to delete those files (with file manager) from the folders corresponding to each group wiki as students cannot access these. Once deleted, there was no problem with uploading and downloading the new files, even if they had the same name.

It was very time consuming and tedious, though, as in my case, the folders in the directory were identified by numbers and not by the name of the wiki or group (which used French phrases). It would be very helpful if files could be fully deleted by the person who uploaded it in the first place.

Incidentally, I found a similar problem with attachments in the glossary. Students cannot delete their attachments. The teacher has to delete the files in file manager.

Françoise

In reply to Françoise Blin

Re: Attached files and non-ASCII names for wikis: buggy?

by Enrique Castro -
Picture of Core developers Picture of Particularly helpful Moodlers
Thanks for your comments, Françoise.

All the problems I described above take place when using the "attachments" tab. I have not used the HTML editor for that. And I use only Mozilla( and Firefox since 3 months ago).

- Enrique -
In reply to Françoise Blin

Re: Attached files and non-ASCII names for wikis: buggy?

by Sebastian de la Chica -

On the topic of PDF attachments to wikis. I am not havng problems with file names, but rahter with IE not communicating well with Acrobat Reader. Has anyone looked at the issue of streaming PDF files not working with IE? I am seeing this same problem and most of my students use IE. There is a workaround: save the PDF file to disk then view it, but I was wondering if there was anything that coule be done from the PHP side to stream the PDF file to IE's liking.

Thanks,

Sebastian

In reply to Enrique Castro

Re: Attached files and non-ASCII names for wikis: buggy?

by Jussi Hannunen -
Before reporting it in bugtracker I would like to see if any of you can confirm thes eare really bugs.


Not that I can confirm anything to be a bug, but I have experienced the same behaviour and do consider it to be a bug. In our case using the character 'ä' (common in Finnish)  leads to directory listings like:

vpmood:/opt/upload/31/moddata/wiki/63/80# ll
total 256
drwxr-x--- 2 www-data www-data 4096 Jan 24 23:00 Ryhmä 2
drwxr-x--- 2 www-data www-data 4096 Jan 25 10:05 Ryhm? 2
drwxr-x--- 2 www-data www-data 4096 Jan 26 18:04 Ryhmä 1
-rwxr-x--- 1 www-data www-data 60242 Jan 28 09:31 ryhma1raj.jpg
-rwxr-x--- 1 www-data www-data 49544 Jan 28 10:01 ryhma2raj.jpg
-rwxr-x--- 1 www-data www-data 58263 Jan 28 10:03 ryhma3raj.jpg
-rwxr-x--- 1 www-data www-data 38127 Jan 28 10:04 ryhma4raj.jpg
drwxr-x--- 2 www-data www-data 4096 Feb 2 16:02 Ryhmä 3
drwxr-x--- 2 www-data www-data 4096 Feb 7 10:51 Ryhmä 4
drwxr-x--- 2 www-data www-data 4096 Feb 7 10:53 Ryhm? 4
drwxr-x--- 2 www-data www-data 4096 Feb 7 10:59 Ryhm? 3
drwxr-x--- 2 www-data www-data 4096 Feb 7 10:59 Ryhm? 1
 
The attached files are intact but unaccessible in directories like "Ryhmä 2".
In reply to Enrique Castro

Re: non-ASCII in wikis

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators
Hi there,

I've got a lot of problems with Czech letters (ISO-8859-2) in Wiki module too... I was trying to go thru eWiki source code but I haven't found anything yet sad

This info from http://erfurtwiki.sourceforge.net/UnicodeSupport makes me sad too:

Forget it! ewiki is very focused on ISO Latin-1 charset and this cannot be changed easily. Also many current PHP versions (like the 4.1.2 here on SourceForge) wouldn't allow that, so we'd lock people out if we transited too early.

So, I'd say we better recommend PhpWiki to users, who really need Unicode/UTF-8 support.


In reply to David Mudrák

Re: non-ASCII in wikis

by Jussi Hannunen -
The character causing problems for me, "ä", is part of ISO Latin-1. It's code 228 and called "a umlaut". So charset support in ewiki isn't the problem here, or not the whole problem at least.

Has a bug report already been filed for this? I tried to check the bug tracker but didn't find it.


Jussi

In reply to Jussi Hannunen

Re: non-ASCII in wikis

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators
It seems there are some regexps using letters [a-zA-Z] and some html_encode() functions which would have been changed in ewiki code to solve this issue. I don't have these problems with pmWiki. I'd like to use Wiki module in Czech language but it is almost impossible sad
In reply to David Mudrák

Re: non-ASCII in wikis

by Ali Banani -
Hi,

although this discussion is almost 1 year old, I am still facing these problems with wiki-pages which have a german umlaut-letter in their name. Sometimes two directories are created, one with the umlaut-letter in the filename and one where the letter has been changed to html code (ä -> ä). Most of the times the files in that directory are intact, but can not be downloaded. Sometimes they got corrupted while uploading.

I am using moodle version 1.5.2+ (2005060222).

thanks for your help...

best regards .. ali
In reply to Ali Banani

Re: non-ASCII in wikis

by José Moya -
This problem was solved in a release on late december. I have used it in my installation, but I can't show it to you now because I'm re-installing my server due trouble with main page.


Try moodle 1.5.3+ current (i.e. 1.5.3+ 20060126)
In reply to José Moya

Re: non-ASCII in wikis

by Ali Banani -
I did that and now most of the links are broken and nobody can create new links. If a link is created it just has the <a> tag without any reference to link to.

This is actually a huge problem for us, since the wiki is very important for us.

any ideas ?

ali
In reply to Ali Banani

Re: non-ASCII in wikis

by Ali Banani -
Hi,

I am still facing problems concerning Wiki-Pages with special characters (for example german umlauts äöü) and file uploading. Although I am using version 1.5.3+ (200506023) the wiki module creates 2 folders, one with the german umlaut and another with the html-coded umlaut (& auml;)

Does anyone have an idea on how to fix this??
Since this is a productive server, I did not want to upgrade to the new wiki-module, so any help is deeply appreciated smile

best regards .. ali
In reply to Ali Banani

Re: non-ASCII in wikis

by Paulo Matos -
Just filled in a bug report on this, see:

http://moodle.org/bugs/bug.php?op=show&bugid=5067

I'll take a look to se if I can solve it.
In reply to Paulo Matos

Re: non-ASCII in wikis

by Paulo Matos -
I created a patch that solve this issue (At least It's working
for me), please take a look at the bug link that I posted before
for more info. Test it, it won't mess with previouly data stored,
it should run with previouly inaccessible files.

Regards,

Paulo.
In reply to Enrique Castro

Re: Attached files and non-ASCII names for wikis: buggy?

by Aracely Gonzalez -

c) Wiki Initial Content:

According to the help file, if you specify a file in that field, the content of that field becomes teh initial content of all new wikipages. I have not seen this, never!. If I put there a file name (either a valid HTML file or just plain text file), I still get empty (uncreated) wikipages. The only effect I see is that the wikipage is named as the file. The directory name for attachment storage is also teh filename. Thus, I end with directories named "wiki1.html" or "contenido_inicial.txt" (literal, with points and extensions, at least these are valid POSIX file name strings).

I have the same problem, have you found out how to work this out ?

Does it have something to do with where the original file to copy is placed ?

Thanx in advance,

Aracely