## General developer forum

This discussion has been locked because a year has elapsed since the last post. Please start a new discussion topic.
Trying really hard to get this working on Moodle 1.8 latest. I'm on Debian Linux, installed xpdf-utils and antiword, using apt-get install.

This puts both antiword and pdftotext in /usr/bin. I located it on my server to verify the path, tried to start pdftotext and antiword from a commandline, checked the execute permissions and that all seems correct.

On the settingspage from Global Search, I've put:
Pad naar commando pdftotext: /usr/bin/pdftotext -enc UTF-8 -eol unix -q

I left
Omgevingsinstellingen voor de MSWord-convertor: ANTIWORDHOME=/var/www/elo/lib/antiword/linux/usr/share/antiword
unchanged

When I run the test indexer, I see loads of errors like

I know the errormessage is quite clear, but I can't see what's wrong here. Anyone any idea?

Average of ratings: -

Hi Koen,

The fact is that xxxtotext converters can only be invoked within the Moodle distribution directory.

Installing antiword and xpdf as standalone packages will not comply with this rule. This was done for a future complete integration of thoses libraries in the /lib distribution, and in case some security settings of PHP and Apache would have restricted access to that executable elsewhere.

This is a restriction I may spit off, but we should have a global developper and architecture discussion before.

In the meanwhile, ensure executable are within <%%Moodleroot%%>. I confess I have no implementation running actually on Linux to evaluate deployement issues. This will come soon, I hope.

Rules are :

I take the value given for the path of the executables, and then the executable is found using the following construction :

$text_converter_cmd = "{$CFG->dirroot}/{$CFG->block_search_pdf_to_text_cmd}$file -";

for PDF, and

$text_converter_cmd = "{$CFG->dirroot}/{$CFG->block_search_word_to_text_cmd}$file";

for Word

Could this help you to setup the search engine. I will add some references to that point in Moodledocs.

Average of ratings: -
Ah, thanks for the explanation Valery. That surely throws a lot of light .
I don't know if it is done already, but it might be a good idea to include them in 1.9 in one go

I did a CVS-checkout and copied the files on the right spot, preserving the directory structure. I chmod 751 the files to make them executable, but no luck yet.

Error with MSWord to text converter command : execution failed.
Error with pdf to text converter command : execution failed.

It's late. I look further into it tomorrow
When it works, I'll try to write some docs on this.

Average of ratings: -
No, can't find it.
Could anyone make Global Search working on Linux? "Special tricks" needed?

Average of ratings: -

I've just made it work with Moodle 1.8.3+ (current as of today) on Debian etch. Once you I've had the required packages installed (xpdf-utils, antiword, etc.), I've just executed the following commands as root:

export DIRROOT=/usr/share/moodle
mkdir -p ${DIRROOT}/lib/antiword/linux/usr/bin/ ln -s /usr/bin/antiword${DIRROOT}/lib/antiword/linux/usr/bin/antiword
mkdir -p ${DIRROOT}/lib/xpdf/linux ln -s /usr/bin/pdftotext${DIRROOT}/lib/xpdf/linux/pdftotext


to use the Global Search default paths. I've launched the text indexer and it has run without problems. I've even enabled 'file indexing' and it has run without trouble, and has indexed the only pdf file I had (this is a test setup that is almost empty).

Hope that helps.

Saludos. Iñaki.

Average of ratings: -
Ah, that's a good one - didn't think about that. I'll change my installation accordingly. Thanks!

Average of ratings: -

I realize this was posted months ago, but nothing like a resurrected old thread..

Anyway, I am going to do something similar on Solaris. Solaris has the great Blastwave package repository (a lot like apt-get) and I have installed "xpdf" and "antiword" using blastwave's pkg-get. On Solaris (at least on my solaris), they end up in /opt/csw/bin.

So I'm gonna try a symbolic link as follows:

ln -s /opt/csw/bin/pdftotext <MOODLE>/lib/opt/csw/bin

(and the same for antiword)

I'll post back to let the community know if it worked for me. Now, my issue regarding the original repsonse -- that the libs have to be in the Moodle lib/ dir, for reasons of future packaging -- is that this logic seems to assume a uniform platform base. Meaning, what about if I'm installing on Mac, Windows, Solaris, etc.? The binary xpdf package, for example, can't be included with any Moodle codebase in those cases. Unless I'm missing something.

Average of ratings: -

Well, the actual version of the global search should now not force you with such trick. There is an additional option in the search block central config that let you avoid prepending the Moodle root to the executable path construction. This should let libs be anywhere else in the server.

About, packaging. I searched for real opensource and free converters, so matching with the GPL extensibility of Moodle in general. Some converters where found with two distros, as generic Linux and generic Windows binaries. Other converters might have themselves more complex distributions.

My opinion is that we should (reserved to GPL compatible code) integrate libs for majoritary cases, and point the eventual availability of other distros. Specially for small libs cause no problem adding them to lib, pursuant all developpers find it valuable, but there would be some resistance (understandable) to integrate "couple of megas" distributions, as they still are "external code".

Average of ratings: -

Could you echo me something Koen ?

in /search/documents/physical_doc.php § 24

echo "{$CFG->dirroot}/{$CFG->block_search_word_to_text_cmd}";

and tell me what goes out ?

Average of ratings: -
I solved my problem. I copied the files, by Debian installer installed, to the right place in the moodle/lib folder, replacing the ones I've downloaded using CVS from /contrib and now it works. Very weird do, because both are version 3.02.
On my laptop, running ubuntu, it worked immediately with the files in CVS. The server has a 64 bit processor. I wonder if there could be a problem ...

Average of ratings: -

for a fix

Cheers.

Average of ratings: -
keep on reporting

I had to raise my allocated memory size for PHP a lot - up to 150M to be able to go through the test indexer (keep on getting allowed memorysize of 96M exceeded).
Now the index is running (takes indeed looong). The apache errorlog gets quite some info. Please find in attachment the tail -n 100 of the log. Should I worry about that?

Average of ratings: -

Thanks for the log.

It seems Windows-like filenames with () within breaks the Linux command line. I may add something to protect that and allow transparently indexing all files whatever the name is.

Valery.

Average of ratings: -

This is a big security hole. Using shell_exec() (or any other shell invoking functions) without cleaning parameters first with escapeshellcmd()/escapeshellarg opens up for shell injection attacks.

In this particular case, $file should be cleaned with escapeshellarg() before using it: $file = escapeshellarg($CFG->dataroot.'/'.$resource->course.'/'.$resource->reference;);$text_converter_cmd = "{$CFG->dirroot}/{$CFG->block_search_word_to_text_cmd} \$file";
[...]


Saludos. Iñaki.

Average of ratings: -

Gracias por el aviso Iñaki !!

Fixed in CVS for HEAD, MOODLE_18_STABLE, MOODLE_19_STABLE.

Peace.

Average of ratings: -

I hope you didn't check in the code I wrote in my previous post. I have just seen I didn't remove an extra ';' before the closing parenthesis in the escapeshellarg() call

En tout cas, c'est un plaisir de collaborer à améliorer le code

Saludos. Iñaki.

Average of ratings: -

I saw it !
Je l'avais vu !!
Lo habia visto !!!

Is that an international project or not !!!

Average of ratings: -
Thanks guys, updated

Average of ratings: -

Average of ratings: -
Ups, sorry for the delay. I didn't see your post until today.

Yes, I'll add it there. Thanks

Saludos. Iñaki.

Average of ratings: -
Still trying to complete the initial index: I increased my memory limit up to 500M now. I wonder how high it needs to be set. This can't be the right way I think.
I'm using lynx on the server to trigger the script, so no network problems can distribute the process.

Something good came out of this: I fixed the broken TeX filter the same way: installing mimetex with the Debian installer and replacing mimetex.linux, distributed with Moodle with the one from the linux distribution. I'm beginning to wonder whether it is such a good idea to distribute these binaries together with Moodle

Average of ratings: -

This is a real question. The positive argument is : keep Moodle simple to deploy, without having tens of packs to fetch and install in the correct order. Distributing a suitable distribution of additional libraries should work most of the time, once sufficiant people sent feedback to evaluate stability.  The developper will often rely on what a specific distribution level offers as an API. If you get the last updated version, API might have changed and the integration could suffer of this.

About memory : this is a real problem I don't know actually yet how to resolve. First indexing might need huge amount of resources, but it will need it once. I tried to see in Michael code how to optimize and free some resources. I do not have yet sufficiant memory inspecting tools to see where is the mess.

Average of ratings: -
It's done: it worked finally with the 500M memory limit. As a reference of the amount of work done by the server, I post my statistics page (apparently I should update my English language pack):

## statistics

 datadirectory /var/moodledata/search filesinindexdirectory 9 totalsize 7.8MB createdon Tue, 04 Dec 2007 16:46:16 +0100

## solutions

 runindexertest tests/index.php runindexer indexersplash.php

## databasestate

 database mdl_block_search_documents documentsinindex 6673 deletionsinindex -47367 documentsindatabase 54040 documentsfor 'Chats' 98 documentsfor 'Databases' 20 documentsfor 'Forums' 44254 documentsfor 'Glossaries' 1462 documentsfor 'Resources' 8074 documentsfor 'modulenameplural' 0 documentsfor 'Wikis' 0

Average of ratings: -

All the keys are in the search/lang/en_utf8/search.php that should be copied within the standard lang dir.

I saw that the lang files had been updated in the distributions in CVS.

Average of ratings: -

Has anyone gotten Global Search to work with these executables on IIS?

I am running it in my test environment, and consistently get the "execution failed" error.

I have tried using the moodle root option, and disabling it.

I want to see if anyone else has gotten this resolved in a similar environment.

Regards,

John

Version: 1.9.2

Environment:

PHP Version 5.2.1

System Windows NT MEDIA 5.2 build 3790
Build Date Feb 7 2007 23:10:31
Configure Command cscript /nologo configure.js "--enable-snapshot-build" "--with-gd=shared"
Server API ISAPI
Virtual Directory Support enabled

Average of ratings: -