Global search not working

Global search not working

by Adam Shore -
Number of replies: 30
I've got a bunch of content entered in my moodle site. I can't get the search to bring me back anything. In the block settings for Search, all I see is how to change the text for the caption and button...

How do I make this function index itself or whatever seems to be missing?

Thanks,

Adam
Average of ratings: -
In reply to Adam Shore

Re: Global search not working

by Joseph Rézeau -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers Picture of Translators
Hi Adam,
The answer is... nobody knows! See this thread.
Joseph
In reply to Joseph Rézeau

Re: Global search not working

by Valery Fremaux -

You're not so up to date !! wink

the global search works like this. I confess it is not a real explicit way of getting the content indexed. I promiss I will fix that writting doc in Moodledocs and the reviewed Readme.

1. Get in administrator mode

2. Edit sitewide parameters of the search block. You should NOT have to change anything, unless if the extra libs for converting files to text have been deployed in an unusual place.

Note that for indexing physical files, you need to get additional converters that are in the CVS at contrib/patches/global_search_libraires. I collected these converters for Windows and Linux support. Some of them may have addtional support for other OS distributions.

You may activate here for indexing physical files or not. Eventually change some path setup if needed.

3. Go to the block, make a blank search

4. Browse to the "statistics". Being administrator, you'll have additional links to perform the first-time-indexing. Once done, the cron should update the indexes with deleted, updated and added keys.

Beware : if you have many document, this process might be heavy and time spending. Try at night if possible.

5. The indexer will report you what has been indexed for each supported module.

6. Try a search

7. It should be fine.   

In reply to Valery Fremaux

Re: Global search not working

by Joseph Rézeau -
Picture of Core developers Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers Picture of Translators
Thanks for the info, Valery. It sounds quite complicated to implement, though.sad
Joseph
In reply to Valery Fremaux

Re: Global search not working

by Adam Shore -
Valery - thank you for your reply.

I got in to administrator mode.
I don't know what you mean by editing the site-wide parameters of the search block... Please describe.

My install is all default. My only content is labels and webpages throughout my moodle. Maybe 40 files - all reasonably short.

I just went to search block - made a blank search.
Then clicked statistics
...

When I get to the stats page, it reads like this:




Clicking the Indextester gives me this:
Server Time: Thu, 29 Nov 2007 18:27:34 -0400
Testing global search capabilities:

Success: PHP 5.0.0 or later is installed (5.2.0).

Clicking on the Indexersplash script gives me this:

Server Time: Thu, 29 Nov 2007 18:28:19 -0400

Warning: Indexing was not successfully completed last time, restarting.

Using C:\http\vhosts\brightwhite.ca\private\moodledata/search as data directory.
Database error. Please check settings/files.


I have nooo idea what is going on here... Please advise.

Adam
In reply to Adam Shore

Re: Global search not working

by Valery Fremaux -

I added the documentation page :

http://docs.moodle.org/en/Global_Search_block

Based on my previous post. I will update it to make as clear as possible.

Block site-wide configuration is reached using the Administration->Modules->Block menu and clicking the "parameters" link near the Global Search Block entry.

If the database seems not being set, try uninstall and install the block back. Then run the indexer once more.

Cheers. 

Average of ratings: Useful (1)
In reply to Valery Fremaux

Re: Global search not working

by Adam Shore -
so, I have to install those tools in order to see those other options? If you note in my post, I show some pics (links). ... in my search params screen, I don't have all the ones you show in your moodledoc... Is that why?
Perhaps you can be more specific about the CVS whatchamajogger we need to install.
This is a sweet tool - and I hope to get it dialled.

Thanks.
Adam
In reply to Adam Shore

Re: Global search not working

by Valery Fremaux -

Well, both search dir and blocks/search must be updated to the last published version in CVS.

The /search dir contains the search engine itself.

lib.php in there should show the updated version :

* @version 2007110400

I did'nt did it so clear for the search block, as I did only fix some internationalisation code, and just added some parms to the global_config. That's right, I should also have changed more things in file header to make that upgrade more traceable.

Version of the block should show :

      $this->version = 2007062900;

at the top of the block_search.php file.

Are you OK with that ?

the Third thing to get are that library and converter set, but this is only suitable for physiscal documents (that may be most part of indexable stuff !!).

Actually is the version OK in the HEAD of the CVS. I also commited and merged in the MOODLE_18_STABLE.

Obviously, and according to a last message of Eloy Lafuente, that tracks use of the CVS, some last files were not correctly merged in the MOODLE_19_STABLE, this could maybe explain the issue.

Tell me informed. I also maintain a whole pack of the search engine distribution in :

http://www.ethnoinformatique.fr/course/view.php?id=112&lang=en_utf8

Just let me a couple of minutes to turn in english most content, and check the distribution...

In reply to Valery Fremaux

Re: Global search not working

by Mark Nielsen -
Hi Valery,

> I also commited and merged in the MOODLE_18_STABLE.

Were the changes to the Zend API supposed to be merged into MOODLE_18_STABLE? I noticed that most everything else was back ported from MOODLE_19_STABLE except for the /search/Zend directory. Primary reason for asking is because I appear to be having troubles with the Zend API when using PHP 5.2.5. I haven't done a solid test to see if the PHP version is the actual problem, but when I switch the /search/Zend directory to the code in MOODLE_19_STABLE, the problem disappears. And the problem has to do with reading the index and not finding documents with simple searches like +doctype:resource

Cheers and thanks,
Mark
In reply to Mark Nielsen

Re: Global search not working

by Valery Fremaux -

Hi Mark,

I appreciate the work of stabilization you are doing around Global Search.

In fact I checked Zend version on my copies of 18_STABLE and 19_STABLE and HEAD, being exactly the same (as far I could check) recently UPDATED with clean copy retrieval.

I know I experimented some issues with php5.2 when starting working on 1.8.1 (as I tried to add eaccelerator). I didn't come back to this matter since.

I still have a denied access on MOODLE_19_STABLE for the search engine. I will ask MD if he could change something there. 

Had mistaken Root files in my CVS markers. Was pointing to old sourceforge server !!

In reply to Valery Fremaux

Re: Global search not working

by Mark Nielsen -
Hi Valery,

Glad I could help out smile

> Had mistaken Root files in my CVS markers. Was pointing to old sourceforge server !!

Ah, the cvs server switch got ya! I created a new ticket to update the Zend API in Moodle 1.8 (MDL-12654) and the language problems are in MDL-12577 and MDL-12086.

Cheers and thanks,
Mark
In reply to Valery Fremaux

Global search will not index anything for me

by Laren Droll -

Hi kind helpers.

After carefully following this thread and installation instructions I only get this message when running the indexspasher...

Warning: Indexing was not successfully completed last time, restarting.
Using /home/courses.kpublic.net/web/moodledata/search as data directory.

Steps taken:

I ran my cron job

I checked moodledata/search dir and made sure it was writeable by webserver.

I installed the lib dir with antiword and xpdf for linux.

I checked the database table: block_search_documents (it is empty)

When I ran search/tests/index.php I got...

Testing global search capabilities:
Checking activity modules:

9 modules to search in / 19 modules found.
0 blocks to search in / 33 blocks found.
1 additional to search in.
Success : 'assignment' has nothing to index.
Success : 'chat' has nothing to index.
Success: 'data' module seems to be ready for indexing.
Found 2 discussions to analyse in forum Social forum
*.*..Finished discussion
Success: 'forum' module seems to be ready for indexing.
Success : 'glossary' has nothing to index.
finished label 2
Success: 'label' module seems to be ready for indexing.
Success : 'lesson' has nothing to index.
finished Web page
finished Simple Text
finished New page
Success: 'resource' module seems to be ready for indexing.
Success : 'wiki' has nothing to index.
Success: 'user' module seems to be ready for indexing.

Finished checking activity modules.

======================

My moodle version is Moodle 1.9.11+ (Build: 20110427)

It's a rather new install. That's all I've got for now. Further trouble shooting steps would be appreciated. Thanks for your time!

Laren

In reply to Laren Droll

Re: Global search will not index anything for me

by Robert Crane -

Laren, global search is not bringing up results from lesson content.

it is finding results from other ares, like activities.

Did you manage to fix the global indexer, I would like to make it work. It is causeing me a lot of bother and I would appreciate any advise if you managed to get it to work. Skype ray.mizzi1

thanks in advance.

In reply to Adam Shore

Re: Global search not working

by Martín Langhoff -

Doing a quick review of this... Mark, Valery, great to see some action in describing the setup. What are the roadblocks to a generally usable global search? Is it even feasible to provide a global search to non-admins?

The stumbling blocks I see (that I'm unsure about):

  • Installation issues - I see these are being addressed smile
  • Unbound memory use. Indexing and searching seem to gob up memory with no limit. Is that because of how we use the ZF libs or is it due to ZF design issues?
  • How do we check for access rights? MD has explained the problem in MDL-8074, I couldn't say it better. (I do think that this is quite a hard one, but we may be able to find a way...)
  • Is ZF now dealing with adds/removes correctly? Long time ago ZF had some really odd limitations that I took to mean it wasn't quite ready yet...
In reply to Martín Langhoff

Re: Global search not working

by Valery Fremaux -

"Is it even feasible to provide a global search to non-admins?"

in which way is it not ? 

If I understand you, you would like to ensure users that are Moodle admins, but not physical admins of the server to install and run the global search.

"Unbound memory use"

I guess this is because for indexing purpose, PHP needs to process the entire document content, or at least, the text converted version. We get this text conversion through both ways : internal converter (XML, HTML) or externally invoked converters (PDF, DOC).

In both ways the PHP itself MUST process the text content and cut it into pieces. The problem is that the amount of memory needed depends on the document size, not the search engine proper code.

I tried to cleanup the most part of uneeded memory structures in the external code (I mean our code is external, compared to the ZF code which I call internal). The issue should be really problematic on "primo-indexation", when a huge existing set of documents is stored and the index is empty. We would imagine having a command line, server side tool for indexing that document set outside the normal functionning of Moodle, but that would avoid non-admins setting up the search engine.  

"How do we check for access rights?"

Do you mean, access rights on indexed entries or access rights upon external text converters ?

On index entries, access rights are handled by the document callbacks, so they are part of each module search API implementation. The implementation should find its own way to reproduce conditions of availability of the target document. In some way, a well designed document search API should know how each module behaves and encode the appropriate checkings (I'm not sure i made it all good !!).

Let think deeper on it !! Martin. It's a good way to find solutions. Thanks.

In reply to Valery Fremaux

Re: Global search not working

by Martín Langhoff -

Hi Valery!

I am thinking of moodle admins (not sysadmins, those can use grep! smile )) vs normal Moodle users. What I suspect may not be feasible is to run the searches so that they are scalable and fast.

For a modern text-based indexing and searching system there is no good reason (that I know of!) to use unbounded memory. Documents are being processed linearly -- so we don't need the whole document in memory, ever. I am sure we can get the Moodle side of things to be memory-smart rather than memory-bound. But what worries me is the design of ZF - is ZF memory bound in itself? If it is, then it'll be hard to fix...

access rights are handled by the document callbacks, so they are part of each module search API implementation

Does this mean that if we find 10,000 documents we'll call 10,000 callbacks? Ouch!

We will need to steal some scalable techniques to do the checks in-place. At least let each module do a bit of setup beforehand, so they get a chance to read-in the needed data in one go. (Here, having OOP an module API would help a bit as it would give us a 'natural' persistence model, but we can work around it).

Edit: we can probably reuse the programming techniques we have in accesslib. We used to have a ton of DB traffic, and now we read some data up-front with some smart SQL, and don't touch the DB at all past initialisation. That means that no matter the number of calls to has_capabilities(), we run a constant number of DB queries.

In reply to Martín Langhoff

Re: Global search not working

by Mark Nielsen -
> Does this mean that if we find 10,000 documents we'll call 10,000 callbacks? Ouch!

I'm pretty sure it only does the callbacks for the set that you are currently viewing. So, if you are viewing page 2 and page size is 50, then it would call the callbacks about 50 times for documents 50-100 (roughly).

Cheers,
Mark
In reply to Mark Nielsen

Re: Global search not working

by Martín Langhoff -

Thinking about this... don't think we can count on that optimisation:

  • When you want to view page 2, it will do the callbacks for documents 1 to 100 to do the pagination "properly". If 4 of those documents are invisible to you, then it will check 104 documents! On page 10, it'll be 500. On the last page, 10K.
  • And if you have to provide a count of the number of results, it's always 10K - there is no other way to know how many documents this user can view.

So it is back to what I was saying earlier - we must find a scalable way to do this. Simple callbacks won't work.

In reply to Martín Langhoff

Re: Global search not working

by Valery Fremaux -

This sounds me pretty receivable.

This solution is affordable for small organisation systems who will not have thousand of thousand of entries, but I agree it is not scalable.

I tried to think about storing sufficient data with the indexing record to make the initial selection process to most of the filtering work, but it was also a bit hard to go deeper this way. The indexer does work in cron context, that is, unaware of the situation of the resource author is in. So was the initial logic of getting access resolved by the requirer.

Maybe could we have a way to better segment the initial pick out from the Zend engine, so that most part of macro-context rules should be applied ? This is another research way : preprocessing retro indexes on who is known acceeding to what, hum ! big waste of data would'nt it be ? 

Caching access-query results for a user ? => think about remanance and release timeouts... ???   

There is another fallback :

a new module will implement some local access strategy. We would prefer developpers do rely only on capabilities, and do not try to implement other acess control strategy, but we can't rely on.  

... let continue arguing ...

In reply to Valery Fremaux

Re: Global search not working

by Adam Shore -
I finally found this old thread. I got the newest search/lib.php as well as blocks/search/search.block.php from CVS. I installed them, re-ran the search process and it still doesn't work. I try and run 'indexersplash.php' and get this error:

Server Time: Thu, 07 Feb 2008 14:13:55 -0400

Warning: Indexing was not successfully completed last time, restarting.

Using C:\http\vhosts\brightwhite.ca\private\moodledata/search as data directory.
Database error. Please check settings/files.

I think that's step 1.

Please help./

Adam
In reply to Adam Shore

Re: Global search not working

by Valery Fremaux -

Did you passed through the search block administrative configuration ?

In reply to Valery Fremaux

Re: Global search not working

by Adam Shore -
This is what I see when I am in the search block config.

I have no options.

Admin -> Modules -> Blocks -> Search (settings).

I know I've got to be missing something here.

Thanks !

Adam


In reply to Valery Fremaux

Re: Global search not working

by Adam Shore -
Can you send me a link to this? I still haven't gotten this working.

Finding it very difficult to find what I am looking for with Moodle....

Thanks for any help. My deadline is rapidly approaching!

Cheers,

Adam
In reply to Adam Shore

Re: Global search not working

by Valery Fremaux -

This was obviously not the right block. Search block parameters shoud show :

the search version should show 2007081100 in blocs/search/block_search.php, it should be a README.txt in the distribution so would be a lang dir with en,fr and nl packs. 

Note that this code is now "official code of Moodle" and is up to date in Moodle CVS from 1.8 (and should be over !)

Cheers.

Attachment param_global_search_en.jpg
In reply to Adam Shore

Απάντηση: Global search not working

by maria ak -
I would like a little help too, plz.
I have done all the steps described above (configurations in administration menu, downloaded libraries and new versions from cvs, installed moodlecron.exe etc), but the problem is that i have to run indexersplash.php every time i upload a new file because the indexing doesn't happen automatically.

I run indexersplash.php a couple of times to make sure the indexing was done correctly and there were no error messages. The search results are OK for the files already indexed, but when i upload new files they aren't indexed automatically.

Any ideas? Thanks in advance!
In reply to maria ak

Re: Απάντηση: Global search not working

by Valery Fremaux -

Hi Maria,

actually, the Global Search do only reindexes new resources when the moodle cron job is launched. This depends on the cron being correctly setup on your Moodle server, and which period was choosen for that cron.

In case it is OK, we should check more precisely on your platform if the cron does index new resources or not.

You may check this by triggering manually the cron, addressing <%%yourMoodleWWWRoot%%>/admin/cron.php in a browser, just after having added new files.

You should see the cron report with the Global Search tryouts to index new stuff inside.

This is a first approach to check if everything seems being OK.

Cheers. 

In reply to Valery Fremaux

Απάντηση: Re: Απάντηση: Global search not working

by maria ak -

Thanks for your reply, Valery.

It seeems that cron does not index new resources... This is the report from cron.php after having uploaded a pdf file.

Starting activity modules Processing module function assignment_cron ...
done. 
Processing module function forum_cron ...
Starting digest processing... Cleaned old digest records done. 
Processing module function journal_cron ...done. 
Processing module function workshop_cron ...done. 
Finished activity modules Starting blocks Processing cron function 
for search.... 
--DELETE---- 
Starting clean-up of removed records... 
Index size before: 28 
Checking chat module for deletions. No types to delete. Finished chat. 
Checking data module for deletions. Finished data. 
Checking forum module for deletions. Finished forum. 
Checking glossary module for deletions. Finished glossary. 
Checking lesson module for deletions. Finished lesson. 
Checking resource module for deletions. Finished resource. 
Checking wiki module for deletions. Finished wiki. 
Finished 0 removals. 
Index size after: 28 
--UPDATE---- 
<pre>Starting index update (updates)... 
Checking chat module for updates. No types to update. Finished chat. 
Checking data module for updates. Finished data. 
Checking forum module for updates. Finished forum. 
Checking glossary module for updates. Finished glossary. 
Checking lesson module for updates. Finished lesson. 
Checking resource module for updates. Finished resource. 
Checking wiki module for updates. Finished wiki. 
Finished 0 updates.</pre> 
--ADD------- 
<pre>Starting index update (additions)... 
Index size before: 28 
Checking chat module for additions. No types to add. Finished chat. 
Checking data module for additions. Finished data. 
Checking forum module for additions. Finished forum. 
Checking glossary module for additions. Finished glossary. 
Checking lesson module for additions. Finished lesson. 
Checking resource module for additions. Finished resource. 
Checking wiki module for additions. Finished wiki. 
 
Index size after: 28
</pre> ------------ done done. 
Finished blocks
 Updating languages cache 
Removing expired enrolments ...
0 to delete none found Running backups if required... 
Checking backup status...OK 
Getting admin info 
Deleting old data 
Checking courses 
Skipping deleted courses 
0 courses 
ple Next execution: Sunday, 6 April 2008, 12:00 am
eleni Next execution: Sunday, 6 April 2008, 12:00 am
eleni2 Next execution: Sunday, 6 April 2008, 12:00 am
Next execution: Sunday, 6 April 2008, 12:00 am
Backup tasks finished. 
Running rssfeeds if required... Generating rssfeeds... 
assignment: ...NOT SUPPORTED (file) 
chat: ...NOT SUPPORTED (file) 
choice: ...NOT SUPPORTED (file) 
data: generating ...OK 
forum: generating ...OK 
glossary: generating ...OK 
hotpot: ...NOT SUPPORTED (file) 
journal: ...NOT SUPPORTED (file) 
label: ...NOT SUPPORTED (file) 
lams: ...NOT SUPPORTED (file) 
lesson: ...NOT SUPPORTED (file) 
quiz: ...NOT SUPPORTED (file) 
resource: ...NOT SUPPORTED (file) 
scorm: ...NOT SUPPORTED (file) 
survey: ...NOT SUPPORTED (file) 
wiki: ...NOT SUPPORTED (file) 
workshop: ...NOT SUPPORTED (file) 
Ending rssfeeds......OK Rssfeeds finished 
Running auth crons if required... 
Cron script completed correctly Execution took 6.259061 seconds

                                    
In reply to maria ak

Re: Απάντηση: Re: Απάντηση: Global search not working

by Valery Fremaux -

Well, here is a first good news : cron is running and indexer updater is called and checks document sources !! I will seek the code deeper to see if we could trap your desease.

Keep you informed...

... there is a VERY KEY query that could help me a lot :

in /search/add.php at line §95 would you mind adding th following line for a test ?

if ($mod->name == "resource") echo $query;

just before the get_records... statement. Try indexing a new file and post me the exact SQL it was showing there.

(this is the key queries that look for new registered stuff in the Moodle database).

Thanks.

Note : if no SQL is shown is still information ! Cut off the test line (or comment it) after testing !!

In reply to Valery Fremaux

Απάντηση: Re: Απάντηση: Re: Απάντηση: Global search not working

by maria ak -
I've got both good and bad news!

The good news is global search indexes new resources!!!

This is the query from the cron report:
SELECT id, id as docid
FROM mdl_resource
WHERE id NOT IN ('7','8','9','11','12','13','14','15',
'21','23','24','25','27','28','31','32','33','34','38',
'39','40','41','42','43','44','45') and
timemodified > 1206960814
AND ( (alltext != '' AND alltext != ' ' AND alltext != '&nbsp;'
AND TYPE != 'file') OR TYPE = 'file' )
I don't really know what happened, maybe restarting my pc made some difference...Anyway, it works and that's what matters smile

The bad news now...
When i delete a resource, it is deleted from table mdl_resource but not from mdl_block_search_documents.

This is the cron report after having deleted 3 files:
Index size before: 33
Finished 0 removals.
Index size after: 33

I can see it also from table mdl_block_search_documents, the files are still there!

I've been seeking the code for a while and this is what i've found:
In my database the field itemtype in mdl_block_search_documents has the value 'file' for resources.
But the query in search/delete.php line 78 is this:
SELECT id, docid
FROM mdl_block_search_documents
WHERE doctype = 'resource' AND
itemtype = 'any' AND
docid not in ('7','8','9','10','11','12','13','14',
'15','16','18','21','23','24','25','27','28','29','30',
'31','32','33','34','35','37','38','42','43','45')
The itemtype should be 'file'. Is that only in my database?

So i made a change in the following function (search/documents/resource_document.php line 279) replacing 'any' with 'file':

function resource_db_names() {
return array(array('id', 'resource', 'timemodified', 'timemodified', 'file', " (alltext != '' AND alltext != ' ' AND alltext != '&nbsp;' AND TYPE != 'file') OR TYPE = 'file' "));
}

Now the filed that have been deleted are found but something goes wrong and the are not deleted...
This is what i've found so far:
Delete.php line 102 calls Lucene.php function find line 551
I added some code to see what's going on:

public function find($query)
{
echo "query: ".$query;
if (is_string($query)) {
$query = Zend_Search_Lucene_Search_QueryParser::parse($query);
}
echo "parsed query:".$query;
....

These are the results:
query: +docid:39 +doctype:resource +itemtype:file
parsed query:+(<EmptyQuery>) +(doctype:resource) +(itemtype:file)

In reply to maria ak

Re: Απάντηση: Re: Απάντηση: Re: Απάντηση: Global search not working

by Valery Fremaux -

What a so nice and efficient issue review !!

Very valuable for me. I will track back that resource type "wildcard" story ASAP. I thought it was achieved but... never say it's finished !!

I will reconsider the 'any' type and translate it correctly in query.

Thanks again.

In reply to Valery Fremaux

Απάντηση: Re: Απάντηση: Re: Απάντηση: Re: Απάντηση: Global search not working

by maria ak -
Thanks for fixing the code so fast! I downloaded the new versions from cvs and the query is correct now.

But I still have that parsing problem i told you about.

Delete.php line 102 calls Lucene.php function find line 551
public function find($query)
{
echo "query: ".$query;
if (is_string($query)) {
$query = Zend_Search_Lucene_Search_QueryParser::parse($query);
}
echo "parsed query:".$query;
....

These are the results:

query: +docid:39 +doctype:resource +itemtype:file
parsed query:+(<EmptyQuery>) +(doctype:resource) +(itemtype:file)

The query is not parsed correctly, therefore the file i have deleted
(docid:39) can not be found and removed from the database.

What causes this problem is that Zend search engine tries to parse the string "39" using search\Zend\Search\Lucene\Analysis\Analyzer\Common\ut8.php, which is wrong since it is numeric, so search\Zend\Search\Lucene\Analysis\Analyzer\Common\ut8num.php should be called. "resource" and "file" are alphabetic and there is no problem in the parsing, but in the case of "39" there is!

Maybe you should check this function (C:\wamp\www\moodle\search\Zend\Search\Lucene\Analysis\Analyzer.php line 100):

public function tokenize($data, $encoding = 'UTF-8')
{
$this->setInput($data, $encoding);
$tokenList = array();
while (($nextToken = $this->nextToken()) !== null) {
$tokenList[] = $nextToken;
}
return $tokenList;
}

The function nextToken is defined in both utf8.php and utf8num.php but in every case utf8.php is called. I think that if you could change that the problem would be solved!

Regards