Robots.txt file for Moodle 3.1

Robots.txt file for Moodle 3.1

by Grigorii Andreev -
Number of replies: 9

Hello everyone.

We integrated the Moodle 3.1 into Joomla site. The Joomla 3.5 site is http://mathematics-at-school.com and in the folder http://mathematics-at-school.com/interactive/ we instaled the Moodle math squizzes. 

Recently received in Google Webmaster the following errors:

Not found 404. (The smartphone)

http://mathematics-at-school.com/interactive/mod/quiz/report.php?sesskey=JySywtv2HM&download=ods&id=22&mode=overview&attempts=enrolled_with&onlygraded=0&onlyregraded=0&slotmarks=1

There are about 900 errors.

So, is it possible to fix it? 

And about the special robots.txt file for Moodle and Joomla cooperation. What folders of Moodle we shoul block from indexing? Any manual about it?


Average of ratings: -
In reply to Grigorii Andreev

Re: Robots.txt file for Moodle 3.1

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

What do you mean by "integrated the Moodle 3.1 into Joomla site". That sounds worrying. 

What is "Google webmaster" and what did you do to get these errors? What does "(The smartphone)" refer to?

I guess my confusion is because I don't know what Google webmaster does, but nor will other people so it will help if you explain. 

What is your concern with robots.txt? What problem do you have that you want to fix?

In reply to Howard Miller

Re: Robots.txt file for Moodle 3.1

by Grigorii Andreev -

Sorry, I usually communicate with tech specialists and they no need to explain more.

1. We have math site http://mathematics-at school.com running on Joomla 3.5.

2. We installed into the http://mathematics-at-school.com/interactive/ folder of Joomla site other system - Moodle. They worked perfectly together.

3. We set the Moodle to be indexed by Google. Before it the Joomla site was indexing good. And the Google has special tools for webmasters to see all warning and errors regarding it's site.

Questions:

1. Which folders of Moodel should be hide from indexing, becouse except courses and squizzes we  use nothing. For examle calendar and forum.

2. Couples of days ago we received huge amount errors from google web master tools (404- Not found) for smartfones. And any suggestions to fix it.

In reply to Grigorii Andreev

Re: Robots.txt file for Moodle 3.1

by Ken Task -
Picture of Particularly helpful Moodlers

Am certain you didn't mean this as it came across:
"Sorry, I usually communicate with tech specialists and they no need to explain more."

In this case, would think more explanation *IS* needed before anyone could provide an accurate response.

Tech specialist in these forums probably don't run Joomla (which, is already behind a version and should be upgraded) with Alledia Site Map NOR Install Faster .... whose requirement to set search in my browser is, uhhhh, suspect - not to mention all the ads in the Joomla content pages.

Would think that the Joomla has been set to SEF but has also used the htaccess file provided with the package - in your case, probably advised.   This clipped from your htaccess.txt file found (by anyone with a browser) at your document root:

## Begin - Rewrite rules to block out some common exploits.
# If you experience problems on your site block out the operations listed below
# This attempts to block the most common type of exploit `attempts` to Joomla!

Appears that sample htaccess.txt file is still directly accessible which means there could be bots poking and probing your Joomla already.

There's nothing in the moodle code directory that needs indexing by any search engine.
Not sure what squizzes is as that is not a standard (core) moodle mod/block.
As far as Moodle only using the directories you listed ... not true ... your moodle *does* require
login so the moodlecode/auth/ directory is definitely used but I don't think you want that indexed by Google, do you? 

If #1 below, then in order for Moodle to provide a link to a file to download it has to make a DB query to acquire the meta data to find the contenthash and translate that to what appears to user as a URL in 'english'.   File system of Moodle is more like Google than the file system of a Joomla.    This to mean, moodle code has to use other directories/files and make DB queries.

So what's the goal here when it comes to Moodle?   Are we trying to:
1. turn Moodle into a file distribution system? 
or 2. you trying to allow visitors to your site access to the quizzes as a preview before downloading
or 3.  are you trying to move away from download and desire to provide a quiz service?
or 4 ... none of the above.

However, to answer your question, which is a guess, I see you have only one rule in the robots.txt file

Disallow: /interactive/calendar/

I'd suggest dropping the /calendar/ and disallowing indexing of any/all of the moodle code.

If there are errors, please provide an example of one of those errors.

'spirit of sharing', Ken

In reply to Ken Task

Re: Robots.txt file for Moodle 3.1

by Ken Task -
Picture of Particularly helpful Moodlers

Will make one more suggestion that will solve the issues (I think) but requires some more work on your end.

Create a virtual apache called 'interactive.math... blah.blah and put all of Moodle in there - would require the ability to run virtual apache where hosted and an entry into DNS for the site.

No longer a subdirectory inside a "wide open' Joomla but in something like /home/sites/interactive/ thus one doesn't need to have to figure out which is romping on what.

'spirit of sharing', Ken


In reply to Grigorii Andreev

Re: Robots.txt file for Moodle 3.1

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

"Sorry, I usually communicate with tech specialists and they no need to explain more."

They know everything, yet they still can't solve your problem? Oh well...

1. Ok

2. Very bad idea. You should have installed them separately and done this URL configuration in your web server configuration (virtual directory). I don't know much about Joomla but you can get strange effects if it uses an .htaccess file or similar.

3. ok

....

Other than that, you didn't really answer my questions. Why don't you want folders to be indexed. Who cares? What problem is it causing you. 

Perhaps you should ask in Google forums why these tools are doing this. I very much doubt that it's a Moodle issue. 

In reply to Howard Miller

Re: Robots.txt file for Moodle 3.1

by Grigorii Andreev -

Sorry for long delay and thanks to everybody for trying to help me.

1. Joomla sites is working good. They are being indexed by Google and Yandex (in Russian). We have two sites: one in English and other one in Russian. The Russian site is started about 4 years ago and now has 30 000 hosts per a day. Heave load and hosting on virtual server. The site is tuned for SEO as requred (robots, httaccess, SEF, mobile frendly templates, etc.)

2. 6 month ago we started English site http://mathematics-at-school.com (the content is the same as Russian site) on Joomla too, and there are no big problems. Tuned rathe good for google. If there is some troubles we usually manage them quickly.

3. 2 month ago we "involved" Moodle in both sites into /interactive subfolder, Created about 20 math quizzes. They work good. But many troubles with indexing. We created the courses with quizzes and write the title and description metategs (checked with special plugin of Google Chrome) there.

4. The quzzes should be in free access mode (public) and most of users will acces them from Google as quizze service.

Errors:

1. When we scanned our sites for incorrect urls (Net spider for example), there are many errors like:

http://mathematics-at-school.com/interactive/calendar/set.php?return=L2NhbGVuZGFyL3ZpZXcucGhwP3ZpZXc9bW9udGgmdGltZT0yMDUxMjQwNDAwJmNvdXJzZT0x&sesskey=HDdY9r43xT&var=showcourses

http://mathematics-at-school.com/interactive/calendar/view.php?view=month&course=1&time=907214400

http://mathematics-at-school.com/interactive/calendar/set.php?return=L2NhbGVuZGFyL3ZpZXcucGhwP3ZpZXc9bW9udGgmdGltZT05MDk4OTY0MDAmY291cnNlPTE%3D&sesskey=SHzvDM5AAE&var=showglobal

etc.

More then 10 000 entrys. We disabled the calendar plugin and there is the line Disallow: /interactive/calendar/ in robots.txt, but these are not helping.

2.  In Google Web Master many errors (about 990) like: Not found.

http://mathematics-at-school.com/interactive/mod/quiz/report.php?sesskey=FXFOkSuHQ8&download=ods&id=22&mode=overview&attempts=enrolled_with&onlygraded=0&onlyregraded=0&slotmarks=1

http://mathematics-at-school.com/interactive/mod/quiz/report.php?sesskey=3VlHatLIQR&download=excel&id=10&mode=overview&attempts=enrolled_with&onlygraded=&onlyregraded=&slotmarks=1

http://mathematics-at-school.com/interactive/mod/quiz/report.php?sesskey=7Tu5jDGbzZ&download=ods&id=10&mode=overview&attempts=enrolled_with&onlygraded=0&onlyregraded=0&slotmarks=1

So how to fix them?

Questions:

1. I look through some forums and many cases with installing Moodle into subfolder. As for our situation, the both sites (Joomla and Moodle) worked fine. And why it is not good idea I do not understand. If you have a tech manual reagrding this question, I'll appreciate! 

2. Is it possible to make the autologin to Moodle with special account? You see, many our visitors are the school kids aged 6-9 years, and it is hardly to register?

Thanks for any idea, and sorry for my bad English smile

In reply to Grigorii Andreev

Re: Robots.txt file for Moodle 3.1

by Ken Task -
Picture of Particularly helpful Moodlers

Please see:

https://docs.moodle.org/31/en/Guest_access

You also might check out:

https://codeboxr.wordpress.com/2013/12/22/seo-for-moodle-how-to-optimize-your-moodle-for-search-engines-by-sadiq-m-alam-co-founder-and-chief-operating-officer-at-codeboxr/

Also:
https://github.com/brendanheywood/moodle-local_cleanurls#installation

Also see: https://h5p.org/
https://h5p.org/arithmetic-quiz

Moodle wasn't designed to be 'search engine friendly' AND run 'wide open' (no login/no session tracking, etc.) so basically, think you are attempting to fit a square peg in a round hole.

Maybe all you need is a WordPress with H5P plugin - rather than Joomla and Moodle.

'spirit of sharing', Ken

In reply to Ken Task

Re: Robots.txt file for Moodle 3.1

by Grigorii Andreev -

Ken, thank you for the links.

The Moodle has very powerfull quizzes system, but not convinient for public access.

After a long discussion we should take a little time out to make a final decision.

We do not won't change the Joomla becouse it works fine. We have to add the squizzes functionality with public access.


In reply to Grigorii Andreev

Re: Robots.txt file for Moodle 3.1

by Ken Task -
Picture of Particularly helpful Moodlers

Welcome.   Yes, Moodle does have  powerful quiz system and you are 100% right about public access.

Now you've mentioned 'squizzes'.   Am not familiar with the term but from what I can find they are: http://www.squizzes.com/

Maybe, you need to contact the makers of squizzes and ask about LTI (which seems to be evolving and for which many learning management systems now offer either side of that tool ... consumer as well as provider).


http://www.squizzes.com/about/contact-us/

Does your Joomla still have what used to be called a 'wrapper' link?   (designed to 'capture' other content from another web site such that Joomla navigation/theme, etc. didn't totally disappear from the users browser ... an iFrame).    Those have issues in that one couldn't strip out navigational links of the 'captured' site and sometimes, clicks wouldn't/won't/ will never stay in the iFrame window of a Jooomla site - ie, they would break out of the jail joomla tried to put them in.

'spirit of sharing', Ken