I have cloned the live site's server to give my client a sandbox they can use for their own security tests, and now I am trying to use /admin/tool/replace/index.php to globally change old url to new url in this new test server. (per http://docs.moodle.org/22/en/Moodle_migration). I get the attached "Service Unavailable" page after about 10sec.
I checked my server logs and it has no mention of any problem.
I switched on Moodle's maximum debugging but it did not add information when I tried the "replace" script again.
I next tried to work-around the use of the replace script: I manually did the old/new swap within the database's contents via pg_dump ... sed ... psql, even swapping in base64 encoded strings to change the many absolute links that exist in blocks (see the note on the migration page). Then I had to rebuild the course cache so i added the rebuildcoursecache patch (/admin/tool/rebuildcoursecache/) and attempted to run it for all 500 courses, only to again encounter the same "Service Unavailable" issue. I tried the patch script using only a few courses and it ran to completion OK. It turns out i can run the rebuildcoursecache script using up to 50 courses at once, more than that and that "Service Unavailable" page appears.
But still those links to the old server remain in the blocks. It seems I cannot change those absolute links in the blocks without using that "replace" script, yet for me the script doesn't want to play.
I am out of ideas. Any suggestions?
I just ran the script again and checked the logs:
No entry was added to the error log.
The access log shows the replace script arriving OK:
[14/May/2013:11:53:32 +1000] "GET /admin/tool/replace/index.php HTTP/1.1" 200 10355 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0"
(plus some 304 calls to yui and favicon)
The thought that it is an Apache thing sounds promising. One of the other things I did was to search my moodle source code (to no avail), for all sorts of subsets of the string:
"The service is temporarily unavailable. Please try again later"
Thanks Ken for landing a hand.
It turns out that the above error page was being served-up by an impatient load balancer in use for this site, upstream from me. That explains why I couldn't find the error message in any source code in Moodle nor on the server.
Since this particular site's tests won't include load/stress tests, and traffic will be light, it makes sense to remove load balancing. After that was done for me, the script now runs (albeit slowly!) and there is hope!
I agree. I work surrounded by .NET devs who use Team City, and they just roll their eyes at the battles i have to fight with migrations and upgrades.
I do enjoy the struggle, but I admint it isn't good business to expend so much effort chasing bugs and hacking solutions, and it must be a complete nightmare for teachers and school admins just trying to get their courses online.
Hopefully this description/resolution report helps someone:
After getting past the Service Unavailable problem, and commencing the replace script, I found that after 15min or so it died without finishing. Before I was served-up the content encoding error page, below, screen output had included hundreds of lines in this format:
"UPDATE table_name SET column_name = REPLACE(column_name, $1, $2) [array ( 0 => 'search_string', 1 => 'replacement_string', )]"
In firefox i only got as far as the event_handlers table (alphabetically). In Chrome I reached the survey table, in IE I reached event_id. Now it seems every course i check does have links to the new server, instead of the old, but i cannot be sure they're all done because the whole script has not reported success.
The error logs catch two issues when the script dies:
[Tue May 14 14:34:20 2013] [warn] [client 192.168.76.25] (104)Connection reset by peer: mod_fcgid: error reading data from FastCGI server, referer: BLAH BLAH.edu.au/admin/tool/replace/index.php
[Tue May 14 14:34:20 2013] [warn] [client 192.168.76.25] (104)Connection reset by peer: mod_fcgid: ap_pass_brigade failed in handle_request function, referer: BLAH BLAH.edu.au/admin/tool/replace/index.php
After trawling the net I surmised that these problems seem related to lack of resource (time/memory). The server just couldn't run the script to completion, probably because there is so much content to churn through. So I broke the task down by temporarily hacking the core moodle source code inside /lib/adminlib.php. i.e. I found/modified the db_replace function near line 6400, using a home-made function and some commenting-out. (see attached text file, note I returned my core code to original state when I was done!)
Then, I used the replace/index.php script as before, only this time with its cut-down source code behind the scenes. It ran to completion OK. I followed up with 50-at-a-time calls to rebuildcoursecache, which I had deliberately disabled in the hacked script. Finally I was able to say, "Yes, I have replaced links to the old server using links to the new server".
Now, hopefully, the testers won't get anywhere near the live site!