Session Management Moodle 2.1.x

This forum post has been removed

Number of replies: 28
The content of this forum post has been removed and can no longer be accessed.
In reply to Deleted user

Re: Session Management Moodle 2.1.x

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Memcache sessions are certainly possible, with a bit of code. We have got it working on our servers. When I say we, it was Derek Woolhead who did all the work.

Throw the attached file into the local/ousession folder (you will need to create it), then do what it says about editing your config.php file.

This comes with absolutely no guarantee that it will work for anyone else, nor any support.

In reply to Tim Hunt

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Deleted user

Re: Session Management Moodle 2.1.x

by Juan Segarra Montesinos -
Picture of Core developers Picture of Plugin developers

Hi Jason,

I'm watching some issues regarding session management. It seems that session access is serialized with the use of named locks. This implies that no one can do more than one thing at the same time (or that seems to me) and problems arrives when an action takes more time than the reasonable to complete: you must wait this task to end.

This is an example of connections running in mysql:

| 19532 | aulavirtual | app1:46650 | aulavirtual_update | Query | 817 | Sending data | SELECT
quiz.id,
quiz.name,
c.shortname,
|
| 19533 | aulavirtual | app1:46652 | aulavirtual_update | Query | 95 | User lock | SELECT GET_LOCK('aulavirtual_update-mdl_-session-154',120) |
| 19582 | aulavirtual | app1:46875 | aulavirtual_update | Query | 102 | User lock | SELECT GET_LOCK('aulavirtual_update-mdl_-session-154',120)

Connecion 19532 is executing a query that is not optimized or that is bad indexed or simply need time to complete... the other connections are associated with other web browser tabs opened to access other moodle pages... they are hung.

Why session code is implemented like this? Is it really necessary?

I'm not sure if this can lead to bigger problems once in production.

Thanks in advance.

In reply to Juan Segarra Montesinos

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Deleted user

Re: Session Management Moodle 2.1.x

by Juan Segarra Montesinos -
Picture of Core developers Picture of Plugin developers

Hi Jason,

I've not put 2.1.x in production... not yet. This is what I've observed while testing the upgrading process. I don't know what will be the behaviour once in production and I don't want to put sessions at filesystem... If I can avoid that.

I think it's the same issue. Locks at 19533 and 19582 timeouts but new locks are created and waiting again and again... I have to digg more on this to know exactly what's happening.

Why sessions are serialized like this?

Regards smile

In reply to Juan Segarra Montesinos

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Deleted user

Re: Session Management Moodle 2.1.x

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
Hello,

standard PHP file based session use locking too, Moodle can not work reliably without session locks, this is why database sessions in Moodle 1.9 can not be considered to be fully supported.

It should be technically possible to implement memcache session driver for Moodle 2.x, the patch above does not support all features of current db driver, there is a lot of work to be done on it imho. I am afraid I will not have time to work on this in the near future, sorry.

Petr

In reply to Petr Skoda

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Deleted user

Re: Session Management Moodle 2.1.x

by Paul Vaughan -

Our Moodle 2.1 installation seems to be running just fine, but I am seeing slow queries of up to 28 seconds (so far!) in MySQL's slow query log. If nobody's complaining of slowness, then I'm not too bothered, however it doesn't seem right:

# Time: 110912 13:42:45
# Query_time: 28.367427  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=1315831365;
SELECT GET_LOCK('moodle2-mdl_-session-140972',120);

# Time: 110912 13:42:47
# Query_time: 26.810449  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=1315831367;
SELECT GET_LOCK('moodle2-mdl_-session-140972',120);

Our database server is generally only under nominal load, yet we still get slow queries... odd. :/

In reply to Paul Vaughan

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Deleted user

Re: Session Management Moodle 2.1.x

by Paul Vaughan -

Hi Jason, thanks for the reply.

Our db server is shared too (virtually on VMware, but we're seeing performace improvements), but only between Moodle 2 and a bespoke ILP project. All our other web systems use another shared db server.

Likewise, our servers have never, ever seemed slow, although with the first full week of teaching nearly over we're just starting to see it being used properly.

Today I saw that the mysql-slow.log was logging these fairly concerning results, and constantly too:

# Query_time: 95.918481  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=1316162439;
SELECT GET_LOCK('moodle2-mdl_-session-164029',120);

# Query_time: 96.041455  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=1316162439;
SELECT GET_LOCK('moodle2-mdl_-session-164029',120);

96 seconds to perform any kind of db operation is a startlingly large time, although like I said earlier, nobody seems to be complaining of slowness or a lack of responsiveness. Guess I need to read the MySQL manual.

In reply to Paul Vaughan

Re: Session Management Moodle 2.1.x

by Edmund Edgar -

Paul, are you sure it's even the database that's taking time here?

If I've understood this right, all we know is that:

1) A Moodle script starting and creating its session lock, which in this case happens to be done using the database.

2) The script is then doing something that takes time.

3) The script is eventually terminating or giving up the lock voluntarily at least 96 seconds later.

The something that's taking time might be a bunch of database calls, or it might be a filesystem operation, or it might be a network request to an external web service, or it might be a really, really hard calculation, or it could be something sleep()ing on purpose.

There are lots of things that you could do to track down which script is taking time, but one that might help you is looking for out-of-order timestamps in your web server logs. For example:


123.123.123.123 - - [16/Sep/2011:10:45:39 +0100] "GET /answerimmediately.php HTTP/1.1" 200 10 "-" "Mozilla/5.0 (Windows NT 6.1; rv:6.0.2) Gecko/20100101 Firefox/6.0.2 -"
123.123.123.123 - - [16/Sep/2011:10:45:36 +0100] "GET /sleep.php HTTP/1.1" 200 20 "-" "Mozilla/5.0 (Windows NT 6.1; rv:6.0.2) Gecko/20100101 Firefox/6.0.2 -"

You can see by the timestamps that my script sleep.php was called 3 seconds earlier than answerimmediately.php, but answerimmediately.php appears first in the server log, because although it started later, it finished running first. If you've got some log entries that are seriously out of order, that's a sign of a slow-running script.

In reply to Edmund Edgar

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Deleted user

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Petr Skoda

Re: Session Management Moodle 2.1.x

by Juan Segarra Montesinos -
Picture of Core developers Picture of Plugin developers

Hi Petr,

I understand that file based sessions use locking at application level. They have to manage concurrent writes to a file or it may finish corrupted.

But this problem doesn't exist when stored in a database. Concurrent access is managed by the database server itself.

What problem/s exactly is/are solved by using GET_LOCK()?

Thanks in advance Petr

In reply to Juan Segarra Montesinos

Re: Session Management Moodle 2.1.x

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
Exclusive lock for the whole duration of session is necessary, file based sessions hold the lock for the whole duration of the session, not just when writing the file.

See http://php.net/manual/en/ref.session.php - Session locking (concurrency) notes by brady at volchok dot com 17-Apr-2006 07:15

Moodle stores access control data in user session, we can not risk corruption caused by concurrent request overriding of sessions.

Technically you may implement a new session handler that locks on pretty much anything and stores the stuff anywhere, but still the requests that use session can not run in parallel.

If I understand the slow queries right it indicates that some request if waiting for some other request to finish for some 30s - that is imo expected behaviour for mysql if you open two browser windows and execute something with complex query in one and then ask for something else in the other.

Did you consider using PostgreSQL instead of MySQL? Some complex Moodle queries may run orders of magnitude faster on pg...
In reply to Petr Skoda

Re: Session Management Moodle 2.1.x

by Juan Segarra Montesinos -
Picture of Core developers Picture of Plugin developers

So we will assume that this serialization is needed for moodle to work fine.

30 seconds is not expected behaviour ;) Maybe we'll need to work on mysql specific query optimizations... or maybe I'm too paranoid ;) Postgress is not an option right now.

Anybody in production having problems with this?

Thanks for the clarification ;)

In reply to Juan Segarra Montesinos

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Deleted user

Re: Session Management Moodle 2.1.x

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
If you insist on MySQL please try also Oracle's 5.5 and 5.6beta. MariaDB 5.3 could have some interesting results too http://kb.askmonty.org/en/what-is-mariadb-53, Percona is another option http://www.percona.com/software/percona-server/feature-comparison/

If you find any problems with these newer version please report it as new issues into our tracker and I will fix it (if possible). Thanks.
In reply to Petr Skoda

This forum post has been removed

The content of this forum post has been removed and can no longer be accessed.
In reply to Petr Skoda

Re: Session Management Moodle 2.1.x

by Juan Segarra Montesinos -
Picture of Core developers Picture of Plugin developers

We're in mysql 5.5 running Innodb 1.1... 

We'll report everything as far as we can ;)

Thanks Petr.

In reply to Juan Segarra Montesinos

Re: Session Management Moodle 2.1.x

by Paul Vaughan -

Note: We're running 5.1 on Debian, with the InnoDB engine. I wanted to use 5.5 as I read up on the improvements to InnoDB (and everyone seems to hate MyISAM anyway) but am sticking with the stable version instead of manually downloading 5.5.

In reply to Deleted user

Re: Session Management Moodle 2.1.x

by Juan Segarra Montesinos -
Picture of Core developers Picture of Plugin developers

Hi Jason,

I'll try to audit those queries ;) In my 1.9 production moodle I watch mysql's slow query log (moodle is not the only service running there). I log queries running longer than 2 seconds and try to optimize them If they cause problems. Sometimes an easy query modification makes the difference.

I'm going to do that on 2.1.x... If we finally update our installation.

Thanks Jason for your comments ;) 

 

In reply to Petr Skoda

Re: Session Management Moodle 2.1.x

by Alberto Lorenzo Pulido -

Hi every one,

We are using PostgreSQL and having some issues while servers are under heavy traffic, the data base server fills all connection slots with a lot of "SELECT ... waiting..", i think that is related to the pg_advisory_lock() because these are queries waiting for the session lock. (wich is strange unless a lot of users were opening multiple tabs all time...)

It seems that reducing the time Moodle waits to get the session lock (SESSION_ACQUIRE_LOCK_TIMEOUT) on config.php has solved the problem for now. Default value for this parameter tells moodle to wait 2 min until raise timeout to the browser. We set the parameter value to 30s wich is by far enough  response time to a fully functional website. This avoid stacking of connections to the database waiting for session lock.

I keep testing and looking for more information about why exactly this happened.

Any feedback is welcome!

Thaks and sorry for my english smile

P.S: We have a multi-portal centraliced moodle installation with about 25.000 users and 4.000 active courses, but on moodle2.3 we have 17.500 active users and 1.000 active courses

In reply to Alberto Lorenzo Pulido

Re: Session Management Moodle 2.1.x

by David Ackerman -

I'm still seeing major issues with locking.

Database: PostgreSQL v9.1
Moodle: 2.2.4+ (Build: 20120719)

I turned on statement logging for postgresql on a single forum page hit (attached the sql statement log with identifiable info removed). Between the time that the pg_advisory_lock is obtained and when it is released, there are 59 database queries (not counting postgresql metadata queries).

There is a 2 second window in which lock contention can easily happen. I've been able to do it via holding down refresh on a web browser. I've also been able to do it by clicking on a link to the forum repeatedly (simulating a frustrated user who might keep clicking on a link thinking that this will somehow get them to the page faster – it may seem absurd to a web dev to do that, but in real life, I've seen people do that far too often to simply discount it).

In production, we've seen this stop service to the entire website for up to the lock period. Shortening the lock timeout helps, but only turns one bigger (2-3 minute) outage into several mini-outages (10-15 seconds). Although it's easy to produce lock contention for a single user, I'm still unsure as to how it's translating to brining down the entire site (other than database connections getting filled). However, every one of the site wide outages will show several pg_advisory_lock timeouts immediately preceeding the recovery. I've looked to as many other causes as possible and keep coming back to this.

There's only so much we can solve at a systems level, making queries more efficient, trying to make the system handle lock contention somewhat gracefully, etc. But it keeps coming back to the fact that there are a lot of queries being executed during a lock. Do all of these really need to be during that phase?

I've scoured the forums for information on this, and it feels like it's not considered a very high priority problem (and/or that it's too hard to solve). I'm not a moodle developer myself, but I'm in charge of making sure the system is running properly, so I get the call when it goes down. That's why I've been looking into it. I've looked at later releases, and it doesn't look like many changes have gone into the session locking code. Is this being addressed in other ways? I see that later releases allow you to turn off session locking for guests and non-logged in users, which will certainly help, but we're seeing it happen with links that are only accessible once logged in.

One suggestion if there's absolutely no way to reduce the number of queries that take place during a lock would be to implement some sort of rate limiting within the framework. This is hard to do outside of the moodle code, as all the available tools I've seen can only really rate limit based on IP address, and if students are accessing the site from within an institution, most hits will come from a small number of IP addresses - rate limiting in that case would be too harsh. But once logged in, the site itself could effectively throw up an error page if it detects a single student going to a particular URL at a rate that would cause locking problems. Better to give them an error right away than to have them frustrated, clicking away, and making the problem worse.

In reply to David Ackerman

Re: Session Management Moodle 2.1.x

by Rebecca O'Connell -

Could you please tell me how you shortened the lock timeout? We are experiencing a similar issue, and even shortening the outage period would be an improvement.

In reply to Petr Skoda

Re: Session Management Moodle 2.1.x

by Tomasz Muras -
Picture of Core developers Picture of Plugin developers Picture of Plugins guardians Picture of Translators

Some complex Moodle queries may run orders of magnitude faster on pg...

...and some queries will run orders of magnitude slower on pg smile.

I believe that swapping DBs will not help here, there are some scripts in Moodle that will take long time to execute no matter where (think about database activity search for instance).

The problem of hanging sessions is not unique to Moodle but it is a big issue in general. You can easily imagine a user opening several tabs at the same time and his connections hanging and just eating resources on the server. It's actually fairly easy for a single user to bring down whole Moodle server, it's just the matter of finding a script that executes for too long.

Sessions should be locked to avoid the corruption, I agree. But ideally we should reduce the contention - e.g. open session as read-only wherever possible and synchronize write operations. Do you think it's feasible and technically do-able?

Tomek

In reply to Tomasz Muras

Re: Session Management Moodle 2.1.x

by David Ackerman -

So here's something that's admittedly a total hack... but I'm interested in knowing whether (besides the ugliness and location of the code) there's something fundamentally wrong with it. The assumptions are:

  1. The majority of GET requests aren't going to need to write specialized data to the session.
  2. Web users are far less likely to try and submit multiple POST requests at the same time than multiple GET requests (it's quite natural, for example to open a few tabs at once for a site). Of course someone actually trying to be malicious wouldn't restrict themselves to GET requests, but the problem I'm seeing is non-malicious use causing things to lock up.

Here's the code, which can be put in config.php (hey, I did warn that it was an ugly location):

if ($_SERVER['REQUEST_METHOD'] === 'GET') {
if (!preg_match('|^/lib/.*|', $_SERVER['REQUEST_URI']) && !preg_match('|^/theme/.*|', $_SERVER['REQUEST_URI']) && !preg_match('|^/login/.*|', $_SERVER['REQUEST_URI'])) {
session_get_instance()->write_close();
}
}

If you watch a statement log on a normal page with this in place, you'll see the update to mdl_sessions as the only query happening during the lock. All other queries execute after.

Now, I'd think a more integrated solution could use some sort of opt-out declaration at the page level instead of the ugly if-statement filtering. Even making it an opt-in sort of system where basic functionality is unchanged, but you could declare your page to be a "safe read" page, would be nice. I suppose you could argue that there's already an opt-in system: i.e. just use "session_get_instance()->write_close()". I'd argue that the declarative approach might encourage more widespread use and make it easier to manage.

I know that it's near impossible to ensure that independently developed plugins adhere to every "best practice", but is there a specific reason that the write_close() function isn't being used more liberally in the standard codebase? I can't see a reason not to do a write_close(), for example, on a simple forum view. The difference in responsiveness is pretty dramatic in my tests (I certainly can't lock it up using the same simple methods of holding down refresh or rapidly clicking a link).

I'm not trying to suggest that there's a trivial solution, and it would be great to hear specific examples of where the above breaks down (I'm sure it does). I am trying to spark a little more discussion on the basic problem (given the potential impact of site outages, I'm surprised I've only found a few forum posts and bug tickets about it), as I just don't think the fact that a perfect solution may be hard or impossible should prevent workable solutions, even if they're just provided as options.

Average of ratings: Useful (1)