We have been having these really strange problems where our main Moodle site has been getting deadlocked, and we may finally have worked out why.
I am going to write about it here for a few reasons. First, it still does not really make complete sense to me, so perhaps someone will be able to explain further, and Second because other people might be affected by the same thing, so increasing awareness of it can't hurt.
What was happening was that lots of Apache processes were stuck handling requests for
POST /webservice/soap/server.php?wstoken=[TOKEN HERE]
and they were stuck because they were waiting for a database lock on the external_tokens database table (which in our database only has 11 rows).
What happens in webservice/lib.php is that every time webservice::authenticate_user or webservice_server::authenticate_by_token checks a token, it then does $DB->set_field to update the lastaccess column.
This is made worse by the implementation of webservice_soap_server. For every single request served (because it hard-codes ini_set('soap.wsdl_cache_enabled', '0'); !?), it makes an HTTP request to itself ($CFG->wwweoot/webservice/soap/server.php?token=[...]&wsdl=1) to compute the WSDL. There is no reason at all why a HTTP request should be involved there. In any case, featching the WSDL verifies the token, and so does another write to the lastaccess column.
Even so, the total number of web service calls our system is handling is quite low (only about 20 per minute, when the server is handling over 50 page-views per second) so it is not clear why this is enough to completely lock-up the server, but it is.
Does this make sense to anyone?
Then, what can we do to fix this? there seem some obvious wins (which should probably have been done as part of MDL-52208):
- Only disable the cache_wsdl option if DEVELOPER_DEBUG is on. (In live use, the code won't be changing, so the list of avaiable methods won't change.)
- Change it so that it does not do a HTTP request to get the WSDL. Instead, generate it once (not every single request) and save it in a temp file instead.
- The lastaccess value is only used in one place, for one type of token (lib/classes/task/session_cleanup_task.php). However, it is always written for any type of token. Perhaps we should only set it for tokens of typet EXTERNAL_TOKEN_EMBEDDED?
- Does anything acutally use external_create_service_token, or are embedded tokens a dead concept?