Hardware and performance High DB Connections [State sleep]

Hardware and performance High DB Connections [State sleep]

by jahangir aslam -
Number of replies: 4

Hello we are experience a weird situation with our Moodle 3.9 installation and we cannot find out how its happening.

So I will share our experience in the hope that you can shed some light on the situation.

We are hosting our Moodle instance on AWS, using a Load Balancer, EC2 instances behind the ELB, RDS (Aurora Mysql) and Redis ElasticCache for caching.

Our servers are running  Apache/2.4.39, PHP 7.4 with php-fpm.

Randomly during the day Moodle (or WebServers [nothing else is running on the webservers by the way]) are opening too many DB connections to the DB, aroung ~800 connections.

As soon as this starts happening our Moodle becomes unreachable, the response time of the website goes crazy, like 30-40s.

When I do SHOW PROCESSLIST, I see 800 rows with state SLEEP.

I restart php-fpm, and httpd, DB Connections suddenly drop and few minutes later they go up again.

During this period server RAM and CPU are below 30%.

Could you please let us know if you had similar problem and help us resolve it?


P.S:- We are running all crons off peak time around mid night

Average of ratings: -
In reply to jahangir aslam

Re: Hardware and performance High DB Connections [State sleep]

by Ken Task -
Picture of Particularly helpful Moodlers

While you've provided setup info and some graphs, not much technical information concerning many things like # of nodes, logs of those nodes,  the queries seen in server logs, etc.  

So .. maybe this will help:

https://support.sciencelogic.com/s/article/1276

Title is 'Determining the Source of Sleep Queries'

Comment: Really?   Running cron just once a day around midnight?

https://docs.moodle.org/310/en/Cron

'SoS', Ken

In reply to Ken Task

Re: Hardware and performance High DB Connections [State sleep]

by jahangir aslam -

Thanks Ken for your prompt reply, 
Cron is a continues cycle, one loop finish then another start 24/7. So there is no overlapping cron task 

In reply to jahangir aslam

Re: Hardware and performance High DB Connections [State sleep]

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Moodle Cron is more complicated than that. Cron itself schedules other tasks at various intervals. They are parallellzed and some even get detached from the main php thread (background processing). And those sub-tasks waiting need cron to poll them regularly. That is the reason for this short (1 min) interval.

For the same reason debugging these scheduled tasks become difficult or even impossible. Reduce as many "off-line" tasks and follow https://docs.moodle.org/400/en/Cron#Debugging_Scheduled_Tasks, etc.
In reply to jahangir aslam

Re: Hardware and performance High DB Connections [State sleep]

by Howard Miller -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
High numbers of sleep states is highly unlikely to be a database issue.

Moodle processes start by opening a database connection and it remains open until the process completes. The upshot of this is that anything that delays that process will result in a 'sleep' connection at the database. This isn't normally an issue but if you see lots of them then it might be. You really need to establish what is causing your Moodle processes to stall. Things like file locking issues in moodledata, network problems, cache issues are all possibilities. You probably need to do a lot of monitoring but understand that (almost certainly) the database is the symptom and not the root cause.

For example...
We've seen this with large numbers of students on overly large courses when they all log in at the same time. This became a problem during Covid due to online lectures. No matter what you tell students, they all log in at exactly 9am for a 9am lecture. 

However, we've also seen it with (as above) internal networks running out of capacity, NFS problems, and even MP4 files being "chunked" by overzealous browsers. It's probably the most common symptom of overloaded servers. 
Average of ratings: Useful (2)