Mysql connection/service failure

Mysql connection/service failure

by Proyectos Moodle -
Number of replies: 12

Hi all.

We have a problem with a facility that is made of two Moodle 3.0.1 front-end Oracle servers Linux 7.2 and Linux 7.2 running Oracle backend MySQL 5.6

They are on the same network segment to 1000 Mbps and three servers have 8 GB RAM.

When users access their accounts, servers, and have 30 different users, MySQL collapses and "Error is received: Database connection failed
That it is possible the database is overloaded or Otherwise not running properly. ".

We use php 5.4

The MySQL log indicates the following:

"1/25/2016 3:05:25 0 [Warning] With implicit DEFAULT TIMESTAMP value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
29404 1/25/2016 3:05:25 [Warning] Buffered warning: Changed limits: max_open_files: 1024 (requested 5000)

29404 1/25/2016 3:05:25 [Warning] Buffered warning: Changed limits: table_cache: 431 (requested 2000)

1/25/2016 3:05:25 29404 [Note] Plugin 'FEDERATED' is disabled.
1/25/2016 3:05:25 29404 [Note] InnoDB: Using atomics to ref count buffer pool pages
1/25/2016 3:05:25 29404 [Note] InnoDB: The InnoDB memory heap is disabled
1/25/2016 3:05:25 29404 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
1/25/2016 3:05:25 29404 [Note] InnoDB: Memory barrier is not used
1/25/2016 3:05:25 29404 [Note] InnoDB: Compressed tables use zlib 1.2.3
1/25/2016 3:05:25 29404 [Note] InnoDB: Using Linux native AIO
1/25/2016 3:05:25 29404 [Note] InnoDB: Using CPU instructions crc32
1/25/2016 3:05:25 29404 [Note] InnoDB: Initializing buffer pool, size = 4.9g
1/25/2016 3:05:26 29404 [Note] InnoDB: Completed initialization of buffer pool
1/25/2016 3:05:26 29404 [Note] InnoDB: file format is supported Highest Barracuda.
1/25/2016 3:05:26 29404 [Note] InnoDB: Log scan progressed past the checkpoint LSN 11,490,285,297
1/25/2016 3:05:26 29404 [Note] InnoDB: Database was not shutdown normally!
1/25/2016 3:05:26 29404 [Note] InnoDB: Starting crash recovery.
1/25/2016 3:05:26 29404 [Note] InnoDB: Reading tablespace information from the .ibd files ...
1/25/2016 3:05:29 29404 [Note] InnoDB: Restoring possible half-written data pages
1/25/2016 3:05:29 29404 [Note] InnoDB: buffer ... from the doublewrite
InnoDB: Doing recovery: scanned up to log sequence number 11490430092
1/25/2016 3:05:29 29404 [Note] InnoDB: Starting an apply batch of log records to the database ...
InnoDB: Progress in percent: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
1/25/2016 3:05:30 29404 [Note] InnoDB: 128 rollback segment (s) are active.
1/25/2016 3:05:30 29404 [Note] InnoDB: Waiting for purge to start
1/25/2016 3:05:30 29404 [Note] InnoDB: 6/5/23 started; log sequence number 11490430092
1/25/2016 3:05:30 29404 [Note] Server hostname (bind-address): '*'; port: 3306
1/25/2016 3:05:30 29404 [Note] IPv6 is available.
1/25/2016 3:05:30 29404 [Note] - '::' Resolves to '::';
1/25/2016 3:05:30 29404 [Note] created on IP socket Server '::'.
1/25/2016 3:05:30 7ff2eadf7740 InnoDB: Error: # sql6459_f88_b` tmp`.` table `does not exist in the InnoDB internal
InnoDB: data dictionary though MySQL is trying to drop it.
InnoDB: Have you copied the .frm file of the table to the
InnoDB MySQL database directory from another database?
InnoDB: You can look for Further help from
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html
1/25/2016 3:05:30 7ff2eadf7740 InnoDB: Error: # sql6459_f87_27` tmp`.` table `does not exist in the InnoDB internal
InnoDB: data dictionary though MySQL is trying to drop it.
InnoDB: Have you copied the .frm file of the table to the
InnoDB MySQL database directory from another database?
InnoDB: You can look for Further help from
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html
1/25/2016 3:05:30 29404 [Note] Event Scheduler: Loaded 0 events
1/25/2016 3:05:30 29404 [Note] / usr / sbin / mysqld: ready for connections.
Version: '6.5.23' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL) "

How can we solve this problem?

Your help will be highly appreciated,

E. Lopez.

Average of ratings: -
In reply to Proyectos Moodle

Re: Mysql connection/service failure

by Leon Stringer -
Picture of Core developers Picture of Particularly helpful Moodlers

Hi,

There are two problems here.

Firstly, is MySQL now up and running following the crash? It looks from the log like some temporary table files were left after the crash. This shouldn't be a problem but there may be some files that can be cleaned up.

Secondly, you need to find what caused the crash. The log shows MySQL starting following the crash, are there any older entries showing details of the crash itself?

I'm not sure what timezone settings you have but the log timestamps like "1/25/2016 3:05:25" look like 03:05:25, i.e. at night time. Does this correlate with the time you're getting the "Database connection failed" message?

Leon

Average of ratings: Useful (1)
In reply to Leon Stringer

Re: Mysql connection/service failure

by Proyectos Moodle -

Hello Leon, and many thanks for your help.

These problem has ocurred two times last week, with only 40-50 people connected to Moodle, as client said.

The three servers has each one 8 GB RAM, the two front-ends are dedicated to Apache+PHP+Moodle, the back end is running only MySQL and SSH services.

The second time the mysql server went down, client told me than MySQL server went completely down, and they could not restart again manually. So they reboot the virtual server in order to operate again.

The three severs are virtualized with VMware, current directives in my.cnf are:

innodb_buffer_pool_size = 5G

innodb_log_file_size = 600M


Which files you suggest have to be cleaned, according to log file?

I don´t have log file with info before the crash, I will ask for it in order to give you more information. I saw on screen with client some info previous to crash, log showed lots of connections to port 3306 in state "time_wait" so I understand Mysql was unable to attend all the concurrent connections of users and queries until it crashed. What do you think?

Is there any solution to increase the serving of multiple connections of database so it can face high workload? We expect more or less 600 users at the same time, trying to access exams and questionaries in one of the servers.

Finally, the timestamps are the same as date/time of server. They are wrong, we have to set this information. The crash ocurred aprox. at 16:30 Hours.

Thanks in advance, for your help.

E. Lopez.




In reply to Proyectos Moodle

Re: Mysql connection/service failure

by Leon Stringer -
Picture of Core developers Picture of Particularly helpful Moodlers
Hi,

How big is the database? And how many CPUs does the server have? 8 GB RAM sounds fine but that depends on the database size. And what version of Moodle is this?

An overloaded MySQL server shouldn't crash, it should just become increasingly slower to respond until its load reduces. If you ran out of memory then maybe a crash would occur but that's normally avoidable. Are there any updates available to the MySQL Server package?

Are you certain the database server itself isn't crashing? If MySQL crashed it should be possible to just start the service again. It will detect that it crashed (as you can see from your logs) and attempt to recover any transaction logs. Rebooting the server should never be required following a MySQL crash, it may indicate the problem is with the server.

Removing the temporary tables (if they still exist) based on the file names from your log is probably a case of:
  1. Take a backup of your database first.
  2. Run statement: DROP TABLE `#mysql50##sql6459_f88_b`
  3. Run statement: DROP TABLE `#mysql50##sql6459_f87_27`

Hope this helps,

Leon
In reply to Proyectos Moodle

Re: Mysql connection/service failure

by Ken Task -
Picture of Particularly helpful Moodlers

Is that all that's in my.cnf (conf for MySQL DB server)????

Would think there would be a setting for max_connections above the default 151!!!

On DB server:

fgrep 'max' /etc/my.cnf

Have you run any sort of tuner app on the DB server to see what it might recommend?   mysqltuner.pl might help.

'spirit of sharing', Ken


In reply to Ken Task

Re: Mysql connection/service failure // Thanks Leon and Ken

by Proyectos Moodle -

Thanks a lot for your help and comments, Leon and Ken.

We have been cheching today for missed configuration parameters such as max_connection, open_file_limit and table_open_cache among others.

We are also use tuning-primer.sh   ... what do you think, Ken? It´s a a good tool?

We are going to make a test in an evaluation MV very similiar to productive server, in order to tune the MySQL server according to your ideas and others we have reviewed today.

I will post our my.cnf config file so you can review and give us more suggestions in order to solve this issue.

Thanks for your help and support   smile


E. Lopez.

In reply to Proyectos Moodle

Re: Mysql connection/service failure // Thanks Leon and Ken

by Ken Task -
Picture of Particularly helpful Moodlers

First, the best resource I've found for MySQL related info is Percona.   Google that. ;) ... that's other than the official MySQL docs.

tuning-primer.sh ok ... does the same thing mysqltuner.pl does me thinks.   It's just a tool to give you some information + make *suggestions* as to config changes.  Why not run both and compare?

See if you can't find the example config files that might have come with your MySQL server on Linux:

find / -name my-huge.cnf

If it's there, compare it with your my.cnf.   On a CentOS 5 running 5.5.37 that file is in:

/usr/share/doc/mysql-server-5.5.37/my-huge.cnf

Newer versions of MySQL (beginning 5.6.x) only provide one example default config file now, but look for it on your system.

Really don't know that testing changes on an evaluation VM would do much good ... what counts is usage/history ... info a used DB server would keep over time. 

And as far as sharing your current my.cnf file here not sure that will do much good either ... unless it was something glaring (like I mentioned already) it would be anyone's guess .... given the technical information shared thus far you'd still be minus the usage info and don't think anyone wants to pour over mysqlquery logs, etc. in these forums.  Proper settings for any MySQL server can only be determined by monitoring that server *on* that server/network/setup.

I do know that I run tuner at beginning of the year ... toward the middle of the first 6 weeks ... just before semester ... and before the end of the academic year on some school servers - those heavily used.   And just about every time tuner is run there is some little tweak recommended.   Some I apply ... some I don't.   In your case .. the size of your implementation, it might worth checking every other day at first and then at the end of every week for a while.   Most changes of significance do require restart of the service ... which wipes the slate clean as far as usage is concerned.   In which case, your next check might have to be after 2-3-4-5 days.  Restarting service disrupts usage until service restarts so it might be a good thing to warn students/faculty of those times.   Wouldn't hurt to put that into a bash shell script that ... put the site into the maintenance mode and ran tuner from the web server.   Then if needed make the tweaks by hand on the DB server and restart the MySQL service there first.   Once up and tested via web server, take site out of maintenance mode and monitor MySQL service a little.

'spirit of sharing', Ken

In reply to Ken Task

Re: Mysql connection/service failure // Thanks Leon and Ken

by Proyectos Moodle -

Hello Ken and friends.


Thanks a lot for your help. I´ve downloaded mysqltuner.pl tonight, and I will use it tomorrow on site.

This evening we have made some tests with a 'copy' of server, an using this my.cnf config file:


table_open_cache = 20000

open_files_limit = 75000

back_log = 3000

max_connections = 3000

max_allowed_packet = 4M

key_buffer_size = 380M

sort_buffer_size = 4M

read_buffer_size = 4M

read_rnd_buffer_size = 8M

thread_cache_size = 8


I´ve tuned some changes in kernel communications, such as net.core.somaxconn, net.ipv4.tcp_max_syn_backlog  , also increased MySQL max_open_files by editing mysqld.service config file

We ran several tests simulating simultaneous access of 300 users (open front page, log-in, search course, enter course, close session) and after some adjustments on this parameters MySQL appears to be working better. The problem now appears to be on apache frontend server configuration file, so the server uses almos all the 8GB RAM trying to serve all the connections.

We expect more or less 600 simultaneous users, and we are asking for 8 GB more on RAM.

What do you think of these parameters on MySQL?

What do you think of increase RAM on frontend in order to have a better response time to users? (simulation requires 5 - 7 minutes to complete, and in this moment we are gettin 25% error in login process, but all the failure messages come from apache).

Thans a lot in advance,

E. Lopez.



In reply to Proyectos Moodle

Re: Mysql connection/service failure // Thanks Leon and Ken

by Ken Task -
Picture of Particularly helpful Moodlers

'What do I think of those parameters?   1) am green with envy!  wish I had a setup like yours 2) thought I said that unless someone had pretty much the same setup as you - that includes network ... it's anyone's guess. 

But, what did top look like on the DB server as you ran the simulation?  Is the DB server using any swap space? (disk IO has to slow it down).   Check out info you can find on skip-networking.

Uhhh the simulation did test login for 300  but not login for 600 then a quiz.  No, I don't have a jmeter plan to answer more than one question in a quiz.  Are you sure that 600 clients will be taking a quiz at the very same time?

Was the 25% error in login due to time outs between Web server and DB server?  What's the settings for php on the web service ... extra memory to use, time a script can run, etc..   We are running files for sessions and NOT DB, right?  Running memcache(d) on web servers?   What else do you have running on those two web servers?   Heck, one can always throw more memory at it but that doesn't mean solution to a problem.   But ... 16 Gig is better than 8 Gig and I've always found it nice to have more than enough RAM.

And you do realize that when you finally get the test environment and production environment tweaked and appear to function ... it ain't over then.

Best resource for MySQL nfo IMHO ... Percona articles.   You are on the right track!!!

'spirit of sharing', Ken

In reply to Ken Task

Mysql connection/service failure --->> Problem Solved !!!

by Proyectos Moodle -

Hello Ken and friends.

Finally, the project has been terminated. We used system´s approach to face this issue. This is a summary:


1) We made some tune in kernel so each front-end server can accept an extraordinary number of packets from their four GB Ethernet NICs. It was tuned to accept 8192 in receive (SYN) buffer of TCP/IP and somaxconn in order to have enough capacity for generating ports to connect each session to MySQL server.

2) We setup Apache to receive simultaneous connections of 600 users. I had to modify httpd.conf and ssl.conf to meet same Timeout for example, and httpd.conf to support adequate number of MinSpareServers, MaxSpareServers, Clients and StartServers.

3) I made some changes in max open files for service mysqld from OS, so it can have 100000 open files, also tuned /etc/security/limits.conf for the same.

4) Finally, we made a small adjustment in certain parameters of our /etc/my.cnf file, based in the same information I posted before. For example, we increased max_connections, table_open_cache, open_files_limit and others, but only small adjustments to suppot the workload.


Today, client ran a simulation with 300 users, and then 600 without errors!!!  smile

As I said, each simulated client: opened Moodle front page, loged-in, search and enter his course and close session.

Report graphs obtained from the stress test software (for example response time) changed from exponential (as before) to linear, with a very small slope. I estimate that with current global configuration we can receive up to 900 users simultaneously, per front-end server (each session separated only by millisecs) without error.

Front-end servers had to be expanded from 8GB RAM to 16GB, and still have room with this configuration to receive even more users. I think we can increase the parameters to support 1000 simultaneous connections before incrementing again the RAM capacity.

Thanks a lot for your help, information and recommendations. If you need detailed info of the process, configuration files or samples of our config files, I can send you a copy or write a document about it.

At this moment, real servers are working only with MySQL tune, and everything is very, very fine. :D

Best regards,

E. Lopez


[English is not my mother language, so forgive my mistakes  smile ]

Average of ratings: Useful (1)
In reply to Proyectos Moodle

Re: Mysql connection/service failure --->> Problem Solved !!!

by Ken Task -
Picture of Particularly helpful Moodlers

Your English is just fine!

And CONGRATS! 

That is an accomplishment!

Don't be surprised if others request seeing those files.

And thanks for sharing back.

'spirit of sharing', Ken

In reply to Proyectos Moodle

Re: Mysql connection/service failure

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hi

It is important to be 100% that the problems arises only when there are large number of users connecting to the server, otherwise, at low loads, it functions perfectly. If so, obviously your cluster can not handle the number of visitors you get.

Debugging performance issues is not easy, in a cluster specially so. Whatever, there is a dedicated forum for performance issues, the Hardware and performance forum: https://moodle.org/mod/forum/view.php?id=596. You can see its full set of initial pointers, when you initiate a discussion: https://moodle.org/mod/forum/post.php?forum=94 (and then vanish).
In reply to Visvanath Ratnaweera

Re: Mysql connection/service failure

by Proyectos Moodle -
Hello Visvanath, thanks a lot for your help.
As you said, we figure out that our problem is related to the number of connections we can accept and the buffers to support them (tables, etc.)

We are going to take a look in that forum in order to complement our ideas and make some configuration changes in MySQL, so we can solve this issue. I will keep you posted of our findings.
Regards,
E. Lopez.