Here are a few from the Linux front http://www.cyberciti.biz/tips/top-linux-monitoring-tools.html.
My question is what tools do you use for this purpose? Could you comment on their usefulness, etc.?
A typical output looks like http://demo.munin-monitoring.org/munin-monitoring.org/buildd.munin-monitoring.org/index.html.
Installation instructions for Debian: http://www.howtoforge.com/server_monitoring_monit_munin.
We've been using Munin for some time but I'm sometimes a little skeptical to it's value.
Whilst it can generate monitoring alerts and does draw some nice pretty graphs, I've mostly found it useful /after/ the event to see what the lead-up to an event was like, or view historical changes to determine whether something is 'the norm'.
That said, Version 2 (well, 1.99) is a massive improvement over 1.4 and I'd highly recommend using that if you're planning to use Munin. It's backwards compatible with 1.4 clients and offers massive performance benefits over 1.4 too.
It would be interesting to see some Munin Moodle plugins too for things like:
* number of users in the last five minutes
* database size
* moodle disk usage
For my experience, Munin is pretty Ok in numbers. What made you skeptical?
I've been using Pandora FMS to monitor moodle using a software angent, a web check and the network monitoring options that Pandora offers. If youd like more information visit the following webpage: http://pandorafms.com/Home/Home/en
I wouldn't regard most of the tools in the link you've provided as 'Monitoring' tools, but rather as diagnostic utilities. I wouldn't expect any of those tools to alert me to any errors or issues for example.
To answer your question on what we frequently use for monitoring:
* munin as discussed
* nagios with nrpe (snmp sucks)
* lots of custom plugins for both of the above
Using the database long query log is particularly helpful at times and can help determine where an issue lies. We also monitor various additional database metrics.
Also, I'd suggest that one should always monitor system usage, not just during peak load. This helps to determine a minimum baseline and monitor other trends.
Both munin and nagios can graph most of the data that the tools you've suggested obtain.
In terms of looking for bottlenecks, tools such as the following are also pretty useful:
* ab - to help replicate a problem more than anything
* jmeter - mostly for the same reason
* `tail -f` and `less -S` -- the latter is particularly useful for viewing wide apache error logs
We are using nagios for monitoring everything in linux box which we feel is sufficient, never faced any problem
I played with Nagios some time ago and thought it would be an overkill for a single server. Could you pl. add more details how you use it to monitor Moodle. @Andrew, what are the plug-ins you mentioned, in house productions?
We have some custom NRPE plugins which we've written to check certain aspects of Moodle.
Nagios is the best way to monitor moodle you can get the details from http://xeois.com/blog/monitoring-moodle-server-with-nagios i found it interesting so i wrote the blog. Please let me know if you want to get some more information.
Thanks and regards
We use nagios to monitor:
* the underlying host for all of our physical and virtual servers
* core infrastructure (powerbars, console servers)
* daemons (apache2, postgres, exim, bind, haproxy, keepalived, squid, etc)
* applications (moodle, mahara, etc)
Our networking infrastructure is monitored separately by out networking team (we're an ISP too) using openNMS.
For apache and postgres, we ensure things like:
* we're listening on all relevant ports
* apache retrieval times are good
* SSL certs aren't about to expire
* long running queries
* free connections on the db
* quite a lot more...
We operate a moodle hosting service and use nagios to monitor all application instances for all of our customers (part of our standard service). For moodle, at the application level we monitor things like:
* code version (ensure that we're on the latest version)
* db version (need to run the moodle upgrade scripts)
* cron (that it's running)
* maintenance mode (that we're not in it)
* dataroot (that it's accessible, writeable, etc)
We monitor a few other things too. Some of these may seem overkill but we think it's worth making especially sure that our customer's moodles are in tip-top condition. All of the above are monitored using nagios nrpe plugins with a variety of alerts (jabber, SMS, e-mail, ticket).
What do you monitor with Nagios? what are the most important things you think you need to graph?
I use ganglia - its UI looks similar to munin, but ganglia is a real-time tool like Nagios.
What are some top areas you should be monitoring with cacti for a moodle server ?
Are there any templates anywhere? or guide lines?
We are using cacti to graph snmp data, and smokeping to test network reachability.
Admittedly, SNMP is quite limited (CPU usage, Load Average, Memory Usage, and traffic).
Other things we use to monitor periodically are top / htop,
PHP performance improved by APC (switching to opcache very soon)
Also enable perfdebug setting in Site administration > Development > Debugging,
and browse list of users sorted by Last access
Most Moodle pages usually load in less than a second.
I should add that this is a Virtual Machine on VMware.
Umm, with everyone talking about Munin, Nagios, and others, I thought I would mention a somewhat different approach to monitoring. Certainly one not talked about here.
New Relic - Screens of my demo server provided here
New Relic - Screens of my demo server provided here http://nginx.ddns.net/moodle-server-monitoring/
Nice to meet you!
As far as my contribution to this thread is concerned, there was nothing original. Therefore it would be enough if you just say in your blog, "... directly related to discussions in moodle.org. Specifically, 'How do you monitor your Moodle server?' http://moodle.org/mod/forum/discuss.php?d=192162 in the 'Hardware and performance' forum http://moodle.org/mod/forum/view.php?id=596." Or something similar, without mentioning me directly.
Hello All, I am joining too late....really nice and helpful article,even A very well briefly written post. People have look on http://www.woodstone.nu/salive this is also a very handy tool for network monitoring software and monitoring tools. Hope you like that.
how do you monitor syslog effectively and free and easy accross the internet or locally?
A couple good log aggregators for syslog are:
- splunk (www.splunk.com)
- loggly (www.loggly.com)
A nice monitoring system with built in graphing and alerting: www.zenoss.com
NewRelic and AppNeta are fairly decent at givng you insight into the web & db traces as well as seeing what the user experience is like.