Suggestion: Alternative logs for fast, efficient reports

Suggestion: Alternative logs for fast, efficient reports

by lior gil -
Number of replies: 5
Picture of Core developers

Hi everybody,

I hope I'm in the right forum. I have an idea for another log system and would appreciate your opinion.

We have a need for periodical reports that will give basic statistics for the activities in a course, showing the number of views, updates etc.

As of now, the usual approach is running through mdl_log and selecting all records that match the selected course and time period. Problem is, our site contains over Hundred of courses and thousands of users and the result is a huge amount of data that even after regular backup and cleaning is still very heavy.

One of the ideas is to create a new log table that will contain much less data and will be used for basic information.

Now for some technical stuff...

The table structure is very simple:
id, time, course, module, view_counter, add_counter, edit_counter, delete_counter

The first four values are similar to those in mdl_log and the rest are simple counters.
The idea is that while the usual add_to_log function collects user, time, ip and so on, this table will contain the activities counters for every module.

Another difference is the the time field will not contain the current time but the current month. This way one record will hold information for a single course module for a whole month (or a whole week, depends on the need).

The advantage is a very fast, very target specific statistical tool that will show the different activities in one or several courses over a period of time, pointing out the less used instances and so on.

However, there are some serious disadvantages.

  • The developer will need to go through all the places in the modules that call to add_to_log and add near it a call to the new log [for example: log_action(courseid, moduletype, actiontype)]
  • While add_to_log just add a new record, here there is a need to look for a record by specific time, course and module values in order to update it. On one hand, this table will be relatively small to work with, but on the other hand it will be called constantly from every page. And this is the major problem here.

I looked for similar issues and didn't find anything close enough so I don't know if a similar idea has been discussed here before. If so, I'll be happy if someone will direct me there.

Average of ratings: -
In reply to lior gil

Re: Suggestion: Alternative logs for fast, efficient reports

by Rex Lorenzo -

I am not sure if they will make logging any quicker, because you will still have a DB action for every view/edit/delete.

You are just losing some reporting information. For example, you can no longer query for how many times a given resource was reviewed/downloaded.

Also, you lose tracking of what content students have viewed or not viewed.

In reply to Rex Lorenzo

Re: Suggestion: Alternative logs for fast, efficient reports

by lior gil -
Picture of Core developers

The intention is not to make logging quicker, but to make the reporting quicker. I'm not losing any information because this option is not intended to replace the usual log, but to work alongside it. Very specific queries will still use the usual log table.

Plus, something I didn't think about yesterday, maybe using an internal cache to store id's will make updating this new table faster.

In reply to lior gil

Re: Suggestion: Alternative logs for fast, efficient reports

by Chris Fryer -

Rather than creating a separate log table and recording actions in that, you could use a similar approach to the (already existing) Moodle "statistics" package in core.  That uses cron to process mdl_log, aggregating data in that table into daily, weekly, etc tables.  Offloading the report-generation to cron means you won't slow student/teachers down while they're browsing, especially if you have a dedicated cron node.

In reply to Chris Fryer

Re: Suggestion: Alternative logs for fast, efficient reports

by lior gil -
Picture of Core developers

The statistics have been proven to slow down the whole site and have been disabled. Maybe on smaller sites they are usefull but when dealing with large quantities of data it's ineffective. I know of other institutions that disabled this option for the same reason.

In reply to lior gil

Re: Suggestion: Alternative logs for fast, efficient reports

by Chris Fryer -

Yes, and we have disabled statistics at our site for the same reason.  But I think the approach, i.e. to use the existing data in mdl_log, is not entirely stupid.  If you use a scheduled task to crunch your data on a nearline copy of the db, (e.g. a slave), then copy the results back to the master database for presentation to the user, it will have virtually no impact on the user experience.  Having said that, adding one more DB transaction to a page view probably won't affect it too much either.