Feedback requested on new Logging specification

Re: Feedback requested on new Logging specification

by Martín Langhoff -
Number of replies: 4
Performance performance performance! smile

Jokes aside, there is one thing that I don't see discussed there, and I think is quite central: differentiated handling of old logs vs recent logs (vs very old logs?).

We want to write them out fast and low-overhead. Whatever this storage is, it won't be query-able, or will have restrictions.

So at least we should say: our read/query log API is not guaranteed to see the very latest entries. In fact, you probably can't see the last 5m of activity - deal with it.

Anything that wants to see the very latest entries needs perhaps a different approach. Perhaps the logging layer has some pre-cooked hardcodes bits that are tailored to run fast (keep a list of recently seen usernames, keep a tally of pages loaded in the last N minutes), or modules could register a callback.

Either way, these are hot paths. A bit of, ahem, not entirely thought through code here can throw the handbrake on. Big time.
In reply to Martín Langhoff

Re: Feedback requested on new Logging specification

by Martín Langhoff -
To add an example or two.

From a scalability perspective, mdl_log is a big bottleneck. Ot makes sense to log somewhere else, somewhere where we don't have to contend for a lock, no need to maintain indexes, etc.

Options include logging to a file, logging to in-memory tables (all our RDBMSs have some support), splitting the logging into several files or tables (to reduce contention), etc.

All of these options are supplemented by a cronjob or daemon that feeds the data to a database table (where it gets all the benefits of indexes, etc) in a way that is more DB-friendly.

The data in that short-term pool isn't easily query-able. If we put demands on it being readable, then we paint ourselves into a corner...
In reply to Martín Langhoff

Re: Feedback requested on new Logging specification

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
Hello!

I agree with what you say. The API we are proposing should be suitable for any mechanism of log storage because the reading and writing can be fully independent. Nothing with the exception of reports should be reading the data from log storages, that should imho help when dealing with any delay between writing and reading.
In reply to Petr Skoda

Re: Feedback requested on new Logging specification

by Martín Langhoff -

Nothing with the exception of reports should be reading the data from log storages

Well, that's the easy case. But we have two cases I know off the top of my head that make this a bit more interesting.

  • "Live logs", which is only useful if it can read the recent logs, so it will need some form of API, or get axed. And I do think it is useful.
  • Recent activity block. Also useful, can perhaps cope with a short delay.
In reply to Martín Langhoff

Re: Feedback requested on new Logging specification

by john saylor -

hi

i echo the concern with performance.

also, maybe i just missed it, but i'm wondering about how the existing logging system will co-exist with the new one. i did see the note about doubling the hits on the db if both are enabled, but nothing about how the code and admin interfaces will be set up. and maybe that will happen a bit later ...