auth/db - need testers for scalability enhancements

auth/db - need testers for scalability enhancements

by Martín Langhoff -
Number of replies: 24

I have some pretty serious enhancements to auth/db that I am working on. If you are using 1.7/1.8, have a large number of users, and use the sync script, you might want to try this wink

The changed auth/db is on a modified 1.8 - if they turn out to be good wink I'll merge them into 1.9. The branch is here http://git.catalyst.net.nz/gitweb?p=moodle-r2.git;a=shortlog;h=mdl18-local

and you can get the latest version by clicking on the "snapshot" link on that page.

With this code

  • the sync should be much faster overall (time it before and after!)
  • the sync script execution should take a lot less memory
  • it used to fail if you had a large number of users - doesn't anymore wink
  • if your remote db has a "last time this record changed" field that is a unix epoch, or you can synthetize one using a VIEW, set it in the config page, and watch updates get a hell of a lot faster wink

I am working with 43K user accounts. Both DBs are PostgreSQL 8.1 and both are "local", meaning that I access them via a socket, not ethernet. So I get very low latency.

Initial creation has gone from "does not even finish, dies after 20 minutes" to ~60m. Updates without the timemodified take also about 1hr. If I use the timemodified optimisation, updates take a couple of minutes.

Unfortunately, I don't have a good example where "it works" before. Only have the "after".

NOTE: if you are testing this code don't file bugs in the tracker. Just post here - this is experimental code. wide eyes

Edit: After the (still slow) initial run, normal daily updates (remove/add/delete a handful of accts) take anywhere between 30 and 60 seconds. I like this wink

Average of ratings: -
In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -

OTOH, the branch may be breaking with some stuff I think Petr has been working on. Specifically

  • Seiti's patch re MDL-9212 - I don't think we should ever disconnect explicitly. OTOH, I don't think we should use pconnections for auth/db so this should not be a problem. Connection is closed at the end of the HTTP request or cli script execution. Seiti's problem was with running the sync() - and with the proposed code there won't be a connection pileup.

  • The "suspend" (re-set to auth='nologin') feature. I think it was buggy as those suspended users would never be revived, while deleted=1 users would. I think my patches fix this, but maybe I'm just misunderstanding how it's meant to work.

And as of 1.8, I think we should default to saying SET NAMES 'utf-8' -- I think all the engines supported by PHP can deal with it. In any case, it should probably be the default. Even if the DB or table are not in utf-8, what that means is that we tell the ext DB that our SQL conversation with it is in utf-8. The ext DB will then translate if necessary.

In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
Yes - the suspended accounts are not revived, it just blocks the login for now and prevents creation of new user account. If you delete account, other auth plugin could take over the old username.

You can also use suspend to block login while still keeping the account with password in the external db/ldap.

I do not think that the revive should be automatic, I would vote for configuration option. Another problem is that the suspended account doe not know the original auth plugin. There should be only one plugin with suspend reviving active.
In reply to Petr Skoda

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -

But this all leads us to a situation that is not very useful: where the ext auth provider (db/ldap/etc) disables and account, and then reenables it, Moodle gets stuck and needs manual intervention.

Moodle should just follow the ext auth system. So if we disable the acct when the ext auth system tells us to disable, then we must enable when the ext auth provider tells us to enable it.

Another problem is that the suspended account doe not know the original auth plugin.

I agree this is a problem - that's why I think that automatically switching to "nologin" is a bad idea. I have a proposal instead: let's add an option to the user browsing under /admin to see deleted users.

So admins can undelete the user (and reassign to 'manual' for example).

To clarify - I maintain several installs where Moodle is driven from external systems with auth/db or auth/ldap, and the purpose of these plugins is automation. So accounts in moodle should never get stuck like they do now.

With my patches auth/db will revive accounts matching the username. I think I will disable that - but we should also remove the option to set the acct to 'nologin'. Or add an entry to the users_preference table to track the "original" plugin. But at the end of the day, I believe we want to delete/undelete the acct.

In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
The current nologin implementation with no reviving has IMO its own uses - imagine you have external database and you want to disable external account without modifying the ext db, but you still want to "loginas" the user. With automatic reviving you could not do that.

My +1 to keep current suspend implementation as is and make new option that does what you want. My -1 for removing the 'nologin' option, instead it should be IMHO better documented.

Browsing of deleted accounts is IMO a good idea, isn't there a SoC project that works on related user management improvements?
In reply to Petr Skoda

Re: auth/db - need testers for scalability enhancements

by Luke Hudson -
I've submitted a patch which provides a preliminary UI for showing and undeleting deleted users on the Browse User Accounts page. The patch works, but the UI is still a bit ugly. Also, minor changes were required to moodlelib and datalib, which could have further-reaching effects.

So, I've created MDL-13377 and attached my patches (against 19_STABLE and CVSHEAD).

I'd be glad to receive comments or improvements, and hope we can get this into 2.0.

Cheers,
-- Luke
In reply to Petr Skoda

Re: auth/db - need testers for scalability enhancements

by Iñaki Arenaza -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Another problem is that the suspended account doe not know the original auth plugin. There should be only one plugin with suspend reviving active.

I don't know why you have to delete the original auth plugin info when you suspend/delete an account. If you don't want to 'float' usernames between authentication plugins (which is what I was told, when discussing the multi-instance authentication design), then just leave the original auth plugin info there.

Either the original auth plugin revives the user, or no-one will (I mean, automatically revive it; giving the admin the option to see deleted/suspended users and revive them manually with any auth plugin s/he wants it's a good idea).

Saludos. Iñaki.

In reply to Iñaki Arenaza

Re: auth/db - need testers for scalability enhancements

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
The original auth field value is replaced with 'nologin' - in fact it is switching of auth plugins. The 'nologin' prevents normal logins and block and sending of mails, but it keeps all posts, assignments, grading, original password, etc. and user is still visible in role assignments and participants list.

It might be much better to add new field to user table 'suspended' and tweak the code to respect this flag - no login + no mails.

My proposal:
==for 1.8.1==
* keep nologin related code as is, change name of option to "Change auth to Nologin" (using new lang string) and fix docs if needed explaining that it is not revived automatically

==for 1.9==
* modify user table - add "suspended" field
* tweak the core code to respect this new flag
* add new option to auth sync scripts - suspend/revive (reusing the original lang string)
* add undelete and suspend GUI - as part of SoC project


In reply to Petr Skoda

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -

Petr - can I propose a small change? For 1.8.1, when switching to 'nologin', store "auth_plugin" in user_preferences.

So we can revive the user. Without "revival" (cue 80's music wink ) this will cause lots of problems for end users.

In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
yay, the user preference is a good place to store original plugin, I did not think about that!

my +1 for that in 1.8.1; we could call it suspended_from_sync (or something like that) so that we know also why/how it was suspended

I would like to commit your changes tomorrow, we could discuss the details when I wake up wink
In reply to Petr Skoda

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -
Ok - I'm keen on merging it, but my current code depends on ddllib stuff that is going into HEAD, not on 18_STABLE just yet. Maybe if Eloy agrees...
In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
thanks, I will discuss it with Eloy via Skype ASAP wink

If we store the original auht in user pref, it is IMO ok to do only automatic reviving, the only problem might be when somebody recreates account with the same username, but for different person in external db. Then the suspended local one would have to be deleted manually or renamed.
In reply to Petr Skoda

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -
Ok - I have a patch that implements reviving as we discussed during sync(). This still depends on temp tables. http://git.catalyst.net.nz/gitweb?p=moodle-r2.git;a=shortlog;h=mdl18-local

However, I found a serious problem with the nologin approach - something we hadn't seen before. If we switch the plugin to 'nologin', then auth/db will not get called if the user tries to login. Depending purely on sync() was never a good idea. The sync() is to "cover the gap" having up-to-date info in Moodle and to show those users who never logged in.

The list of problems with switching users to nologin is growing. Perhaps we should get rid of it and focus on better handling of deleted accts.
In reply to Petr Skoda

Re: auth/db - need testers for scalability enhancements

by Iñaki Arenaza -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
I'm really confused. What's the difference between a deleted user and a suspended user? I'm having trouble seeing why we need a suspended field at all.

If a suspended users is one that can't login, then that's what the nologin auth plugin was designed for, wasn't it?

And if suspension/unsuspension is a manual thing (and thus performed by the administrator, who knows what he's doing), why do we need to add this additional field and manage one more spacial case in the auth plugins?

Saludos. Iñaki.
In reply to Iñaki Arenaza

Re: auth/db - need testers for scalability enhancements

by Iñaki Arenaza -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
Btw, thinking a little more about it (35 Km on a mountain road to get back home gives you a lot of time to think smile), I'd say this 'suspended' account thing is better served using a 'suspended accounts' auth plugin (you may name it using a more catchy name, though)

Now that we have a multi-auth plugin infrastructure in place, just add a new auth plugin that lets you specify which accounts are not allowed to log in. (similar to nologin, but being configurable and selective).

No need to change the user's authenticatio plugin: just make the new plugin reject logins for the configured users (return an AUTH_DENIED value ).

The plugin configuration page just shows you the same screen you get when assigning a role: a split screen with the suspended/rejected user list on the left and the non-suspended user list on the right. Then you just store the username/id/whatever-you-want rejected user list as a plugin configuration value (a TEXT field should allow for quite a bit suspended users wink)

Reusing existing code this should be dead easy and there is no need to add additional special-case code to the existing auth plugins.

Just remember to place this auth plugin high up on your enabled plugin lists before the 'real' auth plugins, so you can have a chance to reject user logins.

What do you think about it?

Saludos. Iñaki.

In reply to Iñaki Arenaza

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -

is better served using a 'suspended accounts' auth plugin

I think it is exactly what Petr implemented, using the 'nologin' plugin. Usually deleting an acct will remove their enrolments, this 'nologin' plugin doesn't. But I have to say, I am not convinced...

Being "deleted" or "suspended" is a special mode and making it "generic" is wrong. For this mode to be useful we need to do things, like...

  • I'd be happy with a patch to allow admins to see and edit deleted accts. This is important, and we are lacking here.
  • Change things so deleting accts does not delete enrolments, and then we can change the participants list to show "deleted" users appear grayed out.
  • Retain the "nologin" plugin, but remove the option from the auth plugins to "delete by setting to 'nologin'". Using a flag (deleted / suspended) means that the plugin setting remains and this is important so that the plugin can bring the acct back to normal. Switching auth plugins breaks this badly. There is no scenario where this is useful - if you want to be able to "disable user reviving" we can add an option for that.
In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Iñaki Arenaza -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers
> I think it is exactly what Petr implemented, using the 'nologin' plugin.

As I see it, it's different from what the nologin plugin does. First, with this nologin plugin you _must_ assign this auth plugin to the user, so you loose the user's original auth plugin. Which is a pita.

But the current authentication schema (we just loop over the enabled plugins when the user doesn't have an assigned on, but go directly to the user's plugin when it has one setup) limits us severely in this area.

If we followed the PAM schema (which is a good reference, IMHO), then we would loop over all of them, and an auth plugin would be anything that given a username and password would say if s/he was allowd to login or not (based on _any_ internal criteria the auth plugin wished).

This would allow for suspended, rejected, or whatever black-list criteria you wanted to be implemented (can you say no logins between 02:00-06:00 local server time?), without having to tweak every other auth plugin when you add a new 'special mode' (or state, or whatever you want to call it). If we extended it for the password change process, we could even implement password policy support in a pluggable way (Anthony Borrow would specially like it wink).

Then you could see what users were suspended, rejected, blacklisted, ... just by having a look at that plugin config settings (because you'd set the list of suspended users in the suspend plugin and so on).

Of course this is a bit different from the way we do things today, but I'd say is a more flexible and maintainable way of doing thigs.

Either that or we set a new user setting: the mode (or state), which can be an enumerated value, for example: active, deleted, suspended and so on, and then let any auth/enrol plugin to deal with the mode/state in a way it sees fit.

Saludos. Iñaki.

Just my 0.02 euros wink
In reply to Iñaki Arenaza

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -
I like PAM too, and when we were planning things with Jonathan Harker a couple months ago, the PAM scheme was the first we tried. For some reason, we dropped it mixed -- I can't remember exactly what the roadblock was.

It wouldn't be hard to bring it back. It _would_ however, change the semantics of what modules are expected to do. Most current modules would have to if (user->auth == myself) { return "whatever"} wink
In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -
Also - do you know if any of the DBs we interop with don't support `SET NAMES 'utf-8'`? I strongly believe we should make it default, and selectable. smile
In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Petr Skoda -
Picture of Core developers Picture of Documentation writers Picture of Peer reviewers Picture of Plugin developers
Set names is for mysql/pg - Oracle and MS do not use that. I am not even sure that majority of external databases is already using utf-8 sad
In reply to Petr Skoda

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -

I am fairly certain that Oracle recognises SET NAMES. Not so sure about MSSQL.

In any case, this does not depend on the DB using utf-8, but in being able to translate whatever encoding it has to utf-8 on-the-wire.

In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Chris Brainerd -

Hi all,

I'm running Moodle 2.0.3, and I wasn't aware until all of my users were placed in suspend limbo due to an external database reboot that the suspend & revive logic wasn't finalized.

I reported it a few days ago as MDL-29356 - as I wrote on there, I did some searching, and I saw this thread but didn't read into it (didn't expect this discussion to be under 'need testers for scalability enhancements', and also didn't believe yet this was really an unresolved thing; this thread hasn't been touched since 2007).

The description for the 'Removed ext. user' setting in the auth/db configuration says, "Specify what to do with internal user account during mass synchronization when user was removed from external source. Only suspended users are automatically revived if they reappear in ext source." So I believe my expectation for the reviving functionality to work was reasonable.

Overall, I've patched it myself, by changing the deleted=1 criteria to auth=nologin and changing the action from updating to deleted=0 to updating to auth=db. It's OK for our implementation to be blind in this fashion because I don't use any other auth plugins and nologin is only set by db suspend, so I can and do assume in my case that all nologin users were suspended by the db plugin. 

Although it works, I'm voting for a better designed, tested & certified Moodle.org solution. Most important of all obvious reasons is the fact that auth/db is a critical component of our Moodle and now this custom core patch has to be maintained by us throughout Moodle updates. 

I do see in the Moodle Roadmap that for 2.3 due in June 2012, there's "Oauth2 - cross-system authentication, for integration with other systems and to replace MNet.", but I'm not sure if this would include or could replace external database sync. 

Anyway, the best I can do now is give my testimony that this was a major unexpected issue for me -- it effectively took my Moodle offline to all of my customers the other day until I discovered reviving did not work -- and I strongly recommend that the suspend/revive functionality of auth/db be revisited and finalized in the Moodle release, not in a patch, or if not, the language string in the plugin that says it works be re-phrased to "suspended users will need to be revived manually".

Note that my MDL-29356 does duplicate MDL-9281, however, 9281 is marked as resolved for 2.0, which I don't understand as it's broken in my 2.0.3. Also, MDL-13563 is the most current and only still active (I think?) issue related to this problem, however it is specific to LDAP. So I've figured MDL-29356 might have some justification to exist; if not, please let it get your attention.

As for your patch, Mr. Langhoff, which sounds like it fixes this as well as optimizes the routine, I'm interested to know if you still offer it for Moodle 2.0.3? 

In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -
A few more enhancements, and some more tests. Initial account creation times are the same as vanilla v1.8. In fact, for small number of accounts we are a bit slower than v1.8 because of some the initial setup overhead.

However, updates take almost the same time

With 100 user accounts:

Vanilla 1.8.x
* sync initial setup: 57s
* sync update on no changes: 57s

1.8-local
* sync initial setup: 1m 22s
* sync update with no changes: 1.2s

I'm a bit puzzled as to why the CASE patch doesn't seem to make a difference. It should be removing one DB query from the process. Hmmmm - probably that query balances out with the overhead of the INSERTs to the temporary table.

Edit: actually - was testing the wrong version of mine - initial setup is as fast as vanilla. Still not happy enough for me.
In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -
Hmmm. There _is_ some overhead, somewhere. Certain versions do the 100 users in a couple of seconds.
In reply to Martín Langhoff

Re: auth/db - need testers for scalability enhancements

by Martín Langhoff -

Ahhh - the "external" DB table I was testing against was actually slowing me down. Great - it actually made for me squeezing my brain on this one.

On 1000 users, MOODLE_18_STABLE:

 real    0m28.293s
 user    0m6.396s
 sys     0m0.612s

On 1000 users, mdl18-local:

 real    0m12.670s
 user    0m5.484s
 sys     0m0.384s

What this doesn't show you is that

  • On subsequent runs, if you are doing updates, MOODLE_18_STABLE takes the same amount of time as on the initial run. Even if accounts didn't really change. mdl18-local takes under 5s.
  • With 43K users, the version in MOODLE_18_STABLE has problems with account deletion and overall just does not work. With mdl18-local, it takes 8 minutes on the initial run. And 29s on subsequent runs.
  • If the connection to the external DB has any latency at all, the speed up is significantly more noticeable.