Databases: Database transactions in Moodle

Databases: Database transactions in Moodle

Tim Hunt發表於
Number of replies: 14
Core developers的相片 Documentation writers的相片 Particularly helpful Moodlers的相片 Peer reviewers的相片 Plugin developers的相片
For a long time we have avoided using database transactions at all in Moodle, because MySQL/ISAM does not support them.

However, they can often be used in a way that makes things more robust on databases that support them, without causing problems in places that do not use them.

For example, in the code I am working on, I want an insert_question_attempt method, which inserts a whole bunch of related data in the database.

Using database transactions you can ensure that, even if an unexpected exception occurs, either all the rows have been inserted and the method completes normally, or nothing is added to the database, and the method end with the exception being thrown.

(And, if you are on MySQL/ISAM that cannot do transactions, then if an exception is thrown you will end up with half the data written, and there is really nothing you can do about that. That is what I mean by less robust in databases that do not support them.)


Now, actually, my situation is not that simple, because I have another method, insert_question_attempt_step, and in pseudocode they look like this:

function insert_question_attempt() {
$DB->insert_record(...)
foreach ($steps as $step) {
insert_question_attempt_step($step);
}
}

function insert_question_attempt_step() {
$DB->insert_record(...)
foreach ($data as $name => $value) {
$DB->insert_record(...);
}
}


Now, if you naively try to use transactions here,

function insert_question_attempt() {
$DB->begin_sql();
try {
$DB->insert_record(...)
foreach ($steps as $step) {
insert_question_attempt_step($step);
}
$DB->commit_sql();
} catch ($e) {
$DB->rollback_sql();
throw $e;
}
}

function insert_question_attempt_step() {
$DB->begin_sql();
try {
$DB->insert_record(...)
foreach ($data as $name => $value) {
$DB->insert_record(...);
}
$DB->commit_sql();
} catch ($e) {
$DB->rollback_sql();
throw $e;
}
}

it will not work, because you are not allowed to nest database transactions.


Now, for a long time, at the OU, we have had a nice solution to this that sam invented. The attached patch is me rewriting that to fit into the Moodle 2.0 database API. It implements nested logical transactions, while only using one real database transaction. Using the patch, insert_question_attempt would look like

function insert_question_attempt() {
$transaction = $DB->start_transaction('insert_question_attempt');
try {
$DB->insert_record(...)
foreach ($steps as $step) {
insert_question_attempt_step($step);
}
$transaction->commit();
} catch ($e) {
$transaction->rollback();
throw $e;
}
}

(and insert_question_attempt_step similarly), and that will work. A call to insert_question_attempt will either insert everything into the database, or nothing will be inserted. The same contract applies if you call insert_question_attempt_step on its own.

The clever bit about this, which you may not immediately notice, is the fact that we are using the $transaction object. The reason for this is that it automatically detects situations where people write silly code like

$transaction = $DB->start_transaction('imlazy');
// Do stuff ...
$transaction->commit();

Suppose an exception is thrown in the middle of // Do stuff ... Then the commit is never called, and there is no call to rollback anywhere. You might be left with a dangling database transactions. Bad.

Well, the $transaction object has a destructor. When you get to the end of the function, the $transaction will be destroyed (unless you return it, in which case it will be destroyed elsewhere). When the $transaction is destroyed, it checks to make sure that it has been committed or rolled back, and if not, it outputs a debugging() message telling you that you forgot, and then does a rollback.

The patch also has code to check to make sure the transactions are nested properly, which is why a string is passed into the constructor.


So, as I say, we have used this for ages at the OU and it is really nice. I would like this in Moodle core, but obviously that requires discussion. Hence this forum post.

(Note, the attached patch is not tested yet, and there are no unit tests, but it should make it clear exactly what I am proposing.)
評比平均分數: -
In reply to Tim Hunt

Re: Databases: Database transactions in Moodle

Tim Hunt發表於
Core developers的相片 Documentation writers的相片 Particularly helpful Moodlers的相片 Peer reviewers的相片 Plugin developers的相片
I filed MDLSITE-814 for the fact that Moodle removed all the indents in the code that I carefully added with the HTML editor. I hope it is still readable.

WTF even if you try to fix it by using non-breaking spaces, they get stripped out. I think I should leave now before I say something I will regret later.
In reply to Tim Hunt

Re: Databases: Database transactions in Moodle

Martín Langhoff發表於
Sounds reasonable. To an extent.

In an implementation like that, the 'transaction' object needs to be a variable in the $DB object (which is in fact a glorified DH handle). The transaction state is a per-dbhandle variable. Otherwise you need to treat it as a global that goes "with" $DB.

I also think that nested transactions should at least throw a warning in developer mode, as the nesting should be avoided and could indicate that we are doing things in the wrong order, or our internal APIs are broken.

The main challenge with transactional code (and exceptions in general) is to handle the other states that may be involved -- files, etc.

With ISAM/MyISAM, the thing that worries me is that the error handling is completely different from what you'd do if are using transactions. No?


In reply to Martín Langhoff

Re: Databases: Database transactions in Moodle

Dan Poltawski發表於

With ISAM/MyISAM, the thing that worries me is that the error handling is completely different from what you'd do if are using transactions. No?

Not sure exactly what you mean by that statement. But the way I (naively, perhaps) look at this is that we treat everything as the lowest common demoninator (myisam) like now - e.g. Insert (a,x); Insert (a, y); Insert (a, z);

Except at the moment if (a,z) fails, we have useless data in the database, maybe a single error is triggered but its too late. But if our db engine supports it we avoid that crud of a,x and a,y being added. At the moment that crud just gets added. What is the disadvantage of avoiding that if our db engine supports it?

In reply to Martín Langhoff

Re: Databases: Database transactions in Moodle

Tim Hunt發表於
Core developers的相片 Documentation writers的相片 Particularly helpful Moodlers的相片 Peer reviewers的相片 Plugin developers的相片
Did you look at my patch Martin? $transaction is not actually the transaction, it is just a token that tells you you have a transaction open. (Bad choice of variable name.)

The actual transaction is handled by $DB, which keeps a stack of $transactiontokens it has handed out. As things are pushed onto and popped off the stack, it checks that nesting is correct, and throws a coding exception if not.

As the first token is pushed onto the the stack in $DB, a real database transaction is started.

When the last $transaction token is popped from the stack, then the real database transaction is committed or rolled back. The real database transaction will only commit if all the nested 'transactions' ended with a call to $transactiontoken->commit(). If any part tried to roll back, then the whole thing is rolled back.


My aim here is not to use the whole range of database transaction featured. All I want to do is to build larger atomic operations.

For example, if you do $DB->insert_record(...) then you know that either it failed, or the entire row was added to the database. You would never get a case where the data in three columns was written and then an error occurred and the half-written row was left there.

I want some of my API methods like insert_question_attempt to be atomic like that (on databases that can support it), and I want to be able to make bigger atomic operations like insert_question_attempt by composing smaller atomic operations like insert_question_attempt_step.
In reply to Tim Hunt

Re: Databases: Database transactions in Moodle

Martín Langhoff發表於

You are right, I looked very narrowly at part of your example. Sorry 臉紅

Rephrasing my comment: I do think that a separate transaction 'object' is not what we should have exposing the commit/rollback methods. Those methods belong to the $DB object.

So I am proposing 2 small changes - first, make starttrans/commit/rollback methods of the DB obj:

function insert_question_attempt() {
try {
      // throws exception in case of failure
      $tx = $DB->start_transaction('insert_question_attempt');
     $DB->insert_record(...)
      foreach ($steps as $step) {
         insert_question_attempt_step($step);
     }
     // note the $DB object
    $DB->commit($tx);
    } catch ($e) {
        $DB->rollback($tx);
        throw $e;
    }
 }

Second - warn about nested transaction when in developer debug mode.

There is a third issue: how do we educate fellow programmers in "ah, yes, we have transactions... sometimes... program as if you didn't, erm, and as if you did."

Anyone looking at our code will assume we are counting on transaction support... hilarity (and bugs) will ensue...

Edit: Re-reviewing... I do think it is an excellent idea to return a token (which your patch does) to force the calling code to be designed around that bit of 'state'. But to perform commit/rollback with sanity you need to access the internal state of $DB.

Ah! Eureka, I see now why you want the transaction to be its own object. The destructor is on the transaction object instead of on the DB handle (which was my initial reading). Hmmm. You might get me to concede the point 眨眼

In reply to Martín Langhoff

Re: Databases: Database transactions in Moodle

Tim Hunt發表於
Core developers的相片 Documentation writers的相片 Particularly helpful Moodlers的相片 Peer reviewers的相片 Plugin developers的相片
Ah yes, it is a much better API for the start/commit/rollback methods to all be on $DB. I did not think that would work, until I saw your code that passes $tx to the commit or roll-back method. (That is so obvious now I have seen it!)

Attached is a revised patch that renames some things, based on what confused people in this thread, has some better PHPdoc comments, and uses your suggested API. I am liking this more and more. Thanks for your help.


But we still need to decide whether this is a good idea. I think it this is a good idea and goes into Moodle core, then there is no reason to have a developer debug warning to warn about nested transactions.

But we do have to avoid developers assuming that these are simple DB transactions.
In reply to Tim Hunt

Re: Databases: Database transactions in Moodle

Martín Langhoff發表於

I like the new patch.

I am still uncomfortable with assuming that the nested transactions are "ok". In a few cases they will, but that is not a safe assumption.

The key question is: where do we stop re-throwing the exception? The idiom in your code example does

try {...}
catch {
   $DB->rollback($tx);
   throw $e;
}

now, that will work well in nested transactions if all the callchain to the "top" transaction propagates the exception.

If any function in the callchain between the point where you throw the exception makes the assumption that it has handled the exception and doesn't re-throw it will make a mess of the whole thing. The higher-level transactions will attempt to continue, and it probably won' t be pretty.

If the PDO (or whatever lower layer) reverts automatically to 'autocommit' after the rollback, statements will succeed... we'll have a rather ugly inconsistency... and if the code catches that thigns are wrong and tries a rollback on itself, it'll error out, possibly leaving the changes behind.

Moodle is huge, and the codepaths are often unexpected.

In reply to Martín Langhoff

Re: Databases: Database transactions in Moodle

Petr Skoda發表於
Core developers的相片 Documentation writers的相片 Particularly helpful Moodlers的相片 Peer reviewers的相片 Plugin developers的相片
I personally do not think we should start adding transactions into moodle core, because moodle would behave differently if you have or not have transactions working on your server - or are we going to say that myisam tables are not supported any more?

On he other hand I agree the DML should support the nested transactions because inevitably we will end up with nesting if we start using transactions outside of enrol, auth, external API and similar parts.

I think that the patch Tim posted above has some problems and the changes in API do not bring any significant benefits. I have created a new tracker issue MDL-20625 and posted a different patch there.

skodak
In reply to Tim Hunt

Re: Databases: Database transactions in Moodle

Dan Marsden發表於
Core developers的相片 Particularly helpful Moodlers的相片 Peer reviewers的相片 Plugin developers的相片 Plugins guardians的相片 Testers的相片 Translators的相片
Mahara does this quite nicely too - they have functions called "db_begin() and db_commit() which you place around a group of sql calls. I think We have a patch around somewhere for Moodle 1.9 that adds this same support!

Transaction support would be really good to have in Core!
In reply to Dan Marsden

Re: Databases: Database transactions in Moodle

Tim Hunt發表於
Core developers的相片 Documentation writers的相片 Particularly helpful Moodlers的相片 Peer reviewers的相片 Plugin developers的相片
Moodle has had begin and commit functions around for ages too, but with instructions to only ever use them for performance reasons (deferring index updates until a bunch of rows has been inserted).

Do your db_begin() and db_commit() nest?
In reply to Tim Hunt

Re: Databases: Database transactions in Moodle

Nigel McNie發表於
Yes they do. We don't name the transactions, so don't check for incorrectly nested ones - but so far this hasn't proven to be a major problem, as it's pretty obvious when people forget to call db_commit() (oh noes where did my data go!!)

Mahara also flat-out refuses to use MyISAM tables. They're untrustworthy, and we're not going to waste time adding code to handle cases when only half the data is there. (We also use foreign keys which help a lot here too).

See my blog post on MySQL vs. Postgres for a summary of why MyISAM is awful: http://nigel.mcnie.name/blog/how-bad-is-mysql
In reply to Tim Hunt

Re: Databases: Database transactions in Moodle

Tim Hunt發表於
Core developers的相片 Documentation writers的相片 Particularly helpful Moodlers的相片 Peer reviewers的相片 Plugin developers的相片
Recent discussion of this has moved to MDL-20625, which is now incredibly long and hard to follow.

I think actually we are getting near a consensus, here is my summary of what I think we have agreed.

1. As ML and Eloy, say, there is no such thing as a nested transaction. There is a clearly understood concept of database transaction, which is what $DB->begin(), ->commit() and ->rollback() should implement.

2. Because transactions cannot be nested, you can never use them in library funcitions, because you don't know whether the person who called you will have started a transaction or not.

3. However, when writing library functions, sometimes you would like to be able to require that a bunch of database updates happen in a transaction, so that either they are all done successfully, or none of them are. That is, we want a declarative style of programming:

$tk = $DB->start_of_calls_that_must_be_in_a_transaction();
$DB->do_stuff();
$DB->do_stuff();
$DB->end_of_calls_that_must_be_in_a_transaction($tk);

The point is that, if we make those calls start and commit a transaction if you are not already in on, otherwise do nothing, then you get what was called nested transactions above. However, the point is this is declarative, so the previous name of it as an action, $DB->start_transaction() was misleading.

4. The other case is also important. Sometimes you need to be sure that you are not in a transaction, because you want to ensure some changes are actually written to the Moodle DB before you send a message to an external system. Something like

$DB->must_not_be_in_a_transaction();
$DB->do_stuff(); // Must be sure this has actually written to the DB now.
send_message_to_remote_system();

Actually, you can do that without inventing a new function name by doing

$DB->begin();
$DB->do_stuff(); // Must be sure this has actually written to the DB now.
$DB->commit();
send_message_to_remote_system();

since begin() will throw an exception if you have already in a transaction.


We could live without 3., but I think it is nice to have. By using this idiom to start transactions automatically, you end up with transactions that last only as long as is really necessary, which is good fore liveness.