An update on the gradebook versioning problems we have been discussing (MDL-50522). There are currently 2 in progress fixes for bugs in gradebook affecting stable versions (MDL-48239 and MDL-49257) and we have been struggling with individual custom solutions that allow:
* 0 changes to grades in an upgrade step
* grades to be "fixed" when someone accepts a message in the course
* updates be applied in a fixed order
* not show 100 separate errors on the gradebook reports
Eventually we need a comprehensive way to "version" the gradebook - but this will require a re-factor and not be safe/backportable.
In the meantime we are proposing a way to allow the same result as versioning the gradebook, but the downside is accepting some nastier code in gradebook.
The idea is:
* introduce a generic course setting for the "version" of the gradebook (not a visible setting, just a flag in the DB)
* Whenever we need to modify gradebook code, we protect the changes with an "if" statement and preserve the old code as well.
* In upgrade scripts, if we detect a course is affected by a bug that is fixed, we set the "version" of that course gradebook to the old version (so the grades are not changed by the upgrade).
* If a gradebook is using an "old" version, we show a generic warning at the top of the gradebook reports with an "fix problems" button - and link to a docs page that lists all the gradebook versions and the bugs fixed in each
* Add behat tests to every issue that verify the old behaviour is not changed, and the new behaviour fixes the bug
* Continue with MDL-50522 to do this in a much nicer way for master by consolidating logic, refactoring etc.
I will be re-posting this message all over the place - and Marina and Adrian are planning to follow this path on their issues mentioned above.
Thanks for reading!
This looks like a complete overreaction to me, brought on by recent history.
My take on the story:
- In Moodle 2.8 major changes were made to the gradebook, with insufficient thought about backwards compatibility.
- And, to make matters worse, there was insufficient testing, so regressions were only found after the release.
- Then that was compounded by the fact that the people responsible for the 2.8 changes largely ignored the regressions until Eric Merrill kicked up a fuss, and the mostly got fixed in 2.8.6/7.
- Hence we are currently completely paranoid about the gradebook, and treating sensible decisions about general principles (MDL-50522) in a totally inappropriate black and white way. (There may also be an element of guilt from the people who ignored bad regression for so long, but I am speculating.)
So, I think the plan proposed here is extremely bad. The gradebook is already very complex, and it is already hard enough to get anyone to touch the code. The last thing we want to do is to make that situation worse. An example would be MDL-48239. The underlying bug could be fixed with about 6 lines of code. The current patch is 462 lines. (It is actually an impressive bit of development Adrian has done there, but sadly I think it should turn out to be wasted effort.)
What happened with the changes in 2.8 is a major exception, and we should not base a whole policy on assuming that is how Moodle is normally changed.
In general, changes are, and should be, forwards and backwards compatible. There is no need for hairy versionning at all.
'Natural' aggregation in 2.8 was a terrible exception. For reasons I don't understand the modifed the existing Sum of grades into 'Natural', rather than making it a new aggregation method. This was a totally unnecssary breaking of backwards compatibility. I trust we won't make that mistake again. If this had been done right, no versionning would be necessary.
Also in 2.8, they changes how grades were displayed to be be out of grade_grades.rawgrademax, instead of grade_item.grademax in many cases. However, they did not take sufficient account of the fact that due to old bugs like , many grades have the wrong rawgrademax. The correct solution to this would have been an database upgrade step as part of 2.8, to fix the old bad data before it was used. Again, no versionning necessary.
The only case when a code changes should cause visible chagnes to grades is when the current behaviour is a bug, and we are fixing it.
Example: is a good example of this. The current behaviour is so broken no-one could possible want the current behaviour in 2.8 (Please say if you disagree). So we should just fix the bug, and warn about it in the release notes. Again, no versionning necessary.
So, anyway, that is my counter suggestion:
- Don't introduce very complex and unmaintainable versionning logic.
- Give proper thought to gradebook improvements, and implement them properly (like we do everywhere else in Moodle) so they are mostly forwards and backwards compatible.
- Don't be afraid to 'change behaviour' when we are fixing something that is clearly a bug.
- Don't overreact, just because you made one mistake recently. (Even if that was a pretty bad mistake.) Instead learn, and don't make the same mistakes again.
Another point of comparison that occurs to me: How we handle API changes. Do don't introduce complex versionning in our APIs. Instead we plan carefully and try to keep compatibility as much as possible. And we communicate necessary changes through upgrade.txt files and the release notes, and so Moodle moves forwards.
"The current behaviour is so broken no-one could possible want the current behaviour in 2.8 (Please say if you disagree)."
As I have said elsewhere, my institution is not concerned with preserving old gradebook logic, but we do need to preserve old grades. Even if old behavior is badly broken and buggy, we need to have some confidence that grades will not spontaneously change in past courses. I think, however, a much simpler fix will suffice.
The simplest fix, of course, is to archive your old courses on a separate install of Moodle, keeping it on the older version.
Even if we were to ignore the scalability of this suggestion, which it absolutely is not for Moodle hosts nor is it for institutions that need to keep 5+ years of backups, it ignores the fact that you need to host that instance on legacy software which likely is extremely difficult to setup/maintain and definitely vulnerable to known exploits. So a host will need to determine whether it maintains a separate environment with known exploits or go through a heavy process to spin-up that environment on-demand. Neither of which are feasible.
A Lower-cost verion of this suggestion is to export the gradebook at the end of the course, either as HTML or Excel, or something. And use that for the definitive record in future.
It should be possible to make a new gradebook report that automates that. Use the existing Moodle code to save the HTML view of the grader report (all rows without paging) to a file, and makes it possible to easily access it later.
Actually, it is very scalable. We are an institution that needs to keep several years of backups and this single-year instance process has helped us avoid the Moodle log bloat issues by splitting up the table space. The key for us is to use VMs for the web servers. The "spin up" process is as simple as starting the VM. The only software that's truly static is the Moodle code, so if we don't feel safe with an old version of the software, we can just turn the web servers off.
"The simplest fix, of course, is to archive your old courses on a separate install of Moodle"
It would be nice if Moodle had more support for archiving (even if only in the form of documentation). For example a way to create a read-only install. I have been experimenting with creating a read-only user in the config.php, but there are other possibilities.
I believe the (over) reaction to freeze everything is due to grade changes that occurred due to SUM being overwritten and reused as Natural instead of creating a new aggregation method.
During all phases of the discussion on Natural Weighting, there were suggestions for implementing it as a new aggregation method. When that suggestion was first not understood, then ignored, promises were made to ensure no grades would be changed during the upgrade process. Maintaining the integrity of grades would be a priority.
This was clearly not what transpired.
First, people used SUM incorrectly and counted on bugs.
Second, developers fixed those bugs without regard for grade integrity.
Natural weighting should have been implemented as a new aggregation method. This would have allowed people to adopt it in their own time. As this was not the case, there seems to be a fundamental disagreement on how to handle things like this in the future.
I'm with Tim on this, don't throw the baby out with the bathwater. We're making this too complicated. If we take care not to make mistakes for the sake of getting things done quicker, we won't have these problems in the future.
Amazing hubris coming from a person who sat on the sidelines while 30+ developers, instructional technologists, partners, et. al. came together with 100% consensus on the proposed changes. I may not agree with Damyon's proposal and I know first-hand how bad the GB code is but you've taken a wide swath at people who ALSO have brains. For the past couple years you had seemed to ratchet down your flames so it'd be great if you went back there.
Bob, that may be how the situation looked to you, but it looks very different to me.
30+ developers, instructional technologists, partners met in a place that I could not feasibly get to myself.
They may have reached a consensus, but if so, it then took them a very long time to write it up and share it with the rest of the community. (And hind-sight shows that the consensus did not actually cover all important edge cases.)
When they did, what was written was not at all easy to understand. I was one of the few developers who tried to give feedback at the time, but it was almost impossible to have a sensible debate, or get further information. I apologies again for the fact that my manner of giving feedback was not as constructive as it might have been, but that does not decrease teh validity of the points I was trying to make. Anyone can review that forum thread.
You did eventually provide more information in that thread, though I was never convinced that you had correctly considered all edge cases. In the end I gave up trying to give feedback.
Most importantly, after Moodle 2.8 was released, and regressions started to get reported, where the hell were all of you? Making changes to Moodle core is bother a privilege and a responsibility. None of us are perfect. We make mistakes, but if you make a mistake in Moodle core, you should expect to jump on it quickly and fix it. As it is, nothing happened after 2.8 until Eric Merrill forced the issue, and things got fixed in 2.8.6. That is not good enough.
(MDL-50373 / https://moodle.org/mod/forum/discuss.php?d=314256 is, I suggest, how regressions in a major Moodle release should be handled. However, I am far from perfect. MDL-27414 & MDL-40753 are two that I am not at all proud of, but that is agains a record of 2104 bugs fixed.)
So, anyway, when you are proposing a major change to Moodle core, it is your responsibility to:
- consider all important edge cases.
- articulate your proposed changes in a way that everyone in the community can understand.
- listen and respond to all reasonable comments and suggestions (though ultimately you don't have to accept them all).
- implement and test the changes to ensure, as far as possible, that you don't cause regressions.
- be available after the release to fix the inevitable regressions that do slip through.
I am sorry to preach, but it is sad that a failure in regards to these responsibilities has spoilt the gradebook changes which, at heart, had a lot of sense to them.
I'm a one-man shop so while I can make suggestions I can't do a lot of what you ask because I don't have the resources. North American schools were largely not going to go to 2.8 until Summer, 2015 so their testing commenced accordingly.
I agree with a lot of what you have said and there was certainly a significant disconnect between HQ and the users even though Mark Mckay and myself hungout with HQ at least weekly. For testing we need some analysis to provide more use cases. The ones missed were disastarous -- most notably NW table schema screwup that perpetuated bad practices and, as Elizabeth said above, makes the GB unusable for those who can't apply a patch to their system. NW is a wonderful improvement and I hope its reputation isn't permanently excoriated because of the fumbled introduction.
With regard to edge cases, two things:
- I don't think any LMS vendor should try to engineer a gradebook for all eventualities. As I tell my instructors, the Moodle gradebook is not for everybody.
- We examined many edge-cases at the working group and found means by which they could be worked around
Like you Bob, I hope that the experience with this development does not prevent good things happening in future. After all, the gradebook working group is something of a prototype for how it is proposed that the Moodle association should operate, and we all hope that is a success.
Therefore, we should try to learn from what went no-so-well with this particular development, so we can do better in future.
You are right that not many people upgrade early. In fact, the people who do (e.g. Eric Merrill and the team at Oakland, MI) are performing a valuable service to the community. Their feedback bad bug-reports are extremely valuable, and should be leapt upon, not ignored.
Re 1. I only partly agree. Moodle should aspire to be a lego set. Lots of simple parts that can be put together to build a wide range of useful constructions. (Or as Martin D would say, Moodle is a platform.) The gradebook has not really managed that yet, but if it did, it should be able to serve most eventualities.
Tim - I'd agree with much of what you said, but I do want to clarify your point on: Don't be afraid to 'change behaviour' when we are fixing something that is clearly a bug.
That's exactly what happened which led to the change to grademin/grademax. The developer thought they were fixing a bug that wasn't experienced "in the wild". In fact, the effects were widespread among institutions (unknowingly), and many would argue that the bug was instead a feature change.
I'd agree that we shouldn't prevent changes that are clearly a bug fix (this example was not), but we also should be very careful and responsible about changes we make to the gradebook. This regression could have been avoided had the decision been supported by actual data, which may require the developer to reach out to community admins, hosts, and Moodle Partners for access to data sets they may not have. It's an important additional step that allows for data-driven decisions.
In regards to moving forward, I do think there's an opportunity for improvement which I have commented on in the ticket.
Good idea Kris! Is there an established channel for developers to communicate with large Moodle installations like us so we can support decision making when addressing these sorts of issues?
Kris, yes, and no.
The change to the change to grademin/grademax should not have lead to visible grade changes if the work had been done completely and thoroughly.
The problematic part of the change was changing from using grade_items.grademax to using grade_grades.rawgrademax. That was only a problem because, due to other bugs, for a lot of grades, grade_grades.rawgrademax was wrong.
What should have happened was that testing should have identified this, and as part of the bug fix, there should have been database upgrade steps in lib/db/upgrade.php, to fix the incorrect grade_grades.rawgrademax.
If this had happened before the 2.8.0 release. Or even very shortly afterwards, before too many people had upgraded, then there would have been no problem.
It is only really a problem because some people have been experiencing the buggy 2.8.0-5 behaviour for up to six months, and now fixing that bug changes the grades for those people.
So, I don't think this example should be a valid guide to future policy. Do you have other examples?
Tim - I would disagree. The fact that there was a technical solution is slightly separate from the assumption that was made during the decision making process. What would have happened if there was no way to ensure backwards compatibility through data manipulation of the Gradebook tables? In fact, this same technical solution was put forth and discussed at length in the regression bug ticket, though ultimately shot down because it did exactly that.
I don't have examples at my fingertips, but there are definitely ones out there. Typically I find them hiding within lost_functionality tickets, or optimizations that have an oversight and can bring a DB server to its knees. Surely, the OU has experienced their own set of issues that would have been improved if actual end-user data was analyzed at some point in the dev process, and if not then consider yourself lucky.
Even if you discount this example, or other examples, the point remains that there's an opportunity to improve the process in a way that gives insight into actual usage which Moodle HQ may not be able to easily obtain. Data driven decisions are a part of modern software development, the same as focusing on backwards compatibility which you suggested in your original post. In fact, they complement each other. We use these concepts often at Moodlerooms, and they have without a doubt shaped the decision making process for us. Data driven decisions have saved our bacon on more than 1 occasion, but that positive effect is blunted when there's a miss upstream.
To be absolutely clear, I'm not suggesting that we add a heavy step to the development process. I'm definitely not saying that every change requires an analysis of data. What I am saying is that certain assumptions (e.g. Gradebook usage) should be challenged and certain changes should be validated against large data sets. Like your original suggestions, I am hoping that developers will incorporate this into their decision making process, because making data driven decisions will make Moodle better for everyone.
I completely agree with you on data-driven decision making.
I am not prepared to give up on "What would have happened if there was no way to ensure backwards compatibility?" At the moment that is hypothetical, so not a basis for policy setting.
The problem in 2.8.6 is that some people had been running the buggy code for 6 months, and it was assumend (no real data) that they would not want the bugs fixed now.
OU gradebook data
As an aside, and in case anyone is interested in the data, here is a summary of all the OU's grade items, which I gathered just before we ran the upgrade. (I was surpised at how much diversity there was in our gradebook. I though are use was more standardised than that.)
|Grade type||Percentage||Actual grade||Letter||Letter & Actual||Actual & Percentage||Total|
Next are the ones where there was at least one mismatch between grade_items.grademax and grade_grades.rawgrademax (and hence potentially affected by the 2.7 -> 2.8 changes).
|Grade type||Percentage||Actual grade||Letter||Letter & Actual||Actual & Percentage||Total|
Luckily, the ones where there was a real problem (higlighted in pink) were a very small majority, and we could fix them all in one of three ways:
- For aggregations that had been changed form 'Sum of grades' to 'Natural', turn off 'Aggregate only non-empty grades'.
- For calculations, where grade_item.maxgrade was not 100, change it to 100, and change the calculation to account for that. (E.g. if the column was previously [EXPRESSION] out of 200, change it to [EXPRESSION] / 2 out of 100.)
- For manual imported grades where grade_grades.rawgrademax was wrong, change update it to equal grade_items.rawgrade before the upgrade.
Only 3. required direct manipulation in the DB. 1. and 2. could be done through the Moodle UI.
Some examples of previous data-driven decisions in Moodle development
So, I completely agree with testing on real data. When I did MDL-20363 I did extensive testing with real data (first the OUs, then other volunteers). Given the scale of the problem, I even built a tool to extract real data from a 2.0 database, and output it as the skeleton of a unit test that could be used to test the upgrade to 2.1 logic.
This is another quite good example of evidence based decision making https://moodle.org/mod/forum/discuss.php?d=175930 (even though the answer it gave me is not one I wanted to hear.)
However, there are logistical difficulties (mainly data protection concerns that have been discussed at various Hackfests). However, we need to overcome those difficulties.
I think that maybe the major overhaul of the grade-book has just brought some major design errors to the surface (e.g. Maybe it is saving a little space in the database to use one field for multiple things (extra credit/weighting) but the design was short sighted and has now and previously caused huge issues.)
I like the overall idea of how the grade-book team worked but there was very little input from the community at large. If you had no way of connecting with the original team, there was no real way to interact with them. A change like this should have had a lot of beta versions that we could have tested - or did I just miss that phase?.. Or at least a lot of forum input...
For those of us that have put in input since the release, it has been dismissed because "a great team decided on this and that is the way it is..." which leads me to agree with Tim's comments. When Tim thinks about a change for the quiz area, he takes it to the community first and lets us hash it out (often leading to pages of input that he has to sift through), and I really appreciate that he is truly being community minded. If the gradebook team had done that or more of that earlier, perhaps these issues would have come up before they were integrated to core..
As an original migrant to Moodle for the gradebook, I have followed the progress pretty closely, and I realize that this group took on a lot and really tried to make improvements. But now that the gradebook is out there, it appears that there is little interest in community feedback - this is one of the first posts I have seen requesting that...
For example, a lot of basic functionality has been removed with the removal of the advanced view and admins are screaming to get it back or have a way to see all the settings that are affecting the grade and are being told "No because it was confusing to have two views." Confusing to who, I wonder?
I would hate to see Moodle get caught up in the "Let's simplify everything so everyone can navigate it but at the same time, limit the ability of those who have the ability to go to the next level" philosophy that is invading the technological world. Let's not be Windows 8!
@Emma, two extensive surveys with hundreds if not thousands of respondents would qualify as significant community input. No practical dynamics would have permitted input at the time of the working group.