Trouble with Moodle's "Stats" math.

Trouble with Moodle's "Stats" math.

by Brian Jones -
Number of replies: 5
I'm running Moodle 1.6.2, but I believe I confirmed that 1.7+ has this problem as well.

In short, I'm not sure how moodle is performing its standard deviation calculations at all. I used to think it was just treating dashes as zeroes, but now I'm seeing really *really* odd behavior and I wonder if someone can help me track down *how* moodle calculates stddev. Here's what I have:

I have a list of grades for an assignment:

93 91 91 90 88 87 81 81 78 76 75 75 73 73 69 68 67 67 65 58 45 22 2 -

Note there's a grade at the end in the form of a '-'.

Then I wrote a really dead simple Python script to calculate the standard deviation. You can cut-n-paste the script below. It should work on any computer that has python installed, regardless of platform.

If you call that script and pass it all of the above grades replacing the "-" with a zero, the output looks like this:

=========================================
strudel:~ jonesy$ ./stddev.py 93 91 91 90 88 87 81 81 78 76 75 75 73 73 69 68 67 67 65 58 45 22 2 0
[93, 91, 91, 90, 88, 87, 81, 81, 78, 76, 75, 75, 73, 73, 69, 68, 67, 67, 65, 58, 45, 22, 2, 0]
Number of grades: 24
Total sum of all grades: 1615
Mean average: 67
Subtracted mean from each observed value: [26, 24, 24, 23, 21, 20, 14, 14, 11, 9, 8, 8, 6, 6, 2, 1, 0, 0, -2, -9, -22, -45, -65, -67]
Squared each value in above list: [676, 576, 576, 529, 441, 400, 196, 196, 121, 81, 64, 64, 36, 36, 4, 1, 0, 0, 4, 81, 484, 2025, 4225, 4489]
Added all values in above list: 15305
Variance: 665
Standard Deviation: 25.7875939165
Most grades fall between 92.7875939165 and 41.2124060835
=========================================

Now, if you *exclude* the dash from the sample population, your output will look like this:

=========================================
strudel:~ jonesy$ ./stddev.py 93 91 91 90 88 87 81 81 78 76 75 75 73 73 69 68 67 67 65 58 45 22 2
[93, 91, 91, 90, 88, 87, 81, 81, 78, 76, 75, 75, 73, 73, 69, 68, 67, 67, 65, 58, 45, 22, 2]
Number of grades: 23
Total sum of all grades: 1615
Mean average: 70
Subtracted mean from each observed value: [23, 21, 21, 20, 18, 17, 11, 11, 8, 6, 5, 5, 3, 3, -1, -2, -3, -3, -5, -12, -25, -48, -68]
Squared each value in above list: [529, 441, 441, 400, 324, 289, 121, 121, 64, 36, 25, 25, 9, 9, 1, 4, 9, 9, 25, 144, 625, 2304, 4624]
Added all values in above list: 10579
Variance: 480
Standard Deviation: 21.9089023002
Most grades fall between 91.9089023002 and 48.0910976998
=========================================

However, when I click "stats" in Moodle for the same set of grades, I get this:

Highest: 93
Lowest: -
Average: 67.29
Median: 74
Mode: 67, 73, 81, 75, 91
Standard Deviation: 50.50


Can anyone explain how the standard deviation in Moodle is between 20 and 30 points different compared to my own calculations?



#!/usr/bin/python

import sys
import math

grades = [int(x) for x in sys.argv[1:]]
print grades

def getstats(grades):
ttl = sum(grades)
numgrades = len(grades)
avg = ttl/numgrades
avgsub = [val - avg for val in grades]
square_avgsub = [pow(val,2) for val in avgsub]
add_squares = sum(square_avgsub)
variance = add_squares / (numgrades-1)
stddev = math.sqrt(variance)
lowbound = avg - stddev
hibound = avg + stddev
print "Number of grades: %s" % numgrades
print "Total sum of all grades: %s" % ttl
print "Mean average: %s" % avg
print "Subtracted mean from each observed value: %s" % avgsub
print "Squared each value in above list: %s" % square_avgsub
print "Added all values in above list: %s" % add_squares
print "Variance: %s" % variance
print "Standard Deviation: %s" % stddev
print "Most grades fall between %s and %s" % (hibound, lowbound)

getstats(grades)


In reply to Brian Jones

Re: Trouble with Moodle's "Stats" math.

by Steve Hyndman -

If you take the sum of squares in the first example, 15305, and divide by 6, then you get 50.50 as the standard deviation. Of course, that's using n-18 thoughtful

Steve

In reply to Steve Hyndman

Re: Trouble with Moodle's "Stats" math.

by Brian Jones -
Right. Er. Well. That would seem to be problematic. Anyone know what might make moodle divide by n-18 instead of, say, I dunno, n-1?

smile


In reply to Brian Jones

Re: Trouble with Moodle's "Stats" math.

by Steve Hyndman -

"That would seem to be problematic."

Ha...yea, that would probably get you an "F" on stats test smile

In reply to Brian Jones

Re: Trouble with Moodle's "Stats" math.

by Brian Jones -
So it seems that everyone is busy giving input to the 1.9 gradebook, which I'm sure is extremely cool, but I'm not going to be able to move forward with moodle unless I can demonstrate that it does the simple things properly first. Can anyone in the moodle development community speak to the issues I'm having here?
In reply to Brian Jones

Re: Trouble with Moodle's "Stats" math.

by Dan Poltawski -
Hi Brian,

I've filled this bug for you: MDL-9472

I'm not really clued up on Standard Deviations in order to debug this ;)