Learning analytics training-csv file

Learning analytics training-csv file

by Yuval Jacobi -
Number of replies: 2

Hi,

We are evaluating the Students in risk of dropping out model in Moodle 3.8, trying to improve the evaluation accuracy. 

After after running evaluation process and exporting the training-data  file, we always get the same number of lines in the csv file, regardless of the number of courses/students evaluated. in our case this is 9379 lines , of which 9376 are sample records, divided into 4 ranges that is 2344 samples per range.

Our data scientist guy is using the training-data externally in Azure ML framework, in order to evaluate the results. However he is confused of that fact.

Can someone elaborate more on the representation of each sample  line in the csv file with relation to the data set used for evaluation? How come this is a fixed number with no correlation to the number of students evaluated?

My guess is some kind of a multiplication of indicators with other parameters. Yet, further clarification is most welcomed.

Best Regards,

Yuval.



Average of ratings: -
In reply to Yuval Jacobi

Re: Learning analytics training-csv file

by Elizabeth Dalton -
Hi Yuval,

Is the evaluation process completing? Do you have a log file of its status?

Separately, has the model training task completed? Do you have a log file of its status?

The evaluation process does not actually populate the training data -- evaluation and training are two separate steps. Training prepares the model for use. Evaluation checks the accuracy of a model by breaking up past data into "training data" and "test data" and uses only the "training data" to train the model, then tests its predictions on the "test data". Because the evaluation process only uses some of the available data to train the model, its results are not what are used to make ongoing predictions.

It sounds like your model has looked at 2344 enrolments to train the model. Does this match how many student records you expected for completed enrolments? If not, have you checked the "invalid samples" report to see if there is a problem with your data? For example, if your prior courses don't have start and end dates defined, those courses would not be considered "valid" for training the model.

Probably we need clearer displays of the status of models (trained, evaluated, data used, etc.) Proposals you might want to view and vote for in the tracker:
  • MDL-62348  Invalid site elements report needs summary
  • MDL-62302  Improve analytics models display and administration 
If you can provide more information about your evaluation and training process logs, I'd be happy to help you troubleshoot further. smile

Best regards,

Elizabeth
In reply to Elizabeth Dalton

Re: Learning analytics training-csv file

by Yuval Jacobi -
Hi Elizabeth,

Thank you so much for your detailed answer.

All processes were completed successfully.
Invalid site elements returns empty report.

The last evaluation task I ran was on a single course with less than 1000 students.
Before the task was executed I have cleared all courses from the course table (except for this single course), purged all analytics data from the DB (except for the analytics_models table), and deleted model files from moodledata/model folder. I also purged caches. So this was like a fresh installation run on a single course. Yet I got the same figures in the CSV file, as described above.

Anyway, I'm guessing this is not an issue that should concern us too much: It's not really a problem, but rather an internal algorithm setup of which we currently do not understand in full.

Best regards,
Yuval.