We are evaluating the Students in risk of dropping out model in Moodle 3.8, trying to improve the evaluation accuracy.
After after running evaluation process and exporting the training-data file, we always get the same number of lines in the csv file, regardless of the number of courses/students evaluated. in our case this is 9379 lines , of which 9376 are sample records, divided into 4 ranges that is 2344 samples per range.
Our data scientist guy is using the training-data externally in Azure ML framework, in order to evaluate the results. However he is confused of that fact.
Can someone elaborate more on the representation of each sample line in the csv file with relation to the data set used for evaluation? How come this is a fixed number with no correlation to the number of students evaluated?
My guess is some kind of a multiplication of indicators with other parameters. Yet, further clarification is most welcomed.