Extending the machine learning classifier to allow multi-class discrimination

Extending the machine learning classifier to allow multi-class discrimination

by Vlad Apetrei -
Number of replies: 7

Hello, just wanted to announce that as part of google summer of code I will be working on adding multi-class classification to the machine learning backend. This will be done over the course of the summer and will expose functionality from python Tensorflow and the phpml library. The project mainly involves writing the train(), classify() and evaluate_classification() methods for both the php and python backend. 

The plan is to tackle the php endpoint first which should be done by July and then the python backend by August in order to have everything ready for testing. Everything is expected to be completed by September.

Average of ratings: -
In reply to Vlad Apetrei

Re: Extending the machine learning classifier to allow multi-class discrimination

by David Monllaó -
Welcome Vlad.

The project Vlad will work on will allow us to create predictive models with more than 2 classes. At the moment we only support binary classification (e.g. "is this student at risk?" yes / no). Once this project is completed we will be able to build models like "how often will this user access the course? never / in monthly basis / in weekly basis / in daily basis.
In reply to Vlad Apetrei

Re: Extending the machine learning classifier to allow multi-class discrimination

by Vlad Apetrei -
A bit of report on the progress made so far: it turns out the php backend already supported multi-classification out of the box through one vs rest regression implemented in php-ml. I have therefore started working on unit tests that include this aspect. However, it's still work in progress.
This is a link to my forked repo and specifically to the branch I made for the php related aspects of adding the multi-classification capabilities:
https://github.com/valadhi/moodle/tree/binary_classification_php
In reply to Vlad Apetrei

Re: Extending the machine learning classifier to allow multi-class discrimination

by Vlad Apetrei -
Another little progress report. I have now completely handled the php side of the backend. I have written unit tests to check that both training and prediction work well in the case of classification done through php-ml by adding custom test targets and indicators specifically for this case. It's on to the python backend now.
In reply to Vlad Apetrei

Re: Extending the machine learning classifier to allow multi-class discrimination

by Vlad Apetrei -
Hello,

I have now finished rewriting the python backends. The training and prediction methods were pretty well written and didn't need many changes on the tensorflow graph. The evaluation side was a bit more detailed as it needed some custom handling for different reports related to n_classes > 2. It should be able to plugin perfectly into the existing moodle functionality if you simply specify in the header of the dataset the number of classes and their labels. I still need to make perfectly sure everything is working nicely and perhaps write some automated tests, but it seems to be fine based on my manual testing.
In reply to Vlad Apetrei

Re: Extending the machine learning classifier to allow multi-class discrimination

by Vlad Apetrei -
The Google Summer of Code project is now finised and this is the final URL description of the project with all the links to the repositories: https://gist.github.com/valadhi/992b808005a2cf7b7988276aa1533c92
I still want to make some changes regarding the scoring method for model evaluation from MCC to F1 and also hopefully work on identifying and implementing new machine learning uses like new models to use or activity recommendations.
In reply to Vlad Apetrei

Re: Extending the machine learning classifier to allow multi-class discrimination

by David Monllaó -
Thanks for your work in GSOC 2019 Vlad.

Vlad's GSOC contribution to the Moodle LMS can be found in https://tracker.moodle.org/browse/MDL-58992. The machine learning layer in Moodle 3.8 has now multi-class capabilities, which means that the predictive models run by Moodle are able to classify students in more than 2 classes. An example of this would be a predictive model where students are classified in very low grade, low grade, average grade, high grade and very high grade. Until now we were only able to classify students in 2 classes (e.g. pass / fail).

Vlad not only completed his Google Summer of Code project, but also contributed a second improvement to the Moodle LMS: https://tracker.moodle.org/browse/MDL-66476, where he modified the algorithm that generates an overall score of a model.

It was a pleasure working with you Vlad.