I'm a new researcher who is working on of the school who use Moodle. My research simply is working on a data set from Moodle Database and using some data mining techniques to classify students. I got a data set from the school I'm working in. But I'd like to prove my research with a better data set.
Finally, my request is, I There are donor who could send me the data set of Moodle for any educational institution? or even a website that provide free test files ?
Thanks in advance
it is so nice to hear that others are researching with moodle-to some degree. I am not sure this is the appropriate place for your question, however, my response includes two immediate points:
-your project sounds interesting
-the first hurdle here-appears to me that there are ethical considerations in relation to requesting a data set elsewhere-other than within the context of the place of your research
What do you think about that?
Be good to hear more about your study.
p.s. there may be a Technology Enhanced Learning (TEL R&D) forum in the future-you never know..........
Dear Asmaa Galal
I am currently working on my research problem on e-learning. I have developed a framework to which can integrate Moodle and data mining tools in order to facilitate data pre-processing, data mining tasks for non-expert users. However, I'm facing the same problem you had when you posted here, which is I don't have real Moodle data to test.
I'm wondering if you can provide your Database. This would be a huge help for me to proceed with my research. This data will be used only for research purpose please.
Kindly help. Many thanks in advance
I am a new researcher and focusing on e learning and data mining, could you please help me in this context. my mail ID is sushumnaraoatgmaildotcom. Have you got any data sets. Can we further discuss on this? Is there a chance to extend your problem statement.
Dear Asmaa Galal,
I'm a student of the Politechnical University of Catalonia (Spain), and currently I'm doing my final degree's project about applying machine learning to moodle platform.
having some difficulties trying to obtain valid data for my experiments.
If it is possible, I'd really appreciate if you could share that data
or at least indicate me some place where obtain another one. I'm not
interested in the students names, just the performance values.
Thank you in advance.
A dataset I've looked at in the past is the Open University Learning Analytics Dataset.
It doesn't really state it is from a moodle site (just a virtual learning environment) but judging by the activity types listed in the dataset I feel like it's safe to assume it originally came from a moodle instance.
It seems to have been restructured and organised though, since it doesn't appear to be structured like it was taken straight out of the database.
Not sure if this helps anyone but I thought it wouldn't hurt to share.
Roughly speaking, we ship our logs out of Moodle, applying some filtering, and dump them in a big data-warehouse for the researchers to user. The aim of the filtering is to reduce the total volume of data, while (hopefully) not losing anything important. Also, we have a lot of our own activities (e.g. ouwiki, forumng) rather than the standard ones.
Anyway, I am basically confirming that your guess is right.
This is a very common question, and we are working on trying to provide a better answer.
There is a dataset available in our Moodle Research repository: https://research.moodle.net/158/
This data comes from the “Teaching with Moodle August 2016” MOOC. It is not the full database, but does include the full logs and several other key tables. The data is anonymised and participants gave their permission for this data to be used for research.
We are also working on tools to anonymise Moodle data sets so they can be shared between researchers. However, this has been very difficult. While it is fairly simple to strip out or encrypt text field values, it would still be possible to identify a single user by numerical values (such as grades, forum timestamps, etc.) if someone had the full detail for that single user, and from there it would be possible to expose the identities of many or all of the users. This does not meet the requirements of GDPR, or other legal requirements we are aware of around the world.
Researchers may be interested in contributing to the development of this plugin: https://github.com/emdalton/moodle-local_pseudonymise
This plugin is meant to more aggressively obscure identities on a cloned site by introducing a small amount of random variance in numerical values. I have not had time to work on it recently, but when complete, it should make it much safer to share data between researchers.