A Data set for a research

Re: A Data set for a research

by Elizabeth Dalton -
Number of replies: 0

This is a very common question, and we are working on trying to provide a better answer.

There is a dataset available in our Moodle Research repository: https://research.moodle.net/158/

This data comes from the “Teaching with Moodle August 2016” MOOC. It is not the full database, but does include the full logs and several other key tables. The data is anonymised and participants gave their permission for this data to be used for research.

We are also working on tools to anonymise Moodle data sets so they can be shared between researchers. However, this has been very difficult. While it is fairly simple to strip out or encrypt text field values, it would still be possible to identify a single user by numerical values (such as grades, forum timestamps, etc.) if someone had the full detail for that single user, and from there it would be possible to expose the identities of many or all of the users. This does not meet the requirements of GDPR, or other legal requirements we are aware of around the world.

Researchers may be interested in contributing to the development of this plugin: https://github.com/emdalton/moodle-local_pseudonymise

This plugin is meant to more aggressively obscure identities on a cloned site by introducing a small amount of random variance in numerical values. I have not had time to work on it recently, but when complete, it should make it much safer to share data between researchers.