"Learn Moodle August 2016" open data set now available!

"Learn Moodle August 2016" open data set now available!

by Elizabeth Dalton -
Number of replies: 11
We are pleased to be able to announce that an open, anonymised data set drawn from the "Learn Moodle August 2016" MOOC is now available for learning analytics and other researchers!  The data set contains the following files:

File name

Description

mdl_badge_issued.csv

This file contains the records of all badges issued to users during the Aug 16 session of “Teaching with Moodle.”

mdl_course_modules.csv

This file contains records describing each activity in the "Teaching with Moodle" course.

mdl_course_modules_completion.csv

This file contains records of each user’s completion of each activity in the course.

mdl_grade_grades_history.csv

This table keeps a historical record of individual grades for each user and each item, exactly as imported or submitted by modules.

mdl_logstore_standard_log.csv

This table contains entries for each “event” tracked by the Moodle logging system, and is the source for all the Moodle “log” reports

mdl_user.csv

This is the table containing user records.


The data set is paired with a readme file that explains its use in more detail. To access, visit:

http://research.moodle.net/158/
Average of ratings: Useful (3)
In reply to Elizabeth Dalton

Re: "Learn Moodle August 2016" open data set now available!

by Tim Hunt -
Picture of Core developers Picture of Documentation writers Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers

Are you on https://oerworldmap.org/? Should you be?

In reply to Elizabeth Dalton

Re: "Learn Moodle August 2016" open data set now available!

by Nadav Kavalerchik -
Picture of Core developers Picture of Plugin developers Picture of Testers Picture of Translators

Beautiful initiative! 

Hope it will set a precedent to a formal secure workflow for institutes to share their anonymized user's data.  as it seems, on our side of the globe, that we are diving deeper into more and more privacy laws and concerns.  as people are simply and generically seem more afraid than before with all the data that is gathered about them and the unclear implication of sharing it on the short and long run. 

In reply to Nadav Kavalerchik

Re: "Learn Moodle August 2016" open data set now available!

by Elizabeth Dalton -

Yes, we are also hoping this will help researchers to learn more about learning without putting individually identifying data at risk. There are reasons for individuals to be concerned about this loss of privacy, but there are still ways to gather useful data that can be used to improve learning outcomes without exposing individual learners and teachers.

In reply to Elizabeth Dalton

Re: "Learn Moodle August 2016" open data set now available!

by Dmytro Kovalchuk -

Hi Elizabeth

I am currently working on my capstone project as part of my Data Science program (http://www.galvanize.com/). I decided to use this dataset as a basis of my project. I have quite a few questions regarding the data provided. Mostly trying to understand how to interpret 'mdl_logstore_standard_log.csv' file correctly.

Please let me know if this is this the right forum to ask questions related to this data set?    

Right now my main questions are related to 'contextid'.  What does 'contextid' = 1 and 0 stand for? Are these activities related to the course '10464' at all?  

Thank you!

Dmytro

In reply to Dmytro Kovalchuk

Re: "Learn Moodle August 2016" open data set now available!

by Elizabeth Dalton -

Hello Dmytro,

I am glad to hear the data set is helping with your capstone project. You can ask questions here and I will do my best to answer.

Contextid 0 and 1 refer to the site level, so these items are not specific to the course 10464.

Best regards,

Elizabeth

In reply to Elizabeth Dalton

Re: "Learn Moodle August 2016" open data set now available!

by Dmytro Kovalchuk -

Hi Elizabeth,

Thank you so much for your response! Much appreciated

My next questions are about completions and badges.

1. What criteria was used to mark users as Completed? I am trying to build a prediction algorithm that measures the risk of drop out/successful completion by each week of the course. 

I am trying to define a meaningful dependent variable to measure a completion. Can you provide any insight here?  

2. Also, there appear to be some discrepancies between data and the description. Badges table (mdl_badge_issued.csv) does not have badge names, only unique hash values. Which makes it impossible to see what type of badge students received. Is there a way to get these values?

Once again, thank you for you help.

P.S. I will be happy to share my results if you are interested.

In reply to Dmytro Kovalchuk

Re: "Learn Moodle August 2016" open data set now available!

by Avi Segal -

Hi Dmytro, 

Did you get an answer regarding the badges names? Knowing which type of badges each student got can also help us in our data analysis.

Thanks,


Avi.


In reply to Avi Segal

Re: "Learn Moodle August 2016" open data set now available!

by Elizabeth Dalton -
For this and all similar questions, please see the "readme" file that accompanies the data set:

https://research.moodle.net/158/3/anonymiseddatasetreadme.pdf

This answers the questions about why badges were awarded, as well as many other details of the data structures.

Regards,

Elizabeth
Average of ratings: Useful (1)
In reply to Elizabeth Dalton

Re: "Learn Moodle August 2016" open data set now available!

by Avi Segal -

Hello Elizabeth,

Thank you for your repy.

I've read the file carefully when trying to use the data and I may be missing something.

The documentation clearly states that the badge name is supplied in the badges file.

Unfortunately the data itself seems to contain only anonymized versions of the badges names (a fact not mentioned in the documentation). This seems a mistake in the data post processing as it basically disables any analysis on specific badges (the same badge name seems to be anonymized differently on different occasions).

I may be missing something here and will be happy to get additional assistance.

Thanks,

Avi.