just wanted to share one idea I had annotated in my (big) list and, perhaps, could be an interesting GSOC project.
Since some weeks ago, Moodle includes a nice spam checker able to look for some words into user profiles and posts.
My proposal is about to make that to evolve to something more complete and (I hope) useful for the Community.
And here it's the (general, in 10 points) outline of the idea, feel free to comment / discuss / modify... whatever:
Spam central server
- 1. each time any registered server marks something as spam... send it to central server (if agreed and configured to do so)
- 2. information provided could be (originator IP, hashed username and email, spam content).
- 3. central server hashes it, keeping one counter of occurrences of each text.
- 4. any text with >X occurrences (or another algorithm), is considered "official" spam.
- 5. one list of "official" spam hashes is publicy available.
- 6. any moodle site can download it and will be used by the spam report.
- 7. those hashes/texts can be of interest in other FLOSS projects (interchange option).
- 8. can evolve to more complex ways of detection (not only hash based, but looking for other types of similarities - linguistic, users, IP...).
- 9. can evolve out from the limits of current spam-checker so anybody in a site has one option to mark as spam any content (via capability), just for review of admins, for direct sending of information to SPAM server or both.
- 10. also, can be checked each time one content is going to be saved to DB (option) using it as an automated-non-blocking reporting tool, or as a blocking one ( configuration ), informing admins about suspicious senders/contents) or blocking them
- ... anything else.
Note I'm not sure about which is the best way to achieve that (so I designed it to work with simple hashes), sure there are a lot of ways to analyse senders / contents. I just don't know about them. But the tool, as outlined in points 1-6, will, for sure, detect repeated SPAM content just based in moodlers interaction with Moodle.
Also, note points 7-10 are just potential (and possible) "expansions" for the project. The basic project itself is covered by "only" points 1-6
Finally note I cannot volunteer (time availability) for any sort of mentor-ship in GSOC so, if anybody is interested.. and this gets interest enough. just take the baton.
And that's all, I hope it can serve, at least, as the origin for better and more ellaborated ideas related to the SPAM problem. I just tried to share it here (cleaning it from my Ideas TODO list).