Hi,
The current spam cleaner is a trivial spam cleaning tool which uses only keyword match as spam detection mechanism which is a very trivial and ineffective way of finding out spam. With increase hunt for less guarded platforms by spammers, Moodle can very soon became a playground for them. On addition to that many Moodle based sites are .edu which is extremely lucrative target for spammers to spam and get backlinks.Thus i believe a major upgrade is needed to the spam tool.
Here are a few suggested solution
Intelligent local database
- Keep a updated log of all spamming ips, emails and hashes of common spam texts.
- Any new posts must go through a screening process which will try to match it with the existing database and flag it as clean or "PROBABLE SPAM".
PROS and CONS
- Can grow intelligent with time as the database grows.
- If connected with global database can be highly deadly spam killer.
- If Administrator chooses not to connect to global database it would be less effective.
Intelligent global database
- This idea is clearly explained by Eloy Lafuente (stronk7) HERE
Quoting
- 1. each time any registered server marks something as spam... send it to central server (if agreed and configured to do so)
- 2. information provided could be (originator IP, hashed username and email, spam content).
- 3. central server hashes it, keeping one counter of occurrences of each text.
- 4. any text with >X occurrences (or another algorithm), is considered "official" spam.
- 5. one list of "official" spam hashes is publicy available.
- 6. any moodle site can download it and will be used by the spam report.
- 7. those hashes/texts can be of interest in other FLOSS projects (interchange option).
- 8. can evolve to more complex ways of detection (not only hash based, but looking for other types of similarities - linguistic, users, IP...).
- 9. can evolve out from the limits of current spam-checker so anybody in a site has one option to mark as spam any content (via capability), just for review of admins, for direct sending of information to SPAM server or both.
- 10. also, can be checked each time one content is going to be saved to DB (option) using it as an automated-non-blocking reporting tool, or as a blocking one ( configuration ), informing admins about suspicious senders/contents) or blocking them
PROS AND CONS
- A combination of a local database and a global database can prove to be extremely effective.
- An API can be coded and provided to users, so that they can directly access the global database in real time and get flaged result for a particular post.This can also be developed to fully api based spam cleaner and remove local database from context.( if desired)
- However we must note that it will increase the load on the user and Moodle's servers (Worth paying i guess?).
Third Party APIs
- There are many third party APIs which provide excellent spam detection mechanism.
- Some examples are Akismet (although basically designed to fight comment spam,still can be configured to use against forum spam) , Stopforumspam or any other API which the Moodle developer team thinks is trust worthy.
PROS AND CONS
- Can kick start spam killing immediately.(Local and global database will take time to buildup).
- Some Apis offer spam detection based on some complex algorithms considering a lot of factors which can prove to be extremely effective.For example, for me, Akismet kills upto 99% of spam comments in my WP blogs.
- No need to maintain synchronized local and global databases.
- Only catch here is we will have to depend on a third party for this module to work..however if we choose a trustable third-party API,this shouldn't a be a issue.
Suggestions are welcome.
Thanks
Ankit Agarwal