Link crawler robot

Administration tools ::: tool_crawler
Maintained by Catalyst IT, Brendan Heywood, Daniel Thee Roperto
An admin tool robot crawler which scans your moodle for broken, large or slow links.
Latest release:
164 sites
115 downloads
37 fans
Current versions available: 2

It is an admin tool with a moodle cron task, but it reaches into your moodle via curl effectively from outside moodle, and scrapes each page, parses it and follows links. By using this architecture it will only find broken links that actually matter to students. Because it comes in from outside it needs to authenticate and has a dependancy on the moodle-auth_basic plugin. It is recommended that you setup a dedicated 'robot' user who has readonly access to all the site pages you wish to crawl. You should give the robot similar capabilites that real students will have.

Screenshots

Screenshot #0
Screenshot #1
Screenshot #2
Screenshot #3

Contributors

Catalyst IT (Lead maintainer)
Daniel Thee Roperto: Coder at Catalyst IT Australia
Please login to view contributors details and/or to contact them

Comments RSS

Erakutsi iruzkinak
  • Mathew Gancarz
    ar., 2018(e)ko uzt.ren 17(e)an, 05:19(e)tan
    Or 3.5?
  • Steve Pollock
    lr., 2018(e)ko aza.ren 10(e)an, 09:14(e)tan
    Should this be working in 3.3 or 3.4 now? I see you were doing some work on the basic auth so wanted to check in.

    Couple of other questions;
    1. Is a valid cert required? Your curl command fails due to self-signed cert on my dev machine. can override with -k
    2. We are running OKTA/SAML as the default login, again your curl test command returns the OKTA login rather than going to the page. The user is set for basic auth but our default is SAML.

    thanks
  • Ricardo Caiado
    ar., 2019(e)ko urt.ren 8(e)an, 09:16(e)tan
    Hi,

    Any updates to M3.6?

    Ricardo
  • Ben Haensel
    og., 2019(e)ko urt.ren 10(e)an, 23:54(e)tan
    It would be great to see if this could be updated for 3.6! I can see that the BB community is advocated to get this added to their plugin set as well: https://community.blackboard.com/thread/7082-broken-link-checker-plugin-please - Ben, BlueSky Online School, MN
  • dhirendra singh
    ar., 2019(e)ko ira.ren 10(e)an, 20:59(e)tan
    Hi,
    Any one help me about why Progress ETA is more than 5 year from crawl start date in robot status tab.

    Progress 1.53% ETA in Monday, 21 April 2025, 4:55 AM | Reset Progress
  • Adam Gogo
    ar., 2020(e)ko ots.ren 4(e)an, 03:51(e)tan
    Hello,

    I've installed the plugin to my Moodle 3.4.5 environment that is using a SQL Server database and i'm getting errors from the plugin. When looking into the code, I found that there is SQL specific to MySQL and is not supported by SQL Server. Also I found that the table mdl_tool_crawler_url has a field called "external" which is a keyword in SQL Server.

    Has anyone got this plugin to work with a SQL Server DB? Did you run into the same issues I have?

    ~Adam
  • Muhammad Sajjad Hussain Abid
    og., 2020(e)ko mar.ren 26(e)an, 21:26(e)tan
    Hello Friends,

    Any update to Moodle latest version, please?

    Sajjad Hussain
  • Phineas Gomez
    lr., 2020(e)ko uzt.ren 11(e)an, 23:29(e)tan
    This plugin works with the URL located on quiz feedback?
    I'm testing it but not working triste
  • Greg Myles
    ar., 2021(e)ko urt.ren 5(e)an, 23:52(e)tan
    Would I be right in thinking that the crawler is unable to check H5P content?
  • pgmoodle dundee
    or., 2021(e)ko eka.ren 18(e)an, 19:09(e)tan
    Hello everyone,
    could someone give me an idea how to make the bot to search into a specific module/course only rather than the whole Moodle site. I am new in using it and doing some testing now - so will appreciate any advice. Thank you!
  • Iron Man
    og., 2021(e)ko aza.ren 18(e)an, 01:53(e)tan
    Is this plugin working on Moodle 3.10? We upgrade to 3.10 this summer and it stopped working.
  • Danijel Todic
    az., 2023(e)ko abu.ren 30(e)an, 21:04(e)tan
    Moodle 4?
  • Leandro Falcón
    ar., 2023(e)ko urr.ren 24(e)an, 00:06(e)tan
    Hi, I started a crawling process and it never stopped, this is the mark that the panel throws.

    Crawl start Crawl end Duration Cron ticks Total URLs Total links Broken links Big / slow links
    oct 21, 19:28:10 - 41:26:49 7 7.825 111.693 379 287
  • Ragan Chastain
    az., 2023(e)ko abe.ren 13(e)an, 06:44(e)tan
    Will the link crawler check links in SCROM content?
  • Blair F.
    or., 2024(e)ko urt.ren 26(e)an, 03:40(e)tan
    I just found and installed this and am also getting that, but I believe it's because of our single-sign-on setup (SAML?). The redirect is to
    https://login.microsoftonline.com/9d83cfc7-6330-47d5-b18d-45bafe3b1d87/saml2?SAML... blah, blah... Is there any way around this?
1 2 3
Please login to post comments