Site delisted and never going back up

Site delisted and never going back up

by DAVID YANG -
Number of replies: 19

My site has been delisted for almost 3 weeks now, and below is a photo of my stats. It is within the requirements of having an active moodle site, and the listing of my site helped my search engine results by quite a bit.

Does anybody know how to get a site back onto the stats.moodle.org list again?kl;

Average of ratings: -
In reply to DAVID YANG

Re: Site delisted and never going back up

by Mary Cooch -
Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Testers Picture of Translators
Hello David. I have looked at your site and can offer some suggestions but would you prefer to contact us directly (ie contact Moodle HQ privately) or are you happy for me to post on here?
In reply to Mary Cooch

Re: Site delisted and never going back up

by DAVID YANG -
Dear Mary,

Hello again. I would be fine with either one, I'd just like the exact same advice. (It appears that you like to make your contributions public, so let's do that)

But then again, how do you contact moodleHQ privately?


Sincerely,
David Yang
In reply to DAVID YANG

Re: Site delisted and never going back up

by Mary Cooch -
Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Testers Picture of Translators
Hello there! So I looked at your site and it seems that on 28 November when our linkchecker tried to access the site it got a 403 Forbidden error message, and was not able to connect. So I suggest you check your webserver rules to see if there was any maintenance on your site around that time.
And as for contacting HQ, if it is about site registration or account problems, you can get the contact link in our Contact page  (in the Finally section)
In reply to Mary Cooch

403 error preventing site registration

by DAVID YANG -

Wait, so by my sheer trash luck, every single time the link checker tries to access the site, it gets a 403? 

Could this be cloudflare getting in the way of the linkchecker?  

I... don't know what's going on. 


(Edited by Mary Cooch to change subject title - original submission Friday, 4 December 2020, 7:30 PM)

In reply to DAVID YANG

Re: 403 error preventing site registration

by Mary Cooch -
Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Testers Picture of Translators
The link checker checks sites at intervals, and that was the last time it checked it and yes, it got a 403 error. I am not technical but I understand that is an Apache configuration error, so you might like to see if any changes were made, as in my previous post.
Average of ratings: Useful (1)
In reply to Mary Cooch

Re: 403 error preventing site registration

by DAVID YANG -
Is there a way for me to know how the scraper works or something? Or what time the scraper made those requests? Or if there's a way to scrape the site now? Sorry for all the questions.
And also, my site is like... always online. And then moodle scraper says its... always offline. Which is weird. I checked my cloudflare up time and there were requests made on November 28, but without the exact time I don't know what's going on either. sad
In reply to DAVID YANG

Re: 403 error preventing site registration

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators

Hello David. Sorry to hear about your troubles to get the site registered. Let's work together to sort it out.

Our site checker is a simple HTTP client that perform some standard requests to your Moodle server and attempts to make sure that it is a valid Moodle installation.

Your site https://usacotutor.com/lms was last successfully checked on 2020-11-01 16:30:27 UTC. After that, whenever it was re-checked, your server responded with 403 Forbidden HTTP status and did not allow our crawler to evaluate the site. it happened four times since then on 2020-11-08 11:00:35 UTC, 2020-11-15 05:30:29 UTC, 2020-11-22 00:00:22 UTC and 2020-11-28 18:30:12 UTC.

Our checker is hosted at AWS. Chances are that you have a protection deployed that blocks access to your site for clients like this?

In reply to David Mudrák

Re: 403 error preventing site registration

by DAVID YANG -
Dear Smart David,

Yea, the crawler probably got blocked by Cloudflare. I disabled it on December 5, 2020, so I will be waiting for the next batch of scrapes. If the issue persists, please let me know.

Also, shouldn't the scraper already have crawled in december? Did it touch mine yet?

Sincerely,
Dumb David
In reply to DAVID YANG

Re: 403 error preventing site registration

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators

Dear Smart David

Haha. I'm not smart. I'm just a janitor who happens to have access to the right databases.

shouldn't the scraper already have crawled in december?

It re-checks sites every 7 days. But if the site is unreachable, it only repeats the check for the first few attempts. After the 4-th unsuccessful check (such as in your case), the delay period is prolonged to 90 days. So it would re-check your site again but after 3 months.

However, I've reset the unreachable counter for your site so it should try it again soon. I can't tell you any particular time, just that your site is back in the queue of sites to be checked.

Average of ratings: Useful (1)
In reply to DAVID YANG

Re: 403 error preventing site registration

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators

Our script checked your site at 17:30 UTC but it still got 403 Forbidden response sad

Maybe you can configure your site and add an exception for our checker?

It sets the User-Agent to the value MoodleBot-LinkChecker (+https://docs.moodle.org/en/Usage) and it will have one of AWS IPs

In reply to David Mudrák

Re: 403 error preventing site registration

by DAVID YANG -
Ok, yup, I just changed my user agent to a bot ( SEMRUSHBot 6 on Mozilla 5) and I get a 403 as well. For some reason only this bot is blacklisted, the ahrefs bot is fine. So I think it has to do with the user-agent. I will try to fix that.
In reply to DAVID YANG

Re: 403 error preventing site registration

by DAVID YANG -
I will post ASAP when the 403 stops.
In reply to David Mudrák

Re: 403 error preventing site registration

by DAVID YANG -
adfsDear Smart David,

Unfortunately, I cannot find how to whitelist a checker or something. I host with plesk, but they don't have this option. I tried disabling nginx and apache, my site broke. My Modsecurity is already disabled. I tried removing htaccess and it didn't work. I tried every possible solution I can think of, but it won't work.

The image above is what I see when I spoof my user agent into semrush. I can see the 403, but I can't fix it, at all. I don't know what to do.

Sincerely,
Dumb David
In reply to DAVID YANG

Re: 403 error preventing site registration

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators

Hmm. I don't think it will be caused solely by the User-Agent string value. When I set the user agent to the value that our linkchecker uses, I am still able to access your site:


Screenshot of the HTTP request details

But then, if you are to reproduce the same 403 error with some other values, is a clear sign that there is some protection in place. Maybe it is more sophisticated or so.

Are you able to ask your hosting provider for support or info? Do you have access to your server error logs? All these 403 incidents could be logged somewhere and the log might contain some hint of what/where did block the request.

In reply to David Mudrák

Re: 403 error preventing site registration

by DAVID YANG -

Yup, MoodleBot-Link-Checker user agent string is allowed to access it. But the semrush doesn't so there definitely is some kind of protection. 

I do have access to the server error logs. They're quite useless.

2020/12/09 13:04:13 [error] 4575#0: *4724 peer closed connection in SSL handshake (104: Connection reset by peer) while SSL handshaking to upstream, client: 173.245.54.83, server: usacotutor.com, request: "HEAD /wordpress HTTP/1.1", upstream: "https://74.208.250.32:7081/wordpress", host: "usacotutor.com"

That is one I found yesterday. They aren't really special, or maybe I can't understand anything.

 My hosting provider is my friend. He owns the server that my site runs on, so I asked him. He doesn't know how to whitelist specific user agents on Plesk, probably because it doesn't exist. IP whitelisting also, he didn't know. 

I have given up on how to make my site moodle-scrapeable. sad 

In reply to DAVID YANG

Re: 403 error preventing site registration

by David Mudrák -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Peer reviewers Picture of Plugin developers Picture of Plugins guardians Picture of Testers Picture of Translators

I have given up

I haven't smile Yesterday in the evening at 10:30pm (UTC), your site was successfully checked by our linkchecker and is now registered as available \o/

Seriously though, I did not change anything. I only reset the unreachable counter again to re-check your site. Something somewhere decided that the linkchecker was OK to go now.

Glad to see this sorted out. Good luck with administering the site. Take care.

Average of ratings: Useful (2)
In reply to David Mudrák

Re: 403 error preventing site registration

by DAVID YANG -
Dear Very Very Smart David,

I am beyond words. This is truly a Christmas miracle. Thank you so much for helping out so much, thank you to Mary for her initial help, and just thank you all so much.

This has made my day. Thank you all so so much.

Sincerely,
David Yang
Average of ratings: Useful (1)
In reply to DAVID YANG

Re: 403 error preventing site registration

by Visvanath Ratnaweera -
Picture of Particularly helpful Moodlers Picture of Translators
Hi

If you are not afraid of spoiling your Christmas miracle, look for any fingerprints of your friend, the hosting provider. I remember the day my kids observed that the Santa was wearing the same shoes like a good family friend. wink
In reply to Visvanath Ratnaweera

Re: 403 error preventing site registration

by DAVID YANG -
haha i don't think it was him, he was in class today taking finals all day long, no way he would deal with this annoying issue
i disabled some random nginx stuff but semrush user agent wasn't working, so i gave up. turns out whatever buttons i pushed worked out.

Also, I have that vague memory too! It was with the tooth fairy, I saw it on a computer on a family friend's computer. ;)