Forums

Web Dev Account Got "urllib.error.HTTPError: HTTP Error 403: Forbidden" Error

Hi I have been using Pythonanywhere for several months for scraping a website. A few days back, I noticed that my program did not work anymore.

I got this error message if I check on my task, "urllib.error.HTTPError: HTTP Error 403: Forbidden". Since I had no time to further check on my code, I let it be.

Today I checked my code on my personal computer, and the code worked fine. It still scraped the website as it used to be. So I am suspecting that Pythonanywhere, probably, just changed their policies especially in regard to the scraping matter.

I am paying for web dev account for now. Anyone is having the same issue as mine?

Thanks and regards, Arnold A.

Hmm, we haven't changed anything in that area -- web dev accounts have unrestricted Internet access. What's the site you're scraping? Have you looked at the response contents? It's possible that they're blocking scraping from cloud services.

Yeah it seems so. Probably they are blocking the scraping from cloud services. I tested it by trying to access manually from the bash and got the 403 error message. However, if I tried to access google, got 200 as the response code.

Well, sorry for bothering you and thanks a lot for your response. Need to find out how to regain my access to the website.

It's possible that they're returning a message with some details along with the 403 message -- if from bash you run

curl http://www.thesite.com/

(replacing http://www.thesite.com/ with the actual URL in question, of course) then you will see whatever text they're returning with the status code, which might be helpful. Or, of course, it might not :-)

Hi Giles, thanks for the clue. Turns out that the site does not block the cloud services. They just changed the URL. Thanks again for your response :) What a great service from the pythonanywhere team.

Ah, excellent -- that's good news :-)