Web Dev Account Got "urllib.error.HTTPError: HTTP Error 403: Forbidden" Error : Forums : PythonAnywhere

Web Dev Account Got "urllib.error.HTTPError: HTTP Error 403: Forbidden" Error

Hi I have been using Pythonanywhere for several months for scraping a website. A few days back, I noticed that my program did not work anymore.

I got this error message if I check on my task, "urllib.error.HTTPError: HTTP Error 403: Forbidden". Since I had no time to further check on my code, I let it be.

Today I checked my code on my personal computer, and the code worked fine. It still scraped the website as it used to be. So I am suspecting that Pythonanywhere, probably, just changed their policies especially in regard to the scraping matter.

I am paying for web dev account for now. Anyone is having the same issue as mine?

Thanks and regards, Arnold A.

deleted-user-2484367 | 7 posts | July 7, 2018, 6:01 a.m. | permalink

Hmm, we haven't changed anything in that area -- web dev accounts have unrestricted Internet access. What's the site you're scraping? Have you looked at the response contents? It's possible that they're blocking scraping from cloud services.

giles | 12095 posts | PythonAnywhere staff | July 7, 2018, 2:41 p.m. | permalink

Yeah it seems so. Probably they are blocking the scraping from cloud services. I tested it by trying to access manually from the bash and got the 403 error message. However, if I tried to access google, got 200 as the response code.

Well, sorry for bothering you and thanks a lot for your response. Need to find out how to regain my access to the website.

deleted-user-2484367 | 7 posts | July 7, 2018, 3:44 p.m. | permalink

It's possible that they're returning a message with some details along with the 403 message -- if from bash you run

curl http://www.thesite.com/

(replacing http://www.thesite.com/ with the actual URL in question, of course) then you will see whatever text they're returning with the status code, which might be helpful. Or, of course, it might not :-)

giles | 12095 posts | PythonAnywhere staff | July 8, 2018, 1:59 p.m. | permalink

Hi Giles, thanks for the clue. Turns out that the site does not block the cloud services. They just changed the URL. Thanks again for your response :) What a great service from the pythonanywhere team.

deleted-user-2484367 | 7 posts | July 9, 2018, 2:04 a.m. | permalink

Ah, excellent -- that's good news :-)

giles | 12095 posts | PythonAnywhere staff | July 9, 2018, 1:30 p.m. | permalink