Forums

aiohttp + oxylabs proxies = Cannot connect to host

I have an async function that does http get requests to get a list of urls and returns a list of dicts with status codes.

import asyncio
import aiohttp
import sys
from user_agent import generate_user_agent

username = 'username'
password = 'pass'
proxy = 'dc.oxylabs.io:8000'
proxy_url = f"http://{username}:{password}@{proxy}"

# Windows fix for annoying event loop warnings
if sys.platform.startswith('win'):
    asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())

def fetch_urls(urls):
    return asyncio.run(_fetch_all(urls))

async def _fetch_all(urls):
    results = []

    async with aiohttp.ClientSession(trust_env=True) as session:
        tasks = [_fetch(session, url, results) for url in urls]
        await asyncio.gather(*tasks)

    return results

async def _fetch(session, url, results):
    try:
        user_agent = generate_user_agent(os='win', navigator='chrome')
        async with session.get(url, headers={"user-agent": user_agent, "upgrade-insecure-requests": "1"}, proxy=proxy_url) as response:
            status = response.status
            results.append({"url": url, "status": status})
    except Exception as e:
        print(f"[ERROR] {url} -> {e}")
        results.append({"url": url, "status": None})

At first it didn't work due to:

Cannot connect to host www.domain.com:443 ssl:default [Connect call failed ('x.x.x.x', 443)]

Had to use trust_env=True in async with aiohttp.ClientSession(trust_env=True) as session - found that on the forums. That fixed that


Later, I implemented proxies and now I get a similar issue:

Cannot connect to host dc.oxylabs.io:8000 ssl:default [Connect call failed ('x.x.x.x', 8000)]

I tried running the script on my local pc and it works flawlessly. Could it be because pythonanyhwhere blocks access to oxylabs.io? Checked the whitelist (https://www.pythonanywhere.com/whitelist/) since I'm a free account for now and found only residential-api.oxylabs.io.

Free accounts on PythonAnywhere have restricted Internet access, so they can only make HTTP or HTTPS to the allowlisted sites via our own proxy -- so what's happening here is that your attempts to connect to the third-party proxy you have at oxylabs.io is what is being blocked.

So do I understand correctly that is a feature to incentivize getting a paid account? In other words, there is no adding this to the whilelist or using other proxies, correct?

Initially we provided unrestricted internet access but had to limit it to paid accounts for security reasons as they are not anonymous. We allow open and public APIs, see this help page.