Forums

Selenium headless: "Browser appears to have exited before we could connect"

I'm using python 3.6.

Here's the code I'm using:

display = Display(visible=0, size=(800, 600))
display.start()
browser = webdriver.Firefox()
try:


    browser.get('https://www.api.slack.com')
    print(browser.title)

finally:

    browser.quit()
    display.stop()

Any thoughts? I manually installed selenium 2 onto python3.6.

While we're at it... the eventual use-case I'm going for is using a proxy like BrowserMob with Selenium to capture sites that are accessed by an individual page. If we are unable to use proxies on pythonanywhere, it'd be nice to know before I spend too much time on this.

Which exact version of Selenium did you install? It needs to be 2.53.6 in order to work with the (somewhat old) version of Firefox we have available.

You should be able to use proxies on PythonAnywhere, but only from a paid account: free accounts are limited to accessing websites on our whitelist. Actually, that might also cause problems with your code as posted -- api.slack.com is on the whitelist, but www.api.slack.com isn't.

It's 2.53.6. I fixed the URL but that didn't change anything.

Is the exception coming from the browser.get line, or a different one?

It's from browser = webdriver.Firefox()

2018-02-13 00:27:14,934: Error running WSGI application
2018-02-13 00:27:14,941: selenium.common.exceptions.WebDriverException: Message: The browser appears to have exited before we could connect. If you specified a log_file in the FirefoxBinary constructor, check it for details.
2018-02-13 00:27:14,941: 
2018-02-13 00:27:14,942:   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1994, in __call__
2018-02-13 00:27:14,942:     return self.wsgi_app(environ, start_response)
2018-02-13 00:27:14,942: 
2018-02-13 00:27:14,942:   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1985, in wsgi_app
2018-02-13 00:27:14,942:     response = self.handle_exception(e)
2018-02-13 00:27:14,942: 
2018-02-13 00:27:14,943:   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1540, in handle_exception
2018-02-13 00:27:14,943:     reraise(exc_type, exc_value, tb)
2018-02-13 00:27:14,943: 
2018-02-13 00:27:14,943:   File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 33, in reraise
2018-02-13 00:27:14,943:     raise value
2018-02-13 00:27:14,943: 
2018-02-13 00:27:14,944:   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1982, in wsgi_app
2018-02-13 00:27:14,944:     response = self.full_dispatch_request()
2018-02-13 00:27:14,944: 
2018-02-13 00:27:14,944:   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1614, in full_dispatch_request
2018-02-13 00:27:14,944:     rv = self.handle_user_exception(e)
2018-02-13 00:27:14,944: 
2018-02-13 00:27:14,944:   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1517, in handle_user_exception
2018-02-13 00:27:14,944:     reraise(exc_type, exc_value, tb)
2018-02-13 00:27:14,944: 
2018-02-13 00:27:14,945:   File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 33, in reraise
2018-02-13 00:27:14,945:     raise value
2018-02-13 00:27:14,945: 
2018-02-13 00:27:14,945:   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1612, in full_dispatch_request
2018-02-13 00:27:14,945:     rv = self.dispatch_request()
2018-02-13 00:27:14,945: 
2018-02-13 00:27:14,945:   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1598, in dispatch_request
2018-02-13 00:27:14,945:     return self.view_functions[rule.endpoint](**req.view_args)
2018-02-13 00:27:14,946: 
2018-02-13 00:27:14,946:   File "/home/analyticsforyourblog/mysite/flask_app.py", line 43, in get_tasks
2018-02-13 00:27:14,946:     browser = webdriver.Firefox()
2018-02-13 00:27:14,946: 
2018-02-13 00:27:14,946:   File "/home/analyticsforyourblog/.local/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py", line 80, in __init__
2018-02-13 00:27:14,946:     self.binary, timeout)
2018-02-13 00:27:14,946: 
2018-02-13 00:27:14,947:   File "/home/analyticsforyourblog/.local/lib/python3.6/site-packages/selenium/webdriver/firefox/extension_connection.py", line 52, in __init__
2018-02-13 00:27:14,947:     self.binary.launch_browser(self.profile, timeout=timeout)
2018-02-13 00:27:14,947: 
2018-02-13 00:27:14,947:   File "/home/analyticsforyourblog/.local/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 68, in launch_browser
2018-02-13 00:27:14,947:     self._wait_until_connectable(timeout=timeout)
2018-02-13 00:27:14,947: 
2018-02-13 00:27:14,948:   File "/home/analyticsforyourblog/.local/lib/python3.6/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 99, in _wait_until_connectable
2018-02-13 00:27:14,948:     "The browser appears to have exited "

Are you running selenium from a console, or a scheduled task? It's not recommended to run it directly from a web app...

Ah, I'm running it just by visiting the web app page.

The end result I'm going for is that I'd like to build an API that gets data about a website that a user inputs, then returns it to my own site (a wordpress site). This has been incredibly difficult for me because I've only ever coded local apps, so all this tangle of servers and databases and web apps is killing me.

Anyway, maybe by telling you the problem I'm trying to solve we can find a good way to do that.

Also, I just tried a curl -i request on it and the same error occurred.

Just to confirm, when you do a curl -i, you see the same error? ie. "the browser appears to have exited"?

Yes, I get the same error

That sounds really weird. When you say that you got the same error, what did curl print out? It would be really strange if it printed out "The browser appears to have exited" -- it doesn't use a browser.

Also -- what do you get if you run which firefox from a bash console?

Oh, the curl command just spits out the standard "something went wrong" page html, it doesn't specify the error. When I check the error log, I see the same error.

22:55 ~ $ which firefox
/usr/local/bin/firefox

Are you sure you're looking at the right errors? The most recent error in the error log is at the bottom, where it's currently showing a last error of

2018-02-14 22:55:52,140: UnboundLocalError: local variable 'browser' referenced before assignment

It's worth noting that this is from two days ago.

Yes, I changed the code a bit to try another approach and it caused that new error. I reverted it and ran it just now. First from the website, and second from the console. You can see the errors now.

See our documentation on the best way that we know how to use selenium on PythonAnywhere and see if that helps.

Well, I used your documentation to come up with the code that I originally posted, but this documentation had slightly different code. I tried it and got the same result. (Both running from web and from the console). Perhaps if you took a look at my file structure or code you'd be able to find something I'm missing?

Unfortunately we are not going to be able to look through your code to help you debug programming issues.

That's understandable. Are there any other steps that I can take with your help to figure this out, or am I on my own now?

wait- i just took a look at your code. are you trying to access www.api.slack.com?

That is probably not the correct url. and also if it were an api you probably don't need selenium, and could just use requests.

In particular, as a free user, you can only access sites on our whitelist. api.slack.com (without the www) is on the whitelist, as is a bunch of other slack endpoints that exist for api access, but not www.api.slack.com, which probably isn't a valid slack endpoint anyways.

I already took out the www and tried it. I have also tried other websites. The slack api is only an example I was using. Once I figure out how to make this work, I am planning to pay for a basic plan because I need to be able to access any site a user inputs. But I don't want to start paying until I know for sure it works.

ok- in that case you are on your own. You can also try to pay for one-to-one help with codementors (link found on our forums page)

Ok, thanks for your help

I think you guys must have the wrong Firefox version installed. From everything I've looked up, this problem only occurs when there's an incompatibility between Selenium and Firefox. I have the recommended Selenium 2.53.6 version installed, but it's simply not playing nice with Firefox.

def get_tasks():
    display = Display(visible=0, size=(320, 240))
    display.start()
    time.sleep(5)
    browser = webdriver.Firefox()

    try:


        browser.get('https://api.slack.com')

        print(browser.title)
        title = browser.title

    finally:

        browser.quit()
        display.stop()

    return title

No matter what, the line that breaks everything is this one:

browser = webdriver.Firefox()

It's definitely not the Firefox version that we have installed; just to make sure, I created a new free account, and ran the following code (moderately adapted from yours):

from pyvirtualdisplay import Display
from selenium import webdriver
import time

display = Display(visible=0, size=(320, 240))
display.start()
time.sleep(5)
browser = webdriver.Firefox()

try:
    browser.get('https://api.slack.com')
    print(browser.title)
finally:    
    browser.quit()
    display.stop()

I got the following results:

14:04 ~ $ python analyticstest.py 
Slack API | Slack

Could you try the same experiment? That is, create a file called analyticstest.py in your home directory, copy/paste the above code into it, and then run it directly using Python from a bash console?

That worked—the only differences I can think of between this working version and my version is that I'm running it on a Flask app, and it isn't in my base directory.

However, I found a workaround for what I was trying to do that doesn't involve using selenium, and it has been much easier for me to code. I think I'll go with that for now instead of trying to figure out this + how to use a proxy. This thread may be useful if I have to try this method again, or for anyone searching for help for this same problem.

OK, glad you worked something out! It's probably a good idea to not use Selenium if you don't have to, anyway -- it's quite a heavyweight way of scraping websites, so if you don't need a full browser (eg. no JavaScript-generated content etc) then you can probably have code that runs faster and is easier to read with other systems. Out of interest, which libraries did you wind up using?

Giles, I'm actually just using a combination of source code crawling with urllib + info from Builtwith.com instead of getting HTTP information myself.

Thanks!