Forums

Selenium no errors but title is returning an empty string

Hi I am trying to run this very simple function using Selenium Webdriver which returns the title of a website.

from selenium import webdriver
from pyvirtualdisplay import Display
import traceback
browser = None

# displayDebug will open google.com and return the title
def displayDebug(o,d):

    try:
        display = Display(visible = 0, size = (800,600)).start()
        browser = webdriver.Firefox()
        browser.get('http://www.google.com')
        myTitle = browser.title

    except Exception as e:
        myTitle = "Title not found"
        traceback.print_exc()

    finally:
        if browser is not None:
            browser.quit()
        display.stop()


    return myTitle

The result of this function is an empty string. If I define myTitle above the browser.get statement (as some random string), myTitle is defined as the random string. I am not sure if browser.title is returning an empty string, or if the try statement is terminated (though I suspect the former, because otherwise, myTitle would be "Title not found"). This leads me to believe this is an issue with the headless display.

I looked through many of the forum posts on this topic. For reference, I am using a version of Selenium older than version 3. I installed xvfbwrapper. I did not install the pythonanywhere-compatible firefox version on my local machine, am I supposed to? Any suggestions would be appreciated.

Thank you

Just for clarity -- where are you running the code when you get this effect? I ask because you mention installing the PythonAnywhere-compatible version of firefox on your local machine -- there's certainly no need to do that if you're running the code on PythonAnywhere.

Hi Giles,

I am running the application on PythonAnywhere. I don't have firefox on my local machine. I only asked because I am unsure why the virtual display is not working.

Thanks, Ivgars

what do you get when you look at the browser body text etc, and also is there any error tracebacks?

I tried running this on a local machine today and it worked. From pythonanywhere, I cannot access any information from the browser (I suspect a display issue). There are no error tracebacks.

What does it print if you do

print(repr(browser.find_element_by_tag_name('body').text))

...just after the

browser.get('http://www.google.com')

...?

Hi Giles,

When I include the line you mentioned, the error logs display the error shown below. Also, the function displayDebug returns "Title not found". When the line is not included, the function returns an empty string and the error logs display no error.

2018-07-11 13:31:12,313: Traceback (most recent call last):
2018-07-11 13:31:12,313:   File "/home/IGLscheduling/mysite/displayDebug.py", line 12, in displayDebug
2018-07-11 13:31:12,314:     browser.get('http://www.google.com')
2018-07-11 13:31:12,314:   File "/home/IGLscheduling/.virtualenvs/my-virtualenv/lib/python3.6/site- 
packages/selenium/webdriver/remote/webdriver.py", line 248, in get
2018-07-11 13:31:12,314:     self.execute(Command.GET, {'url': url})
2018-07-11 13:31:12,314:   File "/home/IGLscheduling/.virtualenvs/my-virtualenv/lib/python3.6/site- 
packages/selenium/webdriver/remote/webdriver.py", line 234, in execute
2018-07-11 13:31:12,314:     response = self.command_executor.execute(driver_command, params)
2018-07-11 13:31:12,314:   File "/home/IGLscheduling/.virtualenvs/my-virtualenv/lib/python3.6/site- 
packages/selenium/webdriver/remote/remote_connection.py", line 401, in execute
2018-07-11 13:31:12,315:     return self._request(command_info[0], url, body=data)
2018-07-11 13:31:12,315:   File "/home/IGLscheduling/.virtualenvs/my-virtualenv/lib/python3.6/site- 
packages/selenium/webdriver/remote/remote_connection.py", line 433, in _request
2018-07-11 13:31:12,315:     resp = self._conn.getresponse()
2018-07-11 13:31:12,315:   File "/usr/lib/python3.6/http/client.py", line 1331, in getresponse
2018-07-11 13:31:12,315:     response.begin()
2018-07-11 13:31:12,315:   File "/usr/lib/python3.6/http/client.py", line 297, in begin
2018-07-11 13:31:12,315:     version, status, reason = self._read_status()
2018-07-11 13:31:12,316:   File "/usr/lib/python3.6/http/client.py", line 266, in _read_status
2018-07-11 13:31:12,316:     raise RemoteDisconnected("Remote end closed connection without"
2018-07-11 13:31:12,316: http.client.RemoteDisconnected: Remote end closed connection without response

Any suggestions?

Thanks, Ivgars

You're going to http://www.google.com. My guess is that Google is giving you a redirect probably to https://www.google.com/. Try that as your url instead.

Unfortunately, the url change had no effect.

In that case, I think you're misinterpreting the results in some way. Adding a line after the browser.get would not suddenly cause the browser.get to throw an exception. Perhaps your exception catching is mangling the stacktrace in some way. Try with the code here: http://help.pythonanywhere.com/pages/selenium/

Hi Glenn,

I tried your suggestion, but browser.get is still returning an empty string when using the code provided in the link. I tried including the print statement Giles suggested using this new code and I get a "RemoteDisconnected: Remote end closed connection without response" error. I tried reloading the application a few times, incorporating iplicit/explicit waits, searching for all kinds of elements (find_element_by_class, css_selector, etc.) and loading different websites. Occasionally, I'll get the error mentioned above. Occasionally, I'll get empty return values. But I do not get what I'm explicitly searching for.

As mentioned, my application is running perfectly fine on my local machine.

I've seen many pythonanywhere posts and messages (including on the link you sent me) warning users of the issues with selenium on pythonanywhere. However, nobody else seems to be having issues with this initial virtualdisplay setup phase. Again, any help is appreciated. I like the platform and would like to get my application running on PA if possible.

Best, Ivgars

lets try this as a script on from the consoles first. does it work from there instead of from a webapp?