Selenium downloaded files go into temp folder with .part extension : Forums : PythonAnywhere

Selenium downloaded files go into temp folder with .part extension

Hello, I'm new to the site - awesome service!

I'm using Selenium to download CSV files via an online search tool. For whatever reason, when I download the files, the csv files are all saved as ".csv.part".

I know that I'm supposed to set the preferences of my firefox webdriver - but how do I do that within pythonanywhere? How can I avoid files being saved with the .part extension? Code samples would be appreciated.

Bonus question: how can I change the download directory for the browser? I'd prefer to avoid working in the /temp folder...

Thanks,

deleted-user-543201 | 2 posts | Dec. 11, 2014, 3:58 a.m. | permalink

The .part files are an indication that the download did not complete. There may be a timeout or something that else that is preventing them from finishing.

Have a look at this for setting the download directory.

glenn | 9718 posts | PythonAnywhere staff | Dec. 11, 2014, 11:11 a.m. | permalink

I'm having the same problem :(

How did you fix that?

I'm already defined the download folder:

options = Options()
profile = {'download.default_directory' : download_dir}
options.add_experimental_option('prefs', profile)

options.add_argument('--disable-gpu')
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(options=options)

When i'm using firefox selenium, the downloaded files are all saved as ".xlsx.part". Now, with chrome selenium, the files don't appear, even in the /tmp/ folder.

deleted-user-8059396 | 10 posts | July 21, 2020, 5:50 p.m. | permalink

Does it raise an error?

fjl | 4614 posts | PythonAnywhere staff | July 22, 2020, 9:57 a.m. | permalink

No. The files only don't appear. I've tried with firefox selenium and chrome and nothing appeared.

deleted-user-8059396 | 10 posts | July 22, 2020, 10:51 p.m. | permalink

What have you set download_dir to?

glenn | 9718 posts | PythonAnywhere staff | July 23, 2020, 9:54 a.m. | permalink

Yes... The same code worked fine in another cloud ide. How I fix it?

deleted-user-8059396 | 10 posts | July 23, 2020, 3:22 p.m. | permalink

Yes... The same code worked fine in another cloud ide. How I fix it?

deleted-user-8059396 | 10 posts | July 23, 2020, 3:22 p.m. | permalink

What is the value of the variable download_dir? Does the directory in question definitely exist?

It might also be worth taking a screenshot of the browser just after you've performed the action that should download the file in order to see if there's some kind of error page:

driver.get_screenshot_as_file("screenshot.jpg")

giles | 12095 posts | PythonAnywhere staff | July 23, 2020, 3:41 p.m. | permalink

I managed to enable the download via headless chrome with the following code:

driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
command_result = driver.execute("send_command", params)

But I was unable to download pdf files, even though I used this as preferences when running chrome:

options = Options()
profile = {"plugins.plugins_list": [{"enabled": False, "name": "Chrome PDF Viewer"}],  # Disable Chrome's PDF Viewer
           "profile.default_content_settings.popups": 0, # Disable download file dialog
           "download.default_directory": download_dir,
           "download.prompt_for_download": False,  # To auto download the file
           "download.extensions_to_open": "applications/pdf",
           "download.directory_upgrade": True,
           "plugins.always_open_pdf_externally": True,  # It will not show PDF directly in chrome
           "safebrowsing.enabled": False,
           "safebrowsing.disable_download_protection": True
           }
options.add_experimental_option('prefs', profile)

So my problem now is to download pdf files :(

deleted-user-8059396 | 10 posts | July 23, 2020, 6:12 p.m. | permalink

Does it happen only for pdfs?

fjl | 4614 posts | PythonAnywhere staff | July 24, 2020, 10:34 a.m. | permalink

Yeah. Maybe the problem is with the website I'm accessing, which is causing some restriction due to the use of headless chrome. But how do i find this out? :(

I'm switching the driver to the new tab that opens when I click to download, and if I try to take a screenshot or get the current URL of the new tab, I get a timeout error. So I think that new tab, that should automatically start the download, is crashing for some reason I don't know.

deleted-user-8059396 | 10 posts | July 24, 2020, 4:24 p.m. | permalink

Could it be running out of memory? If a process hits 3GiB it will run out and exit. But if that were happening, you'd receive an email telling you; I've checked and your account is set up to receive those messages. Have you seen anything like that in your inbox?

giles | 12095 posts | PythonAnywhere staff | July 24, 2020, 5:22 p.m. | permalink

I didn't receive any message saying anything like that :(

What can I do now?

deleted-user-8059396 | 10 posts | July 24, 2020, 5:32 p.m. | permalink

If you're getting a timeout, that suggests that either the tab is doing its job and downloading the file so you can't access it or that it has crashed. I don't really have any idea how you might go about finding that out. If the service that you're trying to use has protections that prevent downloads from automated tools there is, unfortunately, not much you can do about that.

glenn | 9718 posts | PythonAnywhere staff | July 24, 2020, 6:11 p.m. | permalink

The service has no protections that prevent downloads of automated tools. Exactly the same code I am using here is working perfectly on another Cloud IDE and on my personal computer as well. Perhaps it is some permission on your platform that is preventing my program from downloading the pdf file from the website that I need. The file was to be downloaded, but the download does not happen.

deleted-user-8059396 | 10 posts | July 24, 2020, 8:17 p.m. | permalink

The only thing I can think of then is that perhaps you do not have write permissions to the directory that you're trying to download into or that, perhaps, the directory does not exist.

glenn | 9718 posts | PythonAnywhere staff | July 25, 2020, 11:02 a.m. | permalink

I tried everything, but your service did not serve me as I expected. Unfortunately I am going to switch to a competitor that meets my needs.

I have just downgraded my account. It would be nice if you could refund the money. Thanks!

deleted-user-8059396 | 10 posts | July 29, 2020, 2:14 p.m. | permalink

Sure, no problem -- that's done now.

giles | 12095 posts | PythonAnywhere staff | July 29, 2020, 6:52 p.m. | permalink