Forums

Selenium/scraping and suddenly file storage goes up

Hi all,

I wrote a little scraper with selenium and pyvirtualdisplay. It ran about five to six minutes and scraped some data into a csv that came to about 11kb.

However, while it was running, my file storage jumped from like 88% full to about 95% full.

I am making diligent search but I don't think I had any other scrapers downloading anything else at that time and I'm not finding any huge files created during that time.

I'm not super clear on Selenum and virtual display -- is there any chance those processes are creating files somewhere?

FWIW, python 2.7, emulating Firefox browser, and the code is organized like:

with Display:
    try:
       driver = webdriver.Firefox()
       // stuff
   finally:
      driver.quit()

Selenium dumps a bunch of files in /tmp. See http://help.pythonanywhere.com/pages/DiskQuotaExceeded/

Oh dang I had a ton of mess in there. Thanks!

Glad we could help :-)

Hello,

After "rm -rf /tmp/*" my tmp folder size is still 377 Mb (It was 440 Mb)... Folder seems to be empty when I run ls command... How can I clear the tmp folder completely?

Are you using Chromium? I believe that it tends to create lots of temporary files with names starting with ., which won't be deleted by that command. You can get rid of those by running

rm -rf /tmp/.org.chromium.Chromium*

Thanks!