Forums

Flask pandas csv upload problem

I am trying to create a Flask web app using Python 2.7 and pandas that allows the user to upload a csv file, which then gets read into a dataframe and processed. The program worked with one csv file I tried but not any of the others. The program seems to upload the files fine but throws an error message saying (in the case the name of a file is called xyz.csv) "#012IOError: File xyz.csv does not exist". Here's the relevant part of the code:

from flask import Flask, make_response, request, send_file
import pandas as pd

# Initialize the Flask application
app = Flask(__name__)

@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>YDNA Kit Grouping Program</h1>

                <form action="/main_program" method="post" enctype="multipart/form-data">
                    <input type="file" name="input_file" />
                    <input type="submit" />
                </form>
            </body>
        </html>
    """

@app.route('/main_program', methods=["POST"])
def main_program_view():

    # Input file
    file = request.files['input_file']
    if not file:
        return "No file"

    # Put input file in dataframe
    df = pd.read_csv(file.filename, encoding='cp1252')

I think the problem may in the last line of code where file.filename is not giving the file location information that pd.read_csv needs. But I have no idea why it would work for one csv file but not another and no idea how to fix it so that it can read any uploaded csv file into the dataframe.

does this help? http://help.pythonanywhere.com/pages/NoSuchFileOrDirectory

Apparently, that is not the problem.

Now instead of: "#012IOError: File xys.csv does not exist" it says: #012IOError: File /home/chaseashley/mysite/xyz.csv does not exist

Also, now the code no longer works for the one csv file it used to work for.

Here's the rrevised code I used in relevant part:

from flask import Flask, make_response, request, send_file
import os
import pandas as pd

# Initialize the Flask application
app = Flask(__name__)

@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>YDNA Kit Grouping Program</h1>

                <form action="/main_program" method="post" enctype="multipart/form-data">
                    <input type="file" name="input_file" />
                    <input type="submit" />
                </form>
            </body>
        </html>
    """

@app.route('/main_program', methods=["POST"])
def main_program_view():

    # Input file
    file = request.files['input_file']
    if not file:
        return "No file"

    THIS_FOLDER = os.path.dirname(os.path.abspath(__file__))
    filename_path = os.path.join(THIS_FOLDER, file.filename)

    # Put input file in dataframe
    sheet = pd.read_csv(filename_path, encoding='cp1252')

Do I need to save the file before I can read it into the dataframe? Or, is the file in a temporary file and I need to get the directory that temp files are stored in using tempfile.gettempdir()?

I misread what you were trying to do, you want to read the file directly from the POST request, not save it to disk.

I'm not a flask expert, but according to the flask docs you should be able to do something like this:

sheet = pd.read_csv(file, encoding=...)

if that doesn't work, try a two-step process. first, save the file somewhere to disk, and second, use pandas to read the file from that place on disk, and use absolute paths for both. maybe something like this:

import tempfile
tempfile_path = tempfile.NamedTemporaryFile().name
file.save(path=tempfile_path)
sheet = pd.read_csf(tempfile_path)

Thanks! I had tried the first suggestion before and it hadn't worked, but the second suggestion worked great with a few minor tweaks. Here's the revised code that worked:

import tempfile
tempfile_path = tempfile.NamedTemporaryFile().name
file.save(tempfile_path)
sheet = pd.read_csv(tempfile_path)

I had misread the Flask documentation as saying that all Flask file uploads were stored in a temporary location (the path for which is returnable by tempfile.getttempdir()), but it actually says that reasonably small files are just stored in the webserver’s memory (which is presumably why my code didn't work when I tried to find specify its absolute path using .gettempdir)). I don't know if there is a way to do a pandas read_csv of the file if it is just in memory, but the express save to the tempfile worked great and made it readable by pandas read_csv.

I used flask-excel instead of this and it worked pretty well. the csv goes right into an in-memory dataframe structure AFAIK, no need for tempfiles.

Great post @chaseashley! are you able to share you full code? I'm stuck on a similar issue and it would really help! Thank you.