Forums

SQLAlchemy query slow unicode conversion

I asked this question on stackoverflow, but received no answers. I'm asking here as a last resort, because it is just a generic python/sqlalchemy question and not necessarily specific to PA. Here is the question on stack overflow: http://stackoverflow.com/questions/25881443/flask-sqlalchemy-query-cost-unicode-conversion

Basically, I've no idea how to make my queries faster, but I know that they're way slow. I haven't even gotten to using indexes yet because I just want to know why it takes so long to do a simple select all. I've profiled the code, and it looks like the biggest time suck is occuring within "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py:206(string_decoder) ", executing {method 'decode' of 'str' objects}.

I don't know why this is happening or how to stop it. I also tested that just running the raw sql query that sqlalchemy generates inside a MySQL console is much faster, so this is in some way linked to SQLAlchemy or my configuration of it. I would REALLY appreciate any help. Maybe this has something to do with how PA's databases handle strings/unicode?

Thanks,

Seth

I have responded to your StackOverflow question and I'll repeat it here for completeness:

The main thing driving your query time is that you're getting 6000 rows. Do you really need all columns of all 6000 rows? The decode is taking less than 1/5 of the total time. Trying to micro-optimise that away is probably a waste of time. Thinking about the data that you need to move around and getting the database to do the heavy lifting is probably a better use of your time.

I've tried grabbing single columns, but still with lots of entries, and if I remember correctly this is still taking way too long. I would like to do some more analysis of this in the next day or two and I can follow up here. Thanks for your response!

So, I ran some queries again and am still having issues. Grabbing only one column of a 350 row table takes 0.286 seconds, which seems extremely long for such a small query. I ran the same query in a MySQL console and got results in 0.01 seconds.

While the string decode seems to have gone away, I am still having issues. I am struggling with how to debug the profiler. The results are shown here:

31639 function calls (30976 primitive calls) in 0.286 seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)
7 0.141 0.020 0.141 0.020 {method 'query' of '_mysql.connection' objects}
1 0.012 0.012 0.013 0.013 /usr/local/lib/python2.7/dist-packages/MySQLdb/init.py:14(<module>)
2 0.012 0.006 0.018 0.009 /usr/local/lib/python2.7/dist-packages/sqlalchemy/dialects/sqlite/pysqlite.py:289(dbapi)
1 0.006 0.006 0.006 0.006 /usr/lib/python2.7/sqlite3/dbapi2.py:24(<module>)
7 0.006 0.001 0.035 0.005 {import}
1 0.003 0.003 0.006 0.006 /usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py:62(init)
979 0.003 0.000 0.004 0.000 /usr/lib/python2.7/sre_parse.py:183(next)
38/6 0.003 0.000 0.011 0.002 /usr/lib/python2.7/sre_parse.py:380(_parse)
1 0.002 0.002 0.012 0.012 /usr/local/lib/python2.7/dist-packages/sqlalchemy/dialects/mysql/__init
.py:7(<module>)
2571/2538 0.002 0.000 0.002 0.000 {len}
114 0.002 0.000 0.010 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/strategies.py:27(_register_attribute)
1825 0.002 0.000 0.002 0.000 {isinstance}
350 0.002 0.000 0.004 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/util/_collections.py:54(new)
143 0.002 0.000 0.002 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/util/_collections.py:634(iter)
351 0.002 0.000 0.008 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/loading.py:27(instances)
884 0.002 0.000 0.006 0.000 /usr/lib/python2.7/sre_parse.py:202(get)
688 0.002 0.000 0.002 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/util.py:489(hash)
368 0.002 0.000 0.003 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py:211(iterate)
1 0.001 0.001 0.008 0.008 /usr/local/lib/python2.7/dist-packages/sqlalchemy/dialects/mysql/base.py:307(<module>)
26 0.001 0.000 0.004 0.000 {eval}
114 0.001 0.000 0.005 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py:1405(register_attribute_impl)
1 0.001 0.001 0.001 0.001 {_mysql.get_client_info}
70/6 0.001 0.000 0.003 0.001 /usr/lib/python2.7/sre_compile.py:33(_compile)
1246 0.001 0.000 0.001 0.000 {method 'append' of 'list' objects}
422/379 0.001 0.000 0.008 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/util/langhelpers.py:609(get)
350 0.001 0.000 0.002 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py:3208(proc)
444 0.001 0.000 0.001 0.000 {method 'update' of 'dict' objects}
372 0.001 0.000 0.002 0.000 {method 'decode' of 'str' objects}
114 0.001 0.000 0.002 0.000 /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py:408(init)
51 0.001 0.000 0.002 0.000 /usr/lib/python2.7/inspect.py:845(formatargspec)
812/757 0.001 0.000 0.001 0.000 {hash}

I really appreciate any guidance, even if it's just pointing me to a website or something. Thanks!

Alright, I realize I may have been asking too much. As a final follow up, could someone tell me what the difference between running a query in the pythonanywhere mysql console and running a raw sql query through python's mysqldb package is? The console is MUCH faster than running a query through python and I'm driving myself crazy trying to figure out why this is. This doesn't even include an ORM since I am comparing it to mysqldb by itself.

Thanks,

Seth

Hi Seth -- when you run a MySQL console it's not using Python at all, it's a C program connecting directly to the MySQL database. So I guess this is some kind of inefficiency in the mysqldb package. Which is odd -- it's a very popular package, and you'd think it would be pretty efficient.

Thanks for your answer Giles! I was hoping for something more groundbreaking, but for now it looks like I'll just have to deal with the latency in MySQLdb.

Thanks again.