PythonAnywhere Forums

Wrong characterset/collation for MySQL with Python 2.7 webapp and MySQLdb

I moved to PythonAnywhere a webapp using Bottle, that perfectly worked locally. When I get data from the MySQLdb, I get the wrong characters. The database uses utf8 default charset for the database itself and for each table, and the Python connection is estabilished this way: conn = db.connect('mysql.server', 'username', 'password', 'database_name', port = 3306, connect_timeout = 10, charset='utf8') but I keep getting the wrong characters: the 'รน' characters should be returned as u'\u00f9', but is instead returned with the sequence u'\u00c3\u00b9'.

I can see the correct characters inside the MySQL console, but are wrong when returned with MySQLdb. I also tried to use cur.execute('''SET NAMES utf8''') and also conn.set_character_set('utf8') with no change in the returned character set.

Note: it's the same if I also add use_unicode=True (that should be implied, with charset), when connecting.

Anyway, conn.character_set_name() returns 'utf8'.

Ok, I found the solution: since I can create new databases on PythonAnywhere using the command from the database tab, while I can't create it using an SQL statement, I had exported only the tables of my local database in a file, and imported in to PythonAnywhere using the MySQL console, using the source command. I then changed the database encoding using the alter schema command.

I then changed the order of things performed, and got it right, this way: - exported my entire local database (not only the tables) - removed the previously created database in PythonAnywhere and recreated - uploaded the sql file to PythonAnywhere and modified so that: - the create database statement conntains the name of the just created database, and changed the create schema do alter database, so that database charset is set BEFORE creating the tables and entering the data - changed the use table statement so that it contains the name of the just created database - openened the MySQL console and used the source command for issuing the SQL command from the just modified .sql file

That's useful, thanks for sharing it. MySQL can be a be a bit funny with character encodings. I suspect the first time it imported your data as latin-1 or somesuch, and when you changed the schema afterwards it updated the schema only and left the data in the wrong character set.

Yes, that's what I think, too. Anyway, everything's ok, now! :-)

adding this link to our help page on mysql encoding, utf8, and character sets for anyone else that comes across this page!