Forums

HTTP 502 using flask-openid

I'm trying to use flask-openid example (https://github.com/mitsuhiko/flask-openid) and I use google as openid provider. Google openid url: https://www.google.com/accounts/o8/id I've got HTTP 502 (bad gateway) error after authorization on google page and redirection back to my site. This example works fine on localhost and I have a problem only with pythonanywhere.

I assume this is a bug caused by the restrictions on free accounts (to stop people using pa for ddos and junk mail)

rcs1000, I think you are right but google is in whitelist https://www.pythonanywhere.com/whitelist/

Being in the whitelist isn't quite the only issue - the filtering is implemented using a proxy server, so it's entirely possible the behaviour of the proxy is causing some subtle breakage.

Can you trace the sequence of requests leading up to the 502 or is it hidden in some library? Remember you can monkey-patch things like urllib to add debug wrappers if that's helpful, but it's a bit of a last resort.

One known problem is that the requests and urllib3 libraries don't work very well with proxies if you use https -- perhaps the flask-openid library uses one of those under the hood?

I'm not going to pretend to be familiar with these libraries, but it looks like flask-openid is a fairly thin wrapper around Python OpenID and based on this file it looks like it uses urllib2 as a fallback option, but prefers PycURL if available. Since it doesn't appear that's available on PAW, my assumption is that it's using urllib2.

It also looks like it has some support for httplib2, but it didn't look as if this was used by default on a cursory inspection, but I couldn't be sure without going through the code carefully.

As an aside, might be worth sticking PycURL on at some point, I've heard good things about its performance. The API is fairly low-level (and not hugely pleasant by the looks of it), but the flexibility is probably also useful for people doing quirky things and there's a lot to be said for using a library as well tested as libcurl.

DISCLAIMER: I spent under 10 minutes poking around because I was curious, so don't regard my conclusions as in any way reliable (it would be pretty odd if they used different HTTP libraries in different parts of the code, but "odd" and "open source project" coincide more often than we might all like).

Thanks for answers, I'll try to use PycURL.

@Cartroo -- you're right, it does look like it's not using requests. OTOH the failure mode sounds exactly right.

@tema -- could you show us the code that's giving the error?

I use example. I've just created database and changed path in code.

from flask import Flask, render_template, request, g, session, flash, \
 redirect, url_for, abort
from flaskext.openid import OpenID

from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base

# setup flask
app = Flask(__name__)
app.config.update(
    DATABASE_URI = 'sqlite:////home/tema/flask/example.db',
    SECRET_KEY = 'development key',
    DEBUG = True
)

# setup flask-openid
oid = OpenID(app)

# setup sqlalchemy
engine = create_engine(app.config['DATABASE_URI'])
db_session = scoped_session(sessionmaker(autocommit=False,
                                     autoflush=False,
                                     bind=engine))
Base = declarative_base()
Base.query = db_session.query_property()

def init_db():
    Base.metadata.create_all(bind=engine)

class User(Base):
    __tablename__ = 'users'
    id = Column(Integer, primary_key=True)
    name = Column(String(60))
    email = Column(String(200))
    openid = Column(String(200))

    def __init__(self, name, email, openid):
        self.name = name
        self.email = email
        self.openid = openid

@app.before_request
def before_request():
    g.user = None
    if 'openid' in session:
        g.user = User.query.filter_by(openid=session['openid']).first()

@app.after_request
def after_request(response):
    db_session.remove()
    return response

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/login', methods=['GET', 'POST'])    
@oid.loginhandler
def login():
    """Does the login via OpenID. Has to call into `oid.try_login`
to start the OpenID machinery.
"""
    # if we are already logged in, go back to were we came from
    if g.user is not None:
        return redirect(oid.get_next_url())
    if request.method == 'POST':
        openid = request.form.get('openid')
        if openid:
            return oid.try_login(openid, ask_for=['email', 'fullname',
                                              'nickname'])
    return render_template('login.html', next=oid.get_next_url(),
                       error=oid.fetch_error())

@oid.after_login
def create_or_login(resp):
    """This is called when login with OpenID succeeded and it's not
necessary to figure out if this is the users's first login or not.
This function has to redirect otherwise the user will be presented
with a terrible URL which we certainly don't want.
"""
    session['openid'] = resp.identity_url
    user = User.query.filter_by(openid=resp.identity_url).first()
    if user is not None:
        flash(u'Successfully signed in')
        g.user = user
        return redirect(oid.get_next_url())
    return redirect(url_for('create_profile', next=oid.get_next_url(),
                        name=resp.fullname or resp.nickname,
                        email=resp.email))

@app.route('/create-profile', methods=['GET', 'POST'])
def create_profile():
    """If this is the user's first login, the create_or_login function
will redirect here so that the user can set up his profile.
"""
    if g.user is not None or 'openid' not in session:
        return redirect(url_for('index'))
    if request.method == 'POST':
        name = request.form['name']
        email = request.form['email']
        if not name:
            flash(u'Error: you have to provide a name')
        elif '@' not in email:
            flash(u'Error: you have to enter a valid email address')
        else:
            flash(u'Profile successfully created')
            db_session.add(User(name, email, session['openid']))
            db_session.commit()
            return redirect(oid.get_next_url())
    return render_template('create_profile.html', next_url=oid.get_next_url())

@app.route('/profile', methods=['GET', 'POST'])
def edit_profile():
    """Updates a profile"""
    if g.user is None:
        abort(401)
    form = dict(name=g.user.name, email=g.user.email)
    if request.method == 'POST':
        if 'delete' in request.form:
            db_session.delete(g.user)
            db_session.commit()
            session['openid'] = None
            flash(u'Profile deleted')
            return redirect(url_for('index'))
        form['name'] = request.form['name']
        form['email'] = request.form['email']
        if not form['name']:
            flash(u'Error: you have to provide a name')
        elif '@' not in form['email']:
            flash(u'Error: you have to enter a valid email address')
        else:
            flash(u'Profile successfully created')
            g.user.name = form['name']
            g.user.email = form['email']
            db_session.commit()
            return redirect(url_for('edit_profile'))
    return render_template('edit_profile.html', form=form)

@app.route('/logout')
def logout():
    session.pop('openid', None)
    flash(u'You have been signed out')
    return redirect(oid.get_next_url())

if __name__ == '__main__':
    app.run()

From what I remember of the code when I was browsing through it the other day, the way that the process is handled is a little quirky - the one thing that took me a few minutes to get my head around was the fact that it seems to use decorators as a way to declare callbacks. In particular, the @oid.after_login decorate indicates a method which be called when authentication succeeds - that's how the oid.get_next_url() doesn't need any information about the URL structure of the application, because the function is expected to redirect the user back to the appropriate place in the app.

I guess the complicating factor in tracking down problems like this is that the URL redirects that happen during login are entirely under the control of the library. My typical approach would be to run it under Wireshark, but that's not going to fly on a remote system like PAW.

Gives me an idea, actually - it would really handy to have a little module which monkey-patches both the socket and ssl modules to add wrappers which record raw socket-level comms in dump files somewhere. Not sure if being C extension modules complicates monkey-patching, but I guess worst case you could have an entire replacement module which just proxies all functions across.

Sorry for being so slow to respond! I was out of the office and we had some crossed wires about who was meant to be dealing with this. I'm doing some further investigation now.

Hmm. I just tried signing on using your web app, and it's generating an error due to the length of the request. Investigating further...

Right, I've found a problem and fixed it, at least temporarily. Right now, we restrict the length of requests made to PythonAnywhere-hosted applications to 4096 bytes. This is the default setting for uwsgi, one of the components we use on our servers for handling web applications.

I changed that temporarily for tema.pythonanywhere.com, and once I'd done that I was able to authenticate on your test application.

Could you try testing the app again and tell me if it works? If so, then we can make the change permanent.

One word of warning -- if you reload the web app, then the change I made will be undone and it won't work again.

What's odd is that the error message I was getting (a "502 Bad Gateway" didn't mention any problems with the Google URL. Where did you see the error message you mentioned in your original post?

Thank you, giles.

I'm testing authorization using google account and app is working fine right now. It will be great if you can make the change permanent.

I've got standart error message "502 Bad Gateway" and it didn't have other information. Sorry if I didn't explain it clearly in my first post.

@giles: For reference, could you clarify "length of request"? Do you mean the length of the request path including query parameters, or the size of the request body?

Hi Cartroo, I think Giles was talking about the request headers, which I guess would include query parameters. The setting he tweaked is called buffer-size:

https://projects.unbit.it/uwsgi/wiki/Doc#buffer-size

Hi Cartroo, I think Giles was talking about the request headers, which I guess would include query parameters. The setting he tweaked is called buffer-size:

https://projects.unbit.it/uwsgi/wiki/Doc#buffer-size

Yup, that's the one.

I'll prioritise getting it changed, but it might take a little while -- we're being very cautious about introducing changes at the moment, as we don't want to risk delaying the deployment of our performance fixes.

Ahha, that makes sense and it's a very useful setting to know about. The default of 4K seems a little conservative to me so I'm a little surprised they cranked it so low, especially given that users can always customise it themselves. I'd assume that the default configuration for things would be to optimise for compatibility and then allow specific instances to make performance trade-offs themselves.

For example, I think the RFC specifies 4K as a reasonable limit of a single cookie, so I would have thought at least doubling that for the entire request header would be sensible.

Agreed. I think we'll bump it up to 32k.

OK, that's done.