PythonAnywhere Forums

Multi Tenancy on PA

Has anyone achieved anything significant here on PA (working with Django) with regards to multi-tenancy? The same way that tumblr, strikingly, and everyone else is doing (yup, includes PA too I guess).

It seems like a fairly complex task to navigate, due to a combination of factors.

  • 1) PA doesn't yet support pgsql, whose schemas are the most commonly supported for third-party apps which assist with achieving multi tenancy.
  • 2) PA's web app mapping only allows for one project per subdomain, and even if we were to use multiple web apps working together, seems like it would be rather cumbersome to maintain multiple projects with the same source code and database.
  • 3) You can't create a web app on PA which resides at {{ wildcard }}.example.com. Which means that even if you were to set up a {{ wildcard }}.example.com CNAME record on your DNS, your user accessing user.example.com would just get a "unconfigured domain" message from PA.
  • 4) Multi-tenancy inception (I don't know if this is a valid point).
  • 5) Django itself isn't especially suited for multi tenant design.

All limitations considered (especially including my ability), I would still much prefer sticking to PA so I don't have to deal with all that icky server stuff. I would probably stick to a non-subdomain way (meaning, using urls) of implementing the same functionality at the moment, but of course that wouldn't be ideal.

I would love to hear any thoughts on the matter!

*** The above is the result of an afternoon's research findings. If I sound like an idiot, let me know. I'll be glad if I've been thinking about things in too complex terms :)

{{ wildcard }} == * // don't know how to get rid of the italics

It's unlikely that you'll be able to make domain-based multi tenancy work on PythonAnywhere, but you could get something working with a WSGI app that acts as a switching point for your other WSGI apps where each one is at its own space (like http://here.com/webapp1 and http://here.com/webapp2). Each app could be entirely stand alone except that they publish their wsgi app with a particular name that the switching app could import. It's not a perfect solution, because unless you do something weird with imports and reloads, adding a new wsgi app or changing one of the apps would require a reload of the switching app.

Yup... Thought so :/ Oh well, at least I know my options clearer now! Thanks glenn!

I've added handling wildcard domains to our to-do list, I don't know how much work it would be but it's something we should at least be thinking about.

Sounds great! :)

+1 for wildcard domains. I'm using web2py and it has great support for multi-tenancy. Right now I'm just detecting the tenant after login but I'd like to use a unique subdomain for each tenant (web2py can also use the first URL arg) to show a personalized home page.

+1 for wildcard domains. Setting up multi tenancy sounds intriguing, especially now that Postgre is available, but I don't see it being as easy as I hoped since I don't think it is possible to set up and use subdomains on the fly on PA.

+1 noted!

+1 for wildcard domains - I have something I want to host on PA that's multitenanted based on subdomain!

+1 noted :-)

Have the same issue. Any progress on the wildcard domains?

We have made some progress and some infrastructure changes to make this more feasible (eg: how you can change your webapps to a different domain now). But we have not implemented wildcard domains yet.

It seems today it is not possible to use e.g. "django-tenant-schema" with postgress on Python Anywhere due to the wildcard domain limitation?

See: https://django-tenant-schemas.readthedocs.io/en/latest/index.html

Then what is the suggested approach to implement multi-tenancy today without wildcard domains?

If u want to have infinite number of x.yourdomain.com and y.yourdomain.com etc, PythonAnywhere doesn't support that right now.

Is there a plan or a solution to work around this?

See also my post here: http://stackoverflow.com/questions/40973589/multi-tenant-django-application-using-url-mappings-instead-of-domain-mappings

If not, I will be pushed towards other SaaS providers

I've just posted on your Stack Overflow question -- it looks like django-multitenants may have been planned to be a fork of django-tenant-schemas that supported URL-based mapping, but that feature was never added.

It's going to be a while before we support wildcard domains -- we're taking steps in that direction, but it's a major change to the way we route requests around our system, which right now is based on the hostname in the request, so we're taking it one step at a time to make sure we don't break stuff.

Could you give a few more details about why you're looking into multi-tenancy specifically, rather than (say) just extracting a user object from the request's URL? For example, are you planning to use different databases for each user, or something like that? With a few more details we might be able to suggest something.

Why multi-tenancy:

My app is nearly ready for 1 customer. But since the business model doesn't make sense with just 1 customer, I need to foresee the application & the deployment infrastructure for multiple customers.

The most obvious way to support multiple customers without complicating the application code (extra query & additional logic in almost every view) is multi-tenancy.

Django supports this, typically by mapping differents sub-domains (hosts) onto 1 django application while using different postgress schemas for each customer. So every customer would have a URL like:

customer1.myapp.com
customer2.myapp.com

This requires:

  1. postgress DB (to have the schemas, which mysql doesn't support)
  2. wildcard domains (to map multiple customer "hosts" onto the same django application

Then the django app managing the multi-tenancy maps the multiple hosts onto regular django views passing an extra parameter 'tenant' and automatically performs the extra filtering to pass the data relevant to the particular tenant to my django application:

customer1.myapp.com/view1/arg1 -> myapp.view1(arg1)  using schema 'customer1'
customer2.myapp.com/view1/arg1 -> myapp.view1(arg1)  using schema 'customer2'
customer3.myapp.com/view1/arg1 -> myapp.view1(arg1)  using schema 'customer3'

I hope this clarifies the question.

hmm- how many customers do you expect to have?

Difficult to predict, but designing for a number between 10 and 100 in the first 2 years.

OK, so not really a case where you could create one website on PythonAnywhere for each user.

As it looks like the best option from the Django perspective is to use django-tenant-schema, given that django-multitenants doesn't work, perhaps a good trick would be to trick it into thinking that the host header provided is different to what it would normally be.

If you create a Django app on PythonAnywhere, and set it up to use django-tenant-schema, then you can actually write some WSGI code to take a look at the path part of the URL of an incoming request and change the host header appropriately, and put that in the WSGI file. For example, the following code will do this:

  • http://www.yoursite.com/ -> request sent to Django for "/" on host www.yoursite.com
  • http://www.yoursite.com/user1/ -> request set to Django for "/" on host user1.www.yoursite.com
  • http://www.yoursite.com/user1/foo -> request set to Django for "/foo" on host user1.www.yoursite.com

.

import os
import re
import sys

# add your project directory to the sys.path
project_home = "/home/username/path"
if project_home not in sys.path:
    sys.path.append(project_home)

# set environment variable to tell django where your settings.py is
os.environ['DJANGO_SETTINGS_MODULE'] = 'project_name.settings'

from django.core.wsgi import get_wsgi_application
django_application = get_wsgi_application()

def application(environ, start_response):
    path = environ.get("PATH_INFO")
    user_match = re.match(r'^/([^/]+)(/.*)$', path)
    if user_match:
        user = user_match.group(1)
        path = user_match.group(2)
        environ["HTTP_HOST"] = "{}.{}".format(user, environ["HTTP_HOST"])
        environ["PATH_INFO"] = path

    return django_application(environ, start_response)

I imagine that's not exactly what you need (the extra "www" in the modified hostnames looks wrong) but if you let me know more about what would work for your setup I'd be happy to update it -- it's actually quite an interesting problem :-)

I now have been installing postgress locally and upgraded my account on pythonanywhere with postgress support.

Locally I have multi-tenants working fine using the package “django-tenant-schemas”. Now trying to deploy this on pythonanywhere by applying the suggested solution in above topic.

It somewhat works, in the sense that http://<mysite>/<customer1>/user/login/ indeed shows my login page and uses the postgres schema referring to customer1.

But many issues are present mostly with redirects and links within the site. Those links & redirects do not include the extra level /<customer1> so are all wrong. Any suggestion how to solve this transparently for the application code?

Note: don't get what is meant with the "extra www" in above comment. This is not posing issues so far.

I believe host based multi-tenancy with wildcard domains would be a better solution.

By the "extra www" I meant the one in the hostnames that are coming through, for example:

  • http://www.yoursite.com/user1/foo -> request set to Django for "/foo" on host user1.www.yoursite.com

-- it's going to user1.www.yoursite.com instead of user1.yoursite.com. But if that's not causing you problems, then that should be OK.

For the redirects and links within the site -- do you mean that you have some URLs that are, for example, http://www.yourdomain.com/x/y, with no customer name, and others that are http://www.yourdomain.com/customer1/a/b, which do?

If so, then the code that I gave would have to be modified to be able to work out when the first part of a URL path was a customer name, and when it wasn't. Is there anything that would allow code to distinguish between "customer1" and "x" in the above examples? For example, if you were happy to make a small change to the WSGI file when you added customers, you could put a list in the code and switch based on whether the first part of the path was in that list.

Agreed that host-based multi-tenancy is a good idea -- it's just going to take a while for us to change the way requests are routed.

Still no solution.

Problem with the link remains: Links to internal pages on the django site are of form http://yoursite.com/app/link while for this solution to work, they should be: http://yoursite.com/customer/app/link

So while the first request to the (manually entered URL) http://yoursite.com/customer/app/link works fine (due to the WSGI script translating this into http://customer.yoursite.com/app/link, this does not work for link on that page as the <customer> part is not part of the link.

I tried playing with HTTP_REFERER to know where the request came from but that is not a good solution as HHTP_REFERER is not guaranteed to be correct under all browsers / circumstances.

Suggestions?

What I meant in my last post was that you could change the WSGI code that I originally posted so that it had some way of telling whether in a particular link http://yoursite.com/A/B/ should be interpreted as "this is for customer A, so I should hack it to look like a request to http://A.yoursite.com/B or whether it should be interpreted as "this is an internal link so I should not hack it at all, and send it to Django as http://yoursite.com/A/B/.

One way of doing that (which might be a paid to maintain) would be to have a list of the customers, so that the WSGI code would look like this:

import os
import re
import sys

# add your project directory to the sys.path
project_home = "/home/username/path"
if project_home not in sys.path:
    sys.path.append(project_home)

# set environment variable to tell django where your settings.py is
os.environ['DJANGO_SETTINGS_MODULE'] = 'project_name.settings'

from django.core.wsgi import get_wsgi_application
django_application = get_wsgi_application()

USERS = ('username1', 'username2', 'username3')

def application(environ, start_response):
    path = environ.get("PATH_INFO")
    user_match = re.match(r'^/([^/]+)(/.*)$', path)
    if user_match:
        user = user_match.group(1)
        if user in USERS:
            path = user_match.group(2)
            environ["HTTP_HOST"] = "{}.{}".format(user, environ["HTTP_HOST"])
            environ["PATH_INFO"] = path

    return django_application(environ, start_response)

Of course, that would have the problem that you'd need to edit the WSGI file when you added a user.

An alternative would be to do it the other way around. For example, if all of the internal links were of the form http://yoursite.com/internal-SOMETHING/B/ then the WSGI code could be:

import os
import re
import sys

# add your project directory to the sys.path
project_home = "/home/username/path"
if project_home not in sys.path:
    sys.path.append(project_home)

# set environment variable to tell django where your settings.py is
os.environ['DJANGO_SETTINGS_MODULE'] = 'project_name.settings'

from django.core.wsgi import get_wsgi_application
django_application = get_wsgi_application()

def application(environ, start_response):
    path = environ.get("PATH_INFO")
    user_match = re.match(r'^/([^/]+)(/.*)$', path)
    if user_match:
        user = user_match.group(1)
        if not user.startswith("internal-"):
            path = user_match.group(2)
            environ["HTTP_HOST"] = "{}.{}".format(user, environ["HTTP_HOST"])
            environ["PATH_INFO"] = path

    return django_application(environ, start_response)

Without knowing more about the actual structure of the URLs you're using, I can't be more specific.

I do not really have an problem to update the wsgi file for each additional customer. That would cause only a few seconds downtime (time to reload the application) when adding a customer I guess?

Problem is in the internal links. The application is not aware of the multi-tenancy.

So the internal links are of the form: http://yoursite.com/app/link

But to distinguish the tenant, the links should be e.g.: http://customer1.yoursite.com/app/link But this cannot be done since wildcard domains are not supported.

Alternatively, links could be: http://yoursite.com/customer1/app/link But not clear how to do that in django since the application is itself not aware of the "customer" it is. There might be a way in django to make this working, but it should work for all apps.

So the issue is that also internal links must point to the right tenant (customer)...

Ah, I think I see -- so, when a customer is browsing their site, the system assumes that they're on "customer1.yourdomain.com", so internal links within the site are just to (say) "/a/b", rather than to "/customer1/a/b". Is that right?

In that case, and if it's hard to change the internal links (I assume it is, otherwise you wouldn't be asking) maybe this hacking of the hostname in the WSGI file isn't going to work.

I don't know what your revenue model is, or how much traffic a given customer is likely to generate, but perhaps then the simplest option on PythonAnywhere -- until we support wildcard domains -- would be to have a separate web app on the "Web" tab for each customer. A normal "hacker" account can add on web apps for $2/month each. And you can have as many web apps as you want pointing to the same code, so there wouldn't be any duplication there.

Might that work?

Yes, correct

The proposed solution looks acceptable to me as long as the webapps can all point to the same postgres database and use the same code. I'll give it a try.

Thank you for your help.

Yes, they definitely can point to the same DB and use the same code. It should be pretty easy and obvious to set that up, but do let us know if you have any problems.

I also have a multi-tenant requirement, but (in the short term) plan on handling it thru multiple webapps pointing to the same postgres instance with multiple schemas. I don't anticipate more than 5 clients in first year or so. There are certain advantages in having separate webapps for small numbers (handling of log files and defects for example).

My only question is handling wildcard certificates -- Are they currently supported on PAW?

Yes, if you have a wildcard cert, we can apply it to the domains that you specify with no problem.