Forums

multithreading issue

Hey,

I'm calling a function in a within another file from a particular route, to send an email this is to send the email in the background rather than having to wait for it to send and then the user be redirected as normal.

This looks like this...

from threading import Thread
import smtplib

def send_email():
    '''stuff'''
    thr = Thread(target = send_email)
    thr.start()

The trouble is the route redirect isn't done until after the email is sent, I know this because to test it I called the function 5 times and timed it.

Is this because of this? *** Python threads support is disabled. You can enable it with --enable-threads ***

Yes. We have disabled threads in web apps.

Hey, thanks for the reply, so to enable them where exactly do I --enable-threads ??

You don't. We're not convinced that allowing threads in web apps is a good idea - at the moment there's too much of a chance that users will accidentally end up hogging resources.

Well that sucks

Hey there,

well, the lack of threads itself kinda sucks, but it would be fine if there was some sort of workaround -- the common requirement here, one that we hear a lot, is that people want to be able to run async tasks from their web apps -- like sending an email, or doing some kind of batch processing. For that, lots of people use queue management tools like Celery, and a pool of worker processes... The trouble is that we really don't have a good way of supporting celery or batch workers. And that definitely sucks a bit. And it's one of the things we plan to fix soon, if we can.

In the meantime -- you can "roll your own" solution to this problem, by implementing your own queue, and then using a scheduled task as the batch processor. This could be as simple as having a new datatabase table in mysql where you log jobs to be done (like, send an email to this address with this text), and then a scheduled task that runs every hour, and looks through the table, and processes the jobs one by one. Then your web app just adds a job to the queue by adding a row to the database, which is fast, and the scheduled task will take care of doing the "slow" sending of emails a bit later.

And with all that said -- sending emails really isn't all that slow. Unless you're seeing crazy-high traffic and you really need to start doing some serious optimisation, you probably aren't going to get all that much benefit out of building a complex async processing system (even just dealing with all the fiddly bits of threads) just to handle sending emails. Have you done any sort of measurement of how much time it's taking to send your emails, how much time it's adding to the average HTTP request round trip for your site? My guess is that it'll probably be "in the noise" compared to the time it takes an HTTP packet to cross the Internet, there and back...

Another suggestion would be to use Ajax -- that way you can give the UI back to your user immediately, and you don't need anything fancy on the server side, just a bit of javascript on the client side, and we don't place any restrictions at all on what you can do there...

Hey Harry,

Thanks for your response and you're right I was thinking about it last night and it doesn't actually cause much of an issue, Personally I just want to make my site as quick as possible, It's just a personal blog but I was getting a little obsessed with the speed so thought a sub process would be best to make it that little bit quicker :)

ok, i wanted to use threads to copy files from urls in parallel. i'm not exactly thrilled that i can't do this. also, in my case i can't resume the main thread until all the resources are available. how can i create a queue for the file i/o operations and wait until it's finished to call the subsequent functions?

clyde

glenn- when you say this: 'We're not convinced that allowing threads in web apps is a good idea - at the moment there's too much of a chance that users will accidentally end up hogging resources,' that implies you can't find a billing model to support threads. otherwise, hogging resources is good for your business, after all that's what you sell- resources. from my POV it would be much nicer if you would monetize threads rather than ban them.

clyde

also- if you aren't going to support threads, don't allow users to import the thread and threading modules. i had no idea what was going wrong with my code.

clyde

@clydet, it's not about monetisation, it's about users adversely affecting each other. We don't want poorly written code from one user to influence another and, at the moment, we're not convinced that the OS-level limitations that we can apply will sufficiently protect our users from a finger-slip or from actively malicious code.

As to preventing users from importing threading, that would involve modifying the base Python library install with all sorts of possible unintended consequences.

glenn, normally i don't sound so contentious, but my code was so breathtakingly beautiful that i was taken aback when it didn't run. i get what you're saying about not wanting to modify the base install, but it would have been nicer if i had known about this limitation before i wrote my module.

i do think, however, that 'threads disabled by default' would still be the better approach. did it fail as an experiment?

clyde

Hi Clyde -- just jumping in here while Glenn is busy. The problem with stopping people from importing threading is that it's a base Python module. Stopping people from importing would mean somehow removing part of the base Python installation from the system image available to web apps. There are two problems with that:

  1. Removing stuff from Python isn't really a supported operation by Python itself. There could be any number of dependencies that would break, and those breakages wouldn't necessarily be immediately obvious. As an analogy, imagine removing a random system file from your operating system. It might have the desired effect, and it might pass all the tests you can think of, but you could never be sure that it wouldn't break the OS in a non-obvious way that you'd only discover at a later point. With tens of thousands of web apps on PythonAnywhere, we can't really take that risk.
  2. We'd also have to keep totally separate sets of Python installs for console/scheduled task sandboxes (where you can run threads) and for web application ones. That could double the amount of work required to keep our sandbox install up-to-date, which is already an expensive operation -- we need to keep track of dependencies between hundreds of packages for four different Python versions.

Monetising it, and allowing threading but charging for usage, would actually be somewhat easier. But unfortunately it's still hard: while it's reasonably easy to prevent certain kinds of out-of-control processes from adversely affecting other users (for example, process fork bombs can be stopped with a simple per-user process limit) it's harder to put the same limits in place for threads. Especially with the different virtualisation module we use for web apps versus console apps and scheduled tasks. And the failure mode when threads get out of control is that the machine crashes completely, taking all of the running web apps with it -- not just that other web apps slow down. So it's something where we really need to be completely sure about our solution before going for it.

thank you for filling in the missing pieces! it sounds like a reasoned approach. i didn't realize the problems involved in wrangling threads.

clyde

Another thing you can do (I had to do this myself) is to use the multiprocessing library. If you want to use lists, use a manager (I had a huge problem trying to sync lists). Multiprocessing is by far worse than threading, but it is a workaround.

I don't understand the problem, python's threading is not like c++ or golang it doesn't use multiple cores and shouldn't "hog" resources.

Python threads do take resources. The OS needs to keep track of them and schedule switching between them. That takes time and memory.

It would be nice to just show a warning whenever there is a "from threading import Thread" in the code, for example. It took me some time to figure out what was going wrong with my application too.

Thanks for the suggestion. We will give it some thought.