Python Multiprocessing Process Crashes Silently
Solution 1:
What you really want is some way to pass exceptions up to the parent process, right? Then you can handle them however you want.
If you use concurrent.futures.ProcessPoolExecutor
, this is automatic. If you use multiprocessing.Pool
, it's trivial. If you use explicit Process
and Queue
, you have to do a bit of work, but it's not that much.
For example:
defrun(self):
try:
for i initer(self.inputQueue.get, 'STOP'):
# (code that does stuff)1 / 0# Dumb error# (more code that does stuff)
self.outputQueue.put(result)
except Exception as e:
self.outputQueue.put(e)
Then, your calling code can just read Exception
s off the queue like anything else. Instead of this:
yield outq.pop()
do this:
result= outq.pop()
if isinstance(result, Exception):
raise result
yield result
(I don't know what your actual parent-process queue-reading code does, because your minimal sample just ignores the queue. But hopefully this explains the idea, even though your real code doesn't actually work like this.)
This assumes that you want to abort on any unhandled exception that makes it up to run
. If you want to pass back the exception and continue on to the next i in iter
, just move the try
into the for
, instead of around it.
This also assumes that Exception
s are not valid values. If that's an issue, the simplest solution is to just push (result, exception)
tuples:
defrun(self):
try:
for i initer(self.inputQueue.get, 'STOP'):
# (code that does stuff)1 / 0# Dumb error# (more code that does stuff)
self.outputQueue.put((result, None))
except Exception as e:
self.outputQueue.put((None, e))
Then, your popping code does this:
result, exception = outq.pop()
ifexception:
raise exceptionyield result
You may notice that this is similar to the node.js callback style, where you pass (err, result)
to every callback. Yes, it's annoying, and you're going to mess up code in that style. But you're not actually using that anywhere except in the wrapper; all of your "application-level" code that gets values off the queue or gets called inside run
just sees normal returns/yields and raised exceptions.
You may even want to consider building a Future
to the spec of concurrent.futures
(or using that class as-is), even though you're doing your job queuing and executing manually. It's not that hard, and it gives you a very nice API, especially for debugging.
Finally, it's worth noting that most code built around workers and queues can be made a lot simpler with an executor/pool design, even if you're absolutely sure you only want one worker per queue. Just scrap all the boilerplate, and turn the loop in the Worker.run
method into a function (which just return
s or raise
s as normal, instead of appending to a queue). On the calling side, again scrap all the boilerplate and just submit
or map
the job function with its parameters.
Your whole example can be reduced to:
defjob(i):
# (code that does stuff)1 / 0# Dumb error# (more code that does stuff)return result
with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
results = executor.map(job, range(10))
And it'll automatically handle exceptions properly.
As you mentioned in the comments, the traceback for an exception doesn't trace back into the child process; it only goes as far as the manual raise result
call (or, if you're using a pool or executor, the guts of the pool or executor).
The reason is that multiprocessing.Queue
is built on top of pickle
, and pickling exceptions doesn't pickle their tracebacks. And the reason for that is that you can't pickle tracebacks. And the reason for that is that tracebacks are full of references to the local execution context, so making them work in another process would be very hard.
So… what can you do about this? Don't go looking for a fully general solution. Instead, think about what you actually need. 90% of the time, what you want is "log the exception, with traceback, and continue" or "print the exception, with traceback, to stderr
and exit(1)
like the default unhandled-exception handler". For either of those, you don't need to pass an exception at all; just format it on the child side and pass a string over. If you do need something more fancy, work out exactly what you need, and pass just enough information to manually put that together. If you don't know how to format tracebacks and exceptions, see the traceback
module. It's pretty simple. And this means you don't need to get into the pickle machinery at all. (Not that it's very hard to copyreg
a pickler or write a holder class with a __reduce__
method or anything, but if you don't need to, why learn all that?)
Solution 2:
I suggest such workaround for showing process's exceptions
from multiprocessing import Process
import traceback
run_old = Process.run
defrun_new(*args, **kwargs):
try:
run_old(*args, **kwargs)
except (KeyboardInterrupt, SystemExit):
raiseexcept:
traceback.print_exc(file=sys.stdout)
Process.run = run_new
Post a Comment for "Python Multiprocessing Process Crashes Silently"