I’m sorry if this is too simple for some people, but I still don’t get the trick with python’s multiprocessing. I’ve read
and many other tutorials and examples that google gives me… many of them from here too.
Well, my situation is that I have to compute many numpy matrices and I need to store them in a single numpy matrix afterwards. Let’s say I want to use 20 cores (or that I can use 20 cores) but I haven’t managed to successfully use the pool resource since it keeps the processes alive till the pool “dies”. So I thought on doing something like this:
from multiprocessing import Process, Queue import numpy as np def f(q,i): q.put( np.zeros( (4,4) ) ) if __name__ == '__main__': q = Queue() for i in range(30): p = Process(target=f, args=(q,)) p.start() p.join() result = q.get() while q.empty() == False: result += q.get() print result
but then it looks like the processes don’t run in parallel but they run sequentially (please correct me if I’m wrong) and I don’t know if they die after they do their computation (so for more than 20 processes the ones that did their part leave the core free for another process). Plus, for a very large number (let’s say 100.000), storing all those matrices (which may be really big too) in a queue will use a lot of memory, rendering the code useless since the idea is to put every result on each iteration in the final result, like using a lock (and its acquire() and release() methods), but if this code isn’t for parallel processing, the lock is useless too…
I hope somebody may help me.
Thanks in advance!
You are correct, they are executing sequentially in your example.
p.join() causes the current thread to block until it is finished executing. You’ll either want to join your processes individually outside of your for loop (e.g., by storing them in a list and then iterating over it) or use something like
apply_async with a callback. That will also let you add it to your results directly rather than keeping the objects around.
def f(i): return i*np.identity(4) if __name__ == '__main__': p=Pool(5) result = np.zeros((4,4)) def adder(value): global result result += value for i in range(30): p.apply_async(f, args=(i,), callback=adder) p.close() p.join() print result
Closing and then joining the pool at the end ensures that the pool’s processes have completed and the
result object is finished being computed. You could also investigate using
Pool.imap as a solution to your problem. That particular solution would look something like this:
if __name__ == '__main__': p=Pool(5) result = np.zeros((4,4)) im = p.imap_unordered(f, range(30), chunksize=5) for x in im: result += x print result
This is cleaner for your specific situation, but may not be for whatever you are ultimately trying to do.
As to storing all of your varied results, if I understand your question, you can just add it off into a result in the callback method (as above) or item-at-a-time using
imap_unordered (which still stores the results, but you’ll clear it as it builds). Then it doesn’t need to be stored for longer than it takes to add to the result.