I need a queue which multiple threads can put stuff into, and multiple threads may read from.
Python has at least two queue classes, Queue.Queue and collections.deque, with the former seemingly using the latter internally. Both claim to be thread-safe in the documentation.
However, the Queue docs also state:
collections.deque is an alternative
implementation of unbounded queues
with fast atomic append() and
popleft() operations that do not
Which I guess I don’t quite unterstand: Does this mean deque isn’t fully thread-safe after all?
If it is, I may not fully understand the difference between the two classes. I can see that Queue adds blocking functionality. On the other hand, it loses some deque features like support for the in-operator.
Accessing the internal deque object directly, is
x in Queue().deque
Also, why does Queue employ a mutex for it’s operations when deque is thread-safe already?
collections.deque serve different purposes. Queue.Queue is intended for allowing different threads to communicate using queued messages/data, whereas
collections.deque is simply intended as a datastructure. That’s why
Queue.Queue has methods like
Queue.Queue isn’t intended to be used as a collection, which is why it lacks the likes of the
It boils down to this: if you have multiple threads and you want them to be able to communicate without the need for locks, you’re looking for
Queue.Queue; if you just want a queue or a double-ended queue as a datastructure, use
Finally, accessing and manipulating the internal deque of a
Queue.Queue is playing with fire – you really don’t want to be doing that.
If all you’re looking for is a thread-safe way to transfer objects between threads, then both would work (both for FIFO and LIFO). For FIFO:
- Other operations on
dequemight not be thread safe, I’m not sure.
dequedoes not block on
popleft()so you can’t base your consumer thread flow on blocking till a new item arrives.
However, it seems that deque has a significant efficiency advantage. Here are some benchmark results in seconds using CPython 2.7.3 for inserting and removing 100k items
deque 0.0747888759791 Queue 1.60079066852
Here’s the benchmark code:
import time import Queue import collections q = collections.deque() t0 = time.clock() for i in xrange(100000): q.append(1) for i in xrange(100000): q.popleft() print 'deque', time.clock() - t0 q = Queue.Queue(200000) t0 = time.clock() for i in xrange(100000): q.put(1) for i in xrange(100000): q.get() print 'Queue', time.clock() - t0
For information there is a Python ticket referenced for deque thread-safety (https://bugs.python.org/issue15329).
Title “clarify which deque methods are thread-safe”
Bottom line here: https://bugs.python.org/issue15329#msg199368
The deque’s append(), appendleft(), pop(), popleft(), and len(d)
operations are thread-safe in CPython. The append methods have a
DECREF at the end (for cases where maxlen has been set), but this
happens after all of the structure updates have been made and the
invariants have been restored, so it is okay to treat these operations
Anyway, if you are not 100% sure and you prefer reliability over performance, just put a like Lock 😉
All single-element methods on
deque are atomic and thread-safe. All other methods are thread-safe too. Things like
dq yield momentary correct values. But think e.g. about
dq.extend(mylist): you don’t get a guarantee that all elements in
mylist are filed in a row when other threads also append elements on the same side – but thats usually not a requirement in inter-thread communication and for the questioned task.
deque is ~20x faster than
Queue (which uses a
deque under the hood) and unless you don’t need the “comfortable” synchronization API (blocking / timeout), the strict
maxsize obeyance or the “Override these methods (_put, _get, ..) to implement other queue organizations” sub-classing behavior, or when you take care of such things yourself, then a bare
deque is a good and efficient deal for high-speed inter-thread communication.
In fact the heavy usage of an extra mutex and extra method
._get() etc. method calls in
Queue.py is due to backwards compatibility constraints, past over-design and lack of care for providing an efficient solution for this important speed bottleneck issue in inter-thread communication. A list was used in older Python versions – but even list.append()/.pop(0) was & is atomic and threadsafe …
notify_all() to each
popleft results in far worse results for
deque than the 20x improvement achieved by default
deque + notify_all: 0.469802 Queue: 0.667279
@Jonathan modify his code a little and I get the benchmark using cPython 3.6.2 and add condition in deque loop to simulate the behaviour Queue do.
import time from queue import Queue import threading import collections mutex = threading.Lock() condition = threading.Condition(mutex) q = collections.deque() t0 = time.clock() for i in range(100000): with condition: q.append(1) condition.notify_all() for _ in range(100000): with condition: q.popleft() condition.notify_all() print('deque', time.clock() - t0) q = Queue(200000) t0 = time.clock() for _ in range(100000): q.put(1) for _ in range(100000): q.get() print('Queue', time.clock() - t0)
And it seems the performance limited by
collections.deque is an alternative implementation of unbounded queues with fast atomic append() and popleft() operations that do not require locking.
deque is thread-safe. “operations that do not require locking” means that you don’t have to do the locking yourself, the
deque takes care of it.
Taking a look at the
Queue source, the internal deque is called
self.queue and uses a mutex for accessors and mutations, so
Queue().queue is not thread-safe to use.
If you’re looking for an “in” operator, then a deque or queue is possibly not the most appropriate data structure for your problem.
(seems I don’t have reputation to comment…)
You need to be careful which methods of the deque you use from different threads.
deque.get() appears to be threadsafe, but I have found that doing
for item in a_deque: process(item)
can fail if another thread is adding items at the same time.
I got an RuntimeException that complained “deque mutated during iteration”.
Check collectionsmodule.c to see which operations are affected by this