Question :
There is a more general question here: In what situation should the built-in operator
module be used in python?
The top answer claims that operator.itemgetter(x)
is “neater” than, presumably, than lambda a: a[x]
. I feel the opposite is true.
Are there any other benefits, like performance?
Answer #1:
You shouldn’t worry about performance unless your code is in a tight inner loop, and is actually a performance problem. Instead, use code that best expresses your intent. Some people like lambdas, some like itemgetter. Sometimes it’s just a matter of taste.
itemgetter
is more powerful, for example, if you need to get a number of elements at once. For example:
operator.itemgetter(1,3,5)
is the same as:
lambda s: (s[1], s[3], s[5])
Answer #2:
There are benefits in some situations, here is a good example.
>>> data = [('a',3),('b',2),('c',1)]
>>> from operator import itemgetter
>>> sorted(data, key=itemgetter(1))
[('c', 1), ('b', 2), ('a', 3)]
This use of itemgetter
is great because it makes everything clear while also being faster as all operations are kept on the C
side.
>>> sorted(data, key=lambda x:x[1])
[('c', 1), ('b', 2), ('a', 3)]
Using a lambda
is not as clear, it is also slower and it is preferred not to use lambda
unless you have to. Eg. list comprehensions are preferred over using map
with a lambda
.
Answer #3:
Performance. It can make a big difference. In the right circumstances, you can get a bunch of stuff done at the C level by using itemgetter.
I think the claim of what is clearer really depends on which you use most often and would be very subjective
Answer #4:
As performance was mentioned, I’ve compared both methods operator.itemgetter
and lambda
and for a small list it turns out that operator.itemgetter
outperforms lambda by 10%
. I personally like the itemgetter
method as I mostly use it during sort and it became like a keyword for me.
import operator
import timeit
x = [[12, 'tall', 'blue', 1],
[2, 'short', 'red', 9],
[4, 'tall', 'blue', 13]]
def sortOperator():
x.sort(key=operator.itemgetter(1, 2))
def sortLambda():
x.sort(key=lambda x:(x[1], x[2]))
if __name__ == "__main__":
print(timeit.timeit(stmt="sortOperator()", setup="from __main__ import sortOperator", number=10**7))
print(timeit.timeit(stmt="sortLambda()", setup="from __main__ import sortLambda", number=10**7))
>>Tuple: 9.79s, Single: 8.835s
>>Tuple: 11.12s, Single: 9.26s
Run on Python 3.6
Answer #5:
Leaving aside performance and code style, itemgetter
is picklable, while lambda
is not. This is important if the function needs to be saved, or passed between processes (typically as part of a larger object). In the following example, replacing itemgetter
with lambda
will result in a PicklingError
.
from operator import itemgetter
def sort_by_key(sequence, key):
return sorted(sequence, key=key)
if __name__ == "__main__":
from multiprocessing import Pool
items = [([(1,2),(4,1)], itemgetter(1)),
([(5,3),(2,7)], itemgetter(0))]
with Pool(5) as p:
result = p.starmap(sort_by_key, items)
print(result)
Answer #6:
When using this in the key
parameter of sorted()
or min()
, given the choice between say operator.itemgetter(1)
and lambda x: x[1]
, the former is typically significantly faster in both cases:
Using sorted()
The compared functions are defined as follows:
import operator
def sort_key_itemgetter(items, key=1):
return sorted(items, key=operator.itemgetter(key))
def sort_key_lambda(items, key=1):
return sorted(items, key=lambda x: x[key])
Result: sort_key_itemgetter()
is faster by ~10% to ~15%.
(Full analysis here)
Using min()
The compared functions are defined as follows:
import operator
def min_key_itemgetter(items, key=1):
return min(items, key=operator.itemgetter(key))
def min_key_lambda(items, key=1):
return min(items, key=lambda x: x[key])
Result: min_key_itemgetter()
is faster by ~20% to ~60%.
(Full analysis here)
Answer #7:
Some programmers understand and use lambdas, but there is a population of programmers who perhaps didn’t take computer science and aren’t clear on the concept. For those programmers itemgetter()
can make your intention clearer. (I don’t write lambdas and any time I see one in code it takes me a little extra time to process what’s going on and understand the code).
If you’re coding for other computer science professionals go ahead and use lambdas if they are more comfortable. However, if you’re coding for a wider audience. I suggest using itemgetter()
.