What are the uses of iter(callable, sentinel)?

Posted on

Question :

What are the uses of iter(callable, sentinel)?

So, I was watching Raymond Hettinger’s talk Transforming Code into Beautiful, Idiomatic Python and he brings up this form of iter which I was never aware of. His example is the following:

Instead of:

blocks = []
while True:
    block = f.read(32)
    if block == '':
        break
    blocks.append(block)

Use:

blocks = []
read_block = partial(f.read, 32)
for block in iter(read_block, ''):
    blocks.append(block)

After checking the documentation of iter, I found a similar example:

with open('mydata.txt') as fp:
    for line in iter(fp.readline, ''):
        process_line(line)

This looks pretty useful to me, but I was wondering if of you Pythonistas know of any examples of this construct that doesn’t involve I/O-read loops? Perhaps in the Standard Library?

I can think of very contrived examples, like the following:

>>> def f():
...     f.count += 1
...     return f.count
... 
>>> f.count = 0
>>> list(iter(f,20))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> 

But obviously this is not any more useful that the built-in iterables. Also, it seems like code smell to me when you are assigning state to a function. At that point, I’d likely should be working with a class, but if I’m going to write a class, I might as well implement the iterator protocol for whatever I want to accomplish.

Answer #1:

As a rule, the main uses I’ve seen for two arg iter involve converting functions that are similar to C APIs (implicit state, no concept of iteration) to iterators. File-like objects are a common example, but it shows up in other libraries that poorly wrap C APIs. The pattern you’d expect would be one seen in APIs like FindFirstFile/FindNextFile, where a resource is opened, and each call advances internal state and returns a new value or a marker variable (like NULL in C). Wrapping it in a class implementing the iterator protocol is usually best, but if you have to do it yourself, while the API is a C level built-in, the wrapping can end up slowing usage, where two arg iter, implemented in C as well, can avoid the expense of additional byte code execution.

Other examples involve mutable objects that are changed during the loop itself, for example, looping in reverse order over lines in a bytearray, removing the line only once processing is complete:

>>> from functools import partial
>>> ba = bytearray(b'aaaan'*5)
>>> for i in iter(partial(ba.rfind, b'n'), -1):
...     print(i)
...     ba[i:] = b''
...
24
19
14
9
4

Another case is when using slicing in a progressive manner, for example, an efficient (if admittedly ugly) way to group an iterable into groups of n items while allowing the final group to be less than n items if the input iterable isn’t an even multiple of n items in length (this one I’ve actually used, though I usually use itertools.takewhile(bool instead of two arg iter):

# from future_builtins import map  # Python 2 only
from itertools import starmap, islice, repeat

def grouper(n, iterable):
    '''Returns a generator yielding n sized tuples from iterable

    For iterables not evenly divisible by n, the final group will be undersized.
    '''
    # Keep islicing n items and converting to groups until we hit an empty slice
    return iter(map(tuple, starmap(islice, repeat((iter(iterable), n)))).__next__, ())  # Use .next instead of .__next__ on Py2

Another use: Writing multiple pickled objects to a single file, followed by a sentinel value (None for example), so when unpickling, you can use this idiom instead of needing to somehow remember the number of items pickled, or needing to call load over and over until EOFError:

with open('picklefile', 'rb') as f:
    for obj in iter(pickle.Unpickler(f).load, None):
        ... process an object ...
Answered By: ShadowRanger

Answer #2:

Here’s a silly example I came up with:

from functools import partial
from random import randint

pull_trigger = partial(randint, 1, 6)

print('Starting a game of Russian Roulette...')
print('--------------------------------------')

for i in iter(pull_trigger, 6):
    print('I am still alive, selected', i)

print('Oops, game over, I am dead! :(')

Sample output:

$ python3 roulette.py 
Starting a game of Russian Roulette...
--------------------------------------
I am still alive, selected 2
I am still alive, selected 4
I am still alive, selected 2
I am still alive, selected 5
Oops, game over, I am dead! :(

The idea is to have a generator that yields random values, and you want to a process once a particular value has been selected. You could e.g. use this pattern in each run of a simulation that tries to determine the average outcome of a stochastic process.

Of course the process you would be modelling would likely be much more complicated under the hood than a simple dice roll…

Another example I can think of would be repeatedly performing an operation until it succeeds, indicated by an empty error message (let’s just assume here that some 3rd party function is designed like that instead of e.g. using exceptions):

from foo_lib import guess_password

for msg in iter(guess_password, ''):
    print('Incorrect attempt, details:', msg)

# protection cracked, continue...
Answered By: plamut

Answer #3:

In multiprocessing/multithreading code you will (hopefully) find this construct often for polling a queue or pipe. In the standard lib you’ll also find this in multiprocessing.Pool:

@staticmethod
def _handle_tasks(taskqueue, put, outqueue, pool, cache):
    thread = threading.current_thread()

    for taskseq, set_length in iter(taskqueue.get, None):
        task = None
        try:
            # iterating taskseq cannot fail
            for task in taskseq:
        ...
    else:
        util.debug('task handler got sentinel')

Some while ago I came across this blog entry, which IMO wraps up really nice the advantage of iter(callable, sentinel) over while True ... break:

Usually, when we iterate over an objects or until a condition happens, we understand the scope of the loop in its first line. e.g., when reading a loop that starts with for book in books we realize we’re iterating over all the books. When we see a loop that starts with while not battery.empty() we realize that the scope of the loop is for as long as we still have battery.
When we say “Do forever” (i.e., while True), it’s obvious that this scope is a lie. So it requires us to hold that thought in our head and search the rest of the code for a statement that’ll get us out of it. We are entering the loop with less information and so it is less readable.

Answered By: Darkonaut

Leave a Reply

Your email address will not be published.