So, I was watching Raymond Hettinger’s talk Transforming Code into Beautiful, Idiomatic Python and he brings up this form of
iter which I was never aware of. His example is the following:
blocks =  while True: block = f.read(32) if block == '': break blocks.append(block)
blocks =  read_block = partial(f.read, 32) for block in iter(read_block, ''): blocks.append(block)
After checking the documentation of
iter, I found a similar example:
with open('mydata.txt') as fp: for line in iter(fp.readline, ''): process_line(line)
This looks pretty useful to me, but I was wondering if of you Pythonistas know of any examples of this construct that doesn’t involve I/O-read loops? Perhaps in the Standard Library?
I can think of very contrived examples, like the following:
def f(): f.count += 1 return f.count f.count = 0 list(iter(f,20)) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
But obviously this is not any more useful that the built-in iterables. Also, it seems like code smell to me when you are assigning state to a function. At that point, I’d likely should be working with a class, but if I’m going to write a class, I might as well implement the iterator protocol for whatever I want to accomplish.
As a rule, the main uses I’ve seen for two arg iter involve converting functions that are similar to C APIs (implicit state, no concept of iteration) to iterators. File-like objects are a common example, but it shows up in other libraries that poorly wrap C APIs. The pattern you’d expect would be one seen in APIs like
FindNextFile, where a resource is opened, and each call advances internal state and returns a new value or a marker variable (like
NULL in C). Wrapping it in a class implementing the iterator protocol is usually best, but if you have to do it yourself, while the API is a C level built-in, the wrapping can end up slowing usage, where two arg iter, implemented in C as well, can avoid the expense of additional byte code execution.
Other examples involve mutable objects that are changed during the loop itself, for example, looping in reverse order over lines in a bytearray, removing the line only once processing is complete:
from functools import partial ba = bytearray(b'aaaan'*5) for i in iter(partial(ba.rfind, b'n'), -1): print(i) ba[i:] = b'' ... 24 19 14 9 4
Another case is when using slicing in a progressive manner, for example, an efficient (if admittedly ugly) way to group an iterable into groups of
n items while allowing the final group to be less than
n items if the input iterable isn’t an even multiple of
n items in length (this one I’ve actually used, though I usually use
itertools.takewhile(bool instead of two arg
# from future_builtins import map # Python 2 only from itertools import starmap, islice, repeat def grouper(n, iterable): '''Returns a generator yielding n sized tuples from iterable For iterables not evenly divisible by n, the final group will be undersized. ''' # Keep islicing n items and converting to groups until we hit an empty slice return iter(map(tuple, starmap(islice, repeat((iter(iterable), n)))).__next__, ()) # Use .next instead of .__next__ on Py2
Another use: Writing multiple pickled objects to a single file, followed by a sentinel value (
None for example), so when unpickling, you can use this idiom instead of needing to somehow remember the number of items pickled, or needing to call
load over and over until
with open('picklefile', 'rb') as f: for obj in iter(pickle.Unpickler(f).load, None): ... process an object ...
Here’s a silly example I came up with:
from functools import partial from random import randint pull_trigger = partial(randint, 1, 6) print('Starting a game of Russian Roulette...') print('--------------------------------------') for i in iter(pull_trigger, 6): print('I am still alive, selected', i) print('Oops, game over, I am dead! :(')
$ python3 roulette.py Starting a game of Russian Roulette... -------------------------------------- I am still alive, selected 2 I am still alive, selected 4 I am still alive, selected 2 I am still alive, selected 5 Oops, game over, I am dead! :(
The idea is to have a generator that yields random values, and you want to a process once a particular value has been selected. You could e.g. use this pattern in each run of a simulation that tries to determine the average outcome of a stochastic process.
Of course the process you would be modelling would likely be much more complicated under the hood than a simple dice roll…
Another example I can think of would be repeatedly performing an operation until it succeeds, indicated by an empty error message (let’s just assume here that some 3rd party function is designed like that instead of e.g. using exceptions):
from foo_lib import guess_password for msg in iter(guess_password, ''): print('Incorrect attempt, details:', msg) # protection cracked, continue...
In multiprocessing/multithreading code you will (hopefully) find this construct often for polling a queue or pipe. In the standard lib you’ll also find this in
def _handle_tasks(taskqueue, put, outqueue, pool, cache): thread = threading.current_thread() for taskseq, set_length in iter(taskqueue.get, None): task = None try: # iterating taskseq cannot fail for task in taskseq: ... else: util.debug('task handler got sentinel')
Some while ago I came across this blog entry, which IMO wraps up really nice the advantage of
iter(callable, sentinel) over
while True ... break:
Usually, when we iterate over an objects or until a condition happens, we understand the scope of the loop in its first line. e.g., when reading a loop that starts with for book in books we realize we’re iterating over all the books. When we see a loop that starts with while not battery.empty() we realize that the scope of the loop is for as long as we still have battery.
When we say “Do forever” (i.e., while True), it’s obvious that this scope is a lie. So it requires us to hold that thought in our head and search the rest of the code for a statement that’ll get us out of it. We are entering the loop with less information and so it is less readable.