Skip to content Skip to sidebar Skip to footer

Using Lookahead With Generators

I have implemented a generator-based scanner in Python that tokenizes a string into tuples of the form (token type, token value): for token in scan('a(b)'): print token would

Solution 1:

Pretty good answers there, but my favorite approach would be to use itertools.tee -- given an iterator, it returns two (or more if requested) that can be advanced independently. It buffers in memory just as much as needed (i.e., not much, if the iterators don't get very "out of step" from each other). E.g.:

import itertools
import collections

classIteratorWithLookahead(collections.Iterator):
  def__init__(self, it):
    self.it, self.nextit = itertools.tee(iter(it))
    self._advance()
  def_advance(self):
    self.lookahead = next(self.nextit, None)
  def__next__(self):
    self._advance()
    returnnext(self.it)

You can wrap any iterator with this class, and then use the .lookahead attribute of the wrapper to know what the next item to be returned in the future will be. I like to leave all the real logic to itertools.tee and just provide this thin glue!-)

Solution 2:

You can write a wrapper that buffers some number of items from the generator, and provides a lookahead() function to peek at those buffered items:

classLookahead:
    def__init__(self, iter):
        self.iter = iter
        self.buffer = []

    def__iter__(self):
        return self

    defnext(self):
        if self.buffer:
            return self.buffer.pop(0)
        else:
            return self.iter.next()

    deflookahead(self, n):
        """Return an item n entries ahead in the iteration."""while n >= len(self.buffer):
            try:
                self.buffer.append(self.iter.next())
            except StopIteration:
                returnNonereturn self.buffer[n]

Solution 3:

It's not pretty, but this may do what you want:

defpaired_iter(it):
    token = it.next()
    for lookahead in it:
        yield (token, lookahead)
        token = lookahead
    yield (token, None)

defscan(s):
    for c in s:
        yield c

for this_token, next_token in paired_iter(scan("ABCDEF")):
    print"this:%s next:%s" % (this_token, next_token)

Prints:

this:A next:B
this:B next:C
this:C next:D
this:D next:E
this:E next:F
this:F next:None

Solution 4:

Here is an example that allows a single item to be sent back to the generator

defgen():
    for i inrange(100):
        v=yield i           # when you call next(), v will be set to Noneif v:
            yieldNone# this yields None to send() call
            v=yield v       # so this yield is for the first next() after send()

g=gen()

x=g.next()
print0,x

x=g.next()
print1,x

x=g.next()
print2,x # oops push it back

x=g.send(x)

x=g.next()
print3,x # x should be 2 again

x=g.next()
print4,x

Solution 5:

Construct a simple lookahead wrapper using itertools.tee:

from itertools import tee, islice

classLookAhead:
    'Wrap an iterator with lookahead indexing'def__init__(self, iterator):
        self.t = tee(iterator, 1)[0]
    def__iter__(self):
        return self
    defnext(self):
        returnnext(self.t)
    def__getitem__(self, i):
        for value in islice(self.t.__copy__(), i, None):
            return value
        raise IndexError(i)

Use the class to wrap an existing iterable or iterator. You can then either iterate normally using next or you can lookahead with indexed lookups.

>>>it = LookAhead([10, 20, 30, 40, 50])>>>next(it)
10
>>>it[0]
20
>>>next(it)
20
>>>it[0]
30
>>>list(it)
[30, 40, 50]

To run this code under Python 3, simply change the next method to __next__.

Post a Comment for "Using Lookahead With Generators"