Python

Iterators and generators in Python

Iterators

An iterator is an object that will allow you to iterate over a container. The iterator in Python is implemented via two distinct methods: __iter__ and __next__. The __iter__ method is required for your container to provide iteration support. It will return the iterator object itself. But if you want to create an iterator object, then you will need to define __next__ as well, which will return the next item in the container.

Chắc hẳn chúng ta từng tự hỏi: “Iterable và Iterator khác nhau như thế nào?”

  • Iterable – an object that has the __iter__ method defined.
  • Iterator – an object that has both __iter__ and __next__ defined where __iter__ will return the iterator object and __next__ will return the next element in the iteration.

As with most magic methods (the methods with double-underscores), you should not call __iter__ or __next__ directly. Instead you can use a for loop or list comprehension and Python will call the methods for you automatically. There are cases when you may need to call them, but you can do so with Python’s built-ins: iter and next.

Python 3 có một số kiểu Sequence như list, tuple, string và range. Những kiểu dữ liệu này là iterable nhưng không phải là iterator vì nó không có method __next__.

Example 1: iter() and next()

>>> myList = [1, 2, 3]
>>> next(myList)
Traceback (most recent call last):
  Python Shell, prompt 2, line 1
builtins.TypeError: 'list' object is not an iterator
>>> myIter = iter(myList)
>>> next(myIter)
1
>>> next(myIter)
2
>>> next(myIter)
3
>>> next(myIter)
Traceback (most recent call last):
  ...
StopIteration

To turn the list into an iterator, just wrap it in a call to Python’s iter method. Then you can call next on it until the iterator runs out of items and StopIteration gets raised.

How for loop actually works?

A more elegant way of automatically iterating is by using the for loop. Using this, we can iterate over any object that can return an iterator, for example list, string, file etc.

Example 2:

>>> myList = [1,2,3,4,5]
>>> for i in myList:
...     print(i)
...
1
2
3
4
5

In fact the for loop can iterate over any iterable. Let’s take a closer look at how the for loop is actually implemented in Python.

for element in iterable:
    # do something with element

Is actually implemented as:

# create an iterator object from that iterable
iter_obj = iter(iterable)

# infinite loop
while True:
    try:
        # get the next item
        element = next(iter_obj)
        # do something with element
    except StopIteration:
        # if StopIteration is raised, break from loop
        break

So internally, the for loop creates an iterator object, iter_obj by calling iter() on the iterable. Inside the loop, it calls next() to get the next element and executes the body of the for loop with this value. After all the items exhaust, StopIteration is raised which is internally caught and the loop ends. Note that any other kind of exception will pass through.

Creating your own iterators

Building an iterator from scratch is easy in Python. We just have to implement the methods __iter__() and __next__().

The __iter__() method returns the iterator object itself. If required, some initialization can be performed.

The __next__() method must return the next item in the sequence. On reaching the end, and in subsequent calls, it must raise StopIteration.

Example 3: PowerOfTwo

class PowerOfTwo:
    def __init__(self, x):
        self.max = x
    def __iter__(self):
        self.n = 0
        return self
    def __next__(self):
        if self.n <= self.max:
            self.n += 1
            return 2**(self.n-1)
        else:
            raise StopIteration

Lets run the code:

>>> i = iter(PowerOfTwo(4))
>>> next(i)
0
>>> next(i)
1
>>> next(i)
4
>>> next(i)
8
>>> next(i)
16
>>> next(i)
Traceback (most recent call last):
...
StopIteration

Infinite iterators

It is not necessary that the item in an iterator object has to exhaust. There can be infinite iterators (which never ends). We must be careful when handling such iterator.

The built-in function iter() can be called with two arguments where the first argument must be a callable object (function) and second is the sentinel. The iterator calls this function until the returned value is equal to the sentinel.

Example 4: infinite iterator

class InfIter:
    def __iter__(self):
        self.num = 1
        return self
    def __next__(self):
        self.num += 2
        return self.num-2

Hãy chạy thử đoạn code này nào:

>>> for i in iter(InfIter()):
        print(i)

Chúng ta vừa tạo ra một vòng lặp vô tận với iterator đó, vì vậy, hãy chú ý đến điều kiện thoát khi sử dụng Infinite Iterator.

Thêm một chú ý, vì class InfIter có method __iter__() nên mỗi lần được gọi trong for loop, iterator sẽ được tạo mới.

Generators

Generators simplify creation of iterators. A generator is a function that produces a sequence of results instead of a single value.

Example 5:

def evenGenerator(n):
    number = 0
    while number*2 < n:
        yield number*2
        number += 1

Result:

>>> a = evenGenerator(10)
>>> next(a)
0
>>> next(a)
2
>>> next(a)
4

When a generator function evenGenerator() is called, it returns a generator object without even beginning execution of the function. When next method is called for the first time, the function starts executing until it reaches yield statement. The yielded value is returned by the next call.

Để hiểu rõ hơn cách generator hoạt động, chúng ta xem ví dụ tiếp theo

Example 6:

>>> def silly_generator():
...     yield "Python"
...     yield "Rocks"
...     yield "So do you!"
>>> gen = silly_generator()
>>> next(gen)
'Python'
>>> next(gen)
'Rocks'
>>> next(gen)
'So do you!'
>>> next(gen)
Traceback (most recent call last):
  ...
StopIteration

Khi method next() được gọi lần đầu tiên, function silly_generator() bắt đầu được thực hiện cho đến khi gặp statement yield đầu tiên. Lúc này function silly_generator() sẽ tạm dừng, lưu trạng thái và trả về chuỗi “Python”, khi method next() được gọi một lần nữa, function silly_generator() tiếp tục chạy cho đến khi gặp statement yield tiếp theo thì lưu trạng thái hiện tại và trả về chuỗi “Rocks”, cứ thế cho đến khi StopIteration.

Generator expressions

Generator expression có hiệu quả tương đương với list comprehensions, nhưng không tạo ra một physical list để lưu tất cả dữ liệu nhằm tiết kiệm bộ nhớ; thay vào đó Generator expression tạo một generator object để trả từng kết quả trong iteration.

Generator expression có cú pháp tương tự như list comprehension nhưng dùng (…) hay vì […].

Example 7: Generator expression

>>> square = (x**2 for x in range(5))
>>> square
<generator object <genexpr> at 0x01F9D120
>>> next(square)
0
>>> next(square)
1
>>> next(square)
4
>>> next(square)
9
>>> next(square)
16
>>> next(square)
Traceback
    ...
StopIteration

References:
[1] Python Iterators.
[2] Iterators & Generators.
[3] Python 201: An Intro to Iterators and Generators.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s