Generators are defined by a function that generates values on the fly and can be iterated in the same way that we iterate strings, lists, and tuples in Python using the "for"
loop.
When the body of a normal function contains the yield
keyword instead of the return
keyword, it is said to be a generator function.
In this article, we’ll look at:
- What are generator and generator functions?
- Why do we need them?
- What does the yield statement do?
- Generator expression
Generator
PEP 255 introduced the generator concept, as well as the yield
statement, which is used in the generator function.
When called, the generator function returns a generator object or generator-iterator object, which we can loop over in the same way that we do with lists.
1 2 3 4 5 6 7 8 9 10 11 |
# Generator function to generate odd numbers def gen_odd(num): n = 0 while n <= num: if n % 2 == 1: yield n n += 1 odd_num = gen_odd(10) for i in odd_num: print(i) |
The above code defines the gen_odd
generator function, which accepts an arbitrary number and returns a generator object that can be iterated using either the "for"
loop or the next()
method.
1 2 3 4 5 |
1 3 5 7 9 |
By iterating over the generator-iterator object, we obtained the odd numbers between 0 and 10.
Why Generators?
We now have a general understanding of generators, but why do we use them? The great thing about generator functions is that they return iterators, and iterators use a strategy known as lazy evaluation, which means that they return values only when requested.
Consider a scenario in which we need to compute very large amounts of data. In that case, we can use generators to help us because generators compute the data on demand, eliminating the need to save data in memory.
yield – What It Does?
The yield
statement is what gives generators their allure, but what does it do within the function body? Let’s look at an example to see how the process works.
1 2 3 4 5 6 7 |
def gen_seq(num): initial_val = 1 while initial_val <= num: yield initial_val initial_val += 1 sequence = gen_seq(3) |
The generator function gen_seq
generates a sequence of numbers up to the specified num
argument. The generator function is called, and it will return a generator object.
1 2 3 |
print(sequence) ---------- <generator object gen_seq at 0x000001FFBDB55770> |
We can now use the generator object’s next()
method. To get the values, use sequence.__next__()
or next(sequence)
.
1 2 3 4 5 6 7 |
print(sequence.__next__()) print(sequence.__next__()) print(sequence.__next__()) ---------- 1 2 3 |
When we call the generator object’s __next__()
method, the code inside the function executes until it reaches the yield
statement.
What happens when the function code encounters the yield
statement? The yield
statement operates differently than the return
statement.
The yield
statement returns the value to the next()
method’s caller and instead of exiting the program, it retains the function’s state. When we call the
method again, the execution resumes where it was left.__next__()
Check the code below to gain a better understanding.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
def gen_seq(): print("Start") val = 0 print("Level 1") while True: print("Level 2") yield val print("Level 3") val += 1 sequence = gen_seq() print(sequence.__next__()) print(sequence.__next__()) print(sequence.__next__()) |
The print
statement is set at every level in the above code to determine whether or not the yield
statement continues execution from where it was left.
1 2 3 4 5 6 7 8 9 10 |
Start Level 1 Level 2 0 Level 3 Level 2 1 Level 3 Level 2 2 |
The code begins at the beginning and progresses through levels 1 and 2 before returning the yielded value to the
method’s caller. When we call the __next__()
method again, the previously yielded value increments by 1, and the execution cycle is resumed from where it was left.__next__()
Exception
The generators, like all iterators, can become exhausted after all the iterable values are evaluated. Consider the generator function gen_odd
from earlier.
1 2 3 4 5 6 7 8 9 10 11 12 |
# Generator function to generate odd numbers def gen_odd(num): n = 0 while n <= num: if n % 2 == 1: yield n n += 1 odd_num = gen_odd(3) print(odd_num.__next__()) print(odd_num.__next__()) print(odd_num.__next__()) |
The above code will generate odd numbers up to 3. As a result, the program will only generate 1 and 3, allowing us to call the
method twice. When we run the above code, we will get the following result.__next__()
1 2 3 4 5 |
1 3 Traceback (most recent call last): .... StopIteration |
When we called the first two
methods on __next__()
odd_num
, we got the yielded values, but when we called the last
method, our code threw a __next__()
StopIteration
exception which indicates that the iterator has ended.
Instead of raising the StopIteration
exception, the program would have simply exited if we had used the "for"
loop.
yield In try/finally
Take a look at the example below in which we have a generator function and try/except/finally clause inside it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
def func(): try: yield 0 try: yield 1 except: raise ValueError except: # program never get to this part yield 2 yield 3 finally: yield 4 x = func() for val in x: print(val) |
If we run the above code, we’ll get the following output:
1 2 3 |
0 1 4 |
We can see that we didn’t get the values 2 and 3 as our program didn’t reach that part because of the ValueError
, and as is customary when an error occurs, the program proceeds to the finally
clause, execute it, and exit the program.
Generator Expression
You must have used the list comprehension, The generator expression allows us to create a generator in a few lines of code. Unlike list comprehensions, the generator expressions are enclosed within parenthesis ()
.
1 2 3 4 5 |
gen_odd_exp = (n for n in range(5) if n % 2 == 1) print(gen_odd_exp) ---------- <generator object <genexpr> at 0x000001D635E33C30> |
The above generator expression gen_odd_exp
is somewhat equivalent to the generator function gen_odd
which we saw at the beginning. We can iterate just like we would with a generator function.
1 2 3 4 5 6 |
print(next(gen_odd_exp)) print(next(gen_odd_exp)) ---------- 1 3 |
When we compare the memory requirements of the generator expression and list comprehension, we get the following result.
1 2 3 4 5 6 7 8 9 10 11 |
import sys gen_odd_exp = (n for n in range(10000) if n % 2 == 1) print(f'{sys.getsizeof(gen_odd_exp)} bytes') list_odd_exp = [n for n in range(10000) if n % 2 == 1] print(f'{sys.getsizeof(list_odd_exp)} bytes') ---------- 104 bytes 41880 bytes |
The generator object in the case of generator expression took 104 bytes of memory, whereas the result in the case of list comprehension took 41880 bytes (almost 41 KB) of memory.
Conclusion
A normal function with the yield
keyword in its body defines the generator. This generator function returns a generator-iterator object that can be iterated over to generate a data sequence.
It is said that generators are a Pythonic way to create iterators, and iterators use a strategy known as lazy evaluation, which means that they only return values when the caller requests them.
Generators come in handy when we need to compute large amounts of data without storing them in memory.
The quickest way to create a generator function in a few lines of code is to use a generator expression or generator comprehension (similar to list comprehension).
🏆Other articles you might be interested in if you liked this one
✅What are context manager and the with statement in Python?
✅What are __init__ and __new__ methods in Python?
✅What are __init__ and __call__ methods in Python?
✅What is the difference between seek() and tell() in Python?
✅Generate temporary files and directories using tempfile module in Python.
✅How to display images on the frontend using FastAPI in Python?
✅How to use match case statements for pattern matching in Python?
✅Build a command line interface using argparse in Python.
That’s all for now
Keep Coding✌✌