Generators provide a much easier and cleaner way to create iterators. A generator is simply a function that returns an object (iterator) which we can iterate over (one value at a time).
1. The yield keyword
A generator function is defined like a normal function, but whenever it needs to generate a value, it does so with the yield keyword rather than return.
If the body of a def contains yield, the function automatically becomes a generator function.
def my_generator():
yield 1
yield 2
yield 3
gen = my_generator()
print(next(gen)) # 1
print(next(gen)) # 2
print(next(gen)) # 3
Difference between Return and Yield
return: Terminates the function entirely and returns a value. The function’s local state is destroyed.yield: Pauses the function, saving all its states and later continues from there on successive calls.
2. Generator with a Loop
Generators are incredibly powerful when combined with loops. You don’t have to write custom Iterator classes with __iter__ and __next__. The generator handles it automatically.
def countdown(num):
print("Starting countdown!")
while num > 0:
yield num
num -= 1
# Using the generator
for i in countdown(5):
print(i)
Output:
Starting countdown!
5
4
3
2
1
3. Generator Expressions
Just like list comprehensions, generators can be written in a single line. The syntax is identical to list comprehensions, but instead of square brackets [], we use parentheses ().
# List comprehension (creates the whole list in memory)
squared_list = [x**2 for x in range(5)]
print(squared_list) # [0, 1, 4, 9, 16]
# Generator expression (creates an iterator object)
squared_gen = (x**2 for x in range(5))
print(squared_gen) # <generator object <genexpr> at ...>
# We have to iterate to see values
for num in squared_gen:
print(num)
4. Memory Efficiency of Generators
The primary benefit of generators is their memory efficiency.
A standard list stores every single item in memory at once. If you create a list of 10 million items, you will consume a massive amount of RAM.
A generator, however, only calculates the next item in the sequence when you ask for it. It never stores the entire sequence in memory.
import sys
# A list of 10,000 integers
my_list = [i for i in range(10000)]
print(sys.getsizeof(my_list)) # e.g., 87616 bytes
# A generator of 10,000 integers
my_gen = (i for i in range(10000))
print(sys.getsizeof(my_gen)) # e.g., 112 bytes
Generators are the preferred way to work with massive data sets, reading large files, or infinite streams of data.
Discussion
Loading comments...