Course Outline (Part 23)

Generators provide a much easier and cleaner way to create iterators. A generator is simply a function that returns an object (iterator) which we can iterate over (one value at a time).


1. The yield keyword

A generator function is defined like a normal function, but whenever it needs to generate a value, it does so with the yield keyword rather than return.

If the body of a def contains yield, the function automatically becomes a generator function.

def my_generator():
    yield 1
    yield 2
    yield 3

gen = my_generator()

print(next(gen)) # 1
print(next(gen)) # 2
print(next(gen)) # 3

Difference between Return and Yield

  • return: Terminates the function entirely and returns a value. The function’s local state is destroyed.
  • yield: Pauses the function, saving all its states and later continues from there on successive calls.

2. Generator with a Loop

Generators are incredibly powerful when combined with loops. You don’t have to write custom Iterator classes with __iter__ and __next__. The generator handles it automatically.

def countdown(num):
    print("Starting countdown!")
    while num > 0:
        yield num
        num -= 1

# Using the generator
for i in countdown(5):
    print(i)

Output:

Starting countdown!
5
4
3
2
1

3. Generator Expressions

Just like list comprehensions, generators can be written in a single line. The syntax is identical to list comprehensions, but instead of square brackets [], we use parentheses ().

# List comprehension (creates the whole list in memory)
squared_list = [x**2 for x in range(5)]
print(squared_list) # [0, 1, 4, 9, 16]

# Generator expression (creates an iterator object)
squared_gen = (x**2 for x in range(5))
print(squared_gen) # <generator object <genexpr> at ...>

# We have to iterate to see values
for num in squared_gen:
    print(num)

4. Memory Efficiency of Generators

The primary benefit of generators is their memory efficiency.

A standard list stores every single item in memory at once. If you create a list of 10 million items, you will consume a massive amount of RAM.

A generator, however, only calculates the next item in the sequence when you ask for it. It never stores the entire sequence in memory.

import sys

# A list of 10,000 integers
my_list = [i for i in range(10000)]
print(sys.getsizeof(my_list)) # e.g., 87616 bytes

# A generator of 10,000 integers
my_gen = (i for i in range(10000))
print(sys.getsizeof(my_gen))  # e.g., 112 bytes

Generators are the preferred way to work with massive data sets, reading large files, or infinite streams of data.

Discussion

Loading comments...