Skip to content

Lambdas and comprehensions

These three features have something in common: they let you express ideas that would otherwise take several lines in a single, readable expression. Used well, they make code shorter and clearer. Used badly, they make it unreadable. This chapter covers when to reach for each one and when to stop.

Lambdas, comprehensions, and zip are three tools that compress common patterns into expressions. They are not required, but they appear throughout Python code and are worth recognising and writing fluently. The guiding principle: use them when they make intent clearer, not just shorter.

Lambda expressions create anonymous function objects at runtime. Comprehensions compile to optimised bytecode that builds collections without a for loop in the outer frame. Generators are lazy: they yield values on demand without materialising the entire sequence. zip returns an iterator of tuples, consuming the input iterables lazily. All three share the theme of expressing transformations as expressions rather than imperative loops.

Lambda functions

A lambda is a nameless, one-expression function. You create it with the lambda keyword. Its real usefulness is that you can write it inline, right where you need it, without defining a named function first. This is what makes it useful with sorted().

A lambda is an anonymous single-expression function. It can take multiple arguments but its body must be a single expression, not a statement. Its primary use is as an inline key= or callback argument where a full def would add unnecessary indirection. For anything more complex, use def.

lambda args: expression compiles to a code object and creates a function object, identical to def except that it has no name (it appears as <lambda> in tracebacks), cannot contain statements, and does not support docstrings or annotations. Lambdas participate in closure: free variables are captured from the enclosing scope. The common pitfall: lambda i: i in a loop captures i by reference, not by value; use lambda i=i: i to bind the value at creation time.

python
double = lambda x: x * 2
double(5)   # 10

That is equivalent to:

python
def double(x):
    return x * 2

For most cases, use def. Lambdas have one real advantage: you can write them inline, right where you need them, without naming them. This is what makes them useful with sorted(), map(), and filter():

python
players = [("Alice", 87), ("Bob", 74), ("Carol", 92)]

sorted(players, key=lambda p: p[1])              # sort by score (ascending)
sorted(players, key=lambda p: p[1], reverse=True)  # sort by score (descending)

Without a lambda, you would have to define a named function just for the key= argument. The lambda keeps the intent local and visible.

Lambdas can take multiple arguments:

python
add = lambda a, b: a + b
add(3, 4)   # 7

When to use a lambda: only when it is a simple expression used in one place. If it is getting complex, or you need to reuse it, write a proper def. A lambda that spans several operators or requires conditionals is usually a sign to switch to def.

List comprehensions

The most common transformation in Python: take a sequence, do something to each item, get a new list back. A list comprehension does this in one readable line: [expression for item in iterable]. You can also add a filter with if.

List comprehensions are a concise replacement for the build-with-a-loop pattern. They compile to optimised bytecode and are generally faster than equivalent for loops with .append(). The structure is [expression for item in iterable if condition]; the if clause is optional.

List comprehensions compile to a LIST_APPEND loop in dedicated bytecode, faster than repeated list.append() calls in a Python-level loop. They create a new scope in Python 3 (unlike Python 2), so the loop variable does not leak. Nested comprehensions execute from left to right and top to bottom: [expr for x in a for y in b] is equivalent to a nested for loop with x as the outer loop.

The long way:

python
numbers = [1, 2, 3, 4, 5]
squares = []
for n in numbers:
    squares.append(n ** 2)

The list comprehension:

python
squares = [n ** 2 for n in numbers]

The structure is always the same: [expression for item in iterable].

python
scores    = [87, 42, 96, 55, 71]
scaled    = [s * 1.1 for s in scores]       # apply a 10% bonus
as_grades = [f"{s}/100" for s in scores]    # format each one

Filtering with a condition

Add an if clause to include only items that pass a test. The result is a new list with only the items where the condition is True.

The if clause in a comprehension is a filter, not an if/else. It runs once per item and includes only items for which the condition is truthy. For a conditional transform (map one value to another based on a condition), use a ternary expression inside the main expression.

The if filter is distinct from a conditional expression in the output. [x for x in data if x > 0] filters. [x if x > 0 else 0 for x in data] maps (clamps to zero). You can combine both: [x * 2 for x in data if x > 0]. Multiple if clauses chain with implicit and.

python
numbers  = [1, 2, 3, 4, 5, 6, 7, 8]
evens    = [n for n in numbers if n % 2 == 0]    # [2, 4, 6, 8]
odds     = [n for n in numbers if n % 2 != 0]    # [1, 3, 5, 7]
python
scores   = [87, 42, 96, 55, 71, 38]
passing  = [s for s in scores if s >= 60]    # [87, 96, 71]
failing  = [s for s in scores if s < 60]     # [42, 55, 38]

Nested comprehensions

You can nest comprehensions to flatten a list of lists into a single list. Read it left to right: for each row, for each item in that row, include the item.

Nested comprehensions execute from left to right. The first for clause is the outer loop, the second is the inner. They produce a single flat result, not a 2D structure. If the comprehension is hard to read at a glance, write the loops explicitly.

Nested comprehensions execute as nested loops with the first for as the outermost. The scope of each loop variable is available to subsequent clauses. For Cartesian products, itertools.product is often clearer. The key readability rule: if parsing the comprehension takes more than a second, the explicit loop form is better documentation.

python
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat   = [item for row in matrix for item in row]
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

Read it left to right: for each row in matrix, for each item in row, include item.

Nested comprehensions can get confusing fast. If it takes more than a moment to parse, write the loops explicitly.

Dict comprehensions

Dict comprehensions build a dictionary in one expression, using the same idea as list comprehensions: {key: value for item in iterable}. Add a filter with if just like with list comprehensions.

Dict comprehensions create a new dict from any iterable producing key-value pairs. The syntax is {key_expr: val_expr for item in iterable if condition}. Duplicate keys from the loop use the last value, silently. .items() on an existing dict is the most common source iterable for dict comprehensions.

Dict comprehensions compile to dedicated MAP_ADD bytecode, analogous to LIST_APPEND for list comprehensions. They create a new scope in Python 3. Key expressions must produce hashable values; if a key expression produces a duplicate, the later value silently wins. For ordered merge semantics, the | operator (Python 3.9+) is cleaner than a comprehension.

python
names  = ["alice", "bob", "carol"]
scores = [87, 74, 92]

score_map = {name: score for name, score in zip(names, scores)}
# {"alice": 87, "bob": 74, "carol": 92}

With a filter:

python
passing = {name: score for name, score in score_map.items() if score >= 80}
# {"alice": 87, "carol": 92}
python
words     = ["apple", "banana", "cherry"]
word_lens = {word: len(word) for word in words}
# {"apple": 5, "banana": 6, "cherry": 6}

Set comprehensions

Set comprehensions build a set in one expression, with curly braces and no colon. Because the result is a set, duplicates are automatically removed.

Set comprehensions use {expression for item in iterable} and produce a set. They deduplicate automatically. Use them when you need a unique collection built from a transformation, where order does not matter.

Set comprehensions compile to SET_ADD bytecode. The result is an unordered set: duplicate values from the expression are silently merged. Set comprehensions are less common than list or dict comprehensions but are the clean way to produce a deduplicated transformation in one expression.

python
words   = ["apple", "banana", "cherry", "apple"]
unique  = {w.lower() for w in words}    # {"apple", "banana", "cherry"}

Use set comprehensions when you want unique values and do not care about order.

Generator expressions

Generators look like list comprehensions with parentheses instead of brackets. The key difference: a list comprehension builds the entire list in memory at once. A generator produces values one at a time, only when needed. For large sequences, this uses far less memory.

A generator expression produces an iterator, not a collection. It computes values lazily: the next value is only produced when requested. This is most valuable when the result is consumed immediately by a function like sum(), max(), or any(), so there is no point building the full list first.

Generator expressions compile to a code object and return a generator object. Values are produced lazily via __next__, making memory usage O(1) regardless of input size. They participate in the iterator protocol and can be chained. When passed directly to a function that accepts an iterable, the outer parentheses can be omitted. Generators cannot be reused after exhaustion; if you need to iterate multiple times, materialise to a list.

python
squares_gen = (n ** 2 for n in range(1000000))
python
total = sum(n ** 2 for n in range(1000000))   # sum() consumes the generator

When passing a generator directly to a function like sum(), max(), min(), or any(), you can drop the extra parentheses:

python
total = sum(n ** 2 for n in range(1000))   # one set of parens, not two

For most everyday code, list comprehensions are fine. Use generators when you are processing large datasets or streaming data where holding everything in memory would be wasteful.

zip()

zip() pairs items from two or more sequences together so you can loop through them in parallel. It stops at the shortest sequence. It is the clean way to avoid managing indexes when two lists correspond to each other.

zip() returns a lazy iterator of tuples, consuming its input iterables in step. It stops at the shortest input: longer sequences are silently truncated. For sequences that may differ in length, itertools.zip_longest() fills shorter ones with a specified value.

zip() returns a zip object, a lazy iterator that calls next() on each input iterator simultaneously. It stops when any iterator raises StopIteration. All inputs are consumed lazily: zip() itself allocates O(1) memory regardless of input size. zip(*iterable) is the standard transpose operation; * unpacks the outer iterable into separate arguments.

python
names  = ["Alice", "Bob", "Carol"]
scores = [87, 74, 92]

for name, score in zip(names, scores):
    print(f"{name}: {score}")
# Alice: 87
# Bob: 74
# Carol: 92

zip() stops at the shortest sequence. If your sequences might be different lengths, use itertools.zip_longest() with a fill value.

To convert back from a zipped list of pairs into two separate lists, use zip(*pairs):

python
pairs  = [("Alice", 87), ("Bob", 74), ("Carol", 92)]
names, scores = zip(*pairs)
# names = ("Alice", "Bob", "Carol")
# scores = (87, 74, 92)

What does * do here?

*pairs unpacks the list into separate arguments: zip(*pairs) becomes zip(("Alice", 87), ("Bob", 74), ("Carol", 92)). The * operator is covered in the Functions chapter.

zip() is also the clean way to iterate multiple sequences in parallel without managing indexes manually:

python
before = [10, 20, 30]
after  = [15, 18, 35]

for b, a in zip(before, after):
    change = a - b
    print(f"{b} -> {a} ({'+' if change >= 0 else ''}{change})")

map() and filter()

map() and filter() are older functional-style tools that do what comprehensions do. You will see them in older code, so it is worth knowing what they mean. Prefer comprehensions for new code; they are more readable to most Python developers.

map(func, iterable) returns a lazy iterator that applies func to each item. filter(func, iterable) returns a lazy iterator of items for which func is truthy. Both pre-date comprehensions. Prefer comprehensions in new code; use map() when you already have a named function that does what you need.

map() and filter() return lazy iterators (not lists) in Python 3. map(f, it) is equivalent to (f(x) for x in it). filter(pred, it) is equivalent to (x for x in it if pred(x)). For named functions, list(map(int, strings)) is idiomatic because it reads as "map int over strings"; the equivalent comprehension [int(s) for s in strings] is equally valid.

python
numbers = [1, 2, 3, 4, 5]

list(map(lambda x: x ** 2, numbers))         # [1, 4, 9, 16, 25]
list(filter(lambda x: x % 2 == 0, numbers))  # [2, 4]

Prefer comprehensions; they are more readable to most Python developers. Use map() when you have a named function that already exists:

python
strings = ["1", "2", "3"]
numbers = list(map(int, strings))   # [1, 2, 3] (cleaner than a comprehension here)

In practice

Filter a player list to passing scores, rank by score with sorted and a lambda, then print with enumerated positions:

python
players = [
    {"name": "Alice", "score": 87},
    {"name": "Bob",   "score": 42},
    {"name": "Carol", "score": 96},
    {"name": "Dave",  "score": 55},
]

passing   = [p for p in players if p["score"] >= 60]
ranked    = sorted(passing, key=lambda p: p["score"], reverse=True)
score_map = {p["name"]: p["score"] for p in ranked}

for i, (name, score) in enumerate(score_map.items(), start=1):
    print(f"{i}. {name}: {score}")

Filter a user list for active admins, build an id-to-name lookup dict, and collect sorted names in one pass each:

python
raw_users = [
    {"id": 1, "name": "Alice", "role": "admin", "active": True},
    {"id": 2, "name": "Bob",   "role": "user",  "active": False},
    {"id": 3, "name": "Carol", "role": "admin", "active": True},
    {"id": 4, "name": "Dave",  "role": "user",  "active": True},
]

active_admins = [u for u in raw_users if u["active"] and u["role"] == "admin"]
id_map        = {u["id"]: u["name"] for u in raw_users}
names         = sorted(u["name"] for u in raw_users if u["active"])

print(f"Active admins: {[u['name'] for u in active_admins]}")
print(f"All active: {names}")

Pair feature names with importance scores using zip, build a dict comprehension, sort with a lambda, and normalise values in a second comprehension:

python
feature_names = ["age", "income", "score", "tenure"]
importances   = [0.12, 0.34, 0.28, 0.26]

feat_dict = {f: i for f, i in zip(feature_names, importances)}
top_feats = sorted(feat_dict.items(), key=lambda x: x[1], reverse=True)[:2]

print("Top 2 features:")
for name, score in top_feats:
    print(f"  {name}: {score:.2f}")

# Normalise to sum to 1.0 (values already sum to 1 here, but shown as a pattern)
total      = sum(feat_dict.values())
normalised = {k: round(v / total, 4) for k, v in feat_dict.items()}
print(f"Normalised: {normalised}")

zip pairs the two lists without building intermediate tuples. The dict comprehension builds the mapping in one expression. The sort lambda avoids a named key function. The normalisation comprehension transforms values without mutating the original dict.