Skip to content

Tuples and sets

You know lists. Python has two more collection types that solve problems lists cannot. Tuples hold a fixed group of values that will never change. Sets hold only unique values and let you check membership instantly no matter how large the collection gets.

Python's collection toolkit has four types. Lists and dicts handle most general cases. Tuples and sets solve the specific ones: fixed records where immutability is an asset, and unique-value collections where O(1) membership testing is the priority.

Beyond list and dict, Python provides tuple (immutable fixed-length sequence) and set (hash-table-backed unordered collection of unique hashable objects). Each has a distinct memory model, hashability characteristic, and performance profile worth knowing before choosing.

Tuples

A tuple is an ordered group of values that cannot be changed after you create it. Parentheses define a tuple, but they are optional. The comma is what actually makes it a tuple. A single-item tuple requires a trailing comma.

Tuples are immutable sequences. The comma, not the parentheses, is what creates a tuple. Immutability makes them hashable when all their elements are too, which opens use cases that lists cannot fill: dict keys, set members, and fixed-structure records.

tuple is an immutable sequence backed by a fixed-size C array. __hash__ is computed from the element hashes when all elements are hashable, making tuples valid as dict keys and set members. __getitem__ supports integers and slices; __setitem__ is not implemented, so any mutation attempt raises TypeError. The single-item form (42,) requires the trailing comma; without it, parentheses are just grouping.

python
point      = (10, 20)
rgb        = (255, 128, 0)
dimensions = (1920, 1080)
single     = (42,)            # trailing comma required for a single-item tuple
also_tuple = 42, 99           # parentheses are optional; the comma makes it a tuple

Access by index works exactly like a list. Trying to change an item raises a TypeError:

Indexing, slicing, and negative indices all work identically to lists. Any attempt to assign via index raises TypeError; this is intentional, not a limitation.

__getitem__ with integers and slice objects follows the same clamping rules as list. There is no __setitem__: the tuple type does not register it, so TypeError is raised at runtime, not at parse time.

python
point = (10, 20)
point[0]    # 10
point[1]    # 20
point[-1]   # 20

point[0] = 99    # TypeError: 'tuple' object does not support item assignment

When to use a tuple

Use a tuple when you have a small group of related values that belong together and will not change. Coordinates (x, y), a colour (r, g, b), a name-score pair ("Alice", 87). The fixed structure signals to anyone reading the code that this group is treated as a single unit.

Tuples communicate fixed structure: a group of values where position carries meaning and the group is treated as a unit. Their hashability makes them valid as dictionary keys, which lists cannot be. The contract a tuple signals is: these values belong together and are not supposed to change.

Tuples are the idiomatic type for fixed-arity records. Their hashability makes them usable wherever Hashable is required: dict keys, set members, functools.lru_cache call signatures. The semantic contract differs from a list: tuples represent a heterogeneous record where position has meaning, lists represent a homogeneous sequence where length and order can vary.

python
locations = {}
locations[(40, -74)] = "New York"   # tuple as a dict key, works
locations[[40, -74]] = "New York"   # list as a dict key, TypeError

Unpacking

Unpacking pulls values out of a tuple and assigns each to its own name in a single line. The number of names must match the number of values. Use * to capture any remaining items into a list.

Unpacking works on any iterable: tuples, lists, strings. The count of target names must match the iterable's length, unless a starred target catches a variable-length slice. Mismatch raises ValueError. Unpacking is the idiomatic way to consume multiple return values from a function.

Unpacking calls __iter__ on the right-hand side and binds each yielded value to the corresponding target name. A starred target (*rest) collects remaining items into a list. Mismatch raises ValueError at runtime. Extended unpacking also works inside for headers: for x, y in list_of_pairs unpacks each iteration item.

python
point = (10, 20)
x, y  = point

print(x)   # 10
print(y)   # 20

first, *rest = [1, 2, 3, 4, 5]
# first = 1, rest = [2, 3, 4, 5]

head, *middle, tail = [1, 2, 3, 4, 5]
# head = 1, middle = [2, 3, 4], tail = 5

Named tuples

A named tuple is a tuple where each position has a name. Instead of remembering that point[0] is the x-coordinate, you write point.x. The values are still immutable; you just get readable attribute names instead of numeric positions.

namedtuple generates a class that behaves exactly like a tuple but adds named attribute access. It is lighter than a full class, immutable, and self-documenting. Use it when a plain tuple's positional access would require a comment to be understood.

collections.namedtuple is a class factory: it generates a tuple subclass at runtime with named attribute access compiled in. The generated class includes _asdict(), _replace(), and _fields. Memory footprint is identical to a plain tuple. For more control (default values, type annotations, optional mutability), dataclasses.dataclass is the modern alternative; for type-annotated tuples, typing.NamedTuple is idiomatic.

Named tuple import

namedtuple is in Python's standard library but needs to be imported. The from collections import namedtuple line is the first import in this course. Imports are covered fully in the Modules chapter.

python
from collections import namedtuple

Point  = namedtuple("Point",  ["x", "y"])
Player = namedtuple("Player", ["name", "score", "level"])

p = Point(10, 20)
p.x    # 10
p.y    # 20

alice = Player("Alice", 87, 5)
alice.name    # "Alice"
alice.score   # 87

Sets

A set is a collection of unique values with no guaranteed order. Adding the same value twice does nothing: a set keeps only one copy of each item. Use curly braces for a set with items, or set() to create an empty set.

set is an unordered collection that automatically rejects duplicates. Membership testing is O(1) regardless of size, which makes it the right tool whenever you need to check whether a value exists across a large collection. Note: {} creates an empty dict, not an empty set; use set() for that.

set is a hash-table-backed collection of unique hashable objects. Membership testing, insertion, and deletion are all O(1) average. Iteration order reflects internal hash positions and is not stable across runs. Only hashable objects can be members: int, str, tuple yes; list, dict, set no. {} is parsed as an empty dict literal, not a set.

python
tags     = {"python", "beginner", "tutorial"}
numbers  = {1, 2, 3, 4, 5}
empty    = set()    # NOT {} (that's an empty dict)

Adding the same value twice does not change the set:

python
tags.add("python")   # tags is unchanged, "python" is already in it

When to use a set

Sets are the right tool for three things: removing duplicates from a list, checking quickly whether something is in a large collection, and comparing two groups to find what they share or differ on.

Three distinct use cases drive set usage: deduplication (automatic on insertion), O(1) membership testing (versus O(n) for list), and set algebra (|, &, -, ^). When the collection is large and you check membership frequently, the performance difference is substantial.

The three canonical set use cases map directly to hash-table properties: uniqueness (duplicate rejection on insertion), O(1) average __contains__ (hash lookup), and set algebra (bitwise-style operators calling dunder methods). The O(1) membership test is the most practically important: in on a set of 10,000 items is as fast as on a set of 10.

python
# Remove duplicates from a list
raw    = ["cat", "dog", "cat", "bird", "dog", "cat"]
unique = list(set(raw))   # ["cat", "dog", "bird"] (order not guaranteed)
python
# Fast membership check
valid_codes = {"USD", "EUR", "GBP", "JPY"}
code        = "EUR"

if code in valid_codes:    # instant lookup, even with thousands of codes
    print("Valid")

Set operations

Sets support the same operations you learned in maths: union (everything in either set), intersection (only what both sets share), and difference (what one has that the other does not). Python uses operator symbols for these, and each has a method equivalent.

Python's set operators mirror mathematical notation: | for union, & for intersection, - for difference, ^ for symmetric difference. Each operator has a method form (.union(), .intersection(), etc.) that also accepts any iterable, not just sets.

Set operators call dunder methods: | calls __or__, & calls __and__, - calls __sub__, ^ calls __xor__. The operators require both operands to be sets and raise TypeError otherwise. The method forms accept any iterable and convert it internally. In-place forms (|=, &=, -=, ^=) mutate the left operand, equivalent to .update(), .intersection_update(), etc.

python
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}

a | b    # {1, 2, 3, 4, 5, 6}   (union: everything in either)
a & b    # {3, 4}               (intersection: only in both)
a - b    # {1, 2}               (difference: in a but not b)
b - a    # {5, 6}               (difference the other way)
a ^ b    # {1, 2, 5, 6}        (symmetric difference: in one but not both)

These also have method forms: .union(), .intersection(), .difference(), .symmetric_difference().

Modifying sets

Sets are mutable. .add() adds one item. .update() adds several at once from any list or other iterable. .remove() deletes an item but raises an error if it is not there. .discard() deletes silently if the item exists and does nothing if it does not.

.add() is O(1) average. .update() accepts any iterable and is equivalent to calling .add() in a loop. .remove() raises KeyError on a miss, mirroring dict.__delitem__. .discard() is the safe choice when presence is uncertain. .pop() removes an arbitrary element, not the "last" one since sets have no order.

.add(x) hashes x and inserts into the table: O(1) average. .update(iterable) is equivalent to |=. .remove() raises KeyError on a miss. .discard() does a hash lookup first and skips removal on a miss. .pop() removes an arbitrary element determined by internal hash table state, not insertion order.

python
tags = {"python", "beginner"}

tags.add("tutorial")          # add one item
tags.update(["web", "api"])   # add multiple items from any iterable
tags.remove("beginner")       # remove, raises KeyError if not found
tags.discard("missing")       # remove, no error if not found
tags.pop()                    # remove and return an arbitrary item
tags.clear()                  # remove everything

Use .discard() when you are not sure whether the item exists.

Frozen sets

A frozen set is a set you cannot modify after creation. The main reason to use one: frozen sets are hashable, so they can be used as dictionary keys or stored inside other sets.

frozenset is the immutable counterpart to set. It supports all read operations and set algebra but not mutation. Its immutability makes it hashable, meaning it is valid as a dict key or as a member inside another set.

frozenset implements __hash__ computed from a sorted reduction of element hashes, giving it a stable hash value. All set algebra operators and methods that return new collections are supported; mutation methods (add, remove, etc.) are not defined. frozenset is the right type for a constant lookup table that must not change at runtime and may need to be used as a dict key.

python
valid_statuses = frozenset({"active", "paused", "deleted"})
valid_statuses.add("archived")    # AttributeError, frozenset is immutable

Choosing the right collection

Four types, each with a clear role. Ask what you need to do with the data and the right choice usually becomes obvious.

The choice between collection types is about what operations matter and what constraints your data has: mutability, ordering, duplicate handling, and lookup strategy.

Collection choice is a performance and semantic decision. dict and set offer O(1) average lookup via hashing. list and tuple offer O(1) indexed access but O(n) membership testing. tuple's immutability buys hashability. frozenset and tuple are the two hashable compound types in the standard library.

listtuplesetdict
OrderedYesYesNoYes (insertion order)
MutableYesNoYesYes
DuplicatesYesYesNoNo (keys)
Access byIndexIndexn/aKey
Use whenOrdered, changeable sequenceFixed recordUnique values, fast membershipKey-value lookup

A quick decision rule:

  • Need to look something up by name? → dict
  • Need an ordered collection you will modify? → list
  • Have a fixed group of related values? → tuple
  • Need unique values or fast membership tests? → set

In practice

Using tuples to store fixed records and a set to track unique values:

python
home   = (51.5074, -0.1278)   # latitude, longitude
office = (51.5155, -0.0922)

home_lat, home_lon = home
print(f"Home: {home_lat}, {home_lon}")

# Track unique visitors with a set
visitors = set()
visitors.add("alice")
visitors.add("bob")
visitors.add("alice")    # already in set, silently ignored
visitors.add("carol")

print(f"Unique visitors: {len(visitors)}")
print(f"alice visited: {'alice' in visitors}")
print(f"dave visited:  {'dave' in visitors}")

Using sets to track what has already been processed and compute the remaining work:

python
already_processed = {"report_jan.csv", "report_feb.csv"}
all_files         = {"report_jan.csv", "report_feb.csv", "report_mar.csv", "report_apr.csv"}

to_process = all_files - already_processed
print(f"Files to process: {sorted(to_process)}")

for filename in sorted(to_process):
    print(f"Processing {filename}...")
    already_processed.add(filename)

print(f"Done. Total processed: {len(already_processed)}")

Using frozenset for constant lookup tables and demonstrating O(1) membership testing with set algebra:

python
ALLOWED_METHODS = frozenset({"GET", "POST", "PUT", "PATCH", "DELETE"})
SAFE_METHODS    = frozenset({"GET", "HEAD", "OPTIONS"})

# Set algebra on frozensets returns a regular set
unsafe_allowed = ALLOWED_METHODS - SAFE_METHODS
print(f"Non-safe allowed methods: {unsafe_allowed}")

# frozenset is hashable, so it can be stored in a set (a plain set cannot)
method_groups = {
    frozenset({"GET", "HEAD", "OPTIONS"}),
    frozenset({"POST", "PUT", "PATCH"}),
    frozenset({"DELETE"}),
}
print(f"Method groups: {len(method_groups)}")

method = "POST"
print(f"Allowed: {method in ALLOWED_METHODS}")
print(f"Safe:    {method in SAFE_METHODS}")

frozenset carries O(1) lookup and can be stored anywhere a hashable type is required. Set algebra on two frozenset objects returns a plain set; wrap the result in frozenset() to keep it immutable.