Skip to content

Modules and the standard library

Python comes with a huge collection of tools ready to use: randomness, maths, dates, file paths, and much more. These tools live in modules, and you bring them into your code with import. You have already used import json in the previous chapter. This chapter covers imports fully and introduces the most useful parts of the standard library.

Python's standard library provides tested, documented solutions for common problems. Modules are the unit of code organisation: each file is a module, each directory with an __init__.py is a package. The import system finds modules, compiles them if needed, and caches them in sys.modules so they are only loaded once.

The import system is a layered machinery: finders locate modules and loaders compile and execute them. Results are cached in sys.modules. import foo runs foo.py once and binds the module object to foo in the current namespace. from foo import bar binds only bar. Understanding sys.path, __init__.py, and relative imports is essential for building packages.

Importing modules

The simplest import brings in a whole module and lets you use its contents with dot notation. You can also import specific names from a module to use them directly without the prefix. Aliases shorten long names.

import module binds the module object to the name module in the current scope. from module import name binds just name. Aliases (import module as alias) are common with third-party libraries. Avoid from module import *: it pollutes the namespace and makes it unclear where names came from.

import module triggers the full import machinery, caches the result in sys.modules, and binds the module object. from module import name is syntactic sugar: it still imports the full module, then extracts name. Circular imports are a common pitfall; the fix is usually to move imports inside functions or restructure the module dependencies. importlib.import_module() allows programmatic imports.

python
import math

math.sqrt(16)     # 4.0
math.pi           # 3.141592653589793
math.floor(3.9)   # 3
math.ceil(3.1)    # 4

Import specific names from a module so you can use them directly:

python
from math import sqrt, pi

sqrt(16)    # 4.0 (no "math." prefix needed)
pi          # 3.141592653589793

Give a module or name an alias to shorten it:

python
import math as m

m.sqrt(16)    # 4.0

from math import sqrt as square_root
square_root(25)    # 5.0

Aliases are common with popular third-party libraries (import numpy as np, import pandas as pd). For standard library modules, prefer using the full name; it makes the code easier to read.

random

The random module generates random numbers and makes random choices. Use it for games, simulations, random sampling, and anything else that needs unpredictability. Setting a seed makes results reproducible: the same seed produces the same sequence every time.

random uses a Mersenne Twister pseudo-random number generator. The seed determines the full sequence; the same seed always produces the same output. .choice() picks one item, .choices() picks with replacement, .sample() picks without. .shuffle() modifies the list in place and returns None.

random uses a Mersenne Twister (MT19937) PRNG with 624-word state. random.seed() initialises the state; without it, the state is seeded from os.urandom(). For cryptographic purposes, use secrets instead: random is not cryptographically secure. random.SystemRandom() wraps os.urandom() for a secure alternative with the same API.

python
import random

random.random()              # float between 0 and 1 (exclusive)
random.randint(1, 10)        # integer from 1 to 10 (both inclusive)
random.uniform(1.0, 10.0)    # float between 1.0 and 10.0

colours = ["red", "green", "blue"]
random.choice(colours)       # picks one item
random.choices(colours, k=3) # picks k items (with replacement)
random.sample(colours, k=2)  # picks k items (no replacement)

numbers = [1, 2, 3, 4, 5]
random.shuffle(numbers)      # shuffles in place, returns None

For reproducible results (useful in testing and data science), set a seed before generating:

python
random.seed(42)
random.randint(1, 100)   # always the same value for seed 42

The same seed produces the same sequence every time, on any machine.

math

The math module adds more advanced mathematical operations beyond the basic arithmetic operators. Square roots, powers, logarithms, trigonometry, and special values like pi and infinity are all here.

math provides C-level implementations of standard mathematical functions. Note that math.pow() always returns a float, while Python's ** operator returns int for integer bases and exponents. math.log(x, base) computes logarithm to any base; math.log(x) computes the natural logarithm.

math wraps the C <math.h> library functions. These are faster than pure Python implementations and handle edge cases (NaN, infinity) correctly. math.isnan() and math.isinf() check for IEEE 754 special values. For complex numbers, cmath provides the corresponding functions. For array-level maths, numpy is the standard tool.

python
import math

math.sqrt(25)        # 5.0
math.pow(2, 10)      # 1024.0 (same as 2 ** 10 but always returns float)
math.log(100, 10)    # 2.0 (log base 10)
math.log(math.e)     # 1.0 (natural log)

math.sin(math.pi / 2)   # 1.0
math.cos(0)             # 1.0

math.ceil(3.2)    # 4
math.floor(3.9)   # 3
math.trunc(3.9)   # 3 (same as int() for positives)

math.inf          # infinity
math.isnan(float("nan"))   # True
math.isinf(math.inf)       # True

datetime

The datetime module handles dates and times. datetime.now() gives you the current date and time. strftime() formats it as a string. strptime() parses a string into a datetime. timedelta represents a duration you can add or subtract.

datetime, date, and timedelta are the main classes. strftime() formats a datetime as a string using format codes. strptime() parses a string given a format pattern. timedelta supports arithmetic: you can add or subtract durations from dates and compare datetimes with <, >, -.

datetime objects are naive by default (no timezone). For timezone-aware datetimes, use datetime.now(tz=timezone.utc) or datetime.fromisoformat() with an offset. strftime/strptime use C-library format codes; %f gives microseconds. For high-precision timing, prefer time.perf_counter() over datetime.now(). The zoneinfo module (Python 3.9+) provides IANA timezone support.

python
from datetime import datetime, date, timedelta

now   = datetime.now()           # current date and time
today = date.today()             # current date only

print(now.year, now.month, now.day)
print(now.hour, now.minute, now.second)

# Formatting
print(now.strftime("%Y-%m-%d"))           # "2024-01-15"
print(now.strftime("%d %B %Y, %H:%M"))   # "15 January 2024, 09:42"

# Parsing
deadline = datetime.strptime("2024-12-31", "%Y-%m-%d")

# Arithmetic
tomorrow    = today + timedelta(days=1)
next_week   = today + timedelta(weeks=1)
diff        = deadline - now
print(f"{diff.days} days until deadline")

Common strftime codes:

CodeMeaningExample
%Y4-digit year2024
%mMonth (zero-padded)01
%dDay (zero-padded)15
%HHour (24h)09
%MMinute42
%BFull month nameJanuary

os and pathlib

pathlib is the modern way to work with file paths. Path objects let you build, inspect, and navigate paths using the / operator. os gives access to environment variables and lower-level OS operations. Prefer pathlib for new code.

pathlib.Path represents filesystem paths as objects with methods for querying and navigating. The / operator joins path components cleanly, handling OS-specific separators automatically. os.environ is a dict-like object for environment variables; os.environ.get("KEY", "default") is safe for missing variables.

pathlib.Path is an abstract base with PurePosixPath and PureWindowsPath as concrete implementations for each OS. Methods like .glob(), .rglob(), and .iterdir() return generators. .stat() calls os.stat() and returns a stat_result. os.path functions accept both strings and Path objects since Python 3.6. Prefer pathlib for new code; use os.fspath() to convert Path to str when calling APIs that do not accept Path.

python
from pathlib import Path

p = Path("data/reports")

p.exists()           # True if path exists
p.is_dir()           # True if it's a directory
p.is_file()          # True if it's a file

p.mkdir(parents=True, exist_ok=True)   # create directories

for f in p.glob("*.csv"):              # all CSV files in directory
    print(f.name)                      # just the filename

report = p / "report_jan.csv"          # / operator joins paths
report.stem       # "report_jan" (name without extension)
report.suffix     # ".csv"
report.parent     # Path("data/reports")

content = report.read_text()           # read file contents directly
report.write_text("new content\n")    # write directly

For the os module:

python
import os

os.getcwd()                        # current working directory
os.listdir(".")                    # list directory contents
os.path.exists("data.txt")        # True if path exists
os.path.join("data", "file.txt")  # "data/file.txt" (cross-platform)
os.environ.get("HOME")            # read an environment variable

Prefer pathlib for new code. Use os when you need environment variables or working with older APIs that expect strings.

timeit

timeit measures how long code takes to run. It is useful when you want to compare two approaches and pick the faster one. Run the code many times to get a stable measurement.

timeit.timeit(stmt, setup, number) times stmt by running it number times and returning the total elapsed time in seconds. The setup string runs once before the timed loop. Divide the result by number to get the per-call time. More repetitions reduce noise from system scheduling.

timeit disables the garbage collector during timing to reduce noise. It uses time.perf_counter() for high-resolution measurement. The globals parameter passes a namespace to the timed statement. For microbenchmarks, timeit is the standard tool; for profiling where time is spent in a larger program, use cProfile.

python
import timeit

# Time a single statement
timeit.timeit("sum(range(1000))", number=10000)

# Time a more complex block
setup = "data = list(range(1000))"
code  = "[x * 2 for x in data]"
time  = timeit.timeit(code, setup=setup, number=10000)
print(f"{time:.4f} seconds for 10,000 runs")

number is how many times to repeat. More repetitions give a more stable measurement.

string

The string module provides pre-built string constants for letters, digits, and punctuation. Useful when you need to check characters or generate random strings from a specific alphabet.

string module constants (ascii_letters, digits, punctuation) are plain strings you can index, iterate, or use with in. Combining them with random.choices() is the standard way to generate random tokens or passwords.

string module constants are pure Python string literals with no special behaviour. They are not sets, so in is O(n); for frequent membership testing, use set(string.digits). string.Formatter and string.Template are the underlying machinery for str.format() and $-style substitution respectively.

python
import string

string.ascii_lowercase   # "abcdefghijklmnopqrstuvwxyz"
string.ascii_uppercase   # "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
string.ascii_letters     # both combined
string.digits            # "0123456789"
string.punctuation       # all punctuation characters

Useful when you need to check characters or generate random strings:

python
import string, random

chars    = string.ascii_letters + string.digits
password = "".join(random.choices(chars, k=12))

Creating your own modules

Any Python file is a module. To use it from another file, import it by the filename (without .py). You can import the whole module and use its contents with dot notation, or import specific names directly.

When Python imports a module, it executes the file top to bottom once and caches the result in sys.modules. Subsequent imports of the same module return the cached object without re-running the file. For larger projects, modules are organised into packages: directories with an __init__.py file.

Import resolution uses sys.path: a list of directories searched in order. sys.path[0] is the script's directory. PYTHONPATH environment variable prepends extra directories. Packages require __init__.py (can be empty) to be recognised. Relative imports (from . import module) are valid within packages. importlib.reload() re-executes a module, but existing references to old objects are not updated.

python
# utils.py
def clamp(value, lo, hi):
    return max(lo, min(value, hi))

PI = 3.14159
python
# main.py
import utils

utils.clamp(150, 0, 100)   # 100
utils.PI                    # 3.14159

from utils import clamp
clamp(50, 0, 100)           # 50

Python finds the module by looking in the same directory as the importing file (and a few other places). For larger projects, modules are organised into packages: directories with an __init__.py file.

__name__ == "__main__"

When Python runs a file directly, __name__ is set to "__main__". When the same file is imported as a module, __name__ is the module name. This pattern lets you write code that runs when you execute the file directly but is skipped when the file is imported by another module.

if __name__ == "__main__": is the standard guard for executable module code. It lets a module be both importable (exposing its functions) and directly runnable (with test or demo code). Without it, importing the module would execute any top-level code, which is almost never desired.

__name__ is set by the import machinery: "__main__" for the entry-point script, the module's dotted name otherwise. The guard prevents side effects (startup code, argument parsing, test runs) from executing on import. For command-line tools, putting the entry-point logic in a main() function and calling it under the guard is the idiomatic pattern.

python
# utils.py
def clamp(value, lo, hi):
    return max(lo, min(value, hi))

if __name__ == "__main__":
    # this only runs when you do: python utils.py
    # not when you do: import utils
    print(clamp(150, 0, 100))   # 100

This is a standard pattern for any module that is also useful as a standalone script.

Standard library highlights

A few more modules worth knowing about. Each one solves a common problem that would take significant work to implement yourself.

The standard library is extensive; the highlights below are the ones you will encounter most often in production code. For a complete reference, docs.python.org/3/library is the authoritative source.

The standard library is a curated set of well-tested, documented modules. Before reaching for a third-party package, check whether the standard library has a solution: functools, itertools, contextlib, dataclasses, typing, and abc each provide tools that third-party packages often reinvent.

collections: specialised container types:

python
from collections import Counter, defaultdict, deque

Counter(["a", "b", "a", "c", "a"])   # Counter({'a': 3, 'b': 1, 'c': 1})
defaultdict(list)                      # dict that auto-creates missing keys
deque([1, 2, 3], maxlen=5)            # fast append/pop from both ends

itertools: tools for working with iterables:

python
import itertools

list(itertools.chain([1, 2], [3, 4]))          # [1, 2, 3, 4]
list(itertools.islice(range(100), 5))          # [0, 1, 2, 3, 4]
list(itertools.combinations([1, 2, 3], 2))     # [(1, 2), (1, 3), (2, 3)]
list(itertools.product([0, 1], repeat=2))      # [(0,0), (0,1), (1,0), (1,1)]

sys: access to the Python interpreter:

python
import sys

sys.argv        # list of command-line arguments
sys.exit(1)     # exit with a status code
sys.version     # Python version string

Third-party packages: beyond the standard library, pip installs community packages:

bash
pip install requests    # HTTP library
pip install pandas      # data manipulation
pip install numpy       # numerical computing

Third-party packages are out of scope for this guide, but the pattern is always the same: pip install, then import.

In practice

Combining random, string, and datetime to generate unique game IDs with timestamps:

python
import random
import string
from datetime import datetime

def generate_game_id(length: int = 8) -> str:
    chars = string.ascii_uppercase + string.digits
    return "".join(random.choices(chars, k=length))

def timestamp() -> str:
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

game_id = generate_game_id()
print(f"[{timestamp()}] Starting game {game_id}")

scores = [random.randint(50, 100) for _ in range(5)]
print(f"Round scores: {scores}")
print(f"Best: {max(scores)}")

Using pathlib and datetime to find files in a directory and report their sizes:

python
from pathlib import Path
from datetime import datetime

def find_files(directory: str, pattern: str = "*.csv") -> list[Path]:
    return sorted(Path(directory).glob(pattern))

def timestamp() -> str:
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

files = find_files(".", "*.md")[:3]
print(f"[{timestamp()}] Found {len(files)} file(s)")
for f in files:
    size = f.stat().st_size if f.exists() else 0
    print(f"  {f.name} ({size} bytes)")

Reading app config from environment variables with typed defaults, and writing structured access log entries as newline-delimited JSON:

python
import os
import json
from datetime import datetime
from pathlib import Path

def load_env_config() -> dict:
    return {
        "debug":     os.environ.get("DEBUG", "false").lower() == "true",
        "port":      int(os.environ.get("PORT", "8080")),
        "log_level": os.environ.get("LOG_LEVEL", "INFO"),
    }

def write_access_log(method: str, path: str, status: int) -> None:
    log_dir = Path("logs")
    log_dir.mkdir(exist_ok=True)
    entry = {
        "ts":     datetime.now().isoformat(),
        "method": method,
        "path":   path,
        "status": status,
    }
    with open(log_dir / "access.jsonl", "a") as f:
        f.write(json.dumps(entry) + "\n")

config = load_env_config()
print(f"Starting on port {config['port']}, debug={config['debug']}")
write_access_log("GET", "/users", 200)

Newline-delimited JSON (.jsonl) is a common log format: each line is a valid JSON object, which makes it easy to stream, append, and parse line by line without loading the whole file.