Word frequency analyzer
Mira Hello. I've been tasked with pulling insights from the mission logs and there are hundreds of them. I need something that takes a text file, counts how often each word appears, and shows me the top results. I want to find out what topics keep coming up without reading everything manually.
What you're building
Enter text, or a filename to read from: sample.txt
Top 10 words:
the 42
and 31
python 18
is 16
you 14
...What you'll need
- Strings — splitting text into words, stripping punctuation, lowercasing
- Dictionaries — counting how many times each word appears
- Lists — sorting and slicing the top results
- Files and exceptions — reading from a text file
- Lambda, comprehensions, and zip — list comprehensions and
sorted()with a key work well here
Hints
Normalise before counting. Lowercase everything and strip punctuation before building your count. Otherwise "Python" and "python" and "Python," all count as different words.
A dictionary does the counting. Loop through the words. If the word is already a key, increment its count. If it isn't, add it with a count of 1. .get() with a default value makes this neat.
Sorting a dictionary by value. sorted() accepts a key= argument. Pass a lambda that returns the value for each key to sort by frequency.
Going further
Once the core analysis works:
- Stop words. Ignore common words like "the", "and", "is". Define a set of stop words and skip any word that appears in it.
- Configurable top N. Let the user specify how many results to show instead of always showing 10.
- Visual output. Print each word with a bar made of repeated characters proportional to its count. Even a simple version makes the output much more readable.

