Find k longest words in given list in Python

When working with lists of words, you often need to find the k longest words. Python provides several approaches to accomplish this task efficiently using built-in functions like sorted(), enumerate(), and heapq.

Using sorted() with Length as Key

The simplest approach is to sort words by length in descending order and slice the first k elements ?

def k_longest_words(words, k):
    return sorted(words, key=len, reverse=True)[:k]

words = ['Earth', 'Moonshine', 'Aurora', 'Snowflakes', 'Sunshine']
k = 3
result = k_longest_words(words, k)
print(f"Top {k} longest words: {result}")
Top 3 longest words: ['Snowflakes', 'Moonshine', 'Sunshine']

Using sorted() with Stable Ordering

When words have the same length, you might want to preserve the original order. This approach uses enumerate() to maintain stability ?

def k_longest_stable(words, k):
    indexed_words = list(enumerate(words))
    sorted_words = sorted(indexed_words, key=lambda x: (-len(x[1]), x[0]))
    return [word for _, word in sorted_words[:k]]

words = ['Earth', 'Moonshine', 'Aurora', 'Snowflakes', 'Sunshine']
k = 3
result = k_longest_stable(words, k)
print(f"Top {k} longest words (stable): {result}")
Top 3 longest words (stable): ['Snowflakes', 'Moonshine', 'Sunshine']

Using heapq for Large Lists

For very large lists, using heapq.nlargest() can be more efficient as it doesn't sort the entire list ?

import heapq

def k_longest_heap(words, k):
    return heapq.nlargest(k, words, key=len)

words = ['Earth', 'Moonshine', 'Aurora', 'Snowflakes', 'Sunshine', 'Sky', 'Ocean']
k = 4
result = k_longest_heap(words, k)
print(f"Top {k} longest words using heap: {result}")
Top 4 longest words using heap: ['Snowflakes', 'Moonshine', 'Sunshine', 'Aurora']

Comparison of Methods

Method Time Complexity Best For Preserves Order
sorted() O(n log n) Small to medium lists No
sorted() with enumerate O(n log n) When stable sorting needed Yes
heapq.nlargest() O(n log k) Large lists, small k No

Handling Edge Cases

Here's a robust function that handles common edge cases ?

def find_k_longest_words(words, k):
    if not words or k <= 0:
        return []
    
    if k >= len(words):
        return sorted(words, key=len, reverse=True)
    
    return sorted(words, key=len, reverse=True)[:k]

# Test with various inputs
test_cases = [
    (['Python', 'Java', 'C++', 'JavaScript'], 2),
    (['Hi'], 3),  # k greater than list length
    ([], 2),      # empty list
    (['Same', 'Size'], 1)
]

for words, k in test_cases:
    result = find_k_longest_words(words, k)
    print(f"Words: {words}, k={k} ? {result}")
Words: ['Python', 'Java', 'C++', 'JavaScript'], k=2 ? ['JavaScript', 'Python']
Words: ['Hi'], k=3 ? ['Hi']
Words: [], k=2 ? []
Words: ['Same', 'Size'], k=1 ? ['Same']

Conclusion

Use sorted() with key=len for simple cases. For large datasets with small k values, heapq.nlargest() is more efficient. When order preservation matters for same-length words, combine sorted() with enumerate().

Updated on: 2026-03-15T18:26:45+05:30

262 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements