Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python program to find the frequency of the elements which are common in a list of strings
In this Python article, we explore how to find the frequency of elements that are common in a list of strings. We'll demonstrate three different approaches: finding character frequency using reduce(), word frequency using list operations, and word frequency using pandas functions.
Example 1: Character Frequency Using reduce() Function
This approach finds the frequency of characters that appear in all strings using the reduce() function with Counter ?
from functools import reduce
from collections import Counter
# Sample list of strings
strings = ['Types of Environment - Class Discussion',
'Management Structure and Nature - Class Discussion',
'Macro- Demography, Natural, Legal & Political - Class Discussion']
print("Input strings:")
for s in strings:
print(f" '{s}'")
# Find common characters and their frequency
char_freq = reduce(lambda m, n: m & n,
(Counter(elem) for elem in strings[1:]),
Counter(strings[0]))
print("\nCommon characters and their frequency:")
print(dict(char_freq))
Input strings:
'Types of Environment - Class Discussion'
'Management Structure and Nature - Class Discussion'
'Macro- Demography, Natural, Legal & Political - Class Discussion'
Common characters and their frequency:
{'s': 5, ' ': 5, 'e': 2, 'i': 2, 'o': 1, 'n': 1, 'r': 1, 'm': 1, 't': 1, '-': 1, 'C': 1, 'l': 1, 'a': 1, 'D': 1, 'c': 1, 'u': 1}
Example 2: Word Frequency Using List Operations
This method splits each string into words, combines them into a single list, and uses Counter to find frequencies ?
from collections import Counter
# Sample list of strings
strings = ['Types of Environment - Class Discussion',
'Management Structure and Nature - Class Discussion',
'Macro- Demography, Natural, Legal & Political - Class Discussion']
print("Input strings:")
for s in strings:
print(f" '{s}'")
# Split each string into words and combine
words_list = []
for string in strings:
words_list.extend(string.split())
print(f"\nCombined word list: {words_list}")
# Count frequency of words
word_freq = Counter(words_list)
print("\nWord frequency:")
for word, freq in word_freq.most_common():
print(f" '{word}': {freq}")
Input strings: 'Types of Environment - Class Discussion' 'Management Structure and Nature - Class Discussion' 'Macro- Demography, Natural, Legal & Political - Class Discussion' Combined word list: ['Types', 'of', 'Environment', '-', 'Class', 'Discussion', 'Management', 'Structure', 'and', 'Nature', '-', 'Class', 'Discussion', 'Macro-', 'Demography,', 'Natural,', 'Legal', '&', 'Political', '-', 'Class', 'Discussion'] Word frequency: '-': 3 'Class': 3 'Discussion': 3 'Types': 1 'of': 1 'Environment': 1 'Management': 1 'Structure': 1 'and': 1 'Nature': 1 'Macro-': 1 'Demography,': 1 'Natural,': 1 'Legal': 1 '&': 1 'Political': 1
Example 3: Word Frequency Using Pandas
This approach uses pandas Series and the value_counts() method for a more elegant solution ?
import pandas as pd
# Sample list of strings
strings = ['Types of Environment - Class Discussion',
'Management Structure and Nature - Class Discussion',
'Macro- Demography, Natural, Legal & Political - Class Discussion']
print("Input strings:")
for s in strings:
print(f" '{s}'")
# Split each string into words and combine
words_list = []
for string in strings:
words_list.extend(string.split())
print(f"\nCombined word list: {words_list}")
# Use pandas Series to count frequencies
word_freq = pd.Series(words_list).value_counts()
print("\nWord frequency using pandas:")
print(word_freq)
Input strings: 'Types of Environment - Class Discussion' 'Management Structure and Nature - Class Discussion' 'Macro- Demography, Natural, Legal & Political - Class Discussion' Combined word list: ['Types', 'of', 'Environment', '-', 'Class', 'Discussion', 'Management', 'Structure', 'and', 'Nature', '-', 'Class', 'Discussion', 'Macro-', 'Demography,', 'Natural,', 'Legal', '&', 'Political', '-', 'Class', 'Discussion'] Word frequency using pandas: - 3 Class 3 Discussion 3 Types 1 of 1 Environment 1 Management 1 Structure 1 and 1 Nature 1 Macro- 1 Demography, 1 Natural, 1 Legal 1 & 1 Political 1 dtype: int64
Working with Excel Data
To read string data from Excel files, you can use openpyxl and pandas ?
import openpyxl
import pandas as pd
# Load Excel file
workbook = openpyxl.load_workbook("data.xlsx")
sheet = workbook.active
# Convert to DataFrame
df = pd.DataFrame(sheet.values)
# Filter rows containing specific text
filtered_df = df[df.iloc[:,3].str.contains('Discussion', na=False)]
# Extract the strings column
strings = filtered_df.iloc[:,3].values.tolist()
# Now apply any of the frequency methods above
Comparison of Methods
| Method | Element Type | Best For | Performance |
|---|---|---|---|
reduce() + Counter
|
Characters | Character-level analysis | Good for small datasets |
Counter + List |
Words | Word frequency analysis | Fast and memory efficient |
Pandas value_counts()
|
Words | Data analysis workflows | Optimized for large datasets |
Conclusion
Use reduce() with Counter for character frequency analysis. For word frequency, Counter provides a simple solution while pandas value_counts() integrates well with data analysis workflows.
