Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Run Length Encoding in Python
Run-length encoding compresses a string by grouping consecutive identical characters and representing them as character + count. For example, "aaabbc" becomes "a3b2c1".
However, the example above shows a different approach - counting total occurrences of each character rather than consecutive runs. Let's explore both approaches.
Character Frequency Encoding
This approach counts total occurrences of each character ?
import collections
def character_frequency_encoding(string):
# Initialize ordered dictionary to maintain character order
count_dict = collections.OrderedDict.fromkeys(string, 0)
# Count occurrences of each character
for char in string:
count_dict[char] += 1
# Build encoded string
encoded_string = ""
for key, value in count_dict.items():
encoded_string += key + str(value)
return encoded_string
# Test examples
string1 = "tutorialspoint"
result1 = character_frequency_encoding(string1)
print(f"'{string1}' ? '{result1}'")
string2 = "aaaaaabbbbbccccccczzzzzz"
result2 = character_frequency_encoding(string2)
print(f"'{string2}' ? '{result2}'")
'tutorialspoint' ? 't3u1o2r1i2a1l1s1p1n1' 'aaaaaabbbbbccccccczzzzzz' ? 'a6b5c7z6'
True Run-Length Encoding
Traditional run-length encoding groups consecutive identical characters ?
def run_length_encoding(string):
if not string:
return ""
encoded = ""
current_char = string[0]
count = 1
for i in range(1, len(string)):
if string[i] == current_char:
count += 1
else:
encoded += current_char + str(count)
current_char = string[i]
count = 1
# Add the last group
encoded += current_char + str(count)
return encoded
# Test examples
test1 = "aaabbc"
result1 = run_length_encoding(test1)
print(f"'{test1}' ? '{result1}'")
test2 = "aabbbaabbcc"
result2 = run_length_encoding(test2)
print(f"'{test2}' ? '{result2}'")
'aaabbc' ? 'a3b2c1' 'aabbbaabbcc' ? 'a2b3a2b2c2'
Comparison
| Method | Input: "aabbcc" | Output | Use Case |
|---|---|---|---|
| Character Frequency | "aabbcc" | "a2b2c2" | Character counting |
| Run-Length Encoding | "aabbcc" | "a2b2c2" | Data compression |
Conclusion
Character frequency encoding counts total occurrences using OrderedDict. True run-length encoding compresses consecutive identical characters and is more effective for data compression.
Advertisements
