Program to enclose pattern into bold tag in Python?

When working with text processing, you often need to highlight specific patterns by wrapping them in HTML tags. This problem involves finding all occurrences of given patterns in a text and enclosing them in <b> tags, while merging overlapping or adjacent patterns.

Problem Understanding

Given a text string and a list of patterns, we need to:

  • Find all substrings that match any pattern
  • Wrap matching substrings in <b> and </b> tags
  • Merge overlapping or adjacent bold regions

Algorithm Steps

The solution uses a boolean array to track which characters should be bold:

  1. Create a boolean array bold of the same length as text
  2. For each position in text, check if any pattern starts at that position
  3. Mark all characters of matching patterns as bold
  4. Build the result string by adding <b> tags at the start and </b> tags at the end of bold regions

Implementation

class Solution:
    def solve(self, text, patterns):
        n = len(text)
        bold = [False] * n
        
        # Mark characters that should be bold
        for i in range(n):
            for pattern in patterns:
                if text[i:].startswith(pattern):
                    for j in range(len(pattern)):
                        bold[i + j] = True

        # Build result string with bold tags
        result = ""
        for i in range(n):
            # Start bold tag if this is the beginning of a bold region
            if bold[i] and (i == 0 or not bold[i - 1]):
                result += "<b>"
            
            result += text[i]
            
            # End bold tag if this is the end of a bold region
            if bold[i] and (i == n - 1 or not bold[i + 1]):
                result += "</b>"
        
        return result

# Test the solution
solution = Solution()
text = "thisissampleline"
patterns = ["this", "ssam", "sample"]
print(solution.solve(text, patterns))
<b>this</b>i<b>ssample</b>line

How It Works

Let's trace through the example with text "thisissampleline" and patterns ["this", "ssam", "sample"]:

  1. Pattern "this": Found at index 0, marks positions 0-3 as bold
  2. Pattern "ssam": Found at index 3, marks positions 3-6 as bold
  3. Pattern "sample": Found at index 6, marks positions 6-11 as bold

The bold array becomes: [True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False]

Since positions 3-11 are all marked as bold (overlapping patterns), they merge into one continuous bold region.

Alternative Approach Using String Replacement

def embolden_text(text, patterns):
    n = len(text)
    bold = [False] * n
    
    # Mark all matching positions
    for pattern in patterns:
        start = 0
        while True:
            pos = text.find(pattern, start)
            if pos == -1:
                break
            for i in range(pos, pos + len(pattern)):
                bold[i] = True
            start = pos + 1
    
    # Build result
    result = ""
    for i in range(n):
        if bold[i] and (i == 0 or not bold[i - 1]):
            result += "<b>"
        result += text[i]
        if bold[i] and (i == n - 1 or not bold[i + 1]):
            result += "</b>"
    
    return result

# Test the alternative approach
text = "abcdefghijk"
patterns = ["abc", "def"]
print(embolden_text(text, patterns))
<b>abcdef</b>ghijk

Key Points

  • The algorithm handles overlapping patterns by merging them into continuous bold regions
  • Time complexity is O(n × m × p) where n is text length, m is number of patterns, and p is average pattern length
  • Space complexity is O(n) for the boolean array
  • The startswith() method efficiently checks if a pattern begins at a specific position

Conclusion

This solution efficiently identifies and merges overlapping text patterns using a boolean array to track bold regions. The approach ensures that adjacent or overlapping matches are combined into single bold tags, creating clean HTML output.

---
Updated on: 2026-03-25T12:14:49+05:30

468 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements