How to Split Text Using Regex in Golang?


In Golang, splitting text using regular expressions (regex) is a powerful and flexible way to extract information from strings. In this article, we will explore how to split text using regex in Golang.

Using the regexp.Split() function

Golang provides a built-in regexp package that allows us to work with regex expressions. The regexp.Split() function can be used to split a string based on a regex pattern.

Example

Here is an example of using the regexp.Split() function to split a string using a regex pattern −

package main

import (
   "fmt"
   "regexp"
)

func main() {
   str := "The quick brown fox jumps over the lazy dog"
   pattern := "\s+"

   regex := regexp.MustCompile(pattern)
   result := regex.Split(str, -1)

   fmt.Printf("%q\n", result)
}

Output

["The" "quick" "brown" "fox" "jumps" "over" "the" "lazy" "dog"]

In the above example, we have a string named str that contains a sentence. We want to split this string into words based on whitespace characters. We define a regex pattern that matches one or more whitespace characters using the \s+ sequence. We then create a regex object using the regexp.MustCompile() function, which compiles the pattern into a regular expression object. Finally, we call the regex.Split() function with the input string and the regex object to obtain a slice of words.

The second argument of the regex.Split() function specifies the maximum number of splits to be performed. If the value is negative, all possible splits will be made.

Using the regexp.FindAllString() function

Another way to split a string using regex in Golang is to use the regexp.FindAllString() function. This function returns all non-overlapping matches of a regex pattern in a string as a slice of strings.

Example

Here is an example of using the regexp.FindAllString() function to split a string using a regex pattern −

package main

import (
   "fmt"
   "regexp"
)

func main() {
   str := "The quick brown fox jumps over the lazy dog"
   pattern := "\S+"

   regex := regexp.MustCompile(pattern)
   result := regex.FindAllString(str, -1)

   fmt.Printf("%q\n", result)
}

Output

["The" "quick" "brown" "fox" "jumps" "over" "the" "lazy" "dog"]

In the above example, we have a string named str that contains a sentence. We want to split this string into words based on non-whitespace characters. We define a regex pattern that matches one or more non-whitespace characters using the \S+ sequence. We then create a regex object using the regexp.MustCompile() function, which compiles the pattern into a regular expression object. Finally, we call the regex.FindAllString() function with the input string and the regex object to obtain a slice of words.

The second argument of the regex.FindAllString() function specifies the maximum number of matches to be returned. If the value is negative, all matches will be returned.

Conclusion

Splitting text using regex in Golang is a powerful way to extract information from strings. The regexp.Split() and regexp.FindAllString() functions provide flexible and efficient ways to split strings based on regex patterns.

Updated on: 25-Apr-2023

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements