Many times, after we get the result of a search we need to search one level deeper into part of the existing search result. For example, in a given body of text we aim to get the web addresses and also extract the different parts of the web address like the protocol, domain name etc. In such scenario we need to take help of group function which is used to divide the search result into various groups bases on the regular expression assigned. We create such group expression by separating the main search result using parentheses around the searchable part excluding the fixed words we want match.
import re text = "The web address is https://www.tutorialspoint.com" # Taking "://" and "." to separate the groups result = re.search('([\w.-]+)://([\w.-]+)\.([\w.-]+)', text) if result : print "The main web Address: ",result.group() print "The protocol: ",result.group(1) print "The doman name: ",result.group(2) print "The TLD: ",result.group(3)
When we run the above program, we get the following output −
The main web Address: https://www.tutorialspoint.com The protocol: https The doman name: www.tutorialspoint The TLD: com