Beautiful Soup - Functions Reference

Beautiful Soup Useful Resources

Beautiful Soup - find_all() Method



Method Description

The find_all() method in Beautiful Soup looks for the elements that match the given criteria in the children of this PageElement and returns a list of all elements.

Syntax

Soup.find_all(name, attrs, recursive, string, **kwargs)

Parameters

name − A filter on tag name.

attrs − A dictionary of filters on attribute values.

recursive − If this is True, find() a recursive search will be performed. Otherwise, only the direct children will be considered.

limit − Stop looking after specified number of occurrences have been found.

kwargs − A dictionary of filters on attribute values.

Return type

The find_all() method returns a ResultSet object which is a list generator.

Example - finding all input tags

When we can pass in a value for name, Beautiful Soup only considers tags with certain names. Text strings will be ignored, as will tags whose names that don't match. In this example we pass title to find_all() method.

from bs4 import BeautifulSoup

html = """
<html>
   <head>
      <title>TutorialsPoint</title>
   </head>
   <body>
      <form>
         <input type = 'text' id = 'nm' name = 'name'>
         <input type = 'text' id = 'age' name = 'age'>
         <input type = 'text' id = 'marks' name = 'marks'>
      </form>
   </body>
</html>
"""

soup = BeautifulSoup(html, 'html.parser')

obj = soup.find_all('input')
print (obj)

Output

[<input id="nm" name="name" type="text"/>, <input id="age" name="age" type="text"/>, <input id="marks" name="marks" type="text"/>]

Example - finding matching tags using a filtering function

We can pass a string to the name argument of find_all() method. With string you can search for strings instead of tags. You can pass in a string, a regular expression, a list, a function, or the value True.

In this example, a function is passed to name argument. All the name starting with 'A' are returned by find_all() method.

from bs4 import BeautifulSoup

html = """
<html>
   <body>
      <h2>Departmentwise Employees</h2>
      <ul id="dept">
      <li>Accounts</li>
         <ul id='acc'>
         <li>Anand</li>
         <li>Mahesh</li>
         </ul>
      <li>HR</li>
         <ol id="HR">
         <li>Rani</li>
         <li>Ankita</li>
         </ol>
      </ul>
   </body>
</html>
"""

soup = BeautifulSoup(html, 'html.parser')

def startingwith(ch):
   return ch.startswith('A')

lst=soup.find_all(string=startingwith)

print (lst)

Output

['Accounts', 'Anand', 'Ankita']

Example - Finding first two appearances of a tag

In this example, we pass limit=2 argument to find_all() method. The method returns first two appearances of <li> tag.

from bs4 import BeautifulSoup

html = """
<html>
   <body>
      <h2>Departmentwise Employees</h2>
      <ul id="dept">
      <li>Accounts</li>
         <ul id='acc'>
         <li>Anand</li>
         <li>Mahesh</li>
         </ul>
      <li>HR</li>
         <ol id="HR">
         <li>Rani</li>
         <li>Ankita</li>
         </ol>
      </ul>
   </body>
</html>
"""

soup = BeautifulSoup(html, 'html.parser')

lst=soup.find_all('li', limit =2)

print (lst)

Output

[<li>Accounts</li>, <li>Anand</li>]
Advertisements