Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selenium and Python to find elements and text?
We can find elements and extract their text with Selenium webdriver. First, identify the element using any locator like id, class name, CSS selector, or XPath. Then use the text property to obtain the text content.
Syntax
element_text = driver.find_element(By.CSS_SELECTOR, "h4").text
Here driver is the webdriver object. The find_element() method identifies the element using the specified locator, and the text property extracts the text content.
Modern Selenium Approach
Recent Selenium versions use the By class for locators instead of the deprecated find_element_by_* methods ?
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
# Setup Chrome driver
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)
# Navigate to webpage
driver.get("https://www.tutorialspoint.com/index.htm")
# Find element and extract text
element_text = driver.find_element(By.CSS_SELECTOR, "h4").text
print("The text is:", element_text)
driver.quit()
Different Locator Methods
You can use various locator strategies to find elements ?
from selenium import webdriver from selenium.webdriver.common.by import By # By ID text_by_id = driver.find_element(By.ID, "element-id").text # By Class Name text_by_class = driver.find_element(By.CLASS_NAME, "element-class").text # By Tag Name text_by_tag = driver.find_element(By.TAG_NAME, "h1").text # By XPath text_by_xpath = driver.find_element(By.XPATH, "//div[@class='content']").text # By CSS Selector text_by_css = driver.find_element(By.CSS_SELECTOR, "div.content").text
Complete Example
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# Setup driver
service = Service("path/to/chromedriver")
driver = webdriver.Chrome(service=service)
try:
# Navigate to webpage
driver.get("https://www.tutorialspoint.com/index.htm")
# Wait for element to be present
wait = WebDriverWait(driver, 10)
element = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "h4")))
# Extract text
element_text = element.text
print(f"The text is: {element_text}")
except Exception as e:
print(f"Error: {e}")
finally:
driver.quit()
Key Points
- Always use
driver.quit()to close the browser properly - Use
WebDriverWaitfor better reliability with dynamic content - The
textproperty returns visible text only, not HTML content - Use
get_attribute("innerHTML")to get HTML content instead
Conclusion
Use Selenium's find_element() method with appropriate locators and the text property to extract element text. Always use modern Selenium syntax with the By class and implement proper wait conditions for reliable web scraping.
