In this article, we will learn about web scraping technique using lxml module available in Python.
Web scraping is used to obtain/get the data from a website by the help of a crawler/scanner. Web scrapping comes handy to extract the data from a web page that doesn't offer the functionality of an API. In python, web scrappping can be done by the help of various modules namely Beautiful Soup, Scrappy & lxml.
Here we will discuss web scrapping using the lxml module.
For that, we first need to install lxml .
Type in the terminal or command prompt −
>>> pip install lxml
Here xpath is used to access the data .
In this article we will extract data from the website known as steam containing informations about different games.
On the page, we will try to extract information from the popular new releases section.
Here we will extract names , prices , tags associated & target platform .
On the page see the html code of new releases tab by using inspect element feature in the chrome . Here we will get to know that which tag is storing the required information.
Here in this website ; every list element is encapslated in a div tag id=tab_content which is further encapsualted in
a div tag id=tab_select_newreleases
Now let's see the implementation