• Selenium Video Tutorials

Selenium WebDriver - Introduction



Selenium is an open-source and a portable automated software testing tool for testing web applications. It has capabilities to operate across different browsers and operating systems. Selenium is not just a single tool but a set of tools that helps testers to automate web-based applications more efficiently. Selenium can be used with multiple programming languages like Java, Python, Ruby, JavaScript, C#, and so on.

Let us now understand each one of the tools available in the Selenium suite and their usage.

  • Selenium IDE − Selenium Integrated Development Environment (IDE) is a Firefox plugin that lets testers record their actions as they follow the workflow that they need to test. From Selenium 4, Selenium IDE is also available in Chrome along with parallel execution features and metrics to track pass/fail tests.

  • Selenium RC − Selenium Remote Control (RC) was the flagship testing framework that allowed more than simple browser actions and linear execution. It makes use of the full power of programming languages such as Java, C#, PHP, Python, Ruby and PERL to create more complex tests. It is similar to a server running in tandem with the browser and the tests.

  • Selenium WebDriver is the successor to Selenium RC which sends commands directly to the browser and retrieves results.

  • Selenium Grid is a tool used to run parallel tests across different machines and different browsers simultaneously which results in minimized execution time. Till Selenium 3, jars for hub and node are required to be triggered separately to kick test execution. However, from Selenium 4, both the hub and node jars are combined to a single jar.

Let us discuss the architecture of Selenium 4. Selenium Web Driver architecture in a simplified diagram is described below −

Selenium Webdriver Overview

From the Selenium 4 version, the entire architecture is fully compatible with W3C - World Wide Consortium meaning Selenium 4 follows all the standards and guidelines given by W3C. We can get more information about W3C from the below link −

https://www.tutorialspoint.com/world-wide-web-consortium-w3c.

The basic difference between Selenium 3 and Selenium 4 is that, in Selenium 3 the communication between client and server is done by JSON Wire protocol, however, from Selenium 4, there is direct communication between the server and client, following the W3C guidelines.

Selenium WebDriver API enables interaction between browsers and browser drivers. This architecture consists of four layers namely the Selenium Client Library, W3C Protocol, Browser Drivers and Browsers. Since the browsers, browser drivers, and Selenium webdriver are compliant with W3C protocols, hence the interaction between the client libraries and the browser drivers are more efficient, faster, reliable, and stable.

  • Selenium Client Library consists of languages like Java, Ruby, Python, C# and so on. A test case written in any language is used to send the command to interact with the browsers.

  • After the code is triggered, it will be converted to Json or other standard similar formats by the client as per the W3C protocol.

  • W3C protocol is used for the task of transferring information from the server to the client. The browser drivers act as a link between the client and browser. The browser drivers have the serialized request which is actually performed on the browsers. Browser drivers interact with their respective browsers and execute the commands by interpreting Json. As soon as the browser drivers get any instructions, they run them on the browsers. Then the response is given back in the form of HTTP response.

  • The browser drivers also serialize the response it receives in a standardized format as per W3C protocols and send it back to the client. Then the client would deserialize the responses it received to confirm if there is a successful execution of the command request.

Before Selenium 4, the communication between the client and server was carried on using the JSON Wire protocol over HTTP. However, there is no existence of JSON Wire protocol from Selenium 4,and there is direct communication between the client and server using the W3C protocols.

Let’s consider the below block of code −

WebDriver driver = new ChromeDriver();
driver.get (“https://www.tutorialspoint.com/selenium/practice/selenium_automation_practice.php“);

Once we run this block of code, the entire code will be converted to JSON or any other standard formats as per W3C protocols over HTTP as a URL. The converted URL will be fed to the ChromeDriver.

The browser driver utilizes HTTP server to get the request from HTTP. As the browser driver gets the URL, it passes the request to its browser via HTTP. It will trigger the event of executing the Selenium instructions on the browser.

Now if the request is that of POST, it will trigger an action on the browser. If it’s a GET request, then the response will be produced at the browser end. Finally it will be passed over HTTP to the browser driver. The browser driver will in turn send it to the UI.

This sums up the overall explanation of the Selenium WebDriver Architecture.

In the above example, we have used the class ChromeDriver which extends the ChromiumDriver from the Selenium 4 onwards. Previously, it extended the RemoteWebDriver class. Also, Selenium 4 gives some additional locators (apart from the regular locators like id, class, xpath, and so on) called the Relative locators with the help of the methods - above, below, near, toRightOf, toLeftOf, and chaining of multiple Relative locators. Selenium 4 also gives access to the ChromeDevTools which help for debugging, network traffic analysis, and other features that help in automation.

Let us discuss some of the advantages of Selenium −

  • Selenium is a free and open-source tool.

  • Can be extended for various technologies that expose DOM.

  • Supports multiple browsers.

  • Supports multiple operating systems, and platforms.

  • Supports mobile devices.

  • Supports headless and parallel executions.

  • Has a big community support for help in case of issues.

Let us discuss some of the disadvantages of Selenium −

  • Supports only web based applications.

  • No support for QR, captcha and barcode, scenario.

  • No feature such as Object Repository/Recovery Scenario.

  • No default test report generation.

  • Programming and technical knowledge required.

  • It requires time to be more compatible and stable with new browsers.

  • Difficult to set up since no vendor support.

Advertisements