How does selenium interact with the Web browser?

Selenium is an open source framework used for automation of web applications. Apart from this, it can also work on various administrative tasks such as monitoring of websites.

There are 4 flavors of Selenium −

  • Selenium-IDE.
  • Selenium-RC.
  • Selenium Grid.
  • Selenium Web Driver.

Let’s have a look at the uses of each of them −

  • Selenium IDE − It is the Integrated Development Environment which has easy to use interface used for building and running Selenium Test Cases. It is a prototyping tool for building test scripts in a way that it records the user actions as they are performed. The recorded actions are stored as a script which can be re-used as and when required.

    Although Selenium IDE is a Firefox Add-on, scripts created on it can be run on any of the browsers by using Selenium RC.

    It can be used to create simple test cases, uses assertions and verification for the selected locations, but for making advanced test cases we use either Selenium RC or Web Driver.

  • Selenium RC − RC is a Remote Control which works by taking the remote of the browser and then injects the automation code to be tested by injecting the custom scripts written. Selenium RC interacts with browsers using Selenium RC Server. It injects JavaScript function into browsers when the web page is loaded.

    RC had an advantage since it gives user automated HTML file of test results while this feature is not present in the Web Driver.

    Selenium RC provides an API and library for each of its supported languages such as Java, C#, Perl, Ruby, PHP, Python, making it as the first automated web testing tool that allows the user to program in any language they prefer.

    It works in cross-platform, cross browsers and can readily support new browsers.

  • Selenium Grid - To run different test cases at the same time in different remote machines, we use Selenium Grid. This enhances the speed of execution. Let’s say we have a test suite where there are both complex and simple test cases. In this scenario, we can divide the test cases on the basis of complexity and run them separately.

    Faster execution & parallel execution of test cases is the main advantage of Selenium Grid. With the support of Selenium RC, it runs multiple test cases in multiple environments in multiple remote machines.

  • Selenium Web Driver − Web Driver works on the browser directly and uses browsers in-built features to trigger the automation test written by the tester.

    Web driver comes with another advantage which is its use on HTMLUnit browsers(HTMLUnit browser are headless browsers which means these are invisible to the user, in simple words they have no GUI), due to these testing on HTMLUnit browser is faster as these browsers save the time needed to load the page elements. This results in less execution time for test cases.

    As Web Driver controls the browser from OS level hence its more speedy than its predecessor selenium RC.

Now let’s have a look at the architecture of Selenium to understand how selenium works with different languages and supports different browsers giving the same output i.e. test case execution.

The above picture depicts an idea about working of the selenium web driver. Below are the different modules which combine together to execute the Automated Test Scripts.

Selenium Libraries − Due to the restriction of different languages, developers have build selenium Client Libraries/language bindings to support multiple languages, for instance, if we are using the browser drivers in java, it will use java bindings.

Data Communication − To communicate between server and client (browser), selenium web driver uses JSON. JSON Wire Protocol is a REST API that transfers the information between HTTP servers. Each Browser Driver has its own HTTP server.

Custom Clients (Headless Browser) − Browser Drivers acts like assistant of browsers which communicates with a respective browser without revealing the internal logic of the browser’s functionality. When any browser Driver receives any command it will execute on that respective browser and send back the HTTP Response. Drivers are nothing but specific to each browser used for a secure connection. All these browser drivers work directly on the top of OS which makes it faster than traditional Selenium RC.

Now since we have gone through what goes inside the selenium, let’s have a look at the typical case to case execution of steps when we execute a script in selenium −

  • Selenium Script creates an HTTP Request for each selenium command and sends it to the browser driver.

  • An HTTP request is then sent to the server using Browser Driver.

  • The steps are executed on the HTTP server.

  • The execution status is sent to the HTTP server which is then captured by the automation script.

So, that was, in brief, the entire working along with the architecture of selenium.