← Blog

"Selenium Series #1: Introduction to Selenium WebDriver — What It Is and Why You Need It"

Understand what Selenium WebDriver is, how it works under the hood, what problems it solves, and get your Java development environment ready to write your first automated test.

reading now
views
comments

The Selenium Series

This is a complete, zero-to-advanced Selenium WebDriver training series. Every post includes working code, real explanations, and practical patterns used in production test suites.

Full Series Roadmap:

  1. Introduction to Selenium WebDriver ← you are here
  2. Project Setup — Maven, Dependencies and First Test
  3. Browser Drivers — ChromeDriver, GeckoDriver and WebDriverManager
  4. Locators — ID, Name, XPath, CSS and Beyond
  5. WebElement Interactions — Click, Type, Select, Hover
  6. Waits — Implicit, Explicit and Fluent
  7. TestNG Integration — Annotations, Groups and Parallel Runs
  8. Page Object Model — Scalable Test Architecture
  9. Handling Windows, Frames, Alerts and Pop-ups
  10. JavaScriptExecutor — Beyond the WebDriver API
  11. Actions Class — Mouse, Keyboard and Drag-Drop
  12. Screenshots, File Downloads and Test Evidence
  13. Data-Driven Testing — Excel, CSV and DataProviders
  14. Cross-Browser Testing — Chrome, Firefox, Safari, Edge
  15. Selenium Grid — Parallel Execution at Scale
  16. CI/CD Integration — Jenkins and Maven Surefire
  17. Headless Browsers and Docker — Containerised Testing
  18. Selenium 4 — New Features, Relative Locators and CDP

What Is Selenium?

Selenium is an open-source suite of tools for automating web browsers. When a Selenium test runs, it opens a real browser window, navigates to pages, clicks buttons, fills forms, and reads content — exactly like a human user would, but at machine speed.

The Selenium suite has three main components:

Selenium Suite
├── Selenium WebDriver   ← what we use for automation (this series)
├── Selenium IDE         ← record-and-playback browser extension
└── Selenium Grid        ← run tests across multiple machines (Part 15)

Selenium WebDriver is the API that lets your code talk to a browser. It replaced the older Selenium RC (Remote Control) in Selenium 2.0 and is the industry standard for browser automation.


How WebDriver Works Under the Hood

Understanding the architecture prevents a lot of confusion later.

Your Test Code (Java)
        │
        ▼
   WebDriver API
        │  sends JSON commands over HTTP
        ▼
  Browser Driver         ← ChromeDriver, GeckoDriver, EdgeDriver
  (local process)
        │  uses browser-native protocol
        ▼
  Actual Browser         ← Chrome, Firefox, Edge, Safari
  (Chrome, Firefox...)
        │
        ▼
   Web Application

When you write driver.findElement(By.id("login")).click(), here is what actually happens:

  1. Your Java code calls the WebDriver API
  2. The API sends an HTTP POST request to ChromeDriver:
    POST /session/{sessionId}/element
    {"using": "id", "value": "login"}
    
  3. ChromeDriver translates this to Chrome's DevTools Protocol
  4. Chrome finds the element and clicks it
  5. ChromeDriver sends back the result
  6. Your Java code receives the response

This is why you need a browser driver executable separate from the browser itself — it acts as the translator.


The W3C WebDriver Standard

Since Selenium 3.x, the WebDriver API is a W3C standard — meaning all browser vendors (Google, Mozilla, Microsoft, Apple) implement the same specification. This is why the same test code can run on Chrome, Firefox, Edge, and Safari with minimal changes.

The spec defines commands like:

  • Navigate to URL
  • Find element
  • Click element
  • Get element text
  • Execute script
  • Take screenshot

Prerequisites

Before we install anything, confirm you have the right tools:

Required:

  • Java JDK 8 or higher — Selenium's Java bindings require JDK, not just JRE
  • Maven or Gradle — for dependency management
  • An IDE — IntelliJ IDEA (recommended) or Eclipse
  • A browser — Chrome (we'll start with this)

Check your Java version:

java -version
# Expected output:
# java version "11.0.8" 2020-07-14 LTS
# Java(TM) SE Runtime Environment 18.9

javac -version
# Expected output:
# javac 11.0.8

If javac is not found, you have JRE only — download JDK from adoptopenjdk.net.


Install IntelliJ IDEA

IntelliJ IDEA Community Edition is free and the best Java IDE for Selenium work.

  1. Download from jetbrains.com/idea/download
  2. Install with default settings
  3. On first launch, choose "Do not import settings"
  4. Select your preferred theme (Darcula is popular)
IntelliJ IDEA layout:
┌─────────────────────────────────────────────────┐
│  File  Edit  View  Navigate  Code  Run  Tools   │ ← Menu bar
├──────────┬──────────────────────────┬────────────┤
│          │                          │            │
│ Project  │     Editor               │ Maven      │
│ Explorer │     (your code here)     │ Panel      │
│          │                          │            │
├──────────┴──────────────────────────┴────────────┤
│  Run / Debug console                             │ ← Output
└─────────────────────────────────────────────────┘

What Languages Does Selenium Support?

Selenium has official bindings for:

Language Popularity for QA Notes
Java ★★★★★ Most common This series uses Java
Python ★★★★☆ Growing fast Concise syntax
C# ★★★☆☆ .NET teams Strong typing
JavaScript ★★☆☆☆ Use Playwright instead
Ruby ★★☆☆☆ Legacy teams
Kotlin ★★★☆☆ Growing in Android shops

Java remains the most used language for enterprise Selenium suites because of its mature ecosystem, strong typing, and excellent IDE support. That's what we'll use throughout this series.


Key Concepts to Understand Before Coding

The WebDriver Interface

WebDriver is a Java interface. The classes that implement it — ChromeDriver, FirefoxDriver, EdgeDriver — provide the actual browser-specific implementation.

// WebDriver is the type (interface)
WebDriver driver = new ChromeDriver(); // ChromeDriver implements WebDriver

// This means you can swap browsers by changing one line:
WebDriver driver = new FirefoxDriver(); // exact same code works

WebElement

A WebElement represents a single HTML element on the page — a button, input, div, link, etc. Everything you interact with is a WebElement.

WebElement loginButton = driver.findElement(By.id("login-btn"));
loginButton.click();

By

By is a factory class for creating locator strategies. You tell Selenium how to find an element:

By.id("username")           // find by id attribute
By.name("email")            // find by name attribute
By.className("btn-primary") // find by CSS class
By.xpath("//button[@type='submit']") // find by XPath
By.cssSelector("#form .submit") // find by CSS selector
By.linkText("Sign in")      // find by exact link text
By.tagName("h1")            // find by HTML tag

Your Very First Selenium Test (No Maven Yet)

Just to prove it works — download the jars manually and run this:

// File: FirstTest.java
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

public class FirstTest {
    public static void main(String[] args) throws InterruptedException {
        // Tell Selenium where ChromeDriver is
        System.setProperty("webdriver.chrome.driver", "/path/to/chromedriver");

        // Open Chrome
        WebDriver driver = new ChromeDriver();

        // Navigate to a website
        driver.get("https://www.google.com");

        // Print the page title
        System.out.println("Page title: " + driver.getTitle());
        // Output: Page title: Google

        // Wait 2 seconds so you can see the browser
        Thread.sleep(2000);

        // Close the browser
        driver.quit();
    }
}

When you run this, a Chrome window opens, navigates to Google, and closes. That's Selenium working.

Note: We're using Thread.sleep() here just to see what's happening. In real tests, never use Thread.sleep() — we cover proper waits in Part 6.


driver.close() vs driver.quit()

This trips up many beginners:

driver.close();   // closes the CURRENT browser tab/window only
                  // if it's the only tab, the browser may stay open as a zombie process

driver.quit();    // closes ALL windows and kills the browser process entirely
                  // ALWAYS use this in your @AfterTest cleanup

Always call driver.quit() in your test teardown. Failing to do so leaves orphaned browser processes that consume memory and eventually crash your CI server.


What's Next

In Part 2 we set up a proper Maven project with Selenium dependencies, TestNG, and write a structured first test. We'll never manually download jars again — Maven handles everything.

Discussion

Loading...

Leave a Comment

All comments are reviewed before appearing. No links please.

0 / 1000