← Blog

"Selenium Series #18: Selenium 4 — Relative Locators, CDP Integration and What's New"

Selenium 4 is the biggest update in a decade. Learn relative locators, Chrome DevTools Protocol integration, improved window management, native authentication handling, and the new Grid architecture.

reading now
views
comments

Series Navigation

Part 17: Headless Browsers and Docker

This is the final post in the Selenium WebDriver series.

Start from Part 1 if you're new.


What's New in Selenium 4

Selenium 4 (official release: October 2021) is a W3C-compliant rewrite with major new features:

Feature Selenium 3 Selenium 4
W3C standard Partial Full compliance
Relative locators
CDP integration ✅ Native
New tab/window API
Grid architecture Hub/Node Distributed microservices
BiDi protocol ✅ (experimental)

Upgrading from Selenium 3

<!-- Before (Selenium 3) -->
<dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-java</artifactId>
    <version>3.141.59</version>
</dependency>

<!-- After (Selenium 4) -->
<dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-java</artifactId>
    <version>4.1.0</version>
</dependency>
<!-- Update WebDriverManager too -->
<dependency>
    <groupId>io.github.bonigarcia</groupId>
    <artifactId>webdrivermanager</artifactId>
    <version>5.0.3</version>
</dependency>

Breaking changes in Selenium 4:

  • WebDriverWait(driver, seconds)WebDriverWait(driver, Duration.ofSeconds(n))
  • implicitlyWait(n, TimeUnit.SECONDS)implicitlyWait(Duration.ofSeconds(n))
  • manage().timeouts().pageLoadTimeout() same change
  • DesiredCapabilities deprecated — use browser-specific Options classes
// Selenium 3
WebDriverWait wait = new WebDriverWait(driver, 10);
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);

// Selenium 4
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));

Relative Locators — Finding Elements by Position

Selenium 4 adds locators that find elements relative to other elements — like a human would describe them:

import static org.openqa.selenium.support.locators.RelativeLocator.*;

// Find the input ABOVE another element
WebElement passwordField = driver.findElement(
    with(By.tagName("input")).above(By.id("loginButton"))
);

// Find element BELOW
WebElement errorMessage = driver.findElement(
    with(By.cssSelector(".message")).below(By.id("email"))
);

// Find element to the LEFT
WebElement label = driver.findElement(
    with(By.tagName("label")).toLeftOf(By.id("username"))
);

// Find element to the RIGHT
WebElement submitBtn = driver.findElement(
    with(By.tagName("button")).toRightOf(By.id("cancelBtn"))
);

// NEAR — within 50px distance
WebElement closeBtn = driver.findElement(
    with(By.cssSelector(".close-btn")).near(By.id("modal-header"))
);

// Combining relative locators
WebElement saveBtn = driver.findElement(
    with(By.tagName("button"))
        .below(By.id("formTitle"))
        .toRightOf(By.id("cancelBtn"))
);

Practical use case — finding adjacent table cells:

// HTML table row: | Name | Email | Role | Actions |
// Find the email cell that's to the right of "Alice" name cell
WebElement aliceCell  = driver.findElement(By.xpath("//td[text()='Alice']"));
WebElement aliceEmail = driver.findElement(
    with(By.tagName("td")).toRightOf(aliceCell)
);
System.out.println(aliceEmail.getText()); // alice@example.com

Chrome DevTools Protocol (CDP)

CDP gives you direct access to Chrome's internal APIs — the same protocol Chrome DevTools uses. Selenium 4 exposes this through the HasDevTools interface.

Network Interception and Mocking

import org.openqa.selenium.devtools.DevTools;
import org.openqa.selenium.devtools.v96.network.Network;
import org.openqa.selenium.devtools.v96.network.model.*;

ChromeDriver driver = new ChromeDriver();
DevTools devTools = driver.getDevTools();
devTools.createSession();

// Enable network monitoring
devTools.send(Network.enable(Optional.empty(), Optional.empty(), Optional.empty()));

// Listen to ALL network requests
devTools.addListener(Network.requestWillBeSent(), request -> {
    System.out.println("Request: " + request.getRequest().getUrl());
});

// Listen to ALL network responses
devTools.addListener(Network.responseReceived(), response -> {
    System.out.println("Response: " + response.getResponse().getStatus()
        + " " + response.getResponse().getUrl());
});

// Mock a network response
devTools.send(Network.setBlockedURLs(List.of("*.analytics.com/*")));

driver.get("https://example.com");

Console Logs Capture

import org.openqa.selenium.devtools.v96.log.Log;

devTools.send(Log.enable());
devTools.addListener(Log.entryAdded(), logEntry -> {
    System.out.println("[Console " + logEntry.getLevel() + "] " + logEntry.getText());
});

driver.get("https://example.com");
// All console.log(), console.error() etc. captured in test output

Simulate Device — Mobile Emulation

import org.openqa.selenium.devtools.v96.emulation.Emulation;

// Emulate iPhone 12 Pro
devTools.send(Emulation.setDeviceMetricsOverride(
    390,     // width
    844,     // height
    3.0,     // deviceScaleFactor
    true,    // mobile
    Optional.empty(), Optional.empty(), Optional.empty(),
    Optional.empty(), Optional.empty(), Optional.empty(),
    Optional.empty(), Optional.empty(), Optional.empty()
));

devTools.send(Emulation.setUserAgentOverride(
    "Mozilla/5.0 (iPhone; CPU iPhone OS 15_0 like Mac OS X) AppleWebKit/605.1.15",
    Optional.empty(), Optional.empty(), Optional.empty()
));

driver.get("https://your-app.com");
// Page renders as mobile

Simulate Geolocation

import org.openqa.selenium.devtools.v96.emulation.Emulation;
import org.openqa.selenium.devtools.v96.emulation.model.GeolocationOverride;

devTools.send(Emulation.setGeolocationOverride(
    Optional.of(28.6139),   // latitude  (New Delhi)
    Optional.of(77.2090),   // longitude
    Optional.of(1.0)        // accuracy in meters
));

driver.get("https://your-app.com/location-features");
// App sees New Delhi coordinates

Simulate Network Conditions

import org.openqa.selenium.devtools.v96.network.Network;

// Simulate 3G network
devTools.send(Network.emulateNetworkConditions(
    false,      // offline
    200,        // latency ms
    750000,     // downloadThroughput bits/s (~750 Kbps)
    250000,     // uploadThroughput bits/s
    Optional.empty()
));

driver.get("https://your-app.com");
// Page loads under 3G conditions — test slow network behaviour

Capture JavaScript Exceptions

import org.openqa.selenium.devtools.v96.runtime.Runtime;

devTools.send(Runtime.enable());
List<String> jsErrors = new ArrayList<>();

devTools.addListener(Runtime.exceptionThrown(), exception -> {
    String error = exception.getExceptionDetails().getText();
    jsErrors.add(error);
    System.out.println("[JS Error] " + error);
});

driver.get("https://your-app.com");

// After test: assert no JS errors occurred
Assert.assertTrue(jsErrors.isEmpty(),
    "JavaScript errors found: " + jsErrors);

New Tab and Window APIs

// Open new tab (Selenium 4 only)
driver.switchTo().newWindow(WindowType.TAB);
driver.get("https://example.com");

// Open new window
driver.switchTo().newWindow(WindowType.WINDOW);
driver.get("https://other-site.com");

Full-Page Screenshots (Selenium 4)

Selenium 4 added native full-page screenshot capture (no scrolling workaround needed):

import org.openqa.selenium.PrintsPage;
import org.openqa.selenium.print.PrintOptions;

// Print to PDF (captures full page)
PrintOptions printOptions = new PrintOptions();
Pdf pdf = ((PrintsPage) driver).print(printOptions);

String base64Pdf = pdf.getContent();
byte[] bytes = Base64.getDecoder().decode(base64Pdf);
Files.write(Paths.get("full-page.pdf"), bytes);

What You've Learned in This Series

Across these 18 posts you went from driver.findElement(By.id("x")).click() to production-grade enterprise automation:

  1. ✅ Selenium architecture and browser driver mechanics
  2. ✅ Maven project setup with proper folder structure
  3. ✅ WebDriverManager for zero-maintenance driver management
  4. ✅ All 8 locator strategies with XPath axes and CSS power patterns
  5. ✅ Every WebElement interaction — inputs, dropdowns, checkboxes, uploads
  6. ✅ Implicit, explicit and fluent waits — the end of flaky tests
  7. ✅ TestNG — annotations, groups, listeners, parallel execution, DataProviders
  8. ✅ Page Object Model with PageFactory for scalable architecture
  9. ✅ Windows, frames and JavaScript dialogs
  10. ✅ JavaScriptExecutor for DOM manipulation and complex scenarios
  11. ✅ Actions class for hover, drag-drop and keyboard shortcuts
  12. ✅ Screenshots and file download handling
  13. ✅ Data-driven testing from Excel, CSV and JSON
  14. ✅ Cross-browser testing with parallel execution
  15. ✅ Selenium Grid for distributed multi-machine execution
  16. ✅ Jenkins CI/CD integration with pipeline configuration
  17. ✅ Headless and Docker containerised execution
  18. ✅ Selenium 4 — CDP, relative locators, and new APIs

This foundation handles any real-world Selenium challenge you'll encounter.


Want to go deeper? The Playwright series covers the next generation of browser automation — faster, more reliable, and built for modern web apps.

Discussion

Loading...

Leave a Comment

All comments are reviewed before appearing. No links please.

0 / 1000