Series Navigation
← Part 17: Headless Browsers and Docker
This is the final post in the Selenium WebDriver series.
Start from Part 1 if you're new.
What's New in Selenium 4
Selenium 4 (official release: October 2021) is a W3C-compliant rewrite with major new features:
| Feature | Selenium 3 | Selenium 4 |
|---|---|---|
| W3C standard | Partial | Full compliance |
| Relative locators | ❌ | ✅ |
| CDP integration | ❌ | ✅ Native |
| New tab/window API | ❌ | ✅ |
| Grid architecture | Hub/Node | Distributed microservices |
| BiDi protocol | ❌ | ✅ (experimental) |
Upgrading from Selenium 3
<!-- Before (Selenium 3) -->
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>3.141.59</version>
</dependency>
<!-- After (Selenium 4) -->
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>4.1.0</version>
</dependency>
<!-- Update WebDriverManager too -->
<dependency>
<groupId>io.github.bonigarcia</groupId>
<artifactId>webdrivermanager</artifactId>
<version>5.0.3</version>
</dependency>
Breaking changes in Selenium 4:
WebDriverWait(driver, seconds)→WebDriverWait(driver, Duration.ofSeconds(n))implicitlyWait(n, TimeUnit.SECONDS)→implicitlyWait(Duration.ofSeconds(n))manage().timeouts().pageLoadTimeout()same changeDesiredCapabilitiesdeprecated — use browser-specific Options classes
// Selenium 3
WebDriverWait wait = new WebDriverWait(driver, 10);
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
// Selenium 4
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));
Relative Locators — Finding Elements by Position
Selenium 4 adds locators that find elements relative to other elements — like a human would describe them:
import static org.openqa.selenium.support.locators.RelativeLocator.*;
// Find the input ABOVE another element
WebElement passwordField = driver.findElement(
with(By.tagName("input")).above(By.id("loginButton"))
);
// Find element BELOW
WebElement errorMessage = driver.findElement(
with(By.cssSelector(".message")).below(By.id("email"))
);
// Find element to the LEFT
WebElement label = driver.findElement(
with(By.tagName("label")).toLeftOf(By.id("username"))
);
// Find element to the RIGHT
WebElement submitBtn = driver.findElement(
with(By.tagName("button")).toRightOf(By.id("cancelBtn"))
);
// NEAR — within 50px distance
WebElement closeBtn = driver.findElement(
with(By.cssSelector(".close-btn")).near(By.id("modal-header"))
);
// Combining relative locators
WebElement saveBtn = driver.findElement(
with(By.tagName("button"))
.below(By.id("formTitle"))
.toRightOf(By.id("cancelBtn"))
);
Practical use case — finding adjacent table cells:
// HTML table row: | Name | Email | Role | Actions |
// Find the email cell that's to the right of "Alice" name cell
WebElement aliceCell = driver.findElement(By.xpath("//td[text()='Alice']"));
WebElement aliceEmail = driver.findElement(
with(By.tagName("td")).toRightOf(aliceCell)
);
System.out.println(aliceEmail.getText()); // alice@example.com
Chrome DevTools Protocol (CDP)
CDP gives you direct access to Chrome's internal APIs — the same protocol Chrome DevTools uses. Selenium 4 exposes this through the HasDevTools interface.
Network Interception and Mocking
import org.openqa.selenium.devtools.DevTools;
import org.openqa.selenium.devtools.v96.network.Network;
import org.openqa.selenium.devtools.v96.network.model.*;
ChromeDriver driver = new ChromeDriver();
DevTools devTools = driver.getDevTools();
devTools.createSession();
// Enable network monitoring
devTools.send(Network.enable(Optional.empty(), Optional.empty(), Optional.empty()));
// Listen to ALL network requests
devTools.addListener(Network.requestWillBeSent(), request -> {
System.out.println("Request: " + request.getRequest().getUrl());
});
// Listen to ALL network responses
devTools.addListener(Network.responseReceived(), response -> {
System.out.println("Response: " + response.getResponse().getStatus()
+ " " + response.getResponse().getUrl());
});
// Mock a network response
devTools.send(Network.setBlockedURLs(List.of("*.analytics.com/*")));
driver.get("https://example.com");
Console Logs Capture
import org.openqa.selenium.devtools.v96.log.Log;
devTools.send(Log.enable());
devTools.addListener(Log.entryAdded(), logEntry -> {
System.out.println("[Console " + logEntry.getLevel() + "] " + logEntry.getText());
});
driver.get("https://example.com");
// All console.log(), console.error() etc. captured in test output
Simulate Device — Mobile Emulation
import org.openqa.selenium.devtools.v96.emulation.Emulation;
// Emulate iPhone 12 Pro
devTools.send(Emulation.setDeviceMetricsOverride(
390, // width
844, // height
3.0, // deviceScaleFactor
true, // mobile
Optional.empty(), Optional.empty(), Optional.empty(),
Optional.empty(), Optional.empty(), Optional.empty(),
Optional.empty(), Optional.empty(), Optional.empty()
));
devTools.send(Emulation.setUserAgentOverride(
"Mozilla/5.0 (iPhone; CPU iPhone OS 15_0 like Mac OS X) AppleWebKit/605.1.15",
Optional.empty(), Optional.empty(), Optional.empty()
));
driver.get("https://your-app.com");
// Page renders as mobile
Simulate Geolocation
import org.openqa.selenium.devtools.v96.emulation.Emulation;
import org.openqa.selenium.devtools.v96.emulation.model.GeolocationOverride;
devTools.send(Emulation.setGeolocationOverride(
Optional.of(28.6139), // latitude (New Delhi)
Optional.of(77.2090), // longitude
Optional.of(1.0) // accuracy in meters
));
driver.get("https://your-app.com/location-features");
// App sees New Delhi coordinates
Simulate Network Conditions
import org.openqa.selenium.devtools.v96.network.Network;
// Simulate 3G network
devTools.send(Network.emulateNetworkConditions(
false, // offline
200, // latency ms
750000, // downloadThroughput bits/s (~750 Kbps)
250000, // uploadThroughput bits/s
Optional.empty()
));
driver.get("https://your-app.com");
// Page loads under 3G conditions — test slow network behaviour
Capture JavaScript Exceptions
import org.openqa.selenium.devtools.v96.runtime.Runtime;
devTools.send(Runtime.enable());
List<String> jsErrors = new ArrayList<>();
devTools.addListener(Runtime.exceptionThrown(), exception -> {
String error = exception.getExceptionDetails().getText();
jsErrors.add(error);
System.out.println("[JS Error] " + error);
});
driver.get("https://your-app.com");
// After test: assert no JS errors occurred
Assert.assertTrue(jsErrors.isEmpty(),
"JavaScript errors found: " + jsErrors);
New Tab and Window APIs
// Open new tab (Selenium 4 only)
driver.switchTo().newWindow(WindowType.TAB);
driver.get("https://example.com");
// Open new window
driver.switchTo().newWindow(WindowType.WINDOW);
driver.get("https://other-site.com");
Full-Page Screenshots (Selenium 4)
Selenium 4 added native full-page screenshot capture (no scrolling workaround needed):
import org.openqa.selenium.PrintsPage;
import org.openqa.selenium.print.PrintOptions;
// Print to PDF (captures full page)
PrintOptions printOptions = new PrintOptions();
Pdf pdf = ((PrintsPage) driver).print(printOptions);
String base64Pdf = pdf.getContent();
byte[] bytes = Base64.getDecoder().decode(base64Pdf);
Files.write(Paths.get("full-page.pdf"), bytes);
What You've Learned in This Series
Across these 18 posts you went from driver.findElement(By.id("x")).click() to production-grade enterprise automation:
- ✅ Selenium architecture and browser driver mechanics
- ✅ Maven project setup with proper folder structure
- ✅ WebDriverManager for zero-maintenance driver management
- ✅ All 8 locator strategies with XPath axes and CSS power patterns
- ✅ Every WebElement interaction — inputs, dropdowns, checkboxes, uploads
- ✅ Implicit, explicit and fluent waits — the end of flaky tests
- ✅ TestNG — annotations, groups, listeners, parallel execution, DataProviders
- ✅ Page Object Model with PageFactory for scalable architecture
- ✅ Windows, frames and JavaScript dialogs
- ✅ JavaScriptExecutor for DOM manipulation and complex scenarios
- ✅ Actions class for hover, drag-drop and keyboard shortcuts
- ✅ Screenshots and file download handling
- ✅ Data-driven testing from Excel, CSV and JSON
- ✅ Cross-browser testing with parallel execution
- ✅ Selenium Grid for distributed multi-machine execution
- ✅ Jenkins CI/CD integration with pipeline configuration
- ✅ Headless and Docker containerised execution
- ✅ Selenium 4 — CDP, relative locators, and new APIs
This foundation handles any real-world Selenium challenge you'll encounter.
Want to go deeper? The Playwright series covers the next generation of browser automation — faster, more reliable, and built for modern web apps.
Discussion
Loading...Leave a Comment
All comments are reviewed before appearing. No links please.