How to Bypass CAPTCHA With Selenium in Java
I’ve found a few tricks that help reduce the hassle of dealing with CAPTCHA while automating things. Whether using Selenium or another tool, these strategies can improve your chances of getting past the barrier. The goal isn’t to break the system but to make automation less complicated and time-consuming when CAPTCHAs are involved. Let’s dive into the options.
Understanding CAPTCHA and Its Types
Before diving into bypass strategies, it’s essential to understand that CAPTCHAs come in various forms:
- Text-based CAPTCHAs: Require solving distorted text.
- Image-based CAPTCHAs: Require selecting certain objects within a set of images.
- ReCAPTCHA v2 and v3: Google’s CAPTCHAs rely on user behavior to verify humans from bots.
Bypassing CAPTCHAs is not straightforward as websites evolve to detect bots. Hence, the following methods can help in certain scenarios.
The Easy CAPTCHA Solution
Before we dive deeper into handling CAPTCHAs with Selenium in Java, I want to introduce some of the best CAPTCHA solving services currently available. You can read about the top CAPTCHA solving services here.
Skip the long read, here is the final list:
- Bright Data CAPTCHA Solver — Robust proxy-based solver; handles complex CAPTCHAs with high accuracy.
- BypassCaptcha — Simple integration; supports multiple languages and reCAPTCHA handling.
- 2captcha — Human-powered service; supports diverse CAPTCHAs and is cost-effective.
- CapSolver — High-speed solver with affordable pricing for bulk CAPTCHA needs.
- Anticaptcha — Fast response and global human solvers; effective for reCAPTCHA.
- Best Captcha Solver — Human-based solution with 24/7 support; known for reliable results.
Can Selenium and Java Handle CAPTCHA?
Selenium Java can handle CAPTCHAs, but how you approach it depends on the website. Some sites show CAPTCHA challenges only when they detect suspicious bot activity, while others use CAPTCHA immediately to block automated access.
In the first case, you might avoid CAPTCHA by mimicking natural browsing behavior to stay under the radar. In the second, a human needs to solve the CAPTCHA.
While Selenium can be used for both situations, solving CAPTCHAs consistently is challenging to scale. That’s why it’s more effective to focus on preventing CAPTCHAs from showing up in the first place. Let’s explore how to avoid these challenges while using Selenium for automation.
Approach 1: Using Anti-CAPTCHA Services
One of the more common methods is using third-party anti-CAPTCHA services such as:
- 2Captcha
- AntiCaptcha
- DeathByCaptcha
These services use APIs to solve CAPTCHA challenges by sending them to human solvers. The integration of these services with Selenium in Java is straightforward. Here’s a simple code snippet:
// Import required libraries
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
public class BypassCaptcha {
public static void main(String[] args) {
System.setProperty("webdriver.chrome.driver", "path_to_chromedriver");
WebDriver driver = new ChromeDriver();
driver.get("https://www.example.com/captcha");
// Assume the CAPTCHA image is here and requires solving
// Send CAPTCHA image to a third-party CAPTCHA-solving service API
String captchaSolution = callCaptchaService(driver);
// Input solved CAPTCHA into the webpage
driver.findElement(By.id("captchaInput")).sendKeys(captchaSolution);
driver.findElement(By.id("submit")).click();
}
private static String callCaptchaService(WebDriver driver) {
// Example API call to a CAPTCHA-solving service
// Logic to send CAPTCHA image, receive solution, and return it
return "solvedCaptcha";
}
}
Approach 2: Using Undetected Chromedriver
Selenium bots are often detected by CAPTCHA scripts, especially with Google reCAPTCHA. Using an undetected WebDriver, such as Undetected ChromeDriver, allows for bypassing some of the detection mechanisms websites use to identify Selenium.
To integrate it:
- Download undetected ChromeDriver.
- Use it instead of the default WebDriver.
Example:
System.setProperty("webdriver.chrome.driver", "path_to_undetected_chromedriver");
WebDriver driver = new ChromeDriver();
// Continue with your usual automation steps
This method works because some CAPTCHAs track user behavior (mouse movements, typing speed), and an undetected WebDriver mimics more natural user interactions.
Approach 3: Pre-Solving CAPTCHA by Collecting Cookies
Some websites allow CAPTCHA to be solved only once per session. You can manually solve the CAPTCHA once, save the cookies of the session, and reuse these cookies in subsequent Selenium sessions.
Here’s how to do it:
Set<Cookie> cookies = driver.manage().getCookies();
// Save cookies to a file
saveCookiesToFile(cookies);
// Later, load cookies back
for(Cookie cookie : loadCookiesFromFile()) {
driver.manage().addCookie(cookie);
}
In future sessions, you may bypass CAPTCHA using cookies from an already verified session.
Approach 4: Using Browser Automation with Human-like Interaction
Some CAPTCHAs depend on analyzing user interactions. Tools like Selenium Stealth help simulate human-like behavior, reducing the chances of CAPTCHA triggering.
This involves mimicking:
- Natural mouse movement.
- Random delays between actions.
- Scrolling and page interaction.
Conclusion
CAPTCHA bypassing is a complex and evolving challenge. While the methods above can help reduce CAPTCHA disruptions in Selenium-based automation in Java, it’s essential to remember that CAPTCHA is designed to prevent automation. Respecting the website’s policies and terms of service is critical to maintaining a responsible approach.