With the continuous advancements in internet security technologies, Cloudflare has introduced the Turnstile verification mechanism. This is a frictionless verification method designed to provide users with a seamless browsing experience while effectively blocking malicious traffic. However, for developers relying on automation tools and web scraping technologies, the introduction of Turnstile has undoubtedly increased the difficulty of bypassing such verifications.
Fortunately, by leveraging Nstbrowser's Browserless cloud service and automation tools like Puppeteer, developers can simulate real user behavior to successfully bypass Cloudflare Turnstile verification and continue efficiently completing data scraping tasks. This article will detail how Cloudflare Turnstile works, its impact on web scraping, and how to use Nstbrowser's Browserless service to tackle this challenge.
Cloudflare Turnstile is a new type of verification mechanism designed to replace traditional CAPTCHA. It distinguishes between human users and automated traffic through a frictionless verification process, reducing user interaction burdens.
While this mechanism is more user-friendly for regular users, it significantly increases the difficulty for bots and automation tools to bypass the verification.
The introduction of Cloudflare Turnstile poses several challenges for web scraping applications:
For developers needing to scrape large amounts of data or perform automation tasks, these challenges can significantly reduce task success rates and efficiency.
By using Nstbrowser's Browserless cloud service and automation tools like Puppeteer, developers can seamlessly bypass Cloudflare Turnstile's verification mechanism. Below are the strategies and implementation steps to address these challenges:
Nstbrowser's Browserless cloud service is a high-performance headless browser solution specifically designed to handle complex anti-bot mechanisms like Turnstile.
Install Puppeteer:
npm install puppeteer-core
Register and Log in to Nstbrowser:
Visit the official Nstbrowser website and create an account.
Obtain API Key:
When writing business logic with Puppeteer, you don’t need to worry about Cloudflare Turnstile blocking your requests. Nstbrowser's Browserless cloud service will automatically handle the verification, allowing developers to focus on their code logic.
Below is a complete example script:
import puppeteer from 'puppeteer-core';
const API_KEY = "your api key"; // required
const HOST = 'wss://less.nstbrowser.io';
const config = {
proxy: 'your proxy',
headless: true,
};
const query = new URLSearchParams({
"x-api-key": API_KEY, // required
"config": JSON.stringify(config),
});
const browserWSEndpoint = `${HOST}/connect?${query.toString()}`;
(async () => {
const browser = await puppeteer.connect({
browserWSEndpoint: browserWSEndpoint,
defaultViewport: null,
});
try {
const page = await browser.newPage();
await page.goto('https://www.scrapingcourse.com/login/cf-turnstile', { waitUntil: 'domcontentloaded' });
// Wait for turnstile to unlock successfully
await page.waitForFunction(() => {
return window.turnstile && window.turnstile.getResponse();
});
await page.screenshot({ path: 'turnstile-solved.png' });
} catch (e) {
console.error(e);
} finally {
await browser.close();
}
})();
You can also test the code directly in the Playground feature under the Browserless menu in the Nstbrowser client.
Simply add the following code in the Playground, and it will automatically establish the Browserless connection:
const page = await browser.newPage();
await page.goto('https://www.scrapingcourse.com/login/cf-turnstile', {
waitUntil: 'domcontentloaded'
});
const token = await page.waitForFunction(() => {
return window.turnstile && window.turnstile.getResponse();
});
console.info("Turnstile solved token:", token);
This guide provides a comprehensive solution to bypass Cloudflare Turnstile using Nstbrowser's Browserless service and Puppeteer, enabling developers to overcome challenges in modern web scraping efficiently.