puppeteer-humanize for Web Scraping

How to Use puppeteer-humanize for Web Scraping

In this guide, I’ll walk you through how to set puppeteer-humanize up, use it, and talk about its limits — and when to switch to tools like Bright Data.

What is puppeteer-humanize?

Puppeteer-humanize is a Node.js library that makes typing behavior look human when using Puppeteer. It doesn’t change your entire browser behavior, but it improves how your bot fills out text inputs, such as usernames and passwords on login forms.

Here’s what puppeteer-humanize does:

  • Types characters with random delays
  • Adds mistakes while typing and corrects them using backspace
  • Simulates spacebar behavior
  • Makes form filling look more human-like

Why Websites Detect Puppeteer

When using Puppeteer, you’re controlling a browser using code. This can lead to patterns that websites detect easily. Some examples of bot-like behavior include:

  • Always typing with the same speed
  • No typing mistakes
  • Clicking elements instantly
  • Cursor not moving naturally
  • Scrolling in fixed steps
  • No response to popups or dynamic changes

Anti-bot systems can use these signals to block your scraper, even if you’re logged in or using proxies. So, it’s important to make your automation behave more like a real person.

Installing puppeteer-humanize

To use puppeteer-humanize, you need to install it along with Puppeteer Extra and a stealth plugin that helps reduce detection.

Here’s how to install everything:

npm install @forad/puppeteer-humanize puppeteer-extra puppeteer-extra-plugin-stealth
  • @forad/puppeteer-humanize: the typing behavior enhancer
  • puppeteer-extra: a wrapper for Puppeteer that allows plugins
  • puppeteer-extra-plugin-stealth: hides Puppeteer’s fingerprint

Setting Up Your Scraper

Let’s build a simple scraper that visits a login page and enters credentials using puppeteer-humanize.

Create a file named scraper.js:

const { typeInto } = require('@forad/puppeteer-humanize');
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());

Now, we’ll launch the browser and navigate to a login form:

(async () => {
try {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://www.scrapingcourse.com/login/cf-turnstile');
const emailInput = await page.$('#email');
const passwordInput = await page.$('#password');
const submitButton = await page.$('#submit-button');
if (emailInput && passwordInput) {
const config = {
mistakes: {
chance: 8,
delay: {
min: 50,
max: 500,
},
},
delays: {
space: {
chance: 70,
min: 10,
max: 50,
},
},
};
await typeInto(emailInput, '[email protected]', config);
await typeInto(passwordInput, 'password123', config);
await page.waitForTimeout(2000); // simulate thinking time
await submitButton.click();
await page.screenshot({ path: 'screenshot.png' });
} else {
console.log('One or more form fields not found.');
}
await browser.close();
} catch (error) {
console.error('Error:', error);
}
})();

Understanding the Configuration

The config object passed to typeInto() defines how human-like your typing should be.

mistakes

This sets how often the script makes a typo.

mistakes: {
chance: 8, // 8% chance of a mistake
delay: { min: 50, max: 500 } // delay before correcting
}

The bot will type a wrong letter and correct it using backspace, just like a human.

delays

This controls delay between characters, especially spaces:

delays: {
space: {
chance: 70,
min: 10,
max: 50,
}
}

This simulates a pause when you press the spacebar or move between words.

Screenshots and Results

After running the script, a screenshot showing the filled-out form will be saved. If the website has simple bot detection, puppeteer-humanize might be enough to get through.

However, on more advanced anti-bot sites like those using Cloudflare or Turnstile, your scraper may still get blocked. This is because puppeteer-humanize only improves typing — not browser fingerprints, IP behavior, or JavaScript challenges.

Common Issues and Fixes

Problem: Selectors Not Found

Check the site structure using DevTools. Make sure the IDs or classes for inputs and buttons are correct.

Problem: Blocked by CAPTCHA

Puppeteer-humanize won’t solve CAPTCHAs. You can try solving them with external services like 2Captcha or Bright Data’s CAPTCHA Solver, but this adds complexity.

Problem: Still Blocked After Humanizing Typing

This means the website is using more advanced methods, such as browser fingerprinting, IP tracking, or canvas detection. You’ll need more than puppeteer-humanize.

Limitations of puppeteer-humanize

While puppeteer-humanize is useful, it’s not a complete anti-bot solution. Here are some key limitations:

  1. Limited Scope
    It only simulates typing. It doesn’t help with mouse movements, scrolling, or interaction with popups.
  2. Outdated
    The library hasn’t been updated in years. New anti-bot technologies are being developed, and puppeteer-humanize may no longer be enough.
  3. No IP Protection
    Sites can block your scraper based on IP address. puppeteer-humanize does nothing to hide or rotate your IP.
  4. No Fingerprint Spoofing
    It doesn’t change the browser fingerprint. Websites still know they’re dealing with a bot.
  5. Doesn’t Handle JavaScript Challenges
    If the site uses JavaScript puzzles or security checks, your scraper may fail to load the page.

A Better Way: Use Bright Data or Other Proxy Providers

Scrapers who need to work on websites with advanced protection should consider using Bright Data (formerly Luminati). Bright Data is a premium proxy provider and scraping infrastructure company. It offers:

  • Residential and mobile IPs from millions of locations
  • Browser fingerprinting protection
  • Session management
  • Built-in CAPTCHA solving
  • JavaScript rendering

Want to read about other providers? Check out my list of the best proxy providers.

Simple Example Using Bright Data:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless: true,
args: [
' - proxy-server=http://:@brd.superproxy.io:33335',
],
});
const page = await browser.newPage();
await page.goto('https://example.com');
const content = await page.content();
console.log(content);
await browser.close();
})();

Replace and with your Bright Data credentials. This gives you access to real IP addresses, helping bypass geo-blocking and fingerprint detection.

Conclusion

Puppeteer-humanize is a helpful tool that makes your web scraper type like a real person. It adds delays, mistakes, and corrections to make typing look more natural. This can help you avoid simple bot detection systems that look for perfect typing. However, puppeteer-humanize has some limits. It doesn’t hide your IP address, can’t change your browser fingerprint, and won’t solve CAPTCHAs.

So, if you’re scraping websites with strong protection, you’ll need more than just this tool. But for basic scraping tasks like filling out login forms or collecting public data, it’s a simple and valuable way to make your bot more human.

Similar Posts