How to Rotate Proxies in Python
In this guide, I’ll walk you through proxies, why we need proxy rotation, and how to set it up in Python. No complex jargon — just simple, practical steps to keep your scraping running smoothly. Let’s dive in!
What is a Proxy?
A proxy is an intermediary server between your computer and the internet. When you make a request using a proxy, it forwards the request to the target website and then returns the response. This process helps to hide your actual IP address.
Types of Proxies
- Datacenter Proxies — Fast and inexpensive but easily detectable.
- Residential Proxies — Use real IPs assigned to households, making them harder to detect.
- Mobile Proxies — Use IP addresses from mobile devices, providing the highest anonymity.
- ISP Proxies — Similar to residential proxies but provided by ISPs for stable connections.
Proxies help you avoid detection, bypass restrictions, and access region-specific content.
Why Use Proxy Rotation in Python?
When web scraping or automating requests, using a single IP address can lead to rate limits, CAPTCHAs, or IP bans. Proxy rotation helps to:
- Prevent IP Blocks — Distribute requests across multiple IPs to avoid detection.
- Bypass Rate Limits — Websites restrict requests from a single IP, but rotating proxies can bypass this.
- Access Geo-Restricted Content — Some websites display different content based on location. Using proxies from different countries allows you to access this content.
Setting Up Proxy Rotation in Python
Now, let’s look at how to implement proxy rotation using different Python libraries.
Rotating Proxies with the Requests Library
The requests library is one of the simplest ways to make HTTP requests in Python. We can manually rotate proxies using a list of available proxies.
Step 1: Install Requests
First, install the requests library:
pip install requests
Step 2: Define Proxy Rotation Logic
Create a Python file (rotate_requests.py) and add the following code:
import random
import requests
# List of proxies
proxies_list = [
"http://username:password@PROXY_1:PORT",
"http://username:password@PROXY_2:PORT",
"http://username:password@PROXY_3:PORT"
]
# Function to get a random proxy
def get_random_proxy():
return random.choice(proxies_list)
# Make requests with proxy rotation
for i in range(5): # Number of requests
proxy = get_random_proxy()
proxies = {
"http": proxy,
"https": proxy
}
try:
response = requests.get("https://httpbin.io/ip", proxies=proxies, timeout=5)
print(f"Request {i 1} from {proxy}: {response.json()}")
except requests.exceptions.RequestException as e:
print(f"Request {i 1} failed with proxy {proxy}: {e}")
Step 3: Run the Script
Run the script using:
python rotate_requests.py
Each request will be sent from a different proxy. This method works well for small-scale scraping but has limitations for handling large requests efficiently.
Rotating Proxies with AIOHTTP (Asynchronous Requests)
The requests library processes one request at a time. To speed up requests, we can use AIOHTTP, which allows asynchronous proxy rotation.
Step 1: Install AIOHTTP
Install the aiohttp library:
pip install aiohttp
Step 2: Define Proxy Rotation Logic
Create a Python file (rotate_aiohttp.py) and add the following code:
import aiohttp
import asyncio
import random
# List of proxies
proxies_list = [
"http://username:password@PROXY_1:PORT",
"http://username:password@PROXY_2:PORT",
"http://username:password@PROXY_3:PORT"
]
# Function to fetch IP using a proxy
async def fetch_ip(session, proxy, attempt):
try:
async with session.get("https://httpbin.io/ip", proxy=proxy) as response:
json_response = await response.json()
print(f"Attempt {attempt} using {proxy}: {json_response}")
except Exception as e:
print(f"Attempt {attempt} failed with {proxy}: {e}")
# Main function to send requests concurrently
async def main():
async with aiohttp.ClientSession() as session:
tasks = []
for i in range(5): # Number of requests
proxy = random.choice(proxies_list)
tasks.append(fetch_ip(session, proxy, i 1))
await asyncio.gather(*tasks)
# Run the async event loop
asyncio.run(main())
Step 3: Run the Script
Run the script using:
python rotate_aiohttp.py
This method sends multiple requests simultaneously, making it faster and more efficient than the requests method.
Rotating Proxies with Scrapy
Scrapy is a powerful web scraping framework that supports built-in proxy rotation.
Step 1: Install Scrapy
Install Scrapy and the rotating proxies middleware:
pip install scrapy scrapy-rotating-proxies
Step 2: Create a Scrapy Project
Inside your working directory, create a new Scrapy project:
scrapy startproject proxy_scraper
cd proxy_scraper
Step 3: Configure Proxy Rotation
Modify settings.py in the Scrapy project:
# Enable proxy middleware
DOWNLOADER_MIDDLEWARES = {
"rotating_proxies.middlewares.RotatingProxyMiddleware": 610,
"rotating_proxies.middlewares.BanDetectionMiddleware": 620,
}
# List of rotating proxies
ROTATING_PROXY_LIST = [
"http://PROXY_1:PORT",
"http://PROXY_2:PORT",
"http://PROXY_3:PORT"
]
# Retry settings
RETRY_TIMES = 5
RETRY_HTTP_CODES = [500, 502, 503, 504, 403, 408]
Step 4: Create a Spider
Inside the spiders/ directory, create a file called ip_spider.py:
import scrapy
class IpSpider(scrapy.Spider):
name = "ip_spider"
start_urls = ["https://httpbin.io/ip"]
def parse(self, response):
ip = response.json().get("origin")
self.log(f"IP Address: {ip}")
Step 4: Run the Spider
Run the spider using:
scrapy crawl ip_spider
Scrapy will automatically rotate proxies while scraping.
Conclusion
Following these proxy rotation techniques, you can scrape data smoothly without worrying about blocks or restrictions. Switching between IPs helps you stay under the radar, bypass rate limits, and access content from different locations.
If you’re working on small projects, simple methods like requests will do the job. But for larger tasks, using AIOHTTP or Scrapy will make things faster and more efficient.
Remember, free proxies can be unreliable, so choosing a good rotating proxy provider can save you time and trouble. Now that you know how to rotate proxies, you’re ready to scrape smarter and avoid getting blocked. Thank you for reading!