Skip to content

The process of bypassing CAPTCHA when extracting public data from Amazon with Oxylabs Amazon Scraper API.

Notifications You must be signed in to change notification settings

oxylabs/how-to-bypass-amazon-captcha

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

How to Bypass Amazon CAPTCHA When Scraping

Oxylabs promo code

Take a look at the process of bypassing CAPTCHAs when collecting public data from Amazon with Amazon Scraper API (one-week free trial). You can find the full guide on our blog.

Setting up a simple scraper

This scraper will likely encounter a CAPTCHA.

import requests

custom_headers = {
    "Accept-language": "en-GB,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br",
    "Cache-Control": "max-age=0",
    "Connection": "keep-alive",
    "User-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15",
}

url = "https://www.amazon.com/SAMSUNG-Border-Less-Compatible-Adjustable-LS24AG302NNXZA/dp/B096N2MV3H?ref_=Oct_DLandingS_D_fe3953dd_2"

response = requests.get(url, headers=custom_headers)

with open('with_captcha.html', 'w') as file:
    file.write(response.text)

Using Amazon Scraper API

The API is designed to avoid CAPTCHAs.

import requests
from pprint import pprint


payload = {
    'source': 'amazon',
    'url': 'https://www.amazon.com/dp/B096N2MV3H',
    'parse': True
}

response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('username', 'password'),
    json=payload,
)


pprint(response.json())

with open('without_captcha.json', 'w') as file:
    file.write(response.text)

Final word

Follow our technical documentation for all available API parameters.

In case of any issues, please contact us at support@oxylabs.io

Releases

No releases published

Packages

No packages published

Languages