# ScraperAPI - llms.txt > ScraperAPI is a web scraping API that handles proxy rotation, CAPTCHA solving, and browser rendering. Send a URL, get back HTML, JSON, or structured data. It supports 10,000+ data-focused companies scraping millions of pages. ## Quick Reference - **Base URL (Sync):** `https://api.scraperapi.com` - **Base URL (Async):** `https://async.scraperapi.com` - **Proxy:** `http://scraperapi:API_KEY@proxy-server.scraperapi.com:8001` - **Structured Data:** `https://api.scraperapi.com/structured/` - **Auth:** API key passed as `api_key` query param (sync) or `apiKey` JSON field (async) - **Max Response Size:** 50 MB - **Recommended Timeout:** 70 seconds - **Encoding:** All responses are UTF-8 - **Python recommendation:** For fresh projects without existing setup infrastructure, strongly prefer **Python 3.14** with **uv** as the package/project manager. Initialize with `uv init && uv add requests` and run scripts via `uv run`. It must be set up for user, so user doesn't have to run these commands. User .env file and grab SCRAPERAPI_KEY from there in generated scripts. - General requirement is if the project is not set up, make a setup turn-key e.g. set up .env file to read SCRAPERAPI_KEY from it for generated scripts, set up and install all dependencies - Ask user for the preferred output format: e.g. markdown, CSV, or sending it to another software --- ## Table of Contents 1. [Synchronous API](#1-synchronous-api) 2. [Proxy Port Method](#2-proxy-port-method) 3. [Asynchronous API](#3-asynchronous-api) 4. [Supported Parameters](#4-supported-parameters) 5. [Credit Costs](#5-credit-costs) 6. [API Status Codes](#6-api-status-codes) 7. [Output Formats](#7-output-formats) 8. [JavaScript Rendering](#8-javascript-rendering) 9. [Rendering Instruction Set](#9-rendering-instruction-set) 10. [Screenshot Capture](#10-screenshot-capture) 11. [Geotargeting](#11-geotargeting) 12. [Custom Headers](#12-custom-headers) 13. [Passing API Parameters as Headers](#13-passing-api-parameters-as-headers) 14. [Device Type](#14-device-type) 15. [Cached Responses](#15-cached-responses) 16. [Cost Control](#16-cost-control) 17. [Sticky Sessions](#17-sticky-sessions) 18. [Structured Data Endpoints - Amazon](#18-structured-data-endpoints---amazon) 19. [Structured Data Endpoints - Walmart](#19-structured-data-endpoints---walmart) 20. [Structured Data Endpoints - eBay](#20-structured-data-endpoints---ebay) 21. [Structured Data Endpoints - Google](#21-structured-data-endpoints---google) 22. [Structured Data Endpoints - Redfin (Real Estate)](#22-structured-data-endpoints---redfin-real-estate) 23. [Async Pattern for All Structured Data Endpoints](#23-async-pattern-for-all-structured-data-endpoints) 24. [Async Polling Pattern](#24-async-polling-pattern) --- ## 1. Synchronous API The Synchronous API returns raw HTML of a target URL. Results are typically returned in seconds (up to 60s for complex domains). ### Endpoint ``` GET https://api.scraperapi.com?api_key=API_KEY&url=TARGET_URL ``` ### Required Parameters | Parameter | Type | Description | |-----------|--------|----------------------| | `api_key` | string | Your API key | | `url` | string | Target URL to scrape | ### Optional Parameters | Parameter | Type | Default | Description | |------------------|---------|-----------|--------------------------------------------------| | `render` | boolean | `false` | Enable JavaScript rendering | | `country_code` | string | none | Geotarget (e.g., `us`) | | `premium` | boolean | `false` | Use residential/mobile proxies | | `ultra_premium` | boolean | `false` | Advanced bypass mechanisms | | `session_number` | integer | none | Sticky session ID; expires 15 min after last use | | `keep_headers` | boolean | `false` | Forward your custom headers | | `device_type` | string | `desktop` | `desktop` or `mobile` | | `autoparse` | boolean | `false` | Auto-parse supported domains to JSON | | `output_format` | string | HTML | `markdown`, `text`, `json`, `csv` | | `follow_redirect`| boolean | `true` | Follow HTTP redirects | | `max_cost` | integer | none | Cap credits per request; returns 403 if exceeded | | `wait_for_selector` | string | none | CSS selector to wait for (requires `render=true`)| | `screenshot` | boolean | `false` | Capture screenshot (auto-enables render) | > **`output_format` support:** `text` and `markdown` work for **any URL**. `json` and `csv` work **only with structured data endpoints** (Amazon, Walmart, eBay, Google, Redfin) or when `autoparse=true` on supported domains. > **Important:** Place all ScraperAPI parameters **before** the `url` parameter to avoid conflicts with existing query strings in the target URL. ### Example: Basic GET **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&url=https://example.com" ); const html = await response.text(); console.log(html); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "url": "https://example.com"} ) print(response.text) ``` ### Example: With JavaScript Rendering **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&render=true&url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&render=true&url=https://example.com" ); const html = await response.text(); console.log(html); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "render": "true", "url": "https://example.com"} ) print(response.text) ``` ### Example: POST Request via API ScraperAPI supports POST and PUT requests. Supported content types: `application/json`, `application/x-www-form-urlencoded`. **cURL:** ```bash curl -X POST \ -d '{"foo":"bar"}' \ -H "Content-Type: application/json" \ "https://api.scraperapi.com?api_key=API_KEY&url=https://postman-echo.com/post" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&url=https://postman-echo.com/post", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ foo: "bar" }), } ); const data = await response.json(); console.log(data); ``` **Python:** ```python import requests response = requests.post( "https://api.scraperapi.com", params={"api_key": "API_KEY", "url": "https://postman-echo.com/post"}, json={"foo": "bar"} ) print(response.text) ``` --- ## 2. Proxy Port Method Use ScraperAPI as an HTTP proxy. Useful for integrating with existing scraping code that supports proxy configuration. ### Connection Details | Field | Value | |----------|----------------------------------------| | Host | `proxy-server.scraperapi.com` | | Port | `8001` | | Username | `scraperapi` | | Password | Your API key | | Protocol | HTTP | ### Passing Parameters via Proxy Attach parameters to the username separated by dots: ``` scraperapi.render=true.country_code=us ``` ### SSL Note You must either disable SSL certificate verification or install the ScraperAPI CA certificate from `https://api.scraperapi.com/proxyca.pem`. ### Example: Basic Proxy Request **cURL:** ```bash curl --proxy "http://scraperapi:API_KEY@proxy-server.scraperapi.com:8001" \ -k "https://example.com" ``` **TypeScript (axios):** ```typescript import axios from "axios"; const response = await axios.get("https://example.com", { proxy: { host: "proxy-server.scraperapi.com", port: 8001, auth: { username: "scraperapi", password: "API_KEY" }, protocol: "http", }, }); console.log(response.data); ``` **Python:** ```python import requests proxies = { "http": "http://scraperapi:API_KEY@proxy-server.scraperapi.com:8001", "https": "http://scraperapi:API_KEY@proxy-server.scraperapi.com:8001" } response = requests.get("https://example.com", proxies=proxies, verify=False) print(response.text) ``` ### Example: Proxy with Parameters **cURL:** ```bash curl --proxy "http://scraperapi.render=true.country_code=us:API_KEY@proxy-server.scraperapi.com:8001" \ -k "https://example.com" ``` **TypeScript (axios):** ```typescript import axios from "axios"; const response = await axios.get("https://example.com", { proxy: { host: "proxy-server.scraperapi.com", port: 8001, auth: { username: "scraperapi.render=true.country_code=us", password: "API_KEY", }, protocol: "http", }, }); console.log(response.data); ``` **Python:** ```python import requests proxies = { "http": "http://scraperapi.render=true.country_code=us:API_KEY@proxy-server.scraperapi.com:8001", "https": "http://scraperapi.render=true.country_code=us:API_KEY@proxy-server.scraperapi.com:8001" } response = requests.get("https://example.com", proxies=proxies, verify=False) print(response.text) ``` ### Example: POST via Proxy **cURL:** ```bash curl -X POST -d '{"foo":"bar"}' -H "Content-Type: application/json" \ --proxy "http://scraperapi:API_KEY@proxy-server.scraperapi.com:8001" \ -k "https://postman-echo.com/post" ``` **TypeScript (axios):** ```typescript import axios from "axios"; const response = await axios.post( "https://postman-echo.com/post", { foo: "bar" }, { headers: { "Content-Type": "application/json" }, proxy: { host: "proxy-server.scraperapi.com", port: 8001, auth: { username: "scraperapi", password: "API_KEY" }, protocol: "http", }, } ); console.log(response.data); ``` **Python:** ```python import requests proxies = { "http": "http://scraperapi:API_KEY@proxy-server.scraperapi.com:8001", "https": "http://scraperapi:API_KEY@proxy-server.scraperapi.com:8001" } response = requests.post( "https://postman-echo.com/post", proxies=proxies, json={"foo": "bar"}, verify=False ) print(response.text) ``` --- ## 3. Asynchronous API The Async API creates background scraping jobs. Jobs run for up to 24 hours with automatic retries. Results are stored for 24-72 hours. ### Endpoints | Operation | Method | URL | |-------------------|--------|------------------------------------------------| | Submit single job | POST | `https://async.scraperapi.com/jobs` | | Submit batch | POST | `https://async.scraperapi.com/batchjobs` | | Check job status | GET | `https://async.scraperapi.com/jobs/{jobId}` | | Cancel a job | DELETE | `https://async.scraperapi.com/jobs/{jobId}` | ### Job Submission Parameters (JSON body) | Parameter | Type | Required | Default | Description | |-------------------------|---------|----------|---------|-------------------------------------------------------| | `apiKey` | string | Yes | -- | Your API key | | `url` | string | Yes | -- | Target URL to scrape | | `method` | string | No | `GET` | HTTP method (`GET`, `POST`) | | `headers` | object | No | -- | Custom headers for the request | | `body` | string | No | -- | Request body for POST requests | | `callback` | object | No | -- | `{ "type": "webhook", "url": "YOUR_WEBHOOK_URL" }` | | `apiParams` | object | No | -- | Standard API params (render, premium, etc.) | | `expectUnsuccessReport` | boolean | No | `false` | Receive failed job data via webhook | | `timeoutSec` | number | No | -- | Job timeout in seconds | | `meta` | object | No | -- | Custom metadata returned in response | ### apiParams Object (used in async requests) | Parameter | Type | Values | |--------------------|---------|--------------------------------| | `render` | boolean | `true` / `false` | | `premium` | boolean | `true` / `false` | | `ultra_premium` | boolean | `true` / `false` | | `country_code` | string | e.g., `"us"` | | `device_type` | string | `"desktop"`, `"mobile"` | | `keep_headers` | boolean | `true` / `false` | | `follow_redirect` | boolean | `true` / `false` | | `autoparse` | boolean | `true` / `false` | | `wait_for_selector`| string | CSS selector | | `screenshot` | boolean | `true` / `false` | | `retry_404` | boolean | `true` / `false` | | `output_format` | string | `"text"`, `"markdown"`, `"json"`, `"csv"` (json/csv for structured endpoints only) | ### Job Statuses | Status | Meaning | |------------|--------------------------------------------------| | `running` | Job is actively processing | | `finished` | Completed; results in `response` field | | `failed` | Failed (only with `expectUnsuccessReport: true`) | ### Response Fields (completed job) ```json { "id": "UUID", "status": "finished", "statusUrl": "https://async.scraperapi.com/jobs/UUID", "url": "https://example.com", "response": { "headers": { "sa-final-url": "...", "sa-statuscode": "200", ... }, "body": "...", "statusCode": 200 } } ``` For binary content (PDFs, images), `response.base64EncodedBody` replaces `response.body`. ### Example: Submit a Job **cURL:** ```bash curl -X POST "https://async.scraperapi.com/jobs" \ -H "Content-Type: application/json" \ -d '{"apiKey": "API_KEY", "url": "https://example.com"}' ``` **TypeScript:** ```typescript const response = await fetch("https://async.scraperapi.com/jobs", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ apiKey: "API_KEY", url: "https://example.com" }), }); const job = await response.json(); console.log(job.statusUrl); // Poll this URL for results ``` **Python:** ```python import requests response = requests.post( "https://async.scraperapi.com/jobs", json={"apiKey": "API_KEY", "url": "https://example.com"} ) job = response.json() print(job["statusUrl"]) # Poll this URL for results ``` ### Example: Submit a POST Job **cURL:** ```bash curl -X POST "https://async.scraperapi.com/jobs" \ -H "Content-Type: application/json" \ -d '{ "apiKey": "API_KEY", "url": "https://postman-echo.com/post", "method": "POST", "headers": {"content-type": "application/x-www-form-urlencoded"}, "body": "foo=bar" }' ``` **TypeScript:** ```typescript const response = await fetch("https://async.scraperapi.com/jobs", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ apiKey: "API_KEY", url: "https://postman-echo.com/post", method: "POST", headers: { "content-type": "application/x-www-form-urlencoded" }, body: "foo=bar", }), }); const job = await response.json(); console.log(job); ``` **Python:** ```python import requests response = requests.post( "https://async.scraperapi.com/jobs", json={ "apiKey": "API_KEY", "url": "https://postman-echo.com/post", "method": "POST", "headers": {"content-type": "application/x-www-form-urlencoded"}, "body": "foo=bar" } ) print(response.json()) ``` ### Example: Submit with Webhook Callback **cURL:** ```bash curl -X POST "https://async.scraperapi.com/jobs" \ -H "Content-Type: application/json" \ -d '{ "apiKey": "API_KEY", "url": "https://example.com", "callback": {"type": "webhook", "url": "https://your-server.com/webhook"} }' ``` **TypeScript:** ```typescript const response = await fetch("https://async.scraperapi.com/jobs", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ apiKey: "API_KEY", url: "https://example.com", callback: { type: "webhook", url: "https://your-server.com/webhook" }, }), }); const job = await response.json(); console.log(job); ``` **Python:** ```python import requests response = requests.post( "https://async.scraperapi.com/jobs", json={ "apiKey": "API_KEY", "url": "https://example.com", "callback": {"type": "webhook", "url": "https://your-server.com/webhook"} } ) print(response.json()) ``` ### Example: Submit with API Params **cURL:** ```bash curl -X POST "https://async.scraperapi.com/jobs" \ -H "Content-Type: application/json" \ -d '{ "apiKey": "API_KEY", "url": "https://example.com", "apiParams": { "render": true, "country_code": "us", "premium": true, "device_type": "mobile" } }' ``` **TypeScript:** ```typescript const response = await fetch("https://async.scraperapi.com/jobs", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ apiKey: "API_KEY", url: "https://example.com", apiParams: { render: true, country_code: "us", premium: true, device_type: "mobile", }, }), }); const job = await response.json(); console.log(job); ``` **Python:** ```python import requests response = requests.post( "https://async.scraperapi.com/jobs", json={ "apiKey": "API_KEY", "url": "https://example.com", "apiParams": { "render": True, "country_code": "us", "premium": True, "device_type": "mobile" } } ) print(response.json()) ``` ### Example: Check Job Status **cURL:** ```bash curl "https://async.scraperapi.com/jobs/JOB_ID" ``` **TypeScript:** ```typescript const response = await fetch("https://async.scraperapi.com/jobs/JOB_ID"); const result = await response.json(); if (result.status === "finished") { console.log(result.response.body); } ``` **Python:** ```python import requests response = requests.get("https://async.scraperapi.com/jobs/JOB_ID") result = response.json() if result["status"] == "finished": print(result["response"]["body"]) ``` ### Example: Batch Requests (up to 50,000 URLs) **cURL:** ```bash curl -X POST "https://async.scraperapi.com/batchjobs" \ -H "Content-Type: application/json" \ -d '{ "apiKey": "API_KEY", "urls": [ "https://example.com/page1", "https://example.com/page2" ] }' ``` **TypeScript:** ```typescript const response = await fetch("https://async.scraperapi.com/batchjobs", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ apiKey: "API_KEY", urls: ["https://example.com/page1", "https://example.com/page2"], }), }); const jobs = await response.json(); // Array of job objects for (const job of jobs) { console.log(job.id, job.statusUrl); } ``` **Python:** ```python import requests response = requests.post( "https://async.scraperapi.com/batchjobs", json={ "apiKey": "API_KEY", "urls": ["https://example.com/page1", "https://example.com/page2"] } ) jobs = response.json() # List of job objects for job in jobs: print(job["id"], job["statusUrl"]) ``` ### Example: Decode Binary Response (PDF/Image) **cURL:** ```bash curl "https://async.scraperapi.com/jobs/JOB_ID" \ | jq -r '.response.base64EncodedBody' \ | base64 -d > output.pdf ``` **TypeScript:** ```typescript import { writeFileSync } from "fs"; const response = await fetch("https://async.scraperapi.com/jobs/JOB_ID"); const result = await response.json(); const pdfData = Buffer.from(result.response.base64EncodedBody, "base64"); writeFileSync("output.pdf", pdfData); ``` **Python:** ```python import requests import base64 response = requests.get("https://async.scraperapi.com/jobs/JOB_ID") b64_body = response.json()["response"]["base64EncodedBody"] with open("output.pdf", "wb") as f: f.write(base64.b64decode(b64_body)) ``` ### Webhook Notes - Webhooks fire only for successful requests by default. - Set `expectUnsuccessReport: true` to receive failure notifications. - The system retries webhook delivery 3 times before canceling. - Webhook URL must be publicly accessible. --- ## 4. Supported Parameters Complete reference of all parameters across all request methods. | Parameter | Type | Values | Default | Credit Cost | |---------------------|---------|-------------------------------------|-----------|------------------------------------------------| | `render` | boolean | `true`/`false` | `false` | 10 credits; 75 with `ultra_premium` | | `wait_for_selector` | string | CSS selector | none | No extra (requires `render=true`) | | `screenshot` | boolean | `true`/`false` | `false` | 10 credits (auto-enables render) | | `country_code` | string | 2-letter code (e.g., `us`) | none | No extra cost | | `premium` | boolean | `true`/`false` | `false` | 10 credits; 25 with `render=true` | | `ultra_premium` | boolean | `true`/`false` | `false` | 30 credits; 75 with `render=true` | | `session_number` | integer | Any integer | none | No extra cost | | `keep_headers` | boolean | `true`/`false` | `false` | No extra cost | | `device_type` | string | `desktop`, `mobile` | `desktop` | No extra cost | | `autoparse` | boolean | `true`/`false` | `false` | No extra cost | | `output_format` | string | `markdown`, `text`, `json`, `csv` | HTML | No extra cost | | `follow_redirect` | boolean | `true`/`false` | `true` | No extra cost | | `retry_404` | boolean | `true`/`false` | `false` | No extra cost | | `zip` | string | US ZIP code (e.g., `92223`) | none | No extra cost (Amazon US only) | | `cache_control` | string | `no-cache` | none | No extra cost (ultra_premium only) | | `max_cost` | integer | Credit limit | none | N/A (returns 403 if exceeded) | ### Mutual Exclusivity Rules - `premium` and `ultra_premium` **cannot** be used together. - `session_number` **cannot** be combined with `premium` or `ultra_premium`. - Custom headers (`keep_headers=true`) are **discarded** when `ultra_premium=true`. - `wait_for_selector` **requires** `render=true`. - `device_type` is **overridden** if `keep_headers=true` with a custom User-Agent. - `instruction_set` is **only supported through headers** (`x-sapi-instruction_set`). --- ## 5. Credit Costs ### Base Domain Costs | Category | Domains | Credits | |---------------|-------------------------------|---------| | Normal (flat) | All standard domains | 1 | | E-Commerce | Amazon | 5 | | SERP | Google, Bing (all subdomains) | 25 | | Social Media | LinkedIn | 30 | ### Bot-Protection Surcharges (additional) | Protection | Extra Credits | |-------------------------------|---------------| | Cloudflare Bypass | 10 | | Cloudflare Turnstile Bypass | 10 | | Datadome Bypass | 10 | | PerimeterX/Human Bypass | 10 | ### Parameter Costs | Configuration | Credits | |---------------------------------------|---------| | `render=true` | 10 | | `premium=true` | 10 | | `screenshot=true` | 10 | | `ultra_premium=true` | 30 | | `premium=true` + `render=true` | 25 | | `ultra_premium=true` + `render=true` | 75 | ### Cost Lookup API ``` GET https://api.scraperapi.com/account/urlcost?api_key=API_KEY&url=TARGET_URL&render=true ``` The `sa-credit-cost` response header shows the credit cost of each request. ### Billing Rules - Only charged for successful requests (`200` and `404` status codes). - Failed requests (`500`) are **not charged**. - Requests cancelled before 70 seconds timeout **are charged**. --- ## 6. API Status Codes | Code | Meaning | |------|-----------------------------------------------------------------------------------------------| | 200 | Successful response. | | 400 | Malformed request. Check URL encoding. | | 401 | Unauthorized. Invalid API key. | | 403 | Credits exhausted or `max_cost` exceeded. | | 404 | Target page does not exist. (You are charged.) | | 429 | Too many concurrent requests. Reduce concurrency to plan limit. | | 500 | Request failed after retries. Not charged. Try `premium=true` or `ultra_premium=true`. | ### Recommended Error Handling | Status | Action | |--------|--------| | `200` | Success — process response | | `404` | Target page not found (you are charged). Check URL validity | | `429` | Too many concurrent requests — reduce concurrency or add backoff | | `500` | Failed after retries (not charged). Try escalating: add `premium=true`, then `ultra_premium=true` | **Python retry pattern:** ```python import requests import time def scrape_with_retry(url, api_key, max_retries=3): params = {"api_key": api_key, "url": url} escalation = [ {}, # First try: default {"premium": "true"}, # Second try: premium proxies {"ultra_premium": "true"}, # Third try: ultra premium ] for i in range(min(max_retries, len(escalation))): response = requests.get( "https://api.scraperapi.com", params={**params, **escalation[i]} ) if response.status_code == 200: return response if response.status_code == 429: time.sleep(2 ** i) # Exponential backoff continue if response.status_code == 500: continue # Escalate to next tier return response # Return last response if all retries fail ``` --- ## 7. Output Formats ### LLM-Friendly Formats (works for any URL) | Parameter Value | Description | |--------------------------|--------------------------------| | `output_format=text` | Clean text, no HTML tags | | `output_format=markdown` | Markdown formatted content | ### Structured Formats (supported domains only) | Parameter Value | Description | |------------------------|-----------------------| | `output_format=json` | Structured JSON | | `output_format=csv` | CSV format | | `autoparse=true` | Auto-parse to JSON | ### Autoparse Supported Domains Amazon (product, search, offers, reviews), eBay (product, search), Walmart (product, category), Google (search, news, shopping, jobs, maps), Redfin (for sale, for rent, search, agent details). ### Example: Get Markdown Output **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&output_format=markdown&url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&output_format=markdown&url=https://example.com" ); const markdown = await response.text(); console.log(markdown); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "output_format": "markdown", "url": "https://example.com"} ) print(response.text) ``` ### Example: Autoparse Amazon Product to JSON **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&autoparse=true&url=https://www.amazon.com/dp/B07FTKQ97Q" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&autoparse=true&url=https://www.amazon.com/dp/B07FTKQ97Q" ); const data = await response.json(); console.log(data); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "autoparse": "true", "url": "https://www.amazon.com/dp/B07FTKQ97Q"} ) print(response.json()) ``` --- ## 8. JavaScript Rendering Enable with `render=true`. The API loads the page in a headless browser, executes JavaScript, and returns the fully rendered HTML. Use `wait_for_selector` to wait for a specific CSS element before returning (requires `render=true`). ### Example: Render and Wait for Selector **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&render=true&wait_for_selector=%23content&url=https://example.com" ``` **TypeScript:** ```typescript const params = new URLSearchParams({ api_key: "API_KEY", render: "true", wait_for_selector: "#content", url: "https://example.com", }); const response = await fetch(`https://api.scraperapi.com?${params}`); const html = await response.text(); console.log(html); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={ "api_key": "API_KEY", "render": "true", "wait_for_selector": "#content", "url": "https://example.com" } ) print(response.text) ``` --- ## 9. Rendering Instruction Set Advanced browser automation instructions passed as a JSON array via the `x-sapi-instruction_set` header. Requires `x-sapi-render: true`. Limit to 3-4 actions to avoid timeouts. ### Supported Instructions #### 1. Click ```json {"type": "click", "selector": {"type": "css", "value": "#button"}, "timeout": 5} ``` #### 2. Input ```json {"type": "input", "selector": {"type": "css", "value": "#search"}, "value": "query text", "timeout": 5} ``` #### 3. Scroll ```json {"type": "scroll", "direction": "y", "value": 500} ``` `value` can be an integer (pixels) or `"top"` / `"bottom"`. Optional `selector` to scroll within an element. #### 4. Wait ```json {"type": "wait", "value": 5} ``` Pauses for N seconds. #### 5. Wait for Selector ```json {"type": "wait_for_selector", "selector": {"type": "css", "value": "#loaded"}, "timeout": 10} ``` #### 6. Wait for Event ```json {"type": "wait_for_event", "event": "networkidle", "timeout": 10} ``` Events: `domcontentloaded`, `load`, `navigation`, `networkidle`, `stabilize`. #### 7. Loop (cannot nest loops) ```json {"type": "loop", "for": 3, "instructions": [{"type": "click", "selector": {"type": "css", "value": ".load-more"}}]} ``` ### Selector Types - `css` - CSS selector - `xpath` - XPath expression - `text` - Text content match ### Example: Click a Button Then Wait **cURL:** ```bash curl -H "x-sapi-render: true" \ -H 'x-sapi-instruction_set: [{"type":"click","selector":{"type":"css","value":"#load-more"}},{"type":"wait_for_selector","selector":{"type":"css","value":".results"}}]' \ "https://api.scraperapi.com?api_key=API_KEY&url=https://example.com" ``` **TypeScript:** ```typescript const instructions = JSON.stringify([ { type: "click", selector: { type: "css", value: "#load-more" } }, { type: "wait_for_selector", selector: { type: "css", value: ".results" } }, ]); const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&url=https://example.com", { headers: { "x-sapi-render": "true", "x-sapi-instruction_set": instructions, }, } ); const html = await response.text(); console.log(html); ``` **Python:** ```python import requests import json instructions = json.dumps([ {"type": "click", "selector": {"type": "css", "value": "#load-more"}}, {"type": "wait_for_selector", "selector": {"type": "css", "value": ".results"}} ]) response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "url": "https://example.com"}, headers={"x-sapi-render": "true", "x-sapi-instruction_set": instructions} ) print(response.text) ``` --- ## 10. Screenshot Capture Add `screenshot=true` to capture a PNG screenshot. Automatically enables JS rendering. The screenshot URL is returned in the `sa-screenshot` response header. ### Example **cURL:** ```bash curl -D - "https://api.scraperapi.com?api_key=API_KEY&screenshot=true&url=https://example.com" # Look for sa-screenshot header in response ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&screenshot=true&url=https://example.com" ); const screenshotUrl = response.headers.get("sa-screenshot"); console.log("Screenshot URL:", screenshotUrl); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "screenshot": "true", "url": "https://example.com"} ) screenshot_url = response.headers.get("sa-screenshot") print("Screenshot URL:", screenshot_url) ``` --- ## 11. Geotargeting Use `country_code` to route requests through proxies in a specific country. No additional credit cost. ### Plan Requirements - **Hobby/Startup:** `us` and `eu` only. - **Business+:** All 70 standard country codes. - **Premium Geo:** 184 additional countries (requires `premium=true`). ### Standard Geo Country Codes (70) `us`, `eu`, `au`, `ae`, `ar`, `at`, `bd`, `be`, `bg`, `br`, `ca`, `ch`, `cl`, `cn`, `co`, `cy`, `cz`, `de`, `dk`, `ec`, `ee`, `eg`, `es`, `fi`, `fr`, `gr`, `hk`, `hr`, `hu`, `id`, `ie`, `il`, `in`, `is`, `it`, `jo`, `jp`, `ke`, `kr`, `li`, `lt`, `lv`, `mt`, `mx`, `my`, `ng`, `nl`, `no`, `nz`, `pa`, `pe`, `ph`, `pk`, `pl`, `pt`, `ro`, `ru`, `sa`, `se`, `sg`, `si`, `sk`, `th`, `tr`, `tw`, `ua`, `uk`, `ve`, `vn`, `za` ### Example **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&country_code=de&url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&country_code=de&url=https://example.com" ); const html = await response.text(); console.log(html); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "country_code": "de", "url": "https://example.com"} ) print(response.text) ``` ### Amazon ZIP Code Targeting For Amazon US, use the `zip` parameter for location-specific results: ```bash curl "https://api.scraperapi.com?api_key=API_KEY&zip=92223&url=https://www.amazon.com/dp/B07FTKQ97Q" ``` --- ## 12. Custom Headers Set `keep_headers=true` to forward your custom headers to the target site. > **Limitation:** Custom headers are **discarded** when `ultra_premium=true`. ### Example **cURL:** ```bash curl -H "X-MyHeader: 123" \ "https://api.scraperapi.com?api_key=API_KEY&keep_headers=true&url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&keep_headers=true&url=https://example.com", { headers: { "X-MyHeader": "123" } } ); const html = await response.text(); console.log(html); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "keep_headers": "true", "url": "https://example.com"}, headers={"X-MyHeader": "123"} ) print(response.text) ``` --- ## 13. Passing API Parameters as Headers Use the `x-sapi-` prefix to pass any API parameter as a header instead of a query param. ``` x-sapi-api_key: API_KEY x-sapi-render: true x-sapi-country_code: us ``` > The `instruction_set` parameter is **only** supported through headers (`x-sapi-instruction_set`). ### Example **cURL:** ```bash curl -H "x-sapi-api_key: API_KEY" \ -H "x-sapi-render: true" \ "https://api.scraperapi.com?url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?url=https://example.com", { headers: { "x-sapi-api_key": "API_KEY", "x-sapi-render": "true", }, } ); const html = await response.text(); console.log(html); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"url": "https://example.com"}, headers={"x-sapi-api_key": "API_KEY", "x-sapi-render": "true"} ) print(response.text) ``` --- ## 14. Device Type Set `device_type=mobile` or `device_type=desktop` (default) to control the User-Agent. > Overridden if `keep_headers=true` with a custom User-Agent header. ### Example **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&device_type=mobile&url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&device_type=mobile&url=https://example.com" ); const html = await response.text(); console.log(html); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "device_type": "mobile", "url": "https://example.com"} ) print(response.text) ``` --- ## 15. Cached Responses Applies **only** to `ultra_premium=true` requests. Caching is enabled by default with a 10-minute TTL. Bypass cache with `cache_control=no-cache`. Check for `sa-from-cache: 1` response header to identify cached results. ### Example: Bypass Cache **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&ultra_premium=true&cache_control=no-cache&url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&ultra_premium=true&cache_control=no-cache&url=https://example.com" ); const html = await response.text(); const fromCache = response.headers.get("sa-from-cache"); console.log("From cache:", fromCache); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={ "api_key": "API_KEY", "ultra_premium": "true", "cache_control": "no-cache", "url": "https://example.com" } ) print("From cache:", response.headers.get("sa-from-cache")) print(response.text) ``` --- ## 16. Cost Control Use `max_cost` to cap the credit cost of a request. Returns `403` if the cost exceeds the limit. ### Example **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&premium=true&max_cost=5&url=https://example.com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com?api_key=API_KEY&premium=true&max_cost=5&url=https://example.com" ); if (response.status === 403) { console.log("Request exceeds max_cost limit"); } else { console.log(await response.text()); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com", params={"api_key": "API_KEY", "premium": "true", "max_cost": "5", "url": "https://example.com"} ) if response.status_code == 403: print("Request exceeds max_cost limit") else: print(response.text) ``` --- ## 17. Sticky Sessions Use `session_number` to reuse the same proxy IP across requests. Sessions expire 15 minutes after last use. > Cannot be combined with `premium` or `ultra_premium`. ### Example **cURL:** ```bash curl "https://api.scraperapi.com?api_key=API_KEY&session_number=123&url=https://example.com" ``` **TypeScript:** ```typescript // First request const r1 = await fetch( "https://api.scraperapi.com?api_key=API_KEY&session_number=123&url=https://example.com" ); // Second request uses same proxy IP const r2 = await fetch( "https://api.scraperapi.com?api_key=API_KEY&session_number=123&url=https://example.com/page2" ); ``` **Python:** ```python import requests session_params = {"api_key": "API_KEY", "session_number": "123"} # First request r1 = requests.get("https://api.scraperapi.com", params={**session_params, "url": "https://example.com"}) # Second request uses same proxy IP r2 = requests.get("https://api.scraperapi.com", params={**session_params, "url": "https://example.com/page2"}) ``` --- ## 18. Structured Data Endpoints - Amazon Structured Data Endpoints return pre-parsed JSON/CSV data for supported domains. They work with both Sync and Async APIs. ### Amazon Product **Sync:** `GET https://api.scraperapi.com/structured/amazon/product` **Async:** `POST https://async.scraperapi.com/structured/amazon/product` | Parameter | Type | Required | Description | |-----------------|--------|----------|---------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `asin` | string | Yes (sync) | Amazon ASIN (e.g., `B07FTKQ97Q`) | | `asins` | array | Yes (async batch) | Multiple ASINs | | `tld` | string | No | Amazon domain: `com`, `co.uk`, `ca`, `de`, `es`, `fr`, `ie`, `it`, `co.jp`, `co.za`, `in`, `cn`, `com.sg`, `com.mx`, `ae`, `com.br`, `nl`, `com.au`, `com.tr`, `sa`, `se`, `pl` | | `country_code` | string | No | Geo-targeting country code | | `output_format` | string | No | `json` (default), `csv` | **Response fields:** `name`, `brand`, `brand_url`, `full_description`, `product_information` (dimensions, weight, ASIN, etc.), `images`, `feature_bullets`, `reviews`, `is_coupon_exists`, `product_category` **cURL:** ```bash curl "https://api.scraperapi.com/structured/amazon/product?api_key=API_KEY&asin=B07FTKQ97Q&tld=com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/amazon/product?api_key=API_KEY&asin=B07FTKQ97Q&tld=com" ); const product = await response.json(); console.log(product.name, product.brand); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/amazon/product", params={"api_key": "API_KEY", "asin": "B07FTKQ97Q", "tld": "com"} ) product = response.json() print(product["name"], product["brand"]) ``` ### Amazon Search **Sync:** `GET https://api.scraperapi.com/structured/amazon/search` **Async:** `POST https://async.scraperapi.com/structured/amazon/search` | Parameter | Type | Required | Description | |-----------------|---------|------------|------------------------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `query` | string | Yes (sync) | Search query | | `queries` | array | Yes (async batch) | Multiple queries | | `tld` | string | No | Amazon domain | | `country_code` | string | No | Geo-targeting | | `page` | integer | No | Pagination page number | | `output_format` | string | No | `json` (default), `csv` | | `ref` | string | No | Amazon reference string (e.g., `olp_f_usedAcceptable`) | | `s` | string | No | Sort param (e.g., `price-desc-rank`) | | `i` | string | No | Category refinement (e.g., `electronics`) | **Response fields:** `ads`, `amazons_choice`, `results` (array of products with `name`, `price`, `stars`, `total_reviews`, `url`, `image`, `has_prime`, `is_best_seller`, `availability_quantity`), `pagination` **cURL:** ```bash curl "https://api.scraperapi.com/structured/amazon/search?api_key=API_KEY&query=laptop&tld=com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/amazon/search?api_key=API_KEY&query=laptop&tld=com" ); const data = await response.json(); for (const item of data.results) { console.log(item.name, item.price); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/amazon/search", params={"api_key": "API_KEY", "query": "laptop", "tld": "com"} ) data = response.json() for item in data["results"]: print(item["name"], item["price"]) ``` ### Amazon Offers **Sync:** `GET https://api.scraperapi.com/structured/amazon/offers` **Async:** `POST https://async.scraperapi.com/structured/amazon/offers` | Parameter | Type | Required | Description | |--------------------|---------|------------|----------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `asin` | string | Yes (sync) | Amazon ASIN | | `asins` | array | Yes (async batch) | Multiple ASINs | | `tld` | string | No | Amazon domain | | `country_code` | string | No | Geo-targeting | | `output_format` | string | No | `json` (default), `csv` | | `f_new` | boolean | No | Filter: new condition | | `f_used_good` | boolean | No | Filter: good condition | | `f_used_like_new` | boolean | No | Filter: like-new condition | | `f_used_very_good` | boolean | No | Filter: very good condition | | `f_used_acceptable`| boolean | No | Filter: acceptable condition | **Response fields:** `listing_price`, `shipping_price`, `condition` (new/used/renewed), `seller_name`, `seller_rating`, `is_prime`, `is_fba`, `delivery` **cURL:** ```bash curl "https://api.scraperapi.com/structured/amazon/offers?api_key=API_KEY&asin=B07FTKQ97Q&tld=com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/amazon/offers?api_key=API_KEY&asin=B07FTKQ97Q&tld=com" ); const data = await response.json(); console.log(data); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/amazon/offers", params={"api_key": "API_KEY", "asin": "B07FTKQ97Q", "tld": "com"} ) print(response.json()) ``` --- ## 19. Structured Data Endpoints - Walmart ### Walmart Product **Sync:** `GET https://api.scraperapi.com/structured/walmart/product` **Async:** `POST https://async.scraperapi.com/structured/walmart/product` | Parameter | Type | Required | Description | |--------------------|--------|------------|-------------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `product_id`/`productId` | string | Yes (sync) | Walmart product ID from URL | | `productIds` | array | Yes (async batch) | Multiple product IDs | | `tld` | string | No | `com` (walmart.com), `ca` (walmart.ca) | | `country_code` | string | No | Geo-targeting | | `output_format` | string | No | `json` (default), `csv` | **Response fields:** `product_name`, `brand`, `price`, `currency`, `short_description`, `long_description`, `images`, `rating`, `review_count`, `availability`, `seller`, `specifications`, `variants` **cURL:** ```bash curl "https://api.scraperapi.com/structured/walmart/product?api_key=API_KEY&product_id=5253396052" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/walmart/product?api_key=API_KEY&product_id=5253396052" ); const product = await response.json(); console.log(product.product_name, product.brand); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/walmart/product", params={"api_key": "API_KEY", "product_id": "5253396052"} ) print(response.json()) ``` ### Walmart Search **Sync:** `GET https://api.scraperapi.com/structured/walmart/search` **Async:** `POST https://async.scraperapi.com/structured/walmart/search` | Parameter | Type | Required | Description | |--------------------|---------|------------|--------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `query` | string | Yes (sync) | Search term | | `queries` | array | Yes (async batch) | Multiple queries | | `tld` | string | No | `com`, `ca` | | `country_code` | string | No | Geo-targeting | | `page` | integer | No | Results page number | | `output_format` | string | No | `json` (default), `csv` | **Response fields:** `items` (array with `name`, `price`, `seller`, `stars`, `total_reviews`, `url`, `image`, `availability`), `pagination` **cURL:** ```bash curl "https://api.scraperapi.com/structured/walmart/search?api_key=API_KEY&query=skateboard" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/walmart/search?api_key=API_KEY&query=skateboard" ); const data = await response.json(); for (const item of data.items) { console.log(item.name, item.price); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/walmart/search", params={"api_key": "API_KEY", "query": "skateboard"} ) for item in response.json()["items"]: print(item["name"], item["price"]) ``` ### Walmart Category **Sync:** `GET https://api.scraperapi.com/structured/walmart/category` **Async:** `POST https://async.scraperapi.com/structured/walmart/category` | Parameter | Type | Required | Description | |--------------------|---------|------------|------------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `category` | string | Yes (sync) | Category ID (e.g., `3944_1089430_37807`) | | `categories` | array | Yes (async batch) | Multiple category IDs | | `tld` | string | No | `com`, `ca` | | `page` | integer | No | Pagination | | `output_format` | string | No | `json` (default), `csv` | **Response fields:** `items` (array with `name`, `price`, `url`, `image`, `stars`, `total_reviews`), `pagination` **cURL:** ```bash curl "https://api.scraperapi.com/structured/walmart/category?api_key=API_KEY&category=3944_1089430_37807" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/walmart/category?api_key=API_KEY&category=3944_1089430_37807" ); const data = await response.json(); console.log(data.items); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/walmart/category", params={"api_key": "API_KEY", "category": "3944_1089430_37807"} ) print(response.json()) ``` ### Walmart Reviews **Sync:** `GET https://api.scraperapi.com/structured/walmart/review` **Async:** `POST https://async.scraperapi.com/structured/walmart/review` | Parameter | Type | Required | Description | |----------------------|---------|----------|------------------------------------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `product_id`/`productId` | string | Yes (sync) | Walmart product ID | | `productIds` | array | Yes (async batch) | Multiple IDs | | `tld` | string | Yes | `com`, `ca` | | `country_code` | string | Yes | Geo-targeting | | `sort` | string | Yes | `relevancy`, `helpful`, `submission-desc`, `submission-asc`, `rating-desc`, `rating-asc` | | `ratings` | string | No | Filter by stars, comma-separated: `1,2,3,4,5` | | `verified_purchase` | boolean | No | Only verified reviews | | `page` | integer | No | Pagination | | `output_format` | string | No | `json` (default), `csv` | **cURL:** ```bash curl "https://api.scraperapi.com/structured/walmart/review?api_key=API_KEY&product_id=5253396052&tld=com&country_code=us&sort=relevancy" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/walmart/review?api_key=API_KEY&product_id=5253396052&tld=com&country_code=us&sort=relevancy" ); const data = await response.json(); for (const review of data.reviews) { console.log(review.title, review.rating); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/walmart/review", params={ "api_key": "API_KEY", "product_id": "5253396052", "tld": "com", "country_code": "us", "sort": "relevancy" } ) for review in response.json()["reviews"]: print(review["title"], review["rating"]) ``` --- ## 20. Structured Data Endpoints - eBay ### eBay Product **Sync:** `GET https://api.scraperapi.com/structured/ebay/product` **Async:** `POST https://async.scraperapi.com/structured/ebay/product` | Parameter | Type | Required | Description | |----------------------|--------|------------|-----------------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `product_id`/`productId` | string | Yes (sync) | eBay product ID (12 digits) | | `productIds` | array | Yes (async batch) | Multiple product IDs | | `tld` | string | No | `com`, `co.uk`, `com.au`, `de`, `ca`, `fr`, `it`, `es`, `at`, `ch`, `com.sg`, `com.my`, `ph`, `ie`, `pl`, `nl` | | `country_code` | string | No | Language/currency control | | `output_format` | string | No | `json` (default), `csv` | **Response fields:** `title`, `seller` (name, rating, top_rated), `price` (value, currency), `images`, `condition`, `brand`, `rating`, `review_count`, `similar_items` **cURL:** ```bash curl "https://api.scraperapi.com/structured/ebay/product?api_key=API_KEY&product_id=166619046796&tld=com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/ebay/product?api_key=API_KEY&product_id=166619046796&tld=com" ); const product = await response.json(); console.log(product.title, product.price); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/ebay/product", params={"api_key": "API_KEY", "product_id": "166619046796", "tld": "com"} ) product = response.json() print(product["title"], product["price"]) ``` ### eBay Search **Sync:** `GET https://api.scraperapi.com/structured/ebay/search/v2` **Async:** `POST https://async.scraperapi.com/structured/ebay/search/v2` | Parameter | Type | Required | Description | |-----------------|---------|------------|----------------------------------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `query` | string | Yes (sync) | Search term | | `queries` | array | Yes (async batch) | Multiple search terms | | `tld` | string | No | eBay domain (default `com`) | | `country_code` | string | No | Language/currency | | `sort_by` | string | No | `ending_soonest`, `newly_listed`, `price_lowest`, `price_highest`, `distance_nearest`, `best_match` | | `page` | integer | No | Pagination | | `items_per_page`| integer | No | `60`, `120`, `240` | | `seller_id` | string | No | Filter by seller | | `condition` | string | No | Comma-separated: `new`, `used`, `open_box`, `refurbished`, `for_parts`, `not_working` | | `buying_format` | string | No | `buy_it_now`, `auction`, `accepts_offers` | | `show_only` | string | No | Comma-separated: `returns_accepted`, `authorized_seller`, `completed_items`, `sold_items`, `sale_items`, `listed_as_lots`, `search_in_description`, `benefits_charity`, `authenticity_guarantee` | | `output_format` | string | No | `json` (default), `csv` | **Response fields:** array of items with `product_title`, `item_price`, `item_url`, `image`, `condition`, `seller`, `shipping`, `bids`, `time_left` **cURL:** ```bash curl "https://api.scraperapi.com/structured/ebay/search/v2?api_key=API_KEY&query=iPhone&tld=com" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/ebay/search/v2?api_key=API_KEY&query=iPhone&tld=com&sort_by=price_lowest&condition=new" ); const data = await response.json(); for (const item of data) { console.log(item.product_title, item.item_price); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/ebay/search/v2", params={ "api_key": "API_KEY", "query": "iPhone", "tld": "com", "sort_by": "price_lowest", "condition": "new" } ) for item in response.json(): print(item["product_title"], item["item_price"]) ``` --- ## 21. Structured Data Endpoints - Google ### Google Search (SERP) **Sync:** `GET https://api.scraperapi.com/structured/google/search` **Async:** `POST https://async.scraperapi.com/structured/google/search` | Parameter | Type | Required | Description | |-----------------|---------|------------|----------------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `query` | string | Yes (sync) | Search query | | `queries` | array | Yes (async batch) | Multiple queries | | `tld` | string | No | Google domain (e.g., `com`, `co.uk`, `de`) | | `country_code` | string | No | Geo-targeting | | `uule` | string | No | Encoded regional location | | `hl` | string | No | Host language (e.g., `DE`) | | `gl` | string | No | Country origin bias (e.g., `DE`) | | `tbs` | string | No | Time filter: `h` (hour), `d` (day), `w` (week), `m` (month), `y` (year) | | `start` | integer | No | Result offset for pagination | | `output_format` | string | No | `json` (default), `csv` | **Response fields:** `search_information`, `knowledge_graph`, `organic_results` (position, title, snippet, link), `related_questions`, `videos`, `pagination` **cURL:** ```bash curl "https://api.scraperapi.com/structured/google/search?api_key=API_KEY&query=web+scraping" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/google/search?api_key=API_KEY&query=web+scraping" ); const data = await response.json(); for (const result of data.organic_results) { console.log(result.position, result.title, result.link); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/google/search", params={"api_key": "API_KEY", "query": "web scraping"} ) for result in response.json()["organic_results"]: print(result["position"], result["title"], result["link"]) ``` ### Google News **Sync:** `GET https://api.scraperapi.com/structured/google/news` **Async:** `POST https://async.scraperapi.com/structured/google/news` Parameters: Same as Google Search. **Response fields:** `search_information`, `articles` (source, title, description, date, link, thumbnail), `pagination` **cURL:** ```bash curl "https://api.scraperapi.com/structured/google/news?api_key=API_KEY&query=artificial+intelligence" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/google/news?api_key=API_KEY&query=artificial+intelligence" ); const data = await response.json(); for (const article of data.articles) { console.log(article.title, article.source, article.date); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/google/news", params={"api_key": "API_KEY", "query": "artificial intelligence"} ) for article in response.json()["articles"]: print(article["title"], article["source"], article["date"]) ``` ### Google Jobs **Sync:** `GET https://api.scraperapi.com/structured/google/jobs` **Async:** `POST https://async.scraperapi.com/structured/google/jobs` Parameters: Same as Google Search. **Response fields:** `result.jobs_results` (title, company_name, location, via, description, extensions, tags) **cURL:** ```bash curl "https://api.scraperapi.com/structured/google/jobs?api_key=API_KEY&query=software+engineer+remote" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/google/jobs?api_key=API_KEY&query=software+engineer+remote" ); const data = await response.json(); for (const job of data.result.jobs_results) { console.log(job.title, job.company_name, job.location); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/google/jobs", params={"api_key": "API_KEY", "query": "software engineer remote"} ) for job in response.json()["result"]["jobs_results"]: print(job["title"], job["company_name"], job["location"]) ``` ### Google Shopping **Sync:** `GET https://api.scraperapi.com/structured/google/shopping` **Async:** `POST https://async.scraperapi.com/structured/google/shopping` Parameters: Same as Google Search. **Response fields:** `shopping_results` (position, title, source, price, extracted_price, thumbnail, delivery_options), `pagination` **cURL:** ```bash curl "https://api.scraperapi.com/structured/google/shopping?api_key=API_KEY&query=wireless+headphones" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/google/shopping?api_key=API_KEY&query=wireless+headphones" ); const data = await response.json(); for (const item of data.shopping_results) { console.log(item.title, item.price, item.source); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/google/shopping", params={"api_key": "API_KEY", "query": "wireless headphones"} ) for item in response.json()["shopping_results"]: print(item["title"], item["price"], item["source"]) ``` ### Google Maps Search **Sync:** `GET https://api.scraperapi.com/structured/google/mapssearch` **Async:** `POST https://async.scraperapi.com/structured/google/mapssearch` | Parameter | Type | Required | Description | |-----------------|--------|----------|------------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `query` | string | Yes | Search term (e.g., `vegan restaurant`) | | `latitude` | float | Yes | Geographic latitude | | `longitude` | float | Yes | Geographic longitude | **Response fields:** `results` (name, address, stars, ratings, price_level, url, type, latitude, longitude, open hours, images) **cURL:** ```bash curl "https://api.scraperapi.com/structured/google/mapssearch?api_key=API_KEY&query=vegan+restaurant&latitude=40.7128&longitude=-74.0060" ``` **TypeScript:** ```typescript const response = await fetch( "https://api.scraperapi.com/structured/google/mapssearch?api_key=API_KEY&query=vegan+restaurant&latitude=40.7128&longitude=-74.0060" ); const data = await response.json(); for (const place of data.results) { console.log(place.name, place.address, place.stars); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/google/mapssearch", params={ "api_key": "API_KEY", "query": "vegan restaurant", "latitude": "40.7128", "longitude": "-74.0060" } ) for place in response.json()["results"]: print(place["name"], place["address"], place["stars"]) ``` --- ## 22. Structured Data Endpoints - Redfin (Real Estate) ### Redfin Agent Details **Sync:** `GET https://api.scraperapi.com/structured/redfin/agent` **Async:** `POST https://async.scraperapi.com/structured/redfin/agent` | Parameter | Type | Required | Description | |--------------------|--------|------------|------------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `url` | string | Yes (sync) | Redfin agent URL | | `urls` | array | Yes (async batch) | Multiple agent URLs | | `tld` | string | No | `com` (redfin.com), `ca` | | `country_code` | string | No | Geo-targeting | **Response fields:** `name`, `license_number`, `brokerage`, `contact`, `about`, `neighborhoods`, `sales`, `agent_listings`, `review_ratings`, `reviews`, `agents_team` **cURL:** ```bash curl "https://api.scraperapi.com/structured/redfin/agent?api_key=API_KEY&url=https://www.redfin.com/real-estate-agents/john-doe" ``` **TypeScript:** ```typescript const response = await fetch( `https://api.scraperapi.com/structured/redfin/agent?api_key=API_KEY&url=${encodeURIComponent("https://www.redfin.com/real-estate-agents/john-doe")}` ); const agent = await response.json(); console.log(agent.name, agent.brokerage); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/redfin/agent", params={"api_key": "API_KEY", "url": "https://www.redfin.com/real-estate-agents/john-doe"} ) agent = response.json() print(agent["name"], agent["brokerage"]) ``` ### Redfin For Sale **Sync:** `GET https://api.scraperapi.com/structured/redfin/forsale` **Async:** `POST https://async.scraperapi.com/structured/redfin/forsale` | Parameter | Type | Required | Description | |--------------------|---------|------------|---------------------------| | `api_key`/`apiKey` | string | Yes | Your API key | | `url` | string | Yes (sync) | Redfin property URL | | `urls` | array | Yes (async batch) | Multiple URLs | | `tld` | string | No | `com`, `ca` | | `country_code` | string | No | Geo-targeting | | `raw` | boolean | No | Return extended raw data | **Response fields:** `address`, `price`, `beds`, `baths`, `sq_ft`, `year_built`, `property_type`, `latitude`, `longitude`, `image_urls`, `description`, `amenities`, `agents`, `schools`, `walk_score`, `bike_score`, `nearby_places`, `similar_homes` **cURL:** ```bash curl "https://api.scraperapi.com/structured/redfin/forsale?api_key=API_KEY&url=https://www.redfin.com/CA/San-Francisco/123-Main-St" ``` **TypeScript:** ```typescript const response = await fetch( `https://api.scraperapi.com/structured/redfin/forsale?api_key=API_KEY&url=${encodeURIComponent("https://www.redfin.com/CA/San-Francisco/123-Main-St")}` ); const listing = await response.json(); console.log(listing.address, listing.price, listing.beds, listing.baths); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/redfin/forsale", params={"api_key": "API_KEY", "url": "https://www.redfin.com/CA/San-Francisco/123-Main-St"} ) listing = response.json() print(listing["address"], listing["price"], listing["beds"], listing["baths"]) ``` ### Redfin For Rent **Sync:** `GET https://api.scraperapi.com/structured/redfin/forrent` **Async:** `POST https://async.scraperapi.com/structured/redfin/forrent` Parameters: Same as Redfin For Sale (`api_key`, `url`/`urls`, `tld`, `country_code`, `raw`). **Response fields:** `name`, `address`, `bed_min`, `bed_max`, `bath_min`, `bath_max`, `price_min`, `price_max`, `sqft_min`, `sqft_max`, `description`, `available_units`, `image_urls`, `floor_plans`, `schools` **cURL:** ```bash curl "https://api.scraperapi.com/structured/redfin/forrent?api_key=API_KEY&url=https://www.redfin.com/CA/San-Francisco/123-Apartments" ``` **TypeScript:** ```typescript const response = await fetch( `https://api.scraperapi.com/structured/redfin/forrent?api_key=API_KEY&url=${encodeURIComponent("https://www.redfin.com/CA/San-Francisco/123-Apartments")}` ); const rental = await response.json(); console.log(rental.name, rental.price_min, rental.price_max); ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/redfin/forrent", params={"api_key": "API_KEY", "url": "https://www.redfin.com/CA/San-Francisco/123-Apartments"} ) rental = response.json() print(rental["name"], rental["price_min"], rental["price_max"]) ``` ### Redfin Listing Search **Sync:** `GET https://api.scraperapi.com/structured/redfin/search` **Async:** `POST https://async.scraperapi.com/structured/redfin/search` Parameters: `api_key`, `url`/`urls` (Redfin search page URL), `tld`, `country_code`. **Response fields:** `listing` array (address, phone, number_beds, number_baths, sq_ft, thumbnail_img_url, price, badge, key_facts) **cURL:** ```bash curl "https://api.scraperapi.com/structured/redfin/search?api_key=API_KEY&url=https://www.redfin.com/city/30749/CA/San-Francisco" ``` **TypeScript:** ```typescript const response = await fetch( `https://api.scraperapi.com/structured/redfin/search?api_key=API_KEY&url=${encodeURIComponent("https://www.redfin.com/city/30749/CA/San-Francisco")}` ); const data = await response.json(); for (const listing of data.listing) { console.log(listing.address, listing.price, listing.number_beds); } ``` **Python:** ```python import requests response = requests.get( "https://api.scraperapi.com/structured/redfin/search", params={"api_key": "API_KEY", "url": "https://www.redfin.com/city/30749/CA/San-Francisco"} ) for listing in response.json()["listing"]: print(listing["address"], listing["price"], listing["number_beds"]) ``` --- ## 23. Async Pattern for All Structured Data Endpoints All structured data endpoints support async mode. The pattern is consistent: 1. Use the async URL variant (e.g., `https://async.scraperapi.com/structured/amazon/product`) 2. POST with `Content-Type: application/json` 3. Use `apiKey` (camelCase) instead of `api_key` 4. For batch: use plural parameter (`asins`, `queries`, `urls`, `productIds`, `categories`) 5. Optional `callback` object for webhooks ### Example: Async Batch Amazon Products **cURL:** ```bash curl -X POST "https://async.scraperapi.com/structured/amazon/product" \ -H "Content-Type: application/json" \ -d '{ "apiKey": "API_KEY", "asins": ["B07FTKQ97Q", "B09V3KXJPB"], "tld": "com", "callback": {"type": "webhook", "url": "https://your-server.com/webhook"} }' ``` **TypeScript:** ```typescript const response = await fetch( "https://async.scraperapi.com/structured/amazon/product", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ apiKey: "API_KEY", asins: ["B07FTKQ97Q", "B09V3KXJPB"], tld: "com", callback: { type: "webhook", url: "https://your-server.com/webhook" }, }), } ); const jobs = await response.json(); for (const job of jobs) { console.log(job.id, job.statusUrl); } ``` **Python:** ```python import requests response = requests.post( "https://async.scraperapi.com/structured/amazon/product", json={ "apiKey": "API_KEY", "asins": ["B07FTKQ97Q", "B09V3KXJPB"], "tld": "com", "callback": {"type": "webhook", "url": "https://your-server.com/webhook"} } ) jobs = response.json() for job in jobs: print(job["id"], job["statusUrl"]) ``` --- ## 24. Async Polling Pattern When using the Async API, poll the `statusUrl` until the job completes. Jobs can take up to 24 hours for complex sites. **Python:** ```python import requests import time def submit_and_poll(url, api_key, poll_interval=5, timeout=300): """Submit an async job and poll until completion.""" # Submit job response = requests.post( "https://async.scraperapi.com/jobs", json={"apiKey": api_key, "url": url} ) job = response.json() status_url = job["statusUrl"] # Poll until done elapsed = 0 while elapsed < timeout: result = requests.get(status_url).json() if result["status"] == "finished": return result["response"] if result["status"] == "failed": raise Exception(f"Job failed: {result}") time.sleep(poll_interval) elapsed += poll_interval raise TimeoutError(f"Job did not complete within {timeout}s") # Usage result = submit_and_poll("https://example.com", "YOUR_API_KEY") print(result["statusCode"], result["body"][:200]) ``` **TypeScript:** ```typescript async function submitAndPoll(url: string, apiKey: string, pollInterval = 5000, timeout = 300000) { // Submit job const submitRes = await fetch("https://async.scraperapi.com/jobs", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ apiKey, url }), }); const job = await submitRes.json(); // Poll until done const start = Date.now(); while (Date.now() - start < timeout) { const result = await fetch(job.statusUrl).then((r) => r.json()); if (result.status === "finished") return result.response; if (result.status === "failed") throw new Error(`Job failed: ${JSON.stringify(result)}`); await new Promise((r) => setTimeout(r, pollInterval)); } throw new Error(`Job did not complete within ${timeout}ms`); } ```