Skip to main content

Overview

The Scrapengine Proxy allows you to use Scrapengine as a standard HTTP/HTTPS proxy server. This is ideal for integrating with existing tools, libraries, or applications that support proxy configuration. Simply point your HTTP client to our proxy endpoint and authenticate using your API key.
The proxy interface provides the same scraping capabilities as the REST API but through a standard proxy protocol that works with any HTTP client.

Proxy Endpoint

gw.scrapengine.io:8081

Authentication

Authentication is done via HTTP Basic Auth where:
  • Username: Any value (e.g., scrape)
  • Password: Your Scrapengine API key

Quick Start

curl -x http://scrape:[email protected]:8081 \
  https://example.com

Configuration Headers

Control scraping behavior by passing custom headers with your request. These headers are parsed by the proxy and configure how the scraping job is executed.
HeaderTypeDefaultDescription
x-scrapengine-renderbooleanfalseEnable JavaScript rendering for dynamic content
x-scrapengine-asyncbooleanfalseReturn immediately with job ID for async processing
x-scrapengine-locationstringusProxy country code (e.g., us, uk, de)
x-scrapengine-formatstringrawResponse format: raw, json, or markdown
x-scrapengine-include-headersbooleanfalseInclude response headers in the response
x-scrapengine-verbosebooleanfalseReturn detailed response headers

Examples with Headers

Enable JavaScript Rendering

Scrape pages that require JavaScript to load content:
curl -x http://scrape:[email protected]:8081 \
  -H "x-scrapengine-render: true" \
  https://example.com/dynamic-page

Get Markdown Output

Convert the scraped content to clean Markdown:
curl -x http://scrape:[email protected]:8081 \
  -H "x-scrapengine-format: markdown" \
  https://example.com/article

Geo-Targeted Scraping

Scrape from a specific geographic location:
curl -x http://scrape:[email protected]:8081 \
  -H "x-scrapengine-location: uk" \
  -H "x-scrapengine-render: true" \
  https://example.com/local-prices

Multiple Options Combined

Combine multiple options for advanced scraping:
curl -x http://scrape:[email protected]:8081 \
  -H "x-scrapengine-render: true" \
  -H "x-scrapengine-format: markdown" \
  -H "x-scrapengine-location: us" \
  -H "x-scrapengine-include-headers: true" \
  https://example.com/product

POST Requests

Send POST requests through the proxy:
curl -x http://scrape:[email protected]:8081 \
  -X POST \
  -H "Content-Type: application/json" \
  -H "x-scrapengine-render: true" \
  -d '{"query": "search term"}' \
  https://example.com/api/search

Response Headers

The proxy returns the following headers with each response:
HeaderDescription
x-trace-idUnique identifier for request tracing and support
x-remaining-creditsNumber of API credits remaining (on successful requests)

Error Handling

Authentication Errors

If authentication fails, you’ll receive a 407 Proxy Authentication Required response:
HTTP/1.1 407 Proxy Authentication Required
Proxy-Authenticate: Basic realm="Proxy Authentication Required"
Solutions:
  • Verify your API key is correct
  • Ensure the password field contains your API key
  • Check that your API key has not expired

Common HTTP Errors

StatusDescription
400Bad Request - Invalid URL or parameters
401Unauthorized - Invalid API key
407Proxy Authentication Required - Missing credentials
408Request Timeout - Target site took too long to respond
429Too Many Requests - Rate limit exceeded
503Service Unavailable - Temporary service issue

Error Response Format

{
  "status": "error",
  "message": "Error description",
  "traceId": "abc123-def456"
}

Proxy vs REST API

FeatureProxyREST API
IntegrationWorks with any HTTP clientRequires specific API calls
AuthenticationHTTP Basic AuthBearer token
URLgw.scrapengine.io:8081api.scrapengine.io/api/v1/scrape
LLM ExtractionNot supportedFully supported
ConfigurationVia HTTP headersVia JSON body
Best forExisting tools, simple scrapingAdvanced features, AI extraction
Use the REST API if you need AI-powered data extraction with schemas or prompts. Use the Proxy for simple scraping or when integrating with existing tools that support proxy configuration.

Use Cases

  • Browser automation tools: Configure Puppeteer, Playwright, or Selenium to use Scrapengine as a proxy
  • CLI tools: Use with wget, curl, or other command-line HTTP clients
  • Existing applications: Add scraping capabilities without code changes
  • Testing tools: Route traffic through Scrapengine for web testing
  • Scripting: Simple one-liner scraping in shell scripts

Best Practices

Always prefer HTTPS URLs for better security and reliability when scraping.
JavaScript rendering (x-scrapengine-render: true) uses more resources. Only enable it for pages that require JavaScript to load content.
Set appropriate timeouts in your HTTP client. Scraping can take longer than typical API calls, especially with rendering enabled.
Check the x-remaining-credits header in responses to monitor your usage and avoid unexpected interruptions.