Overview
The Scrapengine Proxy allows you to use Scrapengine as a standard HTTP/HTTPS proxy server. This is ideal for integrating with existing tools, libraries, or applications that support proxy configuration. Simply point your HTTP client to our proxy endpoint and authenticate using your API key.The proxy interface provides the same scraping capabilities as the REST API but through a standard proxy protocol that works with any HTTP client.
Proxy Endpoint
Authentication
Authentication is done via HTTP Basic Auth where:- Username: Any value (e.g.,
scrape) - Password: Your Scrapengine API key
Quick Start
Configuration Headers
Control scraping behavior by passing custom headers with your request. These headers are parsed by the proxy and configure how the scraping job is executed.| Header | Type | Default | Description |
|---|---|---|---|
x-scrapengine-render | boolean | false | Enable JavaScript rendering for dynamic content |
x-scrapengine-async | boolean | false | Return immediately with job ID for async processing |
x-scrapengine-location | string | us | Proxy country code (e.g., us, uk, de) |
x-scrapengine-format | string | raw | Response format: raw, json, or markdown |
x-scrapengine-include-headers | boolean | false | Include response headers in the response |
x-scrapengine-verbose | boolean | false | Return detailed response headers |
Examples with Headers
Enable JavaScript Rendering
Scrape pages that require JavaScript to load content:Get Markdown Output
Convert the scraped content to clean Markdown:Geo-Targeted Scraping
Scrape from a specific geographic location:Multiple Options Combined
Combine multiple options for advanced scraping:POST Requests
Send POST requests through the proxy:Response Headers
The proxy returns the following headers with each response:| Header | Description |
|---|---|
x-trace-id | Unique identifier for request tracing and support |
x-remaining-credits | Number of API credits remaining (on successful requests) |
Error Handling
Authentication Errors
If authentication fails, you’ll receive a407 Proxy Authentication Required response:
- Verify your API key is correct
- Ensure the password field contains your API key
- Check that your API key has not expired
Common HTTP Errors
| Status | Description |
|---|---|
400 | Bad Request - Invalid URL or parameters |
401 | Unauthorized - Invalid API key |
407 | Proxy Authentication Required - Missing credentials |
408 | Request Timeout - Target site took too long to respond |
429 | Too Many Requests - Rate limit exceeded |
503 | Service Unavailable - Temporary service issue |
Error Response Format
Proxy vs REST API
| Feature | Proxy | REST API |
|---|---|---|
| Integration | Works with any HTTP client | Requires specific API calls |
| Authentication | HTTP Basic Auth | Bearer token |
| URL | gw.scrapengine.io:8081 | api.scrapengine.io/api/v1/scrape |
| LLM Extraction | Not supported | Fully supported |
| Configuration | Via HTTP headers | Via JSON body |
| Best for | Existing tools, simple scraping | Advanced features, AI extraction |
Use Cases
- Browser automation tools: Configure Puppeteer, Playwright, or Selenium to use Scrapengine as a proxy
- CLI tools: Use with wget, curl, or other command-line HTTP clients
- Existing applications: Add scraping capabilities without code changes
- Testing tools: Route traffic through Scrapengine for web testing
- Scripting: Simple one-liner scraping in shell scripts
Best Practices
Use HTTPS targets when possible
Use HTTPS targets when possible
Always prefer HTTPS URLs for better security and reliability when scraping.
Enable rendering only when needed
Enable rendering only when needed
JavaScript rendering (
x-scrapengine-render: true) uses more resources. Only enable it for pages that require JavaScript to load content.Handle timeouts gracefully
Handle timeouts gracefully
Set appropriate timeouts in your HTTP client. Scraping can take longer than typical API calls, especially with rendering enabled.
Monitor your credits
Monitor your credits
Check the
x-remaining-credits header in responses to monitor your usage and avoid unexpected interruptions.