ScrapeForge API

Enterprise web scraping that bypasses any protection

Enterprise-Grade Reliability
ScrapeForge handles the most challenging sites with 99.8% success rate, JavaScript rendering, and residential proxy rotation. Perfect for mission-critical data extraction.
POST
https://www.searchhive.dev/apihttps://www.searchhive.dev/api/v1/scrapeforge

Scrape any website with enterprise-grade reliability and JavaScript support

Status Codes

200
Successfully scraped the target URL
400
Invalid request parameters or malformed URL
403
Target site blocked the request
429
Rate limit exceeded - retry after delay
500
Internal server error - contact support
504
Timeout - target site took too long to respond

Quick Start

Get started with ScrapeForge in under 2 minutes. Simply provide a URL and get clean, structured data.

Basic scraping with JavaScript rendering

curl -X POST https://www.searchhive.dev/api/v1/scrapeforge \
  -H "Authorization": "Bearer: sk_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/products",
    "render_js": true,
    "wait_for": "#product-list",
    "extract_links": true,
    "follow_redirects": true
  }'

Core Features

JavaScript Rendering

Full browser rendering with Chromium for SPAs and dynamic content

React/Vue/Angular apps
Dynamic content loading
AJAX requests
Residential Proxies

Premium residential proxy network for high success rates

99.8% success rate
Global IP rotation
Anti-detection
Smart Retry Logic

Intelligent retry system with exponential backoff

Auto-retry failures
Rate limit handling
Optimal timing
Enterprise Security

Bank-grade security and compliance for sensitive operations

SOC 2 compliant
Data encryption
Audit logs

Common Use Cases

E-commerce Data

Product details, pricing, inventory, reviews

Lead Generation

Contact information, company data, social profiles

Market Research

Competitor analysis, market trends, industry data

Content Monitoring

Brand mentions, news articles, social media

Key Parameters

Essential ScrapeForge Parameters

ParameterTypeRequiredDescription
url
string
Required

The URL to scrape. Must be a valid HTTP/HTTPS URL.

Example:"https://example.com/products"

render_js
boolean
Optional

Execute JavaScript on the page before scraping.

Example:true

wait_for
string
Optional

CSS selector or XPath to wait for before scraping.

Example:"#product-list"

extract_links
boolean
Optional

Extract all links found on the page.

Example:true

follow_redirects
boolean
Optional

Follow HTTP redirects automatically.

Example:true

Response Format

ScrapeForge Response Fields

FieldTypeDescription
content
string

The scraped HTML content of the page.

Example:"<html><body>...</body></html>"

text_content
string

Plain text content extracted from HTML.

Example:"Welcome to our product catalog..."

links
array

Array of links found on the page (if extract_links=true).

Example:[{"url": "...", "text": "...", "type": "..."}]

load_time
float

Time taken to load and scrape the page in seconds.

Example:2.34

status_code
integer

HTTP status code returned by the target server.

Example:200

credits_used
integer

Number of API credits consumed by this request.

Example:5

Bulk Scraping

Process multiple URLs simultaneously with intelligent load balancing and error handling.

Bulk scraping multiple URLs

Bulk Scraping Benefits

  • • Process up to 100 URLs per request
  • • Intelligent concurrency control
  • • Automatic retry for failed requests
  • • Consolidated billing and reporting
99.8%
Success Rate
0.8s
Avg Response Time
50M+
Pages Scraped
24/7
Uptime

Best Practices

Recommended Practices
Use specific selectors:

Wait for specific elements with wait_for parameter

Enable JS rendering selectively:

Only use render_js when necessary to save credits

Handle failures gracefully:

Implement proper error handling and retry logic

Respect rate limits:

Stay within your plan's concurrent request limits

Common Pitfalls
Scraping too frequently:

Balance data freshness with rate limiting

Ignoring robots.txt:

Respect website policies and terms of service

Not handling dynamic content:

Use render_js for JavaScript-heavy sites

Missing error handling:

Always check status codes and handle failures