Scrape
Scrape any web page and optionally process it with an LLM. Supports JavaScript rendering, CSS selectors, and multiple output formats (raw HTML, clean HTML, markdown, metadata, links, emails).
Use this to fetch any web page as clean markdown, HTML, or structured data. Add a prompt to extract specific fields with an LLM instead of getting the full page.
1 credit per request. Add render_js=true for JavaScript-heavy pages (+4 credits). Add prompt for LLM extraction (+5 credits).
Quick example
Scrape a page and get clean markdown:
curl -G "https://api.webscraperapi.ai/v1/scrape" \
-H "Authorization: Bearer $WEBSCRAPERAPI_API_KEY" \
--data-urlencode "url=https://news.ycombinator.com" \
--data-urlencode "output=markdown"Extract specific data with a prompt:
curl -G "https://api.webscraperapi.ai/v1/scrape" \
-H "Authorization: Bearer $WEBSCRAPERAPI_API_KEY" \
--data-urlencode "url=https://en.wikipedia.org/wiki/Anthropic" \
--data-urlencode "prompt=When was Anthropic founded and who are the founders?"Use output=markdown when feeding content to an LLM — it's clean and token-efficient. Use css_selector to narrow extraction to a specific part of the page (e.g., css_selector=article or css_selector=.product-details).
Output formats
| Value | What you get |
|---|---|
markdown | Clean markdown — great for LLMs |
raw_html | Original HTML as-is |
clean_html | HTML with scripts and styles stripped |
html_head_metadata_json | Page title, description, Open Graph tags as JSON |
links | All links on the page |
emails | Email addresses found on the page |
Full API reference
Query Parameters
The URL to scrape.
uri1 <= length <= 2083Optional prompt for LLM processing.
Optional CSS selector to narrow content.
LLM model for processing (e.g., 'gpt-4o-mini').
Whether to render JavaScript.
falseOutput format: 'raw_html', 'clean_html', 'markdown', 'html_head_metadata_json', 'email_addresses', 'internal_links', 'external_links', 'all_links'.
Request timeout in milliseconds.
Response Body
application/json
application/json
curl -X GET "https://api.webscraperapi.ai/v1/scrape?url=http%3A%2F%2Fexample.com"null{
"detail": [
{
"loc": [
"string"
],
"msg": "string",
"type": "string"
}
]
}