webscraperapi.ai

Quickstart

Make your first webscraperapi.ai API call in under 60 seconds.

Quickstart

Get your API key

Sign up at webscraperapi.ai and grab your API key from the settings page.

Your API key looks like: wsa_sk_...

Scrape a web page

Let's scrape Hacker News and get the content as clean markdown:

curl -G "https://api.webscraperapi.ai/v2/scrape" \
  -d "api_key=YOUR_API_KEY" \
  -d "url=https://news.ycombinator.com" \
  -d "output=markdown"
import requests

response = requests.get("https://api.webscraperapi.ai/v2/scrape", params={
    "api_key": "YOUR_API_KEY",
    "url": "https://news.ycombinator.com",
    "output": "markdown",
})

if response.ok:
    data = response.json()
    print(data["markdown"])
else:
    print(f"Error {response.status_code}: {response.text}")
const params = new URLSearchParams({
  api_key: 'YOUR_API_KEY',
  url: 'https://news.ycombinator.com',
  output: 'markdown',
});

const response = await fetch(
  `https://api.webscraperapi.ai/v2/scrape?${params}`
);

if (response.ok) {
  const data = await response.json();
  console.log(data.markdown);
} else {
  console.error(`Error ${response.status}: ${await response.text()}`);
}

Extract structured data with LLM

Go beyond raw scraping — use a natural language prompt to extract exactly the fields you need:

curl -G "https://api.webscraperapi.ai/v2/scrape" \
  -d "api_key=YOUR_API_KEY" \
  -d "url=https://news.ycombinator.com" \
  -d "output=markdown" \
  --data-urlencode "prompt=Extract the top 5 stories with title, URL, and point count as JSON"
import requests

response = requests.get("https://api.webscraperapi.ai/v2/scrape", params={
    "api_key": "YOUR_API_KEY",
    "url": "https://news.ycombinator.com",
    "output": "markdown",
    "prompt": "Extract the top 5 stories with title, URL, and point count as JSON",
})

if response.ok:
    data = response.json()
    print(data["llm_extraction"])
else:
    print(f"Error {response.status_code}: {response.text}")

Use with AI agents (MCP)

webscraperapi.ai supports MCP (Model Context Protocol) — connect it as a tool in Claude, ChatGPT, or any MCP-compatible AI agent. Your agent gets web scraping superpowers with a single integration.

MCP integration documentation is coming soon. In the meantime, the REST API works with any AI agent framework.

Output formats

Use the output parameter to control what you get back:

ValueDescription
raw_htmlOriginal HTML as-is
clean_htmlCleaned HTML with scripts/styles removed
markdownClean markdown (great for LLMs)
html_head_metadata_jsonPage metadata as JSON
linksAll links on the page
emailsEmail addresses found on the page

Next steps

On this page