Web Scraping
Data Extraction
Extract structured data from any page using CSS or XPath selectors.
POST
$0.01/call/v1/scrape/extractUsage
const res = await fetch('https://api.yepapi.com/v1/scrape/extract', {
method: 'POST',
headers: {
'x-api-key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: 'https://news.ycombinator.com',
extractRules: {
titles: { selector: '.titleline > a', type: 'list' },
},
}),
});
const { data } = await res.json();
console.log(data.extracted);curl -X POST https://api.yepapi.com/v1/scrape/extract \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://news.ycombinator.com", "extractRules": {"titles": {"selector": ".titleline > a", "type": "list"}}}'Request Body
| Parameter | Type | Required | Description | Default |
|---|---|---|---|---|
url | string | Yes | URL to extract data from | — |
extractRules | object | Yes | CSS/XPath extraction rules (see below) | — |
Extract Rules Format
Simple: {"title": "h1"} — extracts text content of the first h1.
List: {"items": {"selector": ".card", "type": "list"}} — extracts all matching elements.
Nested: Extract multiple fields from repeating elements:
{
"articles": {
"selector": ".post",
"type": "list",
"output": {
"title": ".post-title",
"link": { "selector": "a", "output": "@href" }
}
}
}Attributes: Use @attr to extract element attributes: {"image": "img@src"}.
XPath: Selectors starting with / are treated as XPath: {"title": "//h1"}.
Response
{
"ok": true,
"data": {
"url": "https://news.ycombinator.com",
"extracted": {
"titles": [
"Show HN: Open-source AI code editor",
"The State of WebAssembly 2026",
"PostgreSQL 18 Released"
]
}
}
}Response Fields
| Field | Type | Description |
|---|---|---|
ok | boolean | Whether the request succeeded |
data | object | Response payload |
data.url | string | The URL that was scraped |
data.extracted | object | Extracted data matching your extractRules keys. Each key contains the result of the corresponding selector |
Under the Hood
Pages are rendered with JavaScript enabled before extraction. CSS selectors and XPath expressions both work — use whichever you prefer.