YepAPI
Web Scraping

Data Extraction

Extract structured data from any page using CSS or XPath selectors.

POST/v1/scrape/extract
$0.01/call

Usage

const res = await fetch('https://api.yepapi.com/v1/scrape/extract', {
  method: 'POST',
  headers: {
    'x-api-key': 'YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: 'https://news.ycombinator.com',
    extractRules: {
      titles: { selector: '.titleline > a', type: 'list' },
    },
  }),
});
const { data } = await res.json();
console.log(data.extracted);
curl -X POST https://api.yepapi.com/v1/scrape/extract \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://news.ycombinator.com", "extractRules": {"titles": {"selector": ".titleline > a", "type": "list"}}}'

Request Body

ParameterTypeRequiredDescriptionDefault
urlstringYesURL to extract data from
extractRulesobjectYesCSS/XPath extraction rules (see below)

Extract Rules Format

Simple: {"title": "h1"} — extracts text content of the first h1.

List: {"items": {"selector": ".card", "type": "list"}} — extracts all matching elements.

Nested: Extract multiple fields from repeating elements:

{
  "articles": {
    "selector": ".post",
    "type": "list",
    "output": {
      "title": ".post-title",
      "link": { "selector": "a", "output": "@href" }
    }
  }
}

Attributes: Use @attr to extract element attributes: {"image": "img@src"}.

XPath: Selectors starting with / are treated as XPath: {"title": "//h1"}.

Response

{
  "ok": true,
  "data": {
    "url": "https://news.ycombinator.com",
    "extracted": {
      "titles": [
        "Show HN: Open-source AI code editor",
        "The State of WebAssembly 2026",
        "PostgreSQL 18 Released"
      ]
    }
  }
}

Response Fields

FieldTypeDescription
okbooleanWhether the request succeeded
dataobjectResponse payload
data.urlstringThe URL that was scraped
data.extractedobjectExtracted data matching your extractRules keys. Each key contains the result of the corresponding selector
Under the Hood

Pages are rendered with JavaScript enabled before extraction. CSS selectors and XPath expressions both work — use whichever you prefer.

On this page