ScrapEdge returns structured JSON — title, author, published date, content, tags, word count — from any URL. No parsers, no cleaning, no post-processing.
// Send a URL { "url": "https://vercel.com/blog/turborepo-1-5" } // Get structured JSON back { "title": "Introducing Turborepo 1.5", "author": "Jared Palmer", "published_date": "2023-03-15T00:00:00Z", "word_count": 1247, "tags": ["engineering", "monorepo"], "content": "Today we're releasing..." }
POST your target URL with your API key. Optionally pass cache: false to bypass the 30-minute result cache.
Handles JavaScript rendering, anti-bot measures, and structural parsing. Returns structured JSON.
Feed into your AI pipeline, RAG system, or content generation workflow. No post-processing needed.
# Pass your API key in the X-API-Key header curl -X POST https://scrapedge.polsia.app/api/scrape \\ -H "Content-Type: application/json" \\ -H "X-API-Key: YOUR_API_KEY" \\ -d '{"url": "https://example.com/blog/my-post"}'
{
"url": "https://example.com/blog/my-post",
"title": "My Post Title",
"author": "Jane Smith",
"published_date":"2025-01-15T00:00:00Z",
"word_count": 1847,
"tags": ["marketing", "growth"],
"content": "Article body text goes here...",
"cached": false,
"scraped_at": "2025-01-22T10:30:00Z"
}
Feed ScrapEdge output directly into fine-tuning pipelines and RAG systems. Every field is structured — no HTML parsing, no regex cleanup, no post-processing scripts.
Scrape competitor blogs, industry publications, and product pages. Get author, date, tags, and content in one call — ready for your content brief or SEO audit.
Paste any blog post, product page, or article URL. Get structured JSON back in seconds.
Accepts a URL, returns structured JSON. Requires X-API-Key header. Results are cached for 30 minutes.
| Field | Type | Description |
|---|---|---|
url required |
string | Full URL of the page to scrape. Must use http or https protocol. |
cache |
boolean | Pass false to bypass the 30-minute cache. Defaults to true. |
| Field | Type | Description |
|---|---|---|
url |
string | The URL that was scraped. |
title |
string | null | Page title, extracted from the <title> tag or first <h1>. |
author |
string | null | Article author, from byline, meta tags, or JSON-LD structured data. |
published_date |
string | null | ISO 8601 publication date, from meta tags or <time datetime>. |
content |
string | null | Clean article text. Noise elements (nav, scripts, ads) are stripped. |
tags |
string[] | Topic tags, up to 20, from meta keywords or rel="tag" links. |
word_count |
number | Word count of the extracted content. |
cached |
boolean | true if this response came from cache; false for a live scrape. |
scraped_at |
string | ISO 8601 timestamp of when this result was generated. |
| Status | Error | Cause |
|---|---|---|
| 400 | url is required |
No URL field in request body. |
| 400 | Invalid URL format |
URL is malformed or missing protocol. |
| 400 | Only http and https URLs are supported |
URL uses an unsupported protocol. |
| 401 | API key required |
Missing or invalid X-API-Key header. |
| 429 | Rate limit exceeded |
Daily request quota exceeded for your plan. |
| 502 | Scrape failed |
Target site blocked the request, timed out, or returned non-HTML content. |
Create a new API key. Returns the key immediately — store it securely, it won't be shown again.
| Field | Type | Description |
|---|---|---|
email required |
string | Your email address. Used for account identification and rate limit tracking. |
{
"api_key": "sk_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"email": "you@example.com",
"plan": "free"
}
Returns current usage stats for the authenticated user. Pass your API key in the X-API-Key header.
{
"requests_today": 23,
"daily_limit": 50,
"plan": "free"
}
Start free. Scale as you grow. No lock-in.
For exploring and testing the API.
For teams and production AI pipelines.
For high-volume data pipelines and teams.
All plans include access to all API response fields. Daily requests reset at midnight UTC. Cached responses do not count against your limit.
The web has the data your AI needs. ScrapEdge gets it to you in the format that matters — structured, consistent, and ready to use.