Marketing data, AI-ready

Marketing data API for AI training and blog writing.

ScrapEdge returns structured JSON — title, author, published date, content, tags, word count — from any URL. No parsers, no cleaning, no post-processing.

POST /api/scrape
// Send a URL
{ "url": "https://vercel.com/blog/turborepo-1-5" }

// Get structured JSON back
{
  "title":          "Introducing Turborepo 1.5",
  "author":         "Jared Palmer",
  "published_date": "2023-03-15T00:00:00Z",
  "word_count":     1247,
  "tags":           ["engineering", "monorepo"],
  "content":        "Today we're releasing..."
}
Get your API key → How it works

From URL to AI-ready data in one call

01

Call the API

POST your target URL with your API key. Optionally pass cache: false to bypass the 30-minute result cache.

02

We scrape and parse

Handles JavaScript rendering, anti-bot measures, and structural parsing. Returns structured JSON.

03

Use the data

Feed into your AI pipeline, RAG system, or content generation workflow. No post-processing needed.

Request
# Pass your API key in the X-API-Key header
curl -X POST https://scrapedge.polsia.app/api/scrape \\
  -H "Content-Type: application/json" \\
  -H "X-API-Key: YOUR_API_KEY" \\
  -d '{"url": "https://example.com/blog/my-post"}'
Response
{
  "url":           "https://example.com/blog/my-post",
  "title":         "My Post Title",
  "author":        "Jane Smith",
  "published_date":"2025-01-15T00:00:00Z",
  "word_count":    1847,
  "tags":          ["marketing", "growth"],
  "content":       "Article body text goes here...",
  "cached":        false,
  "scraped_at":    "2025-01-22T10:30:00Z"
}

Built for two workflows

AI Training Data

Build better models with clean, structured content

Feed ScrapEdge output directly into fine-tuning pipelines and RAG systems. Every field is structured — no HTML parsing, no regex cleanup, no post-processing scripts.

  • Consistent JSON schema across all sources
  • Resolves JavaScript-rendered pages automatically
  • 30-minute result cache avoids redundant scraping
Blog Research

Write faster with structured competitive intelligence

Scrape competitor blogs, industry publications, and product pages. Get author, date, tags, and content in one call — ready for your content brief or SEO audit.

  • Author and publication date extraction included
  • Tag extraction for topic clustering and SEO
  • Word count for scope estimation and planning

Try it right now

Paste any blog post, product page, or article URL. Get structured JSON back in seconds.

Try:

Endpoints & field reference

POST /api/scrape

Accepts a URL, returns structured JSON. Requires X-API-Key header. Results are cached for 30 minutes.

Request body
Field Type Description
url required string Full URL of the page to scrape. Must use http or https protocol.
cache boolean Pass false to bypass the 30-minute cache. Defaults to true.
Response fields
Field Type Description
url string The URL that was scraped.
title string | null Page title, extracted from the <title> tag or first <h1>.
author string | null Article author, from byline, meta tags, or JSON-LD structured data.
published_date string | null ISO 8601 publication date, from meta tags or <time datetime>.
content string | null Clean article text. Noise elements (nav, scripts, ads) are stripped.
tags string[] Topic tags, up to 20, from meta keywords or rel="tag" links.
word_count number Word count of the extracted content.
cached boolean true if this response came from cache; false for a live scrape.
scraped_at string ISO 8601 timestamp of when this result was generated.
Error responses
Status Error Cause
400 url is required No URL field in request body.
400 Invalid URL format URL is malformed or missing protocol.
400 Only http and https URLs are supported URL uses an unsupported protocol.
401 API key required Missing or invalid X-API-Key header.
429 Rate limit exceeded Daily request quota exceeded for your plan.
502 Scrape failed Target site blocked the request, timed out, or returned non-HTML content.
POST /api/keys

Create a new API key. Returns the key immediately — store it securely, it won't be shown again.

Request body
Field Type Description
email required string Your email address. Used for account identification and rate limit tracking.
Response
{
  "api_key": "sk_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "email":    "you@example.com",
  "plan":     "free"
}
GET /api/usage

Returns current usage stats for the authenticated user. Pass your API key in the X-API-Key header.

Response
{
  "requests_today": 23,
  "daily_limit":    50,
  "plan":           "free"
}

Simple, usage-based pricing

Start free. Scale as you grow. No lock-in.

Free
$0 / month

For exploring and testing the API.

  • 50 requests / day
  • All response fields
  • 30-min result cache
  • Community support
Get API key — free
Scale
$99 / month

For high-volume data pipelines and teams.

  • 10,000 requests / day
  • All response fields
  • 30-min result cache
  • Dedicated rate limits
  • Usage dashboard
  • Dedicated support
Get started — Scale $99/mo

All plans include access to all API response fields. Daily requests reset at midnight UTC. Cached responses do not count against your limit.

Stop writing scrapers.
Start building products.

The web has the data your AI needs. ScrapEdge gets it to you in the format that matters — structured, consistent, and ready to use.

// Data for machines. Speed for humans.