Documentation

Extrapify Docs

Structured web context infrastructure for developer teams. Use one schema-guided extraction endpoint to turn public webpages into reliable JSON for APIs, jobs, and agent workflows.

v1Base URL: https://extrapify.com

Quick start

Go from API key to structured web context in a few minutes. No SDK required.

1

Create an API key backed by prepaid credits

Purchase credits from the pricing flow. Extrapify delivers the API key once, binds quota to that key, and enforces usage server-side.

2

Submit a schema-guided extraction request POST /api/v1/extract

extract.sh
curl -X POST https://extrapify.com/api/v1/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: sk_live_..." \
  -d '{
    "url": "https://example.com",
    "schema": { "title": "string" },
    "mode": "auto"
  }'
response.json
{
  "extracted": {
    "title": "Example Domain"
  },
  "type": "single",
  "count": 1,
  "confidence": 1,
  "tokens_used": 104
}

Try it

Edit the URL and schema to preview the request shape you would ship into production code.

Loading interactive playground...

API reference

A compact contract designed for predictable integration.

POST/api/v1/extract

Headers

x-api-key#stringrequired

Secret API key used for authentication, quota, and distributed rate limiting.

Content-Type#stringrequired

Must be application/json.

Body

url#stringrequired

Public webpage URL to resolve, fetch, and normalize into structured output.

schema#objectrequired

Schema-guided extraction definition using supported leaf types and nested objects.

mode#"auto" | "single" | "list"optional

Extraction mode. Defaults to auto and lets Extrapify choose the best output shape.

Example schema

schema.json
{
  "title": "string",
  "author": "string",
  "published_at": "date",
  "tags": "string[]"
}

Schema templates

Reusable starting points for common structured web context workloads.

Article

article.schema.json
{
  "title": "string",
  "summary": "string",
  "author": "string",
  "published_at": "date",
  "section": "string",
  "canonical_url": "url",
  "tags": "string[]"
}

Product

product.schema.json
{
  "name": "string",
  "brand": "string",
  "price": "number",
  "currency": "string",
  "availability": "string",
  "rating": "float",
  "review_count": "integer",
  "image_url": "url"
}

Job listing

job.schema.json
{
  "title": "string",
  "company": "string",
  "location": "string",
  "employment_type": "string",
  "salary_range": "string",
  "posted_at": "date",
  "apply_url": "url"
}

MCP integration

Extrapify ships with a thin, stateless MCP access layer for agent-native workflows. The MCP server only exposes protocol operations and forwards requests into Extrapify, which remains the authority for extraction, quota, analytics, Browserless rendering, and billing.

TOOLextract_structured_data

Retrieve structured web context from a public webpage using the same schema-guided extraction engine that powers the HTTP API.

Claude Desktop: local API target

claude-desktop.local.json
{
  "mcpServers": {
    "extrapify": {
      "command": "node",
      "args": ["C:/Users/Chris/OneDrive/Desktop/extrapify/mcp/server.mjs"],
      "env": {
        "EXTRAPIFY_API_BASE_URL": "http://localhost:3000",
        "EXTRAPIFY_API_KEY": "sk_live_..."
      }
    }
  }
}

Claude Desktop: production API target

claude-desktop.production.json
{
  "mcpServers": {
    "extrapify": {
      "command": "node",
      "args": ["C:/Users/Chris/OneDrive/Desktop/extrapify/mcp/server.mjs"],
      "env": {
        "EXTRAPIFY_API_BASE_URL": "https://extrapify.com",
        "EXTRAPIFY_API_KEY": "sk_live_..."
      }
    }
  }
}

Operational model

  • MCP remains stateless and easy to maintain.
  • Extrapify remains the extraction engine, analytics authority, and quota authority.
  • Future tools can be added without changing the current extraction contract.

Code examples

Drop-in examples for application code, jobs, and infrastructure services.

extract.sh
curl -X POST https://extrapify.com/api/v1/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: sk_live_..." \
  -d '{
    "url": "https://example.com",
    "schema": { "title": "string" },
    "mode": "auto"
  }'

Error handling

Standard HTTP status codes with a stable JSON error contract. The body always includes { error: string }, and successful responses include an x-request-id header for tracing.

CodeNameDescription
401Invalid API keyMissing, invalid, or unauthorized x-api-key header.
402Credits exhaustedThe API key has no remaining extraction credits.
422Invalid schemaThe schema or model output does not satisfy the extraction contract.
429Too many requestsThe API key exceeded the distributed per-key rate limit window.
500Internal server errorUnexpected upstream failure. Retry with backoff and preserve x-request-id for support.

Notes & limitations

  • Extrapify only fetches public http and https targets and rejects private-network destinations.
  • Browserless fallback is applied automatically for JavaScript-heavy pages when lightweight extraction is insufficient.
  • Responses are generated using an LLM. Use the confidence field to gate downstream logic.
  • Default distributed rate limit: 30 requests per minute per API key.
  • Quota accounting, refund handling, observability, and extraction analytics are enforced server-side by Extrapify.

Support the developer

Send a thoughtful note or tip the developer directly.

Appreciation, suggestions, bug reports, or a quick tip.

Your message opens as a prefilled email so you can review it before sending.