Skip to main content
POST
/
parse
Parse a local file upload using the scrape pipeline
curl --request POST \
  --url https://api.firecrawl.dev/v2/parse \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form 'options={
  "formats": [
    "markdown"
  ],
  "onlyMainContent": true,
  "includeTags": [
    "<string>"
  ],
  "excludeTags": [
    "<string>"
  ],
  "timeout": 123,
  "parsers": [
    "pdf"
  ],
  "removeBase64Images": true
}' \
  --form 'origin=<string>' \
  --form 'integration=<string>' \
  --form zeroDataRetention=true
{
  "success": true,
  "data": {
    "markdown": "<string>",
    "summary": "<string>",
    "html": "<string>",
    "rawHtml": "<string>",
    "screenshot": "<string>",
    "links": [
      "<string>"
    ],
    "actions": {
      "screenshots": [
        "<string>"
      ],
      "scrapes": [
        {
          "url": "<string>",
          "html": "<string>"
        }
      ],
      "javascriptReturns": [
        {
          "type": "<string>",
          "value": "<unknown>"
        }
      ],
      "pdfs": [
        "<string>"
      ]
    },
    "metadata": {
      "title": "<string>",
      "description": "<string>",
      "language": "<string>",
      "sourceURL": "<string>",
      "keywords": "<string>",
      "ogLocaleAlternate": [
        "<string>"
      ],
      "<any other metadata> ": "<string>",
      "statusCode": 123,
      "error": "<string>"
    },
    "warning": "<string>",
    "changeTracking": {
      "previousScrapeAt": "2023-11-07T05:31:56Z",
      "changeStatus": "new",
      "visibility": "visible",
      "diff": "<string>",
      "json": {}
    },
    "branding": {}
  }
}
Use /v2/parse to upload a local file and run it through the scrape pipeline (PDF/document parsing, markdown conversion, metadata extraction, and transformers).

Multipart Fields

  • file (required): The file to parse.
  • options (optional): JSON string of parse options.
  • origin, integration, zeroDataRetention (optional): Same semantics as /v2/scrape.

Allowed Options

  • formats: markdown, html, rawHtml, links, images, summary, json, attributes
  • onlyMainContent, includeTags, excludeTags, parsers (pdf only), removeBase64Images, timeout

Supported File Types

  • PDF (application/pdf or %PDF signature)
  • Office documents: .docx, .odt, .rtf, .xlsx, .xls
  • HTML: .html, .htm, text/html, application/xhtml+xml
  • Markdown: .md, text/markdown
  • Plain text: text/plain

Example

curl -X POST "https://api.firecrawl.dev/v2/parse" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@./sample.pdf" \
  -F 'options={"formats":["markdown"]}'

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data
file
file
required

File to parse

options
object

Parse options to apply (send as JSON in the multipart part).

origin
string

Request origin identifier

integration
string

Integration identifier

zeroDataRetention
boolean

If true, enable zero data retention for this parse. Contact support to enable.

Response

Successful response

success
boolean
data
object