PDF to Markdown API

Convert PDF documents to Markdown format with automatic image extraction

Overview

The PDF to Markdown API converts PDF documents to Markdown format with automatic image extraction and hosting. It uses a unified JSON request format, accepting either URL references or base64-encoded document data.

Authentication

The API supports two authentication methods:

  • API Key: Pass your API key as a query parameter ?key=YOUR_API_KEY

Convert PDF to Markdown

Convert a PDF document to Markdown format with automatic image extraction.

Request

POST /api/pdf-to-markdown

Parameters:

ParameterTypeRequiredDescription
documentobjectYesDocument object
document.typestringYesFixed value "document_url"
document.document_urlstringYesPDF document URL or base64 data
filenamestringNoFilename (recommended for base64 data)
keystringNoAPI key (query parameter, optional for logged-in users)

Examples:

Using PDF URL:

curl -X POST "https://llmocr.com/api/pdf-to-markdown?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": {
      "type": "document_url",
      "document_url": "https://llmocr.com/document.pdf"
    }
  }'

Using Base64 PDF Data:

curl -X POST "https://llmocr.com/api/pdf-to-markdown?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": {
      "type": "document_url",
      "document_url": "data:application/pdf;base64,JVBERi0xLjQK..."
    },
    "filename": "my-document.pdf"
  }'

Response

Parameters:

ParameterTypeDescription
idstringDatabase record ID
filenamestringFilename
contentstringComplete Markdown content (with embedded images)
formatstringOutput format, fixed as "markdown"
total_pagesnumberTotal number of pages in the PDF
pages_shownnumberNumber of pages included in the response
is_partialbooleanWhether the response contains partial content due to subscription limits
remaining_pagesnumberNumber of pages not shown due to limits (only present when is_partial is true)
messagestringInformation message about subscription limits (only present when is_partial is true)
timestampnumberProcessing completion timestamp
payloadstringAPI endpoint URL

Example:

{
  "id": "67890",
  "filename": "document.pdf",
  "content": "# Document Title\n\nDocument content with images...\n\n![image](https://storage.llmocr.com/image.jpg)",
  "format": "markdown",
  "total_pages": 29,
  "pages_shown": 21,
  "is_partial": true,
  "remaining_pages": 8,
  "message": "Showing 21 out of 29 pages based on your available subscription. All 29 pages have been saved and you can unlock the remaining 8 pages when you have more subscription pages.",
  "timestamp": 1758871660489,
  "payload": "https://llmocr.com/api/pdf-to-markdown?key=YOUR_API_KEY"
}
PDF to Markdown API - LLMOCR Developer Guide