Advanced Recognition API

High-precision text recognition with position detection, extracts text content and provides detailed coordinate information for each text block

Overview

The Advanced Recognition API provides high-precision text recognition with detailed position information. Unlike standard text recognition, this API returns not only the extracted text but also precise coordinates for each text block, including rotation rectangles and four-point coordinates.

It uses a unified JSON request format, accepting either URL references or base64-encoded image data.

Authentication

The API supports the following authentication method:

  • API Key: Pass your API key as a query parameter ?key=YOUR_API_KEY

Extract Text with Position Data

Extract text from an image file and get detailed position information for each text block, including rotation rectangles and four-point coordinates.

Request

POST /api/advanced-recognition

Parameters:

ParameterTypeRequiredDescription
documentobjectYesDocument object
document.typestringYesFixed value "image_url"
document.image_urlstringYesImage URL or base64 data
filenamestringNoFilename (recommended for base64 data)
keystringNoAPI key (query parameter, optional for logged-in users)

Examples:

Using Image URL:

curl -X POST "https://llmocr.com/api/advanced-recognition?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": {
      "type": "image_url",
      "image_url": "https://llmocr.com/image.jpg"
    }
  }'

Using Base64 Image Data:

curl -X POST "https://llmocr.com/api/advanced-recognition?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": {
      "type": "image_url",
      "image_url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEA..."
    },
    "filename": "document.jpg"
  }'

Response

Parameters:

ParameterTypeDescription
idstringDatabase record ID
filenamestringFilename
contentstringExtracted text content (all text blocks joined by newlines)
ocrResultobjectDetailed OCR results with position information
formatstringOutput format, fixed as "json"
timestampnumberProcessing completion timestamp
payloadstringAPI endpoint URL

ocrResult.words_info Structure:

Each item in the words_info array contains:

FieldTypeDescription
textstringText content of the block
locationnumber[]Four-point coordinates [x1,y1,x2,y2,x3,y3,x4,y4] (top-left → top-right → bottom-right → bottom-left)
rotate_rectnumber[]Rotation rectangle [center_x, center_y, width, height, angle], angle range: [-90, 90]

Example:

{
  "id": "12345",
  "filename": "document.jpg",
  "content": "Line 1 text\nLine 2 text",
  "ocrResult": {
    "words_info": [
      {
        "text": "Line 1 text",
        "location": [150, 80, 400, 80, 400, 120, 150, 120],
        "rotate_rect": [275, 100, 250, 40, 0]
      },
      {
        "text": "Line 2 text",
        "location": [150, 150, 400, 150, 400, 190, 150, 190],
        "rotate_rect": [275, 170, 250, 40, 0]
      }
    ]
  },
  "format": "json",
  "timestamp": 1640995200000,
  "payload": "https://llmocr.com/api/advanced-recognition?key=YOUR_API_KEY"
}
Advanced Recognition API - LLMOCR Developer Guide