Advanced Recognition API

概要

Advanced Recognition APIは、詳細な位置情報を備えた高精度テキスト認識を提供します。標準的なテキスト認識とは異なり、このAPIは抽出されたテキストだけでなく、各テキストブロックの正確な座標（回転矩形と4点座標を含む）も返します。

統一されたJSONリクエスト形式を使用し、URL参照またはbase64エンコードされた画像データを受け入れます。

認証

APIは以下の認証方法をサポートしています：

APIキー: クエリパラメータとしてAPIキーを渡す ?key=YOUR_API_KEY

位置データ付きテキスト抽出

画像ファイルからテキストを抽出し、回転矩形と4点座標を含む各テキストブロックの詳細な位置情報を取得します。

リクエスト

POST /api/advanced-recognition

パラメータ:

パラメータ	型	必須	説明
document	object	はい	ドキュメントオブジェクト
document.type	string	はい	固定値 "image_url"
document.image_url	string	はい	画像URLまたはbase64データ
filename	string	いいえ	ファイル名（base64データの場合推奨）
key	string	いいえ	APIキー（クエリパラメータ、ログインユーザーにはオプション）

例:

画像URLを使用:

curl -X POST "https://llmocr.com/api/advanced-recognition?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": {
      "type": "image_url",
      "image_url": "https://llmocr.com/image.jpg"
    }
  }'

Base64画像データを使用:

curl -X POST "https://llmocr.com/api/advanced-recognition?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": {
      "type": "image_url",
      "image_url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEA..."
    },
    "filename": "document.jpg"
  }'

レスポンス

パラメータ:

パラメータ	型	説明
id	string	データベースレコードID
filename	string	ファイル名
content	string	抽出されたテキストコンテンツ（改行で結合されたすべてのテキストブロック）
ocrResult	object	位置情報を含む詳細なOCR結果
format	string	出力形式、固定値 "json"
timestamp	number	処理完了タイムスタンプ
payload	string	APIエンドポイントURL

ocrResult.words_info構造:

words_info配列の各アイテムには以下が含まれます：

フィールド	型	説明
text	string	ブロックのテキストコンテンツ
location	number[]	4点座標 [x1,y1,x2,y2,x3,y3,x4,y4]（左上→右上→右下→左下）
rotate_rect	number[]	回転矩形 [center_x, center_y, width, height, angle]、角度範囲: [-90, 90]

例:

{
  "id": "12345",
  "filename": "document.jpg",
  "content": "Line 1 text\nLine 2 text",
  "ocrResult": {
    "words_info": [
      {
        "text": "Line 1 text",
        "location": [150, 80, 400, 80, 400, 120, 150, 120],
        "rotate_rect": [275, 100, 250, 40, 0]
      },
      {
        "text": "Line 2 text",
        "location": [150, 150, 400, 150, 400, 190, 150, 190],
        "rotate_rect": [275, 170, 250, 40, 0]
      }
    ]
  },
  "format": "json",
  "timestamp": 1640995200000,
  "payload": "https://llmocr.com/api/advanced-recognition?key=YOUR_API_KEY"
}

LLMOCR

APIドキュメント

クイックスタート

概要

認証

位置データ付きテキスト抽出

リクエスト

レスポンス