ocr_image

ocr_image只读扩展
OCR 图片(服务端,需图片 URL 或已缓存的图片 file_id)
别名:.ocr_image

Call via POST http://127.0.0.1:3000/ocr_image (JSON params as the body) or WebSocket.

Parameters

Param Type Required Default Description
image string

Returns (data)

{ texts, language }:识别文本数组(含置信度与坐标)与识别语言。

Every response is wrapped in the standard envelope: { "status": "ok", "retcode": 0, "data": ... }. The table below describes data.

Field Type Required Description
texts object[] 识别出的文本块
language string 识别语言
Raw JSON Schema
{
  "type": "object",
  "properties": {
    "texts": {
      "type": "array",
      "description": "识别出的文本块",
      "items": {
        "type": "object",
        "properties": {
          "text": {
            "type": "string",
            "description": "文本内容"
          },
          "confidence": {
            "type": "number",
            "description": "置信度"
          },
          "coordinates": {
            "type": "array",
            "description": "文本框顶点坐标",
            "items": {
              "type": "object",
              "properties": {
                "x": {
                  "type": "number",
                  "description": "X 坐标"
                },
                "y": {
                  "type": "number",
                  "description": "Y 坐标"
                }
              },
              "required": [
                "x",
                "y"
              ]
            }
          }
        },
        "required": [
          "text",
          "confidence",
          "coordinates"
        ]
      }
    },
    "language": {
      "type": "string",
      "description": "识别语言"
    }
  },
  "required": [
    "texts",
    "language"
  ]
}

Examples

curl
Python
JavaScript
Go
SnowLuma SDK
curl -X POST http://127.0.0.1:3000/ocr_image \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <access-token>' \
  -d '{"image":""}'
Using an AI assistant?

MCP-capable clients can discover and call this action directly — no hand-written HTTP. See MCP.