Structured Extraction API
Overview
The Structured Extraction API extracts key information from PDF files and converts it into structured data, supporting custom JSON Schema templates.
Primary Use Cases
- Resume information extraction from PDFs
- Key information extraction from contract documents
- Structured data extraction from invoice PDFs
- Report document information organization
- Digitization of certificate documents
API Endpoint
POST https://freetier-01.cn-hangzhou.cluster.matrixonecloud.cn/byoa/api/v1/explore/extract
Request Headers
| Parameter | Type | Required | Description |
|---|---|---|---|
| moi-key | String | Yes | API key |
| Content-Type | String | Yes | Fixed value: application/json |
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| file_path | String | Yes | URL address of the PDF file |
| json_schema | Object | Yes | JSON Schema extraction pattern definition |
Request Example
Python Example
import requests
import json
url = "https://freetier-01.cn-hangzhou.cluster.matrixonecloud.cn/byoa/api/v1/explore/extract"
headers = {
"moi-key": "your-api-key",
"Content-Type": "application/json"
}
data = {
"file_path": "http://120.26.117.79:8080/files/xiyou.pdf",
"json_schema": {
"title": "ExtractInfo",
"type": "object",
"properties": {
"email": {
"type": "string",
"description": "Contact email"
},
"book_name": {
"type": "string",
"description": "Book name"
}
},
"required": [
"email",
"book_name"
]
}
}
response = requests.post(url, headers=headers, json=data)
print(response.json())
cURL Example
curl --location --request POST 'https://freetier-01.cn-hangzhou.cluster.matrixonecloud.cn/byoa/api/v1/explore/extract' \
--header 'moi-key: your-api-key' \
--header 'Content-Type: application/json' \
--data-raw '{
"file_path": "http://120.26.117.79:8080/files/xiyou.pdf",
"json_schema": {
"title": "ExtractInfo",
"type": "object",
"properties": {
"email": {
"type": "string",
"description": "Contact email"
},
"book_name": {
"type": "string",
"description": "Book name"
}
},
"required": [
"email",
"book_name"
]
}
}'
Response Format
{
"code": "ok",
"msg": "ok",
"data": {
"req_id": "607654fa-f12b-4415-a4c4-38dd07c7926e",
"msg": "success",
"file_path": "http://120.26.117.79:8080/files/xiyou.pdf",
"file_size_bytes": 1562886,
"results": {
"email": "466698432@qq.com",
"book_name": "Journey to the West"
}
}
}