DeepSeek-OCR 模型 DeepSeek-OCR 是一款先进的 OCR 模型,能够识别图片中的文字并将其转换为指定的文本格式。 请求示例 您可以通过向 https://api.modelverse.cn/v1/chat/completions 端点发送请求来使用 DeepSeek-OCR 模型。 说明: DeepSeek-OCR 支持 max_tokens 参数最大设置为 8192。当前该模型免费开放使用,无需付费。 注意: 该模型输入仅支持 base64 编码的图片(即 “data:image/…” 格式),不支持直接通过 image_url 远程图片地址。如果你的图片在远程地址,可以通过如下命令一键获取 base64 字符串: curl -s https://umodelverse-inference.cn-wlcb.ufileos.com/亿恩科技-maxcot.jpg | base64 | tr -d '\n' 非流式请求 cURL curl https://api.modelverse.cn/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $YOUR_API_KEY" \ -d '{ "model": "deepseek-ai/DeepSeek-OCR", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "convert to markdown" }, { "type": "image_url", "image_url": { "url": "data:image/jpeg;base64,'$(curl -s https://umodelverse-inference.cn-wlcb.ufileos.com/亿恩科技-maxcot.jpg | base64 | tr -d '\n')'" } } ] } ] }' Python import base64 import os from openai import OpenAI # Function to encode the image def encode_image(image_path): with open(image_path, "rb") as image_file: return base64.b64encode(image_file.read()).decode('utf-8') # Path to your image image_path = os.path.expanduser("亿恩科技.png") # Getting the base64 string base64_image = encode_image(image_path) client = OpenAI( api_key=os.getenv("MODELVERSE_API_KEY", "<YOUR_MODELVERSE_API_KEY>"), base_url="https://api.modelverse.cn/v1/", ) response = client.chat.completions.create( model="deepseek-ai/DeepSeek-OCR", messages=[ { "role": "user", "content": [ { "type": "text", "text": "convert to markdown" }, { "type": "image_url", "image_url": { "url": f"data:image/jpeg;base64,{base64_image}" } } ] } ] ) print(response.choices[0].message.content) 流式请求 通过将 stream 参数设置为 true,您可以实现流式响应。 cURL curl https://api.modelverse.cn/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $YOUR_API_KEY" \ -d '{ "model": "deepseek-ai/DeepSeek-OCR", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "convert to markdown" }, { "type": "image_url", "image_url": { "url": "data:image/jpeg;base64,'$(curl -s https://umodelverse-inference.cn-wlcb.ufileos.com/亿恩科技-maxcot.jpg | base64 | tr -d '\n')'" } } ] } ], "stream": true }' Python import base64 import os from openai import OpenAI # Function to encode the image def encode_image(image_path): with open(image_path, "rb") as image_file: return base64.b64encode(image_file.read()).decode('utf-8') # Path to your image image_path = os.path.expanduser("亿恩科技.png") # Getting the base64 string base64_image = encode_image(image_path) client = OpenAI( api_key=os.getenv("MODELVERSE_API_KEY", "<YOUR_MODELVERSE_API_KEY>"), base_url="https://api.modelverse.cn/v1/", ) stream = client.chat.completions.create( model="deepseek-ai/DeepSeek-OCR", messages=[ { "role": "user", "content": [ { "type": "text", "text": "convert to markdown" }, { "type": "image_url", "image_url": { "url": f"data:image/jpeg;base64,{base64_image}" } } ] } ], stream=True, ) for chunk in stream: print(chunk.choices[0].delta.content or "", end="")