Document AI - OCR Processor

Mistral Document AI API comes with a Document OCR (Optical Character Recognition) processor, powered by our latest OCR model mistral-ocr-latest, which enables you to extract text and structured content from PDF documents.

Before You Start

Key Features

Extracts text content while maintaining document structure and hierarchy
Preserves formatting like headers, paragraphs, lists and tables
Returns results in markdown format for easy parsing and rendering
Handles complex layouts including multi-column text and mixed content
Processes documents at scale with high accuracy
Supports multiple document formats including:
- image_url: png, jpeg/jpg, avif and more...
- document_url: pdf, pptx, docx and more...

The OCR processor returns the extracted text content, images bboxes and metadata about the document structure, making it easy to work with the recognized content programmatically.

OCR with Images and PDFs

OCR your Documents

We provide different methods to OCR your documents. You can either OCR a PDF or an Image.

PDFs

Among the PDF methods, you can use a public available URL, a base64 encoded PDF or by uploading a PDF in our Cloud.

Be sure the URL is public and accessible by our API.

import os
from mistralai import Mistral

api_key = os.environ["MISTRAL_API_KEY"]

client = Mistral(api_key=api_key)

ocr_response = client.ocr.process(
    model="mistral-ocr-latest",
    document={
        "type": "document_url",
        "document_url": "https://arxiv.org/pdf/2201.04234"
    },
    include_image_base64=True
)

import os
from mistralai import Mistral

api_key = os.environ["MISTRAL_API_KEY"]

client = Mistral(api_key=api_key)

ocr_response = client.ocr.process(
    model="mistral-ocr-latest",
    document={
        "type": "document_url",
        "document_url": "https://arxiv.org/pdf/2201.04234"
    },
    include_image_base64=True
)

Images

To perform OCR on an image, you can either pass a URL to the image or directly use a Base64 encoded image.

You can perform OCR with any public available image as long as a direct url is available.

import os
from mistralai import Mistral

api_key = os.environ["MISTRAL_API_KEY"]

client = Mistral(api_key=api_key)

ocr_response = client.ocr.process(
    model="mistral-ocr-latest",
    document={
        "type": "image_url",
        "image_url": "https://raw.githubusercontent.com/mistralai/cookbook/refs/heads/main/mistral/ocr/receipt.png"
    },
    include_image_base64=True
)

import os
from mistralai import Mistral

api_key = os.environ["MISTRAL_API_KEY"]

client = Mistral(api_key=api_key)

ocr_response = client.ocr.process(
    model="mistral-ocr-latest",
    document={
        "type": "image_url",
        "image_url": "https://raw.githubusercontent.com/mistralai/cookbook/refs/heads/main/mistral/ocr/receipt.png"
    },
    include_image_base64=True
)

Cookbooks

For more information and guides on how to make use of OCR, we have the following cookbooks:

FAQ