基础使用

模块简介

全流程 OCR：PaddleOCR 包含文字检测、文字识别以及文字方向分类（可选），实现从图像中自动定位文字区域并识别出文本内容。
端到端解决方案：从输入图片到输出结构化的 OCR 结果，PaddleOCR 提供了一整套完整的流程。

主要特点

多语言支持：不仅支持中文，还支持英文、法语、德语等多种语言，满足国际化应用需求。
高准确率与鲁棒性：利用先进的深度学习算法（如 CRNN、CTC 解码等），在复杂场景（如自然场景、手写文字等）下也能保持较高的识别准确率。
易于使用：提供了简单易用的 API 接口，方便开发者快速集成到各种应用中。
可扩展性：基于 PaddlePaddle 平台，用户可以根据需求进一步调优模型或自定义训练新的 OCR 模型。

支持场景

文档识别：对合同、发票、证件等各类印刷文档进行自动识别与信息提取。
场景文本检测：在复杂背景下检测街景、广告牌、交通标志等场景中的文字。
移动端应用：由于模型轻量化，适用于移动设备上的实时文字识别任务。

安装

pip

# CPU版
pip isntall setuptools paddlepaddle paddleocr 

pdm

pdm add setuptools paddlepaddle paddleocr

# 检查setuptools是否已经安装
pdm run python -c "import setuptools; print(setuptools.__version__)"

基础使用

# 1.实例化
ocr = PaddleOCR(
    use_gpu=False,         # 是否使用GPU（需安装GPU版本paddlepaddle）
    lang='ch',             # 识别语言：ch/chinese、en/english 等
    det_model_dir=None,    # 自定义检测模型路径
    rec_model_dir=None,    # 自定义识别模型路径
    use_angle_cls=True,    # 是否启用方向分类
    enable_mkldnn=True     # 启用CPU加速（需Intel处理器）
    show_log=False         # 是否打印DEBUG调试信息
)

# 2.开始识别
result = ocr.ocr("test.jpg", cls=True)

# 3.
print("result: ", result)

结果格式

[
    [
        # 文本区域的四个顶点坐标（按一定顺序排列，通常为顺时针或逆时针）
        [
            [3599.0, 108.0],
            [3872.0, 91.0],
            [3882.0, 249.0],
            [3609.0, 266.0]
        ],
        # 元组：包含识别出的文本和对应的置信度
        ('11', 0.997830867767334)
    ],
    ......,
    [
        [
            [1691.0, 4459.0],
            [2274.0, 4473.0],
            [2271.0, 4613.0],
            [1688.0, 4600.0]
        ],
        ('2222222222222222', 0.9998730421066284)
    ]
]

结构说明

整个返回结果是一个列表，每个元素代表检测到的一处文本区域。每个元素包含两个部分：

第一部分（坐标列表）
- 是一个列表，里面有四个坐标点。
- 每个坐标点是一个二元列表，例如 [3599.0, 108.0] 表示该点的 (x, y) 坐标。
- 这四个点通常按顺时针或逆时针排列，构成了检测文本的矩形或四边形区域。
第二部分（文本与置信度元组）
- 是一个元组，格式为 (文本, 置信度)。
- 文本：检测到的字符串，如 '序号'、'档号' 等。
- 置信度：一个浮点数，表示识别结果的置信分数，数值越接近1说明识别越准确。

这种结构可以让你既获取文本的位置信息（便于进一步定位或绘制检测框），也能获取识别的内容及其可信度。

代码案例

配合pydantic

from paddleocr import PaddleOCR
from typing import List, Dict, Tuple, Union
from typing import List, Any
from pydantic import BaseModel


class Points(BaseModel):
    x: float
    y: float


class PaddleOCRTextPoints(BaseModel):
    left_top: Points
    right_top: Points
    left_buttom: Points
    right_buttom: Points
    center: Points


# 定义单个 OCR 识别结果的数据模型
class PaddleOCRResultT(BaseModel):
    points: PaddleOCRTextPoints
    # 四个坐标点的列表，每个点为 [x, y]
    text: str  # 识别出的文本
    confidence: float  # 识别置信度

# 模拟结果，架设没有./test.jpg
test_result = [
    [
        [[[3599.0, 108.0], [3872.0, 91.0], [3882.0, 249.0], [3609.0, 266.0]], ("11", 0.997830867767334)],
        [[[3110.0, 130.0], [3468.0, 115.0], [3474.0, 244.0], [3116.0, 259.0]], ("22", 0.9990027546882629)],
        [[[2970.0, 298.0], [3578.0, 298.0], [3578.0, 438.0], [2970.0, 438.0]], ("BS·22·0l- 00 2", 0.8297496438026428)],
        [[[3714.0, 351.0], [3761.0, 351.0], [3761.0, 409.0], [3714.0, 409.0]], ("2", 0.9975185394287109)],
        [
            [[744.0, 1152.0], [3354.0, 1152.0], [3354.0, 1269.0], [744.0, 1269.0]],
            ("3333333333333", 0.9934383034706116),
        ],
        [[[1199.0, 1338.0], [2793.0, 1351.0], [2792.0, 1491.0], [1198.0, 1479.0]], ("4444", 0.9937556982040405)],
        [[[1512.0, 1806.0], [2533.0, 1806.0], [2533.0, 1970.0], [1512.0, 1970.0]], ("55555", 0.9997148513793945)],
        [
            [[1287.0, 4267.0], [2681.0, 4267.0], [2681.0, 4384.0], [1287.0, 4384.0]],
            ("66666666666666666666", 0.989166259765625),
        ],
        [[[1691.0, 4459.0], [2274.0, 4473.0], [2271.0, 4613.0], [1688.0, 4600.0]], ("2018年3月", 0.9998730421066284)],
    ]
]


def format_paddle_ocr_result(result: Tuple) -> List[PaddleOCRResultT]:
    if len(result) == 0 or len(result[0]) == 0:
        return {}

    formatted_results = []

    if len(result[0][0]) == 2:
        for xy, content in result[0]:
            test, _DIM = content
            # 添加：计算中心点坐标
            center_x = sum([point[0] for point in xy]) / 4
            center_y = sum([point[1] for point in xy]) / 4
            # 添加：构造各个角点的 Points 对象
            left_top = Points(x=xy[0][0], y=xy[0][1])
            right_top = Points(x=xy[1][0], y=xy[1][1])
            right_buttom = Points(x=xy[2][0], y=xy[2][1])
            left_buttom = Points(x=xy[3][0], y=xy[3][1])
            center = Points(x=center_x, y=center_y)
            # 添加：构造 PaddleOCRTextPoints 对象
            points_obj = PaddleOCRTextPoints(
                left_top=left_top,
                right_top=right_top,
                left_buttom=left_buttom,
                right_buttom=right_buttom,
                center=center,
            )
            # 添加：构造 PaddleOCRResultT 对象，并加入列表
            formatted_results.append(PaddleOCRResultT(points=points_obj, text=test, confidence=_DIM))

    return formatted_results

# 使用
if __name__ == "__main__":
    tar = "./test.jpg"
    search_str = "66666666666666666666"
    
    ocr = PaddleOCR(use_angle_cls=True, lang="ch", show_log=False)
    result = ocr.ocr(img_path, cls=True)
    # result = format_paddle_ocr_result(result)
    result = format_paddle_ocr_result(test_result)
    
    for each in ocr_by_paddle(tar):
        if search_str in each.text:
            print("找到字符串: ", each.dict())
            break

结果格式化

def format_paddle_ocr_result(result: Tuple) -> List[PaddleOCRResultT]:
    if len(result) == 0 or len(result[0]) == 0:
        return {}

    formatted_results = []

    if len(result[0][0]) == 2:
        for xy, content in result[0]:
            test, _DIM = content
            # 添加：计算中心点坐标
            center_x = sum([point[0] for point in xy]) / 4
            center_y = sum([point[1] for point in xy]) / 4
            # 添加：构造各个角点的 Points 对象
            left_top = Points(x=xy[0][0], y=xy[0][1])
            right_top = Points(x=xy[1][0], y=xy[1][1])
            right_buttom = Points(x=xy[2][0], y=xy[2][1])
            left_buttom = Points(x=xy[3][0], y=xy[3][1])
            center = Points(x=center_x, y=center_y)
            # 添加：构造 PaddleOCRTextPoints 对象
            points_obj = PaddleOCRTextPoints(
                left_top=left_top,
                right_top=right_top,
                left_buttom=left_buttom,
                right_buttom=right_buttom,
                center=center,
            )
            # 添加：构造 PaddleOCRResultT 对象，并加入列表
            formatted_results.append(PaddleOCRResultT(points=points_obj, text=test, confidence=_DIM))

    return formatted_results

模块简介​

主要特点​

支持场景​

安装​

pip​

pdm​

基础使用​

结果格式​

结构说明​

代码案例​

配合pydantic​

结果格式化​

模块简介

主要特点

支持场景

安装

pip

pdm

基础使用

结果格式

结构说明

代码案例

配合pydantic

结果格式化