You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clearly, PaddleOCR does not perform well in scenarios other than Chinese and English. We plan to incorporate additional OCR methods in the future to improve recognition quality for non-Chinese and non-English texts. However, due to limited development resources, this process may take some time.
Description of the bug | 错误描述
Hallo,
First of all thank you for MinerU.
BUT the german language detection is miserable.
The German "Umlaute" are not recognized ... like ö, ü, ä, ß.
I have set the language to "german" but nothing really changed.
For exmaple:
This is in the text: Köln Dünnwald
This is the output: K8ln Dünnwald
In addtion:
this is in the text: Köln-Dünnwald
This is the output: Koln-Dunnwald
Maybe you have a solution for this problem.
Thank you!
How to reproduce the bug | 如何复现
You can try it with any document that has the german language.
Operating system | 操作系统
Windows
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.10.x
Device mode | 设备模式
cuda
The text was updated successfully, but these errors were encountered: