From a531f9de1020d5d252bc2a97d1abb5ebaca89425 Mon Sep 17 00:00:00 2001 From: Muhammad Adil Date: Sun, 7 Jun 2026 03:33:06 +0000 Subject: [PATCH] Add 4 ocr python tutorials MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Categories: general Source: AI Search API Tutorials: - Perform OCR on Image with Aspose OCR & LLM – Complete Guide - Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide - OCR Character Set – Retrieve and Use OCR Engine in Python - recognize text from image using Aspose OCR & AI Auto-generated by Professionalize.Tutorials Agent --- .../_index.md | 215 +++++++++++++ .../_index.md | 242 +++++++++++++++ .../_index.md | 262 ++++++++++++++++ .../_index.md | 284 ++++++++++++++++++ .../_index.md | 210 +++++++++++++ .../_index.md | 239 +++++++++++++++ .../_index.md | 254 ++++++++++++++++ .../_index.md | 280 +++++++++++++++++ .../_index.md | 216 +++++++++++++ .../_index.md | 241 +++++++++++++++ .../_index.md | 237 +++++++++++++++ .../_index.md | 282 +++++++++++++++++ .../_index.md | 216 +++++++++++++ .../_index.md | 231 ++++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 284 ++++++++++++++++++ .../_index.md | 217 +++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 257 ++++++++++++++++ .../_index.md | 284 ++++++++++++++++++ .../_index.md | 216 +++++++++++++ .../_index.md | 243 +++++++++++++++ .../_index.md | 261 ++++++++++++++++ .../_index.md | 284 ++++++++++++++++++ .../_index.md | 215 +++++++++++++ .../_index.md | 258 ++++++++++++++++ .../_index.md | 254 ++++++++++++++++ .../_index.md | 283 +++++++++++++++++ .../_index.md | 215 +++++++++++++ .../_index.md | 218 ++++++++++++++ .../_index.md | 236 +++++++++++++++ .../_index.md | 284 ++++++++++++++++++ .../_index.md | 213 +++++++++++++ .../_index.md | 228 ++++++++++++++ .../_index.md | 236 +++++++++++++++ .../_index.md | 283 +++++++++++++++++ .../_index.md | 210 +++++++++++++ .../_index.md | 228 ++++++++++++++ .../_index.md | 260 ++++++++++++++++ .../_index.md | 281 +++++++++++++++++ .../_index.md | 215 +++++++++++++ .../_index.md | 232 ++++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 283 +++++++++++++++++ .../_index.md | 216 +++++++++++++ .../_index.md | 218 ++++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 277 +++++++++++++++++ .../_index.md | 214 +++++++++++++ .../_index.md | 257 ++++++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 284 ++++++++++++++++++ .../_index.md | 210 +++++++++++++ .../_index.md | 254 ++++++++++++++++ .../_index.md | 254 ++++++++++++++++ .../_index.md | 281 +++++++++++++++++ .../_index.md | 214 +++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 254 ++++++++++++++++ .../_index.md | 282 +++++++++++++++++ .../_index.md | 216 +++++++++++++ .../_index.md | 257 ++++++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 284 ++++++++++++++++++ .../_index.md | 214 +++++++++++++ .../_index.md | 232 ++++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 283 +++++++++++++++++ .../_index.md | 217 +++++++++++++ .../_index.md | 258 ++++++++++++++++ .../_index.md | 254 ++++++++++++++++ .../_index.md | 284 ++++++++++++++++++ .../_index.md | 215 +++++++++++++ .../_index.md | 258 ++++++++++++++++ .../_index.md | 261 ++++++++++++++++ .../_index.md | 283 +++++++++++++++++ .../_index.md | 215 +++++++++++++ .../_index.md | 241 +++++++++++++++ .../_index.md | 261 ++++++++++++++++ .../_index.md | 282 +++++++++++++++++ .../_index.md | 212 +++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 252 ++++++++++++++++ .../_index.md | 282 +++++++++++++++++ .../_index.md | 215 +++++++++++++ .../_index.md | 242 +++++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 283 +++++++++++++++++ .../_index.md | 215 +++++++++++++ .../_index.md | 256 ++++++++++++++++ .../_index.md | 257 ++++++++++++++++ .../_index.md | 283 +++++++++++++++++ 92 files changed, 22874 insertions(+) create mode 100644 ocr/arabic/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/arabic/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/arabic/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/arabic/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/chinese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/chinese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/chinese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/chinese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/czech/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/czech/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/czech/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/czech/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/dutch/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/dutch/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/dutch/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/dutch/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/english/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/english/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/english/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/english/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/french/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/french/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/french/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/french/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/german/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/german/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/german/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/german/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/greek/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/greek/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/greek/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/greek/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/hindi/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/hindi/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/hindi/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/hindi/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/hongkong/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/hongkong/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/hongkong/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/hongkong/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/hungarian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/hungarian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/hungarian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/hungarian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/indonesian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/indonesian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/indonesian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/indonesian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/italian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/italian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/italian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/italian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/japanese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/japanese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/japanese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/japanese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/korean/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/korean/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/korean/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/korean/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/polish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/polish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/polish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/polish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/portuguese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/portuguese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/portuguese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/portuguese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/russian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/russian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/russian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/russian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/spanish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/spanish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/spanish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/spanish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/swedish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/swedish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/swedish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/swedish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/thai/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/thai/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/thai/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/thai/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/turkish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/turkish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/turkish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/turkish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md create mode 100644 ocr/vietnamese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md create mode 100644 ocr/vietnamese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md create mode 100644 ocr/vietnamese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md create mode 100644 ocr/vietnamese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md diff --git a/ocr/arabic/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/arabic/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..01d249836 --- /dev/null +++ b/ocr/arabic/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,215 @@ +--- +category: general +date: 2026-06-06 +description: استخراج النص من صورة مكتوبة بخط اليد باستخدام OCR في بايثون. تعلم كيفية + تحويل صورة مكتوبة بخط اليد إلى نص بسرعة وموثوقية. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: ar +og_description: استخراج النص من صورة مكتوبة بخط اليد باستخدام بايثون. يوضح هذا الدليل + كيفية تحويل صورة مكتوبة بخط اليد إلى نص ويجيب عن كيفية التعرف على النص المكتوب بخط + اليد. +og_title: استخراج النص من صورة مكتوبة بخط اليد – دليل بايثون لتقنية OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: استخراج النص من صورة مكتوبة بخط اليد باستخدام OCR في بايثون – دليل خطوة بخطوة +url: /ar/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# استخراج النص من صورة مكتوبة بخط اليد باستخدام Python OCR – دليل خطوة بخطوة + +هل تساءلت يومًا **كيف تتعرف على النص المكتوب بخط اليد** في صورة التقطتها بهاتفك؟ لست وحدك. في العديد من المشاريع—سواء كان ذلك لتحويل ملاحظات المحاضرات إلى رقمية أو استخراج البيانات من النماذج الموقعة—تحتاج إلى **استخراج النص من صورة مكتوبة بخط اليد** بسرعة وبدون عناء. + +في هذا الدرس سنستعرض مثالًا كاملاً جاهزًا للتنفيذ يوضح لك بالضبط كيف **تحول صورة مكتوبة بخط اليد إلى نص** باستخدام مكتبة OCR شائعة في Python. لا مراجع غامضة، فقط كود ملموس، شروحات، ونصائح يمكنك نسخها ولصقها اليوم. + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## ما ستحتاجه + +قبل أن نبدأ، تأكد من توفر ما يلي: + +| المتطلب | لماذا يهم | +|---------|-----------| +| Python 3.9 أو أحدث | دعم الصياغة الحديثة والمكتبات | +| `pip` (مدير حزم Python) | لتثبيت حزمة OCR | +| صورة واضحة لملاحظات مكتوبة بخط اليد (JPEG/PNG) | دقة OCR تنخفض مع الصور الضبابية | +| إلمام أساسي بدوال Python | يساعدك على تعديل المثال لاحقًا | + +إذا كان أي من هذه غير متوفر، احصل على أحدث نسخة من Python من وقم بتثبيتها—بدون تعقيدات. + +## تثبيت مكتبة OCR للـ Python + +المقتطف البرمجي الذي سنستخدمه يعتمد على حزمة `ocr` (غلاف خفيف حول Tesseract يضيف إعدادات مفيدة). ثبّتها بأمر واحد: + +```bash +pip install ocr +``` + +> **نصيحة احترافية:** بعد التثبيت، شغّل `tesseract --version` في الطرفية لتتأكد من وجود المحرك الأساسي. إذا لم يكن لديك Tesseract، اتبع الدليل الرسمي لنظامك—معظم مديري الحزم يحتويونه (`apt-get install tesseract-ocr` على Ubuntu، `brew install tesseract` على macOS). + +## الخطوة 1: إنشاء كائن محرك OCR + +إنشاء المحرك هو الحجر الأول في جدار **python ocr handwritten recognition**. فكر في المحرك كالعقل الذي سيقرأ الخربشات لاحقًا. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +لماذا هذا مهم: بدون محرك لا يمكنك تعديل خط أنابيب التعرف. الإعدادات الافتراضية مهيأة للنص المطبوع، لذا سنحتاج لتعديلها في الخطوة التالية. + +## الخطوة 2: تفعيل التعرف على الخط اليدوي + +بشكل افتراضي يفترض المحرك أن الأحرف مطبوعة. تفعيل وضع الخط اليدوي يغيّر الإعداد ليخبر Tesseract باستخدام نموذج LSTM المدرب على الخطوط المتصلة. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **ماذا يحدث إذا تخطيت هذه الخطوة؟** سيعامل OCR الضربات كضوضاء، مما ينتج مخرجات غير مفهومة. تفعيل العلامة هو جوهر **how to recognize handwritten text**. + +## الخطوة 3: تحميل صورة الخط اليدوي + +الآن نوجه المحرك إلى ملف الصورة. يمكن أن يكون المسار مطلقًا أو نسبيًا؛ فقط تأكد من وجود الملف. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +تحقق سريعًا: افتح الصورة في عارض النظام. إذا كان النص غير واضح، فكر في ما قبل المعالجة (زيادة التباين، تدوير) قبل تمريرها إلى المحرك—هذه التعديلات غالبًا ما تحسن معدل نجاح **extract text from handwritten image**. + +## الخطوة 4: تشغيل عملية التعرف + +مع إعداد كل شيء، نطلب من المحرك الآن أداء مهمته. طريقة `recognize()` تُعيد كائن نتيجة يحتوي على السلسلة المستخرجة ودرجات الثقة. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +خلف الكواليس، يحول المحرك البت ماب إلى سلسلة من متجهات الخصائص، يمررها عبر شبكة LSTM، ثم يجمع الأحرف معًا. هذه هي سحر **python ocr handwritten recognition**. + +## الخطوة 5: عرض النص المستخرج + +كائن النتيجة يوفّر خاصية `.text` التي تحتوي على سلسلة Unicode صافية. اطبعها، احفظها في ملف، أو مرّرها إلى خط أنابيب آخر—اختيارك. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### النتيجة المتوقعة + +إذا كانت الصورة المصدر تحتوي على الملاحظة “Buy milk, eggs, and bread”، فستظهر لك شيء مثل: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +لاحظ كيف تحافظ النتيجة على علامات الترقيم وفواصل الأسطر (إن وجدت). إذا حصلت على نص غير مفهوم، أعد فحص جودة الصورة وعلامة `enable_handwritten_recognition`. + +## معالجة المشكلات الشائعة + +| المشكلة | العرض | الحل | +|---------|-------|------| +| درجات ثقة منخفضة | ظهور الكثير من “?” أو أحرف غير منطقية | زيادة DPI الصورة إلى ≥300، تطبيق التحويل إلى ثنائي (`opencv`)، أو قص المنطقة المطلوبة. | +| لغات مختلطة | المخرجات تمزج بين الإنجليزية ونص آخر | عيّن `engine.ocr_settings.language = "eng"` (أو أي رمز ISO آخر) قبل `recognize()`. | +| ملفات كبيرة | وقت معالجة طويل أو خطأ في الذاكرة | قلل أبعاد الصورة إلى حجم معقول (مثلاً عرض أقصى 1200 px) قبل التحميل. | +| عدم وجود Tesseract | `ImportError` أو `FileNotFoundError` | ثبّت Tesseract منفصلًا وتأكد من إضافته إلى PATH النظام. | + +هذه التعديلات تجعل سير عمل **convert handwritten photo to text** ثابتًا عبر مجموعات بيانات متنوعة. + +## البرنامج الكامل الذي يمكنك تشغيله الآن + +فيما يلي البرنامج الكامل المستقل الذي يجمع كل الأجزاء معًا. انسخه إلى ملف باسم `handwritten_ocr.py` وشغّله بالأمر `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +شغّله، وسترى النص يُطبع في الطرفية—بالضبط النتيجة التي تحتاجها عندما تريد **convert handwritten photo to text**. + +## التعمق أكثر + +الآن بعد أن أتقنت أساسيات **extract text from handwritten image**، فكر في الخطوات التالية: + +- **المعالجة الدفعية:** كرّر العملية على مجلد من الصور واحفظ كل نتيجة في ملف CSV. +- **ما بعد المعالجة:** استخدم تعبيرات نمطية لتنظيف الأخطاء الشائعة في OCR (مثل “1” مقابل “l”). +- **التكامل:** مرّر السلاسل المستخرجة إلى خط أنابيب معالجة لغة طبيعية لتحليل المشاعر أو استخراج الكلمات المفتاحية. +- **مكتبات بديلة:** إذا احتجت دقة أعلى، استكشف `easyocr` أو `pytesseract` مع نماذج LSTM مخصصة—كلاهما يدعم **python ocr handwritten recognition**. + +تذكر أن جودة الصورة المصدر غالبًا ما تحدد النجاح، لذا خصص بضع دقائق للمعالجة المسبقة. قليل من الجهد الآن يوفر عليك الكثير من التصحيح لاحقًا. + +## الخلاصة + +استعرضنا مثالًا كاملاً من البداية إلى النهاية يوضح **how to recognize handwritten text**، والأهم من ذلك **extract text from handwritten image** باستخدام Python. عبر تثبيت حزمة `ocr`، تفعيل وضع الخط اليدوي، تحميل الصورة، واستدعاء `recognize()`، يمكنك **convert handwritten photo to text** في بضع أسطر فقط. + +جرّبه على ملاحظاتك الخاصة، عدّل خطوات المعالجة المسبقة، ودع OCR يتولى العبء. إذا واجهت أي صعوبات، ارجع إلى جدول “معالجة المشكلات الشائعة” أو جرّب محركات OCR بديلة. برمجة سعيدة، ولتصبح بياناتك المكتوبة بخط اليد قابلة للبحث فورًا! + +## ما الذي يجب أن تتعلمه بعد ذلك؟ + +الدروس التالية تغطي مواضيع ذات صلة وثيقة تبني على التقنيات التي تم توضيحها في هذا الدليل. كل مصدر يتضمن أمثلة شيفرة كاملة مع شروحات خطوة بخطوة لمساعدتك على إتقان ميزات API إضافية واستكشاف أساليب تنفيذ بديلة في مشاريعك. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [How to Recognize Page Rectangles for OCR Text Recognition in Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [How to Use OCR - Recognize Image without Text Area Detection](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/arabic/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/arabic/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..f41f031f8 --- /dev/null +++ b/ocr/arabic/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,242 @@ +--- +category: general +date: 2026-06-06 +description: 'دليل مجموعة أحرف OCR: تعلم كيفية استخدام محرك OCR لسرد الأحرف المدعومة + للخطوط اللاتينية والسيريلية مع أمثلة بايثون كاملة.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: ar +og_description: 'دليل مجموعة أحرف OCR: اكتشف كيفية استخدام محرك OCR في بايثون لسرد + الأحرف المدعومة لمختلف النصوص، مع كود ونصائح كاملة.' +og_title: مجموعة أحرف OCR – استرجاع واستخدام محرك OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: مجموعة أحرف OCR – استرجاع واستخدام محرك OCR في بايثون +url: /ar/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# مجموعة أحرف OCR – استرجاع واستخدام محرك OCR في بايثون + +هل تريد معرفة **ocr character set** التي تدعمها مكتبتك؟ في هذا الدرس سنوضح لك كيفية **use OCR engine** لاستعلام الأحرف المدعومة لمختلف الأنظمة الكتابية، ولماذا هذا مهم للمشاريع الواقعية. + +إذا سبق لك أن حدقت في شاشة فارغة متسائلًا لماذا بعض الحروف لا تظهر أبدًا بعد معالجة OCR، فأنت لست وحدك. غالبًا ما يكون الجواب مخفيًا في مجموعة الأحرف التي يعرفها المحرك. بنهاية هذا الدليل ستتمكن من: + +* سرد كل حرف يمكن لمحرك OCR التعرف عليه لنظام كتابة معين. +* التعامل مع الحالات التي لا يكون فيها نظام الكتابة مدعومًا. +* دمج معلومات مجموعة الأحرف في التحقق اللاحق أو منطق واجهة المستخدم. + +لا حيل سحرية في صندوق أسود—فقط بايثون عادي، بضع أسطر من الشيفرة، وقليل من الشرح. + +--- + +## المتطلبات المسبقة + +قبل أن نبدأ، تأكد من أن لديك: + +* تثبيت Python 3.8+ (الكود يستخدم f‑strings). +* مكتبة OCR التي توفر `ocr.OcrEngine` — في هذا المثال سنفترض أن الحزمة المسماة `ocr` مثبتة بالفعل عبر `pip install ocr-lib`. +* إلمام أساسي بنظام الاستيراد في بايثون وطباعة القيم. + +إذا كنت تفتقد المكتبة، شغّل: + +```bash +pip install ocr-lib +``` + +هذا كل شيء—لا حاجة إلى ملفات ثنائية إضافية أو تبعيات أصلية لاستعلام مجموعة الأحرف الأساسية التي سنجريها. + +## الخطوة 1: تهيئة كائن محرك OCR + +إنشاء كائن المحرك هو أول شيء تقوم به عندما **use OCR engine**. فكر فيه كتشغيل الماسح؛ بدون ذلك لا يمكنك طرح أي أسئلة. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +فئة `OcrEngine` هي غلاف خفيف الوزن حول محرك التعرف الأساسي. لا يبدأ معالجة ثقيلة حتى تطلب شيئًا، لذا فإن تكلفة بدء التشغيل ضئيلة. + +> **نصيحة محترف:** إذا كان تطبيقك ينشئ العديد من كائنات المحرك، أعد استخدام كائن واحد. هذا يوفر الذاكرة ويقلل من زمن التهيئة. + +## الخطوة 2: استعلام مجموعة أحرف OCR لنظام كتابة محدد + +الآن بعد أن لدينا محركًا، يمكننا سؤاله عن الأحرف التي يعرفها لنظام كتابة معين. الطريقة `get_supported_characters(script_name)` تُرجع قائمة Python `list` من أحرف Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +تشغيل المقتطف أعلاه يطبع شيئًا مثل: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +المخرجات الدقيقة تعتمد على نسخة مكتبة OCR، لكنك ستحصل دائمًا على قائمة مسطحة من الأحرف. إذا كنت تحتاج إلى الأحرف الكبيرة والصغيرة معًا، ما عليك سوى طلب نظام `"Latin"` ثم تطبيق `.lower()` على كل عنصر، أو استعلام نظام منفصل `"Latin‑lower"` إذا كانت المكتبة تميزهما. + +### لماذا هذا مهم؟ + +عند تمرير صورة تحتوي على علامات تشكيل غير شائعة (مثل “ñ” أو “ø”)، قد يستبدل محرك OCR هذه الأحرف بصمت بمكان حامل أو يتجاهلها تمامًا. معرفة **ocr character set** مسبقًا يتيح لك التحقق المسبق من الإدخال، تحذير المستخدمين، أو اللجوء إلى محرك مختلف. + +## الخطوة 3: استرجاع الأحرف لنظام كتابة آخر – مثال على السيريلي + +تعمل نفس الطريقة مع أي نظام كتابة يدعي المحرك دعمه. لنرى ما يحدث مع السيريلي. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +مخرجات نموذجية: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +إذا كان المحرك **لا** يدعم النظام المطلوب، عادةً ما يرفع استثناء `ValueError` (أو يُرجع قائمة فارغة حسب التنفيذ). دعنا نحمي الكود من ذلك. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +## الخطوة 4: تصور مجموعة أحرف OCR (اختياري) + +أحيانًا تساعد الصورة السريعة، خاصةً عندما تحتاج إلى إظهار أصحاب المصلحة أي الأحرف مغطاة. أدناه مثال بسيط يستخدم `matplotlib` لرسم شبكة من الأحرف. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Image alt text:** ![مخطط مجموعة أحرف OCR](ocr_character_set.png){alt="نظرة عامة على مجموعة أحرف OCR"} + +المخطط ليس ضروريًا للحل الأساسي، لكنه يوضح كيف يمكنك **use OCR engine** للبيانات الوصفية بعيدًا عن الطباعة العادية. + +## الخطوة 5: دمج مجموعة الأحرف في سير العمل الواقعي + +الآن بعد أن يمكننا استرجاع **ocr character set**، لنرى سيناريو عملي: التحقق من مخرجات OCR قبل تخزينها في قاعدة البيانات. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +هذا النمط يمنع تسرب البيانات غير الصالحة إلى الأنابيب اللاحقة، وهو نقطة ألم شائعة عند التعامل مع مستندات متعددة اللغات. + +## الأخطاء الشائعة والحالات الحدية + +| المشكلة | سبب حدوثها | كيفية التجنب | +|---------|------------|--------------| +| **خطأ في اسم النظام الكتابي** (مثال: `"Cyrillic "` مع مساحة زائدة) | المحرك يتعامل مع السلسلة حرفيًا ولا يستطيع العثور على النظام الكتابي. | قم بإزالة المسافات: `script.strip()` قبل استدعاء `get_supported_characters`. | +| **قائمة أحرف فارغة** | بعض المحركات تعرض نظام كتابة لكنه لم يتم تحميل نموذج اللغة بعد. | استدعِ `engine.load_language_model(script)` إذا كانت المكتبة توفر هذه الطريقة، أو تأكد من وجود ملفات النموذج. | +| **مشكلات تطبيع Unicode** | قد تظهر أحرف مثل “é” مركبة (`\u00E9`) أو مفككة (`e\u0301`). | قم بتطبيع السلاسل باستخدام `unicodedata.normalize('NFC', text)` قبل التحقق. | +| **الأداء مع الأنظمة الكبيرة** (مثال: الصينية) | استرجاع آلاف الأحرف قد يكون بطيئًا. | احفظ النتيجة في الذاكرة بعد الاستدعاء الأول؛ معظم التطبيقات تحتاج القائمة مرة واحدة فقط. | + +## مثال كامل يعمل + +وضع كل شيء معًا، إليك سكربت واحد يمكنك نسخه ولصقه وتشغيله: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## What Should You Learn Next? + + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step-by-step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/arabic/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/arabic/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..6ba9649d1 --- /dev/null +++ b/ocr/arabic/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,262 @@ +--- +category: general +date: 2026-06-06 +description: قم بإجراء التعرف الضوئي على الحروف (OCR) على الصورة باستخدام Aspose OCR + ونموذج من Hugging Face. تعلم كيفية تنزيل نموذج Hugging Face، استخراج النص من الفاتورة، + وتحرير موارد وحدة معالجة الرسومات. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: ar +og_description: قم بإجراء التعرف الضوئي على الأحرف (OCR) على الصورة باستخدام Aspose + OCR ونموذج Hugging Face. يوضح هذا الدرس كيفية تنزيل النموذج، استخراج النص من الفاتورة، + وتحرير موارد وحدة معالجة الرسومات. +og_title: إجراء التعرف الضوئي على الأحرف في الصورة باستخدام Aspose OCR و LLM – دليل + كامل +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: إجراء التعرف الضوئي على الأحرف في الصورة باستخدام Aspose OCR و LLM – دليل شامل +url: /ar/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# تنفيذ OCR على الصورة باستخدام Aspose OCR و LLM – دليل كامل + +هل رغبت يومًا في **perform OCR on image** للملفات لكن شعرت بالحيرة عند سؤال “من أين أبدأ؟”؟ لست وحدك—العديد من المطورين يواجهون هذا العائق عندما يتعاملون لأول مرة مع أتمتة المستندات. الخبر السار هو أنه باستخدام Aspose OCR و LLM خفيف الوزن من Hugging Face، يمكنك تحويل مسح ضوئي خام لفاتورة إلى نص نظيف وقابل للبحث في بضع أسطر من بايثون فقط. + +في هذا الدرس سنستعرض كل ما تحتاجه: من **loading image for OCR**، إلى **downloading the Hugging Face model**، إلى **extracting text from invoice**، وأخيرًا **free GPU resources** حتى يبقى تطبيقك خفيفًا. بنهاية الدرس ستحصل على سكريبت مستقل يمكنك إدراجه في أي مشروع. + +--- + +## ما ستتعلمه + +- كيف تقوم بـ **perform OCR on image** باستخدام `OcrEngine` من Aspose. +- الخطوات الدقيقة لـ **download Hugging Face model** تلقائيًا. +- تقنيات لـ **extract text from invoice** من ملفات PDF أو PNG باستخدام معالجة ما بعد الذكاء الاصطناعي. +- أفضل الممارسات لـ **free GPU resources** بعد الاستدلال. +- نصائح لـ **load image for OCR** بفعالية وتجنب الأخطاء الشائعة. + +لا حاجة لأي وثائق خارجية—كل ما تحتاجه موجود هنا، مع الشيفرة الكاملة، الشروحات، والناتج المتوقع. + +--- + +## المتطلبات المسبقة + +قبل أن نبدأ، تأكد من وجود ما يلي: + +| المتطلب | السبب | +|-------------|--------| +| Python 3.9+ | بنية حديثة وتلميحات نوع | +| حزمة `asposeocr` (`pip install asposeocr`) | محرك OCR الأساسي | +| الوصول إلى GPU (اختياري لكن يُنصح به) | يسرّع معالج ما بعد الـ LLM | +| صورة فاتورة (`sample_invoice.png`) | حالة اختبار واقعية | + +إذا كان أي من هذه غير متوفر، قم بتثبيته الآن؛ سيقوم السكريبت أيضًا **download Hugging Face model** تلقائيًا، لذا لن تحتاج للبحث عن الملفات يدويًا. + +--- + +## الخطوة 1: Perform OCR on Image – إنشاء المحرك + +أول شيء تحتاج إلى فعله هو تشغيل محرك OCR من Aspose. فكر فيه كقماش فارغ سيتحول فيه الصورة إلى نص لاحقًا. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **لماذا هذا مهم:** `OcrEngine` يختصر كل عمليات ما قبل المعالجة منخفضة المستوى، لتتمكن من التركيز على سير العمل عالي المستوى. كما يوفر طريقة `set_post_processor` التي ستسمح لنا لاحقًا بربط LLM للحصول على مخرجات أذكى. + +--- + +## الخطوة 2: Load Image for OCR – اختيار الملف المناسب + +الآن بعد أن تم إنشاء المحرك، نحتاج إلى **load image for OCR**. يدعم Aspose صيغ PNG، JPG، TIFF، وعدد قليل من الصيغ الأخرى. تأكد أن المسار إما مطلق أو نسبى لموقع السكريبت. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **نصيحة:** إذا كانت الصورة كبيرة، فكر في تصغير حجمها مسبقًا لتقليل الضغط على الذاكرة. يستطيع محرك OCR التعامل مع مسحات عالية الدقة، لكن صورة بدقة 300 DPI عادةً ما تكون مثالية للفواتير. + +--- + +## الخطوة 3: Perform Raw OCR and View the Extracted Text + +مع تحميل الصورة، يمكننا الآن **perform OCR on image** ورؤية ما ينتجه المحرك الخام. هذه الخطوة تعطيك قاعدة قبل إضافة سحر الذكاء الاصطناعي. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**الناتج المتوقع (مقتطع):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +غالبًا ما يحتوي الناتج الخام على فواصل أسطر، أحرف غير صحيحة، أو حقول مفقودة—وهذا بالضبط السبب في أننا سنستدعي نموذج اللغة في الخطوة التالية. + +--- + +## الخطوة 4: Download Hugging Face Model – إعداد معالج ما بعد الـ LLM + +هنا يبرز دور خطوة **download Hugging Face model**. يمكن لـ Aspose AI سحب نموذج تلقائيًا من مستودع Hugging Face إذا لم يكن موجودًا على القرص. سنستخدم نموذج Qwen2.5‑3B‑Instruct‑GGUF، الذي يوازن بين الدقة وحجم الذاكرة. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **لماذا يعمل ذلك:** `allow_auto_download` يوفر عليك تحميل ملف `.gguf` يدويًا. التقليل الكمي (`int8`) يقلل حجم النموذج إلى حوالي 3 GB، مما يجعله قابلًا للتشغيل على معظم بطاقات الرسوميات الاستهلاكية. عدّل `gpu_layers` وفقًا لمواصفات جهازك—المزيد من الطبقات على الـ GPU يعني استدلال أسرع. + +--- + +## الخطوة 5: Extract Text from Invoice Using AI‑Enhanced Post‑Processing + +الآن نرفق الـ LLM بمحرك OCR وننفذ **post‑processor** ينظف الناتج الخام، يصحح أخطاء OCR، ويُنسق حقول الفاتورة بشكل جميل. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**نموذج ناتج مُحسّن:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **ماذا حدث؟** تعرف الـ LLM أن “Invoice #12345” يجب أن تكون “Invoice Number: 12345”، صحّح تنسيق التاريخ، وحتى استنتج حقل “Bill To” الذي فاته المحرك الخام. هذا هو جوهر أتمتة **extract text from invoice**. + +--- + +## الخطوة 6: Free GPU Resources – تنظيف بعد المعالجة + +إذا كنت تشغل هذا في خدمة طويلة العمر (مثل Flask API)، يجب عليك **free GPU resources** بعد كل استدلال لتجنب تعطل الذاكرة. توفر Aspose AI طريقة مباشرة لذلك. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **نصيحة احترافية:** استدعِ `free_resources()` داخل كتلة `finally:` إذا كنت تغلف استدعاء OCR بـ `try/except`. هذا يضمن التنظيف حتى في حال حدوث استثناء. + +--- + +## الخطوة 7: Full Script – جمع كل الأجزاء معًا + +فيما يلي السكريبت الكامل الجاهز للتنفيذ. انسخه، عدّل المسارات حسب حاجتك، وستكون جاهزًا. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +شغّل السكريبت وشاهد التحول من OCR صاخب إلى بيانات فاتورة نظيفة ومهيكلة. 🎉 + +--- + +## أسئلة شائعة وحالات طرفية + +| السؤال | الجواب | +|----------|--------| +| **ماذا لو فشل تحميل النموذج؟** | تأكد من أن جهازك متصل بالإنترنت وأن `hugging_face_repo_id` صحيح. يمكنك أيضًا تحميل الملف يدويًا. | +| **هل يمكن تشغيل النموذج بدون GPU؟** | نعم، لكن الأداء سيكون أبطأ؛ يمكنك ضبط `gpu_layers` إلى 0 لاستخدام CPU فقط. | +| **كيف أتعامل مع صور متعددة الصفحات؟** | كرّر عملية **load image for OCR** لكل صفحة، ثم اجمع النتائج قبل تمريرها إلى الـ LLM. | +| **هل يدعم Aspose OCR لغات أخرى؟** | يدعم العديد من اللغات؛ استخدم `set_language` لتحديد اللغة المطلوبة قبل الاستدلال. | + +--- + +## ماذا يجب أن تتعلمه بعد ذلك؟ + +الدروس التالية تغطي مواضيع ذات صلة وثيقة تبني على التقنيات التي تم توضيحها في هذا الدليل. كل مورد يتضمن أمثلة شفرة كاملة مع شروحات خطوة بخطوة لمساعدتك على إتقان ميزات API إضافية واستكشاف أساليب تنفيذ بديلة في مشاريعك. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/arabic/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/arabic/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..2f30056d5 --- /dev/null +++ b/ocr/arabic/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,284 @@ +--- +category: general +date: 2026-06-06 +description: التعرف على النص من الصورة باستخدام Aspose OCR – تعلم كيفية تحميل الصورة + للتعرف الضوئي على الحروف وتنفيذ التعرف الضوئي على الحروف على الصورة باستخدام المعالجة + اللاحقة بالذكاء الاصطناعي في بايثون. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: ar +og_description: التعرف على النص من الصورة بسرعة. يوضح هذا الدليل كيفية تحميل الصورة + للتعرف الضوئي على الأحرف، إجراء التعرف الضوئي على الأحرف على الصورة، وتعزيز النتائج + باستخدام المعالجة اللاحقة بالذكاء الاصطناعي. +og_title: التعرف على النص من الصورة باستخدام Aspose OCR والذكاء الاصطناعي +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: التعرف على النص من الصورة باستخدام Aspose OCR والذكاء الاصطناعي +url: /ar/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# التعرف على النص من الصورة باستخدام Aspose OCR & AI + +هل احتجت يوماً إلى التعرف على النص من صورة لكنك لم تكن متأكدًا أي مكتبة ستوفر لك السرعة والدقة معًا؟ لست وحدك. في هذا الدليل سنستعرض مثالًا كاملاً من البداية إلى النهاية يوضح **كيفية تحميل الصورة للـ OCR**، **إجراء الـ OCR على الصورة**، ثم تحسين النتيجة باستخدام معالج Aspose AI. في النهاية ستحصل على سكريبت جاهز للتنفيذ يحول ملف PNG إلى نص نظيف وقابل للبحث. + +## ما ستتعلمه + +سنغطي كل شيء بدءًا من تثبيت حزمة Aspose OCR حتى تحرير الموارد في نهاية التنفيذ. ستعرف لماذا يهم تمكين التعرف على النص المكتوب يدويًا، وكيفية ضبط نموذج Qwen 2.5 LLM للمعالجة اللاحقة، وما هو شكل المخرجات النهائية. لا تحتاج إلى مراجع خارجية—فقط انسخ، الصق، وشغّل. + +### المتطلبات المسبقة + +- Python 3.8 أو أحدث +- حزمة `asposeocr` (`pip install asposeocr`) +- ملف صورة (مثال: `doc.png`) يحتوي على نص مطبوع أو مكتوب يدويًا +- اختياري: وحدة معالجة رسومية (GPU) لتسريع استدلال الـ LLM (السكريبت يعمل على CPU أيضًا) + +--- + +## التعرف على النص من الصورة – خطوة بخطوة + +أسفل كل كتلة شفرة ستجد شرحًا قصيرًا **لماذا** نقوم بهذا الإجراء، وليس مجرد **ماذا** تفعل السطر. + +### الخطوة 1: تثبيت واستيراد الوحدات المطلوبة + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*لماذا؟* استيراد `asposeocr` يزودنا بفئة `OcrEngine`، بينما يوفر الموديل الفرعي `ai` معالجًا قائمًا على الـ LLM يحسن بشكل كبير مخرجات الـ OCR الخام. + +### الخطوة 2: إنشاء محرك OCR وتمكين التعرف على النص المكتوب يدويًا + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +تمكين التعرف على النص المكتوب يدويًا يوسع مجموعة الأحرف في المحرك، لذا لن تفقد الكتابات المتعرجة عندما **تجري OCR على ملفات الصور** التي تحتوي على نص مطبوع ومكتوب يدويًا معًا. + +### الخطوة 3: تحميل الصورة للـ OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +استدعاء `load_image` هو اللحظة التي **تحمل فيها الصورة للـ OCR**؛ إذا كان المسار غير صحيح، سيطرح المحرك استثناءً توضيحيًا، مما يحفظك من أخطاء غامضة لاحقًا. + +### الخطوة 4: تشغيل مرحلة الـ OCR الخام + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +في هذه المرحلة تحصل على كائن `RecognitionResult` يحتوي على النص غير المصفى، درجات الثقة، وبيانات تخطيط الصفحة. غالبًا ما يكون النص مشوشًا—ولهذا نحتاج إلى التنظيف المدعوم بالذكاء الاصطناعي. + +### الخطوة 5: ضبط نموذج Aspose AI للمعالجة اللاحقة باستخدام LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +لماذا نهتم بهذه الإعدادات؟ +- **auto‑download** يضمن توفر النموذج عند أول تشغيل. +- **int8 quantization** يقلل استهلاك الذاكرة دون خسارة كبيرة في الدقة. +- **gpu_layers** يسمح لك بالاستفادة من GPU متوافق لتسريع الاستدلال. + +### الخطوة 6: تهيئة معالج AI وربطه كمعالج لاحق + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +ربط المعالج يعني أنه في كل مرة تستدعي فيها `run_postprocessor`، سيقوم الـ LLM بتنظيف الإملاء، دمج الكلمات المتقطعة، وحتى استنتاج علامات الترقيم المفقودة. + +### الخطوة 7: تشغيل المعالج اللاحق لتحسين مخرجات الـ OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +عادةً ما يكون `enhanced_result.text` أكثر قابلية للقراءة من السلسلة الخام—فكر فيه كمدقق إملائي يفهم السياق أيضًا. + +### الخطوة 8: تحرير الموارد + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +تنظيف الموارد أمر حاسم في الخدمات طويلة التشغيل؛ وإلا ستسرب ذاكرة الـ GPU وتتعرض لتعطل التطبيق في النهاية. + +--- + +## السكريبت الكامل الذي يمكنك تشغيله اليوم + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**المخرجات المتوقعة** (مثال على صورة فاتورة بسيطة): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +لاحظ كيف صحّح طبقة AI كلمة “Inv0ice” → “Invoice” وأضاف علامات الترقيم المفقودة. + +--- + +## الأسئلة المتكررة (وأجوبة سريعة) + +- **هل أحتاج إلى GPU؟** لا. السكريبت ينتقل إلى CPU إذا لم يتوفر GPU، لكن الاستدلال سيكون أبطأ. +- **هل يمكنني استخدام نموذج LLM مختلف؟** بالطبع—فقط غيّر `hugging_face_repo_id` واضبط `gpu_layers` وفقًا لذلك. +- **ماذا لو كانت صورتي ضخمة؟** قم بتصغيرها أولًا (مثلاً باستخدام Pillow) لتقليل استهلاك الذاكرة. +- **هل التعرف على النص المكتوب يدويًا مفعل دائمًا؟** يمكنك تشغيل أو إيقاف `enable_handwritten_recognition` حسب احتياجاتك. + +--- + +## الخلاصة + +أصبحت الآن تعرف كيف **تتعرف على النص من الصورة** باستخدام Aspose OCR، وكيف **تحمل الصورة للـ OCR**، وكيف **تجري OCR على الصورة** مع معالجة لاحقة محسنة بالذكاء الاصطناعي. المثال الكامل القابل للتنفيذ أعلاه يمنحك أساسًا قويًا لدمج OCR في أي مشروع بايثون—سواء كان مسح إيصالات، رقمنة عقود، أو استخراج بيانات من نماذج مكتوبة يدويًا. + +هل أنت مستعد للخطوة التالية؟ جرّب استبدال نموذج Qwen بنموذج أكبر، جرب مخططات تقليل مختلفة، أو ربط عدة صور معًا للمعالجة الدفعة. الاحتمالات لا حصر لها، والكود الذي أنشأته سيتعامل معها بسلاسة. + +برمجة سعيدة، ولتكن نتائج OCR دائمًا واضحة كالكريستال! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="لقطة شاشة تُظهر كيفية التعرف على النص من الصورة باستخدام Aspose OCR"} + +## ماذا يجب أن تتعلم بعد ذلك؟ + +الدروس التالية تغطي مواضيع ذات صلة وثيقة تبني على التقنيات التي تم توضيحها في هذا الدليل. كل مورد يتضمن أمثلة شاملة مع شروح خطوة بخطوة لمساعدتك على إتقان ميزات API إضافية واستكشاف أساليب تنفيذ بديلة في مشاريعك. + +- [استخراج النص من الصورة باستخدام Aspose OCR – دليل خطوة بخطوة](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [تحويل الصورة إلى نص – إجراء OCR على صورة من URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [استخراج نص الصورة بلغة C# مع اختيار اللغة باستخدام Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/chinese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/chinese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..07ecdcb2b --- /dev/null +++ b/ocr/chinese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,210 @@ +--- +category: general +date: 2026-06-06 +description: 使用 Python OCR 从手写图像中提取文本。学习如何快速可靠地将手写照片转换为文本。 +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: zh +og_description: 使用 Python 从手写图像中提取文本。本指南展示如何将手写照片转换为文本,并解答如何识别手写文本。 +og_title: 从手写图像中提取文本 – Python OCR 教程 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: 使用 Python OCR 从手写图像提取文本 – 步骤指南 +url: /zh/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 使用 Python OCR 从手写图像提取文本 – 步骤指南 + +是否曾好奇 **如何识别手机拍摄的手写文字**?你并不孤单。在许多项目中——无论是数字化课堂笔记还是从签名表单中提取数据——你都需要 **快速、轻松地从手写图像中提取文本**。 + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## 您需要准备的内容 + +在开始之前,请确保具备以下条件: + +| Requirement | Why it matters | +|-------------|----------------| +| Python 3.9 或更高版本 | 支持现代语法和库 | +| `pip`(Python 包管理器) | 用于安装 OCR 包 | +| 清晰的手写笔记图像(JPEG/PNG) | 模糊图像会降低 OCR 准确度 | +| 对 Python 函数的基本了解 | 方便后续自行改写示例 | + +如果缺少上述任意项,请前往 下载最新的 Python 并安装——过程非常简单。 + +## 安装 Python OCR 库 + +我们将使用 `ocr` 包(对 Tesseract 的轻量封装并提供实用设置)。使用以下命令一次性安装: + +```bash +pip install ocr +``` + +> **Pro tip:** 安装完成后,在终端运行 `tesseract --version` 以确认底层引擎已就绪。如果系统中没有 Tesseract,请参考官方指南进行安装——大多数包管理器都有对应的包(Ubuntu 上 `apt-get install tesseract-ocr`,macOS 上 `brew install tesseract`)。 + +## 第一步:创建 OCR 引擎实例 + +创建引擎是 **python ocr handwritten recognition** 的第一块砖。可以把引擎想象成稍后读取涂鸦的大脑。 + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +为什么重要:没有引擎就无法微调识别流水线。默认设置针对印刷体文本进行优化,我们需要在下一步进行调整。 + +## 第二步:启用手写识别 + +默认情况下,引擎假设是印刷字符。开启手写模式会切换一个开关,告诉 Tesseract 使用其针对连笔笔画训练的 LSTM 模型。 + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **如果跳过这一步会怎样?** OCR 会把笔画当作噪声,导致输出乱码。开启该标志是 **如何识别手写文字** 的核心。 + +## 第三步:加载手写照片 + +现在把引擎指向图像文件。路径可以是绝对路径也可以是相对路径,只要确保文件真实存在即可。 + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +快速检查:在操作系统的图片查看器中打开该图像。如果文字模糊,建议在送入引擎前进行预处理(提升对比度、旋转校正),这些技巧常能提升 **extract text from handwritten image** 的成功率。 + +## 第四步:运行识别过程 + +一切就绪后,调用引擎执行识别。`recognize()` 方法返回一个结果对象,包含提取的字符串和置信度分数。 + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +在幕后,引擎会把位图转换为特征向量序列,送入 LSTM 网络,并将字符拼接起来。这就是 **python ocr handwritten recognition** 的魔法所在。 + +## 第五步:显示提取的文本 + +结果对象提供 `.text` 属性,里面是普通的 Unicode 字符串。你可以打印它、写入文件,或传递给其他流水线——随你决定。 + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### 预期输出 + +如果源图像中的笔记是 “Buy milk, eggs, and bread”,则可能得到如下输出: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +请注意,输出会保留标点和换行(如果有的话)。如果出现乱码,请再次检查图像质量以及 `enable_handwritten_recognition` 标志是否已开启。 + +## 常见问题处理 + +| Issue | Symptom | Fix | +|-------|---------|-----| +| 低置信度分数 | 出现大量 “?” 或无意义字符 | 将图像 DPI 提升至 ≥300,使用二值化(`opencv`),或裁剪至感兴趣区域。 | +| 混合语言 | 输出中混杂英文和其他脚本 | 在 `recognize()` 前设置 `engine.ocr_settings.language = "eng"`(或其他 ISO 代码)。 | +| 文件过大 | 处理时间长或出现内存错误 | 在加载前将图像缩放至合理尺寸(例如最大宽度 1200 px)。 | +| 缺少 Tesseract | `ImportError` 或 `FileNotFoundError` | 单独安装 Tesseract 并确保其路径已加入系统 PATH。 | + +这些调整可以让你的 **convert handwritten photo to text** 工作流在各种数据集上保持稳健。 + +## 完整脚本(可直接运行) + +下面是完整的、独立的程序示例,整合了上述所有步骤。将其复制到名为 `handwritten_ocr.py` 的文件中,然后执行 `python handwritten_ocr.py`。 + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +运行后,你将在控制台看到文本输出——这正是当你想 **convert handwritten photo to text** 时所需要的结果。 + +## 进一步探索 + +在掌握了 **extract text from handwritten image** 的基础后,你可以尝试以下进阶方向: + +- **批量处理:**遍历文件夹中的所有图像,并将每个结果保存到 CSV 文件中。 +- **后处理:**使用正则表达式清理常见的 OCR 错误(如 “1” 与 “l” 的混淆)。 +- **集成:**将提取的字符串送入自然语言处理流水线进行情感分析或关键词抽取。 +- **替代库:**如果需要更高的准确率,可探索 `easyocr` 或 `pytesseract` 并使用自定义 LSTM 模型——它们同样支持 **python ocr handwritten recognition**。 + +记住,源图像的质量往往决定成功与否,花几分钟进行预处理可以为后续节省大量调试时间。 + +## 结论 + +我们完整演示了一个端到端的示例,展示了 **如何识别手写文字**,更重要的是,如何使用 Python **从手写图像中提取文本**。只需安装 `ocr` 包、打开手写标志、加载图片并调用 `recognize()`,就能在几行代码内 **convert handwritten photo to text**。 + +尝试在自己的笔记上运行,调节预处理步骤,让 OCR 为你完成繁重的工作。如果遇到障碍,回顾 “常见问题处理” 表格或尝试其他 OCR 后端。祝编码愉快,愿你的手写数据瞬间可检索! + +## 接下来该学习什么? + +以下教程涵盖了与本指南技术紧密相关的主题,帮助你进一步掌握 API 功能并探索替代实现方式: + +- [使用 Aspose OCR 从图像中提取文本 – 步骤指南](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [在 Aspose.OCR 中为 OCR 文本识别准备页面矩形](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [如何使用 OCR - 在不检测文本区域的情况下识别图像](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/chinese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/chinese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..04852ce75 --- /dev/null +++ b/ocr/chinese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,239 @@ +--- +category: general +date: 2026-06-06 +description: OCR字符集指南:学习如何使用OCR引擎列出拉丁文和西里尔文脚本的支持字符,附完整的Python示例。 +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: zh +og_description: OCR字符集指南:了解如何在Python中使用OCR引擎列出各种文字的支持字符,附带代码和技巧。 +og_title: OCR字符集 – 检索并使用OCR引擎 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR字符集 – 在Python中检索和使用OCR引擎 +url: /zh/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR字符集 – 在Python中检索和使用OCR引擎 + +想了解您的库支持的 **ocr character set** 吗?在本教程中,我们将向您展示如何 **use OCR engine** 查询不同脚本的支持字符,以及这对实际项目为何重要。 + +如果您曾经盯着空白屏幕,疑惑为何某些字母在 OCR 处理后从未出现,您并不孤单。答案常常隐藏在引擎所了解的字符集中。阅读完本指南,您将能够: + +* 列出 OCR 引擎在给定脚本下能够识别的所有字符。 +* 处理脚本不受支持的情况。 +* 将字符集信息嵌入下游验证或 UI 逻辑中。 + +没有神奇的黑盒技巧——只需普通的 Python、几行代码以及一点说明。 + +--- + +## 前置条件 + +在开始之前,请确保您已具备以下条件: + +* 已安装 Python 3.8+(代码使用 f‑strings)。 +* 提供 `ocr.OcrEngine` 的 OCR 库——在本示例中,我们假设已通过 `pip install ocr-lib` 安装了名为 `ocr` 的包。 +* 对 Python 的导入系统和打印有基本了解。 + +如果缺少该库,请运行: + +```bash +pip install ocr-lib +``` + +就是这样——进行基本字符集查询无需额外的二进制文件或本地依赖。 + +## 步骤 1:初始化 OCR 引擎实例 + +创建引擎对象是使用 **use OCR engine** 功能的第一步。可以将其视为打开扫描仪;没有它,您无法提出任何查询。 + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +`OcrEngine` 类是底层识别引擎的轻量包装器。它在您请求之前不会启动繁重的处理,因此启动成本可以忽略不计。 + +> **技巧提示:** 如果您的应用创建了多个引擎实例,请复用同一个实例。这可以节省内存并降低初始化延迟。 + +## 步骤 2:查询特定脚本的 OCR 字符集 + +现在我们已有引擎,可以询问它对特定书写系统了解哪些字符。方法 `get_supported_characters(script_name)` 返回一个 Python `list`,其中包含 Unicode 字符。 + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +运行上述代码片段会输出类似以下内容: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +确切的输出取决于 OCR 库的版本,但您始终会得到一个扁平的字符列表。如果需要大写和小写两者,只需请求 `"Latin"` 脚本并对每个元素使用 `.lower()`,或者如果库区分,则查询单独的 `"Latin‑lower"` 脚本。 + +### 为什么这很重要? + +当您提供包含不常见变音符号的图像(例如 “ñ” 或 “ø”)时,OCR 引擎可能会悄悄将其替换为占位符或直接丢弃。提前了解 **ocr character set** 可以让您预先验证输入、警告用户或回退到其他引擎。 + +## 步骤 3:检索另一个脚本的字符——以西里尔字母为例 + +相同的方法适用于引擎声称支持的任何脚本。让我们看看西里尔字母的情况。 + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +典型输出: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +如果引擎 **不** 支持请求的脚本,通常会抛出 `ValueError`(或根据实现返回空列表)。让我们对此进行防护。 + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +## 步骤 4:可视化 OCR 字符集(可选) + +有时快速的可视化会有所帮助,尤其是当您需要向利益相关者展示覆盖了哪些字符时。下面是一个使用 `matplotlib` 渲染字符网格的最小示例。 + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **图片说明:** ![OCR字符集图示](ocr_character_set.png){alt="OCR字符集概览"} + +该图示并非核心解决方案所必需,但它演示了如何 **use OCR engine** 元数据超出普通打印的使用方式。 + +## 步骤 5:将字符集集成到实际工作流中 + +现在我们能够检索 **ocr character set**,让我们看一个实际场景:在将 OCR 输出存入数据库之前进行验证。 + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +此模式可防止垃圾数据进入下游管道,这是处理多语言文档时的常见痛点。 + +## 常见陷阱与边缘情况 + +| 陷阱 | 原因 | 避免方法 | +|------|------|----------| +| **脚本名称拼写错误**(例如带有尾随空格的 `"Cyrillic "`) | 引擎会字面匹配该字符串,导致找不到脚本。 | 在调用 `get_supported_characters` 前使用 `script.strip()` 去除空白。 | +| **字符列表为空** | 某些引擎虽然公开了脚本,但尚未加载语言模型。 | 如果库提供此方法,调用 `engine.load_language_model(script)`,或确保模型文件已存在。 | +| **Unicode 正规化问题** | 字符如 “é” 可能以组合形式 (`\u00E9`) 或分解形式 (`e\u0301`) 出现。 | 在验证前使用 `unicodedata.normalize('NFC', text)` 对字符串进行正规化。 | +| **大型脚本的性能问题**(例如中文) | 检索数千个字符可能会很慢。 | 首次调用后缓存结果;大多数应用只需要一次列表。 | + +## 完整工作示例 + +将所有内容整合在一起,以下是一个可以复制粘贴并运行的完整脚本: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## 接下来应该学习什么? + +以下教程涵盖与本指南演示的技术密切相关的主题。每个资源都包含完整的可运行代码示例和逐步说明,帮助您掌握更多 API 功能并在项目中探索替代实现方案。 + +- [Aspose OCR 教程 – 光学字符识别](/ocr/english/) +- [OCR 后处理 – 获取字符选项](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [如何在 OCR 图像识别中设置阈值](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/chinese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/chinese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..53b057fc5 --- /dev/null +++ b/ocr/chinese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,254 @@ +--- +category: general +date: 2026-06-06 +description: 使用 Aspose OCR 和 Hugging Face 模型对图像进行 OCR。学习如何下载 Hugging Face 模型、从发票中提取文本以及释放 + GPU 资源。 +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: zh +og_description: 使用 Aspose OCR 和 Hugging Face 模型对图像进行 OCR。本教程展示了如何下载模型、从发票中提取文本以及释放 + GPU 资源。 +og_title: 使用 Aspose OCR 与 LLM 对图像执行 OCR – 完整指南 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: 使用 Aspose OCR 与 LLM 对图像进行 OCR – 完整指南 +url: /zh/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 使用 Aspose OCR 与 LLM 对图像执行 OCR – 完整指南 + +是否曾想 **对图像文件执行 OCR**,却卡在“从哪里开始?”的问题上?你并不孤单——许多开发者在首次接触文档自动化时都会遇到这道墙。好消息是,借助 Aspose OCR 和 Hugging Face 的轻量级 LLM,你只需几行 Python 代码,就能将发票的原始扫描件转换为干净、可搜索的文本。 + +在本教程中,我们将一步步演示你需要的全部内容:从 **加载图像进行 OCR**,到 **自动下载 Hugging Face 模型**,再到 **从发票中提取文本**,最后 **释放 GPU 资源**,让你的应用保持轻量。完成后,你将拥有一个可直接放入任何项目的独立脚本。 + +--- + +## 你将学到的内容 + +- 如何使用 Aspose 的 `OcrEngine` **对图像执行 OCR**。 +- 自动 **下载 Hugging Face 模型** 文件的完整步骤。 +- 使用 AI 增强的后处理技术 **从发票 PDF 或 PNG 中提取文本**。 +- 推理结束后 **释放 GPU 资源** 的最佳实践。 +- 高效 **加载图像进行 OCR** 的技巧,避免常见陷阱。 + +无需外部文档——所有代码、解释和预期输出都在这里。 + +--- + +## 前置条件 + +在开始之前,请确保你具备以下条件: + +| 要求 | 原因 | +|------|------| +| Python 3.9+ | 支持现代语法和类型提示 | +| `asposeocr` 包(`pip install asposeocr`) | 核心 OCR 引擎 | +| GPU 访问权限(可选但推荐) | 加速 LLM 后处理 | +| 发票图像(`sample_invoice.png`) | 真实案例测试 | + +如果缺少任何项,请立即安装;脚本还会 **自动下载 Hugging Face 模型**,无需你自行寻找文件。 + +--- + +## 步骤 1:执行 OCR – 创建引擎 + +首先需要实例化 Aspose 的 OCR 引擎。可以把它想象成一块空白画布,稍后图像会被“绘制”为文本。 + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **为什么重要:** `OcrEngine` 把所有低层图像预处理都封装起来,让你专注于更高层的工作流。它还提供 `set_post_processor` 方法,后续可挂载 LLM 实现更智能的输出。 + +--- + +## 步骤 2:加载图像进行 OCR – 选择正确的文件 + +引擎创建好后,需要 **加载图像进行 OCR**。Aspose 支持 PNG、JPG、TIFF 等多种格式。请确保路径是绝对路径或相对于脚本的位置。 + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **提示:** 如果图像体积较大,建议先缩放以降低内存压力。OCR 引擎可以处理高分辨率扫描,但 300 DPI 的图像通常是发票的最佳选择。 + +--- + +## 步骤 3:执行原始 OCR 并查看提取的文本 + +图像加载完成后,终于可以 **对图像执行 OCR**,并查看引擎的原始输出。这一步为后续加入 AI 做基线。 + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**预期输出(截断示例):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +原始输出往往包含换行、误识别字符或缺失字段——这正是我们随后引入语言模型的原因。 + +--- + +## 步骤 4:下载 Hugging Face 模型 – 配置 LLM 后处理器 + +这里展示 **下载 Hugging Face 模型** 的关键步骤。Aspose AI 能在本地不存在模型时自动从 Hugging Face Hub 拉取。我们使用 Qwen2.5‑3B‑Instruct‑GGUF 模型,它在准确度与内存占用之间取得了平衡。 + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **为什么可行:** `allow_auto_download` 免去了手动下载 `.gguf` 文件的麻烦。量化为 `int8` 后模型大小约为 3 GB,能够在大多数消费级 GPU 上运行。根据硬件情况调整 `gpu_layers`——GPU 上的层数越多,推理越快。 + +--- + +## 步骤 5:使用 AI 增强的后处理从发票中提取文本 + +现在将 LLM 挂载到 OCR 引擎,运行 **后处理器**,对原始输出进行清洗、纠错并将发票字段格式化。 + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**示例增强输出:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **发生了什么?** LLM 将 “Invoice #12345” 识别为 “Invoice Number: 12345”,修正了日期格式,甚至推断出了原始引擎漏掉的 “Bill To” 字段。这正是 **从发票中提取文本** 自动化的核心。 + +--- + +## 步骤 6:释放 GPU 资源 – 处理完毕后的清理 + +如果你在长期运行的服务(如 Flask API)中使用本脚本,必须在每次推理后 **释放 GPU 资源**,否则会出现显存不足的崩溃。Aspose AI 提供了简洁的清理方法。 + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **专业提示:** 将 `free_resources()` 放在 `finally:` 块中,即使在 `try/except` 捕获异常后也能保证资源被释放。 + +--- + +## 步骤 7:完整脚本 – 整合所有步骤 + +下面是完整、可直接运行的脚本。复制粘贴后,修改路径即可使用。 + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +运行脚本,观察从嘈杂的 OCR 输出到整洁结构化发票数据的转变。 🎉 + +--- + +## 常见问题与边缘情况 + +| 问题 | 解答 | +|------|------| +| **模型下载失败怎么办?** | 确保机器已连接互联网且 `hugging_face_repo_id` 正确。你也可以手动下载的 | + +## 接下来该学习什么? + +以下教程与本指南紧密相关,帮助你进一步掌握 API 功能并探索其他实现方式,每篇都包含完整可运行的代码示例和逐步解释。 + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/chinese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/chinese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..ea04fd364 --- /dev/null +++ b/ocr/chinese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,280 @@ +--- +category: general +date: 2026-06-06 +description: 使用 Aspose OCR 识别图像中的文本——学习如何加载图像进行 OCR,并在 Python 中使用 AI 后处理执行图像 OCR。 +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: zh +og_description: 快速识别图像中的文字。本指南展示了如何加载图像进行 OCR、在图像上执行 OCR,以及通过 AI 后处理提升结果。 +og_title: 使用 Aspose OCR 与 AI 识别图像中的文本 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: 使用 Aspose OCR 与 AI 识别图像中的文本 +url: /zh/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 使用 Aspose OCR 与 AI 识别图像中的文本 + +是否曾经需要从图像中识别文本,却不确定哪个库既快又准?你并不孤单。在本指南中,我们将完整演示一个端到端的示例,展示 **如何加载图像进行 OCR**、**如何在图像上执行 OCR**,并使用 Aspose 的 AI 后处理器对输出进行润色。完成后,你将拥有一个可直接运行的脚本,将 PNG 转换为干净、可搜索的文本。 + +## 你将学到的内容 + +我们将从安装 Aspose OCR 包到运行结束时释放资源全部覆盖。你会了解为何启用手写文本识别很重要,如何为后处理配置 Qwen 2.5 LLM,以及最终输出的样子。无需外部引用——只需复制、粘贴并运行。 + +### 前置条件 + +- Python 3.8 或更高版本 +- `asposeocr` 包(`pip install asposeocr`) +- 包含印刷或手写文本的图像文件(例如 `doc.png`) +- 可选:用于加速 LLM 推理的 GPU(脚本也可在 CPU 上运行) + +--- + +## 从图像识别文本 – 步骤详解 + +在每个代码块下方,你会看到对我们执行该操作的 **原因** 的简短说明,而不仅仅是 **代码做了什么**。 + +### 步骤 1:安装并导入所需模块 + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*为什么?* 导入 `asposeocr` 可获得 `OcrEngine` 类,而 `ai` 子模块提供基于 LLM 的后处理器,显著提升原始 OCR 输出。 + +### 步骤 2:创建 OCR 引擎并启用手写文本识别 + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +启用手写识别会扩展引擎的字符集,这样在对包含印刷体和手写体混合的图像文件 **执行 OCR** 时,就不会丢失涂鸦。 + +### 步骤 3:加载图像进行 OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +`load_image` 调用即为 **加载图像进行 OCR** 的时刻;如果路径错误,引擎会抛出详细异常,帮助你避免后续难以理解的错误。 + +### 步骤 4:执行原始 OCR 过程 + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +此时你会得到一个 `RecognitionResult` 对象,包含未过滤的文本、置信度分数和布局元数据。结果通常噪声较多——因此需要 AI 驱动的清理。 + +### 步骤 5:为 LLM 后处理配置 Aspose AI 模型 + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +为什么要设置这些? +- **auto‑download** 确保模型在首次运行时即可下载。 +- **int8 quantization** 大幅降低内存需求,且对精度影响不大。 +- **gpu_layers** 让你利用兼容的 GPU 加速推理。 + +### 步骤 6:初始化 AI 处理器并将其作为后处理器附加 + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +附加处理器后,每次调用 `run_postprocessor` 时,LLM 都会纠正拼写、合并断开的单词,甚至推断缺失的标点。 + +### 步骤 7:运行后处理器以提升 OCR 输出 + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` 通常比原始字符串更易读——可以把它看作既能纠错又能理解上下文的拼写检查器。 + +### 步骤 8:释放资源 + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +在长期运行的服务中清理资源至关重要;否则会导致 GPU 内存泄漏,最终使应用崩溃。 + +--- + +## 你今天即可运行的完整脚本 + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**预期输出**(简单发票图像示例): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +请注意 AI 层将 “Inv0ice” 修正为 “Invoice”,并补全了缺失的标点。 + +--- + +## 常见问题(以及快速答案) + +- **我需要 GPU 吗?** 不需要。脚本会回退到 CPU,但推理速度会更慢。 +- **我可以使用其他 LLM 吗?** 当然——只需更改 `hugging_face_repo_id` 并相应调整 `gpu_layers`。 +- **如果我的图像很大怎么办?** 首先对其进行缩放(例如使用 Pillow),以保持内存使用在合理范围。 +- **手写识别是否始终开启?** 你可以根据工作负载切换 `enable_handwritten_recognition`。 + +--- + +## 结论 + +现在你已经了解如何使用 Aspose OCR **识别图像中的文本**、如何 **加载图像进行 OCR**,以及如何通过 AI 增强的后处理 **在图像上执行 OCR**。上面的完整可运行示例为你在任何 Python 项目中集成 OCR 打下了坚实基础——无论是扫描收据、数字化合同,还是提取手写表单中的数据。 + +准备好下一步了吗?尝试将 Qwen 模型替换为更大的模型,实验不同的量化方案,或将多张图像串联进行批处理。可能性无限,而你刚编写的代码能够优雅地应对这些场景。 + +祝编码愉快,愿你的 OCR 结果始终清晰如晶! + +![展示增强 OCR 输出的 Python 控制台截图](/images/ocr_output.png){alt="展示如何使用 Aspose OCR 识别图像文本的截图"} + +## 接下来你应该学习什么? + +以下教程涵盖与本指南技术密切相关的主题,构建在所示技巧之上。每个资源都提供完整的可运行代码示例和逐步解释,帮助你掌握更多 API 功能并在项目中探索替代实现方案。 + +- [使用 Aspose OCR 从图像提取文本 – 步骤指南](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [将图像转换为文本 – 从 URL 执行 OCR](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [使用 Aspose.OCR 的语言选择提取图像文本(C#)](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/czech/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/czech/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..e3856274d --- /dev/null +++ b/ocr/czech/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,216 @@ +--- +category: general +date: 2026-06-06 +description: Extrahujte text z ručně psaného obrázku pomocí Python OCR. Naučte se, + jak rychle a spolehlivě převést ručně psanou fotografii na text. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: cs +og_description: Extrahujte text z ručně psaného obrázku pomocí Pythonu. Tento průvodce + ukazuje, jak převést ručně psanou fotografii na text a odpovídá na otázku, jak rozpoznat + ručně psaný text. +og_title: Extrahovat text z ručně psaného obrázku – Python OCR tutoriál +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Extrahujte text z ručně psaného obrázku pomocí Python OCR – krok za krokem + průvodce +url: /cs/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Extrahování textu z ručně psaného obrázku pomocí Python OCR – krok za krokem + +Už jste se někdy zamýšleli **jak rozpoznat ručně psaný text** na fotografii, kterou jste pořítili telefonem? Nejste v tom sami. V mnoha projektech – ať už jde o digitalizaci poznámek z přednášek nebo získávání dat z podepsaných formulářů – potřebujete **extrahovat text z ručně psaného obrázku** rychle a bez zbytečných komplikací. + +V tomto tutoriálu projdeme kompletní, připravený příklad, který vám ukáže, jak **převést ručně psanou fotografii na text** pomocí populární Python OCR knihovny. Žádné vágní odkazy, jen konkrétní kód, vysvětlení a tipy, které můžete dnes zkopírovat a vložit. + +![extrahování textu z ručně psaného obrázku](https://example.com/placeholder-handwritten.jpg "extrahování textu z ručně psaného obrázku") + +## Co budete potřebovat + +Než se pustíme do práce, ujistěte se, že máte následující: + +| Požadavek | Proč je důležitý | +|-----------|-------------------| +| Python 3.9 nebo novější | Moderní syntaxe a podpora knihoven | +| `pip` (správce balíčků pro Python) | Pro instalaci OCR balíčku | +| Jasný obrázek ručně psaných poznámek (JPEG/PNG) | Přesnost OCR klesá u rozmazaných obrázků | +| Základní znalost Python funkcí | Pomůže vám později přizpůsobit příklad | + +Pokud vám něco chybí, stáhněte si nejnovější Python z a nainstalujte jej – žádné komplikace. + +## Instalace Python OCR knihovny + +Úryvek kódu, který použijeme, závisí na balíčku `ocr` (tenký wrapper kolem Tesseractu, který přidává užitečná nastavení). Nainstalujte jej jedním příkazem: + +```bash +pip install ocr +``` + +> **Tip:** Po instalaci spusťte `tesseract --version` v terminálu, abyste ověřili, že je podkladový engine přítomen. Pokud Tesseract nemáte, postupujte podle oficiálního průvodce pro váš OS – většina správce balíčků jej obsahuje (`apt-get install tesseract-ocr` na Ubuntu, `brew install tesseract` na macOS). + +## Krok 1: Vytvoření instance OCR enginu + +Vytvoření enginu je první cihla ve zdi **python ocr handwritten recognition**. Představte si engine jako mozek, který bude později číst vaše škrábance. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Proč je to důležité: bez enginu nemůžete ladit rozpoznávací pipeline. Výchozí nastavení jsou optimalizována pro tištěný text, takže je v dalším kroku potřebné je upravit. + +## Krok 2: Aktivace rozpoznávání ručně psaného textu + +Ve výchozím nastavení engine předpokládá tištěné znaky. Aktivace režimu pro ručně psaný text přepne přepínač, který řekne Tesseractu, aby použil svůj LSTM model trénovaný na kurzivní tahy. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Co se stane, když to přeskočíte?** OCR bude považovat tahy za šum a výstup bude nečitelný. Aktivace tohoto příznaku je jádrem **jak rozpoznat ručně psaný text**. + +## Krok 3: Načtení vaší ručně psané fotografie + +Nyní nasměrujeme engine na soubor s obrázkem. Cesta může být absolutní i relativní; jen se ujistěte, že soubor existuje. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Rychlá kontrola: otevřete obrázek ve výchozím prohlížeči systému. Pokud je text rozmazaný, zvažte předzpracování (zvýšení kontrastu, otočení) před předáním enginu – tyto úpravy často zvyšují úspěšnost **extrahování textu z ručně psaného obrázku**. + +## Krok 4: Spuštění rozpoznávacího procesu + +Vše je připraveno, a tak nakonec požádáme engine, aby odvedl svou práci. Metoda `recognize()` vrací objekt výsledku, který obsahuje extrahovaný řetězec a skóre důvěry. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Za scénou engine převádí bitmapu na sérii feature vektorů, projde je LSTM sítí a spojí znaky dohromady. To je kouzlo **python ocr handwritten recognition**. + +## Krok 5: Zobrazení extrahovaného textu + +Objekt výsledku má atribut `.text`, který obsahuje prostý Unicode řetězec. Vytiskněte jej, zapište do souboru nebo pošlete do dalšího pipeline – jak chcete. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Očekávaný výstup + +Pokud zdrojový obrázek obsahuje poznámku „Buy milk, eggs, and bread“, uvidíte něco jako: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Všimněte si, že výstup zachovává interpunkci a zalomení řádků (pokud jsou). Pokud dostanete nesmyslný text, zkontrolujte kvalitu obrázku a příznak `enable_handwritten_recognition`. + +## Řešení běžných problémů + +| Problém | Příznak | Řešení | +|---------|---------|--------| +| Nízké skóre důvěry | Mnoho „?“ nebo nesmyslných znaků | Zvyšte DPI obrázku na ≥300, aplikujte binarizaci (`opencv`) nebo ořízněte oblast zájmu. | +| Smíšené jazyky | Výstup kombinuje angličtinu s jiným skriptem | Nastavte `engine.ocr_settings.language = "eng"` (nebo jiný ISO kód) před voláním `recognize()`. | +| Velké soubory | Dlouhá doba zpracování nebo chyba paměti | Zmenšete rozměry obrázku na rozumnou velikost (např. max. šířka 1200 px) před načtením. | +| Chybějící Tesseract | `ImportError` nebo `FileNotFoundError` | Nainstalujte Tesseract samostatně a ujistěte se, že je v systémové PATH. | + +Tyto úpravy udrží váš workflow **convert handwritten photo to text** robustní napříč různými datovými sadami. + +## Kompletní skript, který můžete spustit dnes + +Níže je kompletní, samostatný program, který spojuje všechny části dohromady. Zkopírujte jej do souboru pojmenovaného `handwritten_ocr.py` a spusťte `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Spusťte jej a uvidíte text vytištěný v konzoli – přesně výsledek, který potřebujete, když chcete **convert handwritten photo to text**. + +## Kam dál + +Nyní, když ovládáte základy **extrahování textu z ručně psaného obrázku**, zvažte následující kroky: + +- **Dávkové zpracování:** Procházejte složku obrázků a ukládejte každý výsledek do CSV souboru. +- **Post‑zpracování:** Použijte regulární výrazy k vyčištění častých OCR chyb (např. „1“ vs „l“). +- **Integrace:** Vložte extrahované řetězce do pipeline zpracování přirozeného jazyka pro analýzu sentimentu nebo extrakci klíčových slov. +- **Alternativní knihovny:** Pokud potřebujete vyšší přesnost, prozkoumejte `easyocr` nebo `pytesseract` s vlastním LSTM modelem – oba podporují **python ocr handwritten recognition**. + +Pamatujte, že kvalita vstupního obrázku často určuje úspěch, takže věnujte pár minut předzpracování. Malé úsilí nyní ušetří spoustu ladění později. + +## Závěr + +Prošli jsme kompletním, end‑to‑end příkladem, který ukazuje **jak rozpoznat ručně psaný text** a, co je ještě důležitější, **extrahovat text z ručně psaného obrázku** pomocí Pythonu. Instalací balíčku `ocr`, zapnutím příznaku pro ručně psaný text, načtením obrázku a voláním `recognize()` můžete **convert handwritten photo to text** během několika řádků kódu. + +Vyzkoušejte to na svých poznámkách, dolaďte předzpracování a nechte OCR udělat těžkou práci. Pokud narazíte na potíže, podívejte se znovu na tabulku „Řešení běžných problémů“ nebo experimentujte s alternativními OCR back‑endy. Šťastné kódování a ať se vaše ručně psaná data okamžitě dají prohledávat! + +## Co byste se měli naučit dál? + +Následující tutoriály pokrývají úzce související témata, která staví na technikách předvedených v tomto průvodci. Každý zdroj obsahuje kompletní funkční ukázky kódu s podrobnými vysvětleními, aby vám pomohl zvládnout další funkce API a prozkoumat alternativní implementační přístupy ve vašich projektech. + +- [Extrahování textu z obrázku pomocí Aspose OCR – krok za krokem](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Jak připravit obdélníky stránky pro OCR rozpoznávání textu v Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Jak použít OCR – rozpoznat obrázek bez detekce textové oblasti](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/czech/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/czech/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..b3c04b9ae --- /dev/null +++ b/ocr/czech/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,241 @@ +--- +category: general +date: 2026-06-06 +description: 'Průvodce znakovou sadou OCR: naučte se, jak použít OCR engine k výpisu + podporovaných znaků pro latinské a cyrilské skripty s kompletními příklady v Pythonu.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: cs +og_description: 'Průvodce znakovou sadou OCR: objevte, jak v Pythonu použít OCR engine + k výpisu podporovaných znaků pro různé skripty, včetně kódu a tipů.' +og_title: Sada znaků OCR – Získání a použití OCR enginu +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Sada znaků OCR – Získání a použití OCR motoru v Pythonu +url: /cs/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR znaková sada – Získání a použití OCR enginu v Pythonu + +Chcete vědět, jakou **ocr character set** vaše knihovna podporuje? V tomto tutoriálu vám ukážeme, jak **use OCR engine** k dotazu na podporované znaky pro různé skripty a proč je to důležité pro reálné projekty. + +Pokud jste někdy zírali na prázdnou obrazovku a přemýšleli, proč se některá písmena po zpracování OCR nikdy neobjevila, nejste sami. Odpověď je často skryta v sadě znaků, kterou engine zná. Na konci tohoto průvodce budete schopni: + +* Vypsat každý znak, který OCR engine dokáže rozpoznat pro daný skript. +* Ošetřit případy, kdy skript není podporován. +* Vložit informace o znakové sadě do následné validace nebo UI logiky. + +Žádné magické triky v černé skříňce – jen čistý Python, pár řádků kódu a trochu vysvětlení. + +--- + +## Požadavky + +Než se pustíme dál, ujistěte se, že máte: + +* Nainstalovaný Python 3.8+ (kód používá f‑stringy). +* Knihovnu OCR, která poskytuje `ocr.OcrEngine` – pro tento příklad předpokládáme, že balíček nazvaný `ocr` je již nainstalován pomocí `pip install ocr-lib`. +* Základní znalost import systému v Pythonu a výpisu (printing). + +Pokud knihovnu postrádáte, spusťte: + +```bash +pip install ocr-lib +``` + +A to je vše – žádné další binární soubory ani nativní závislosti nejsou potřeba pro základní dotazy na sadu znaků, které provedeme. + +## Krok 1: Inicializace instance OCR enginu + +Vytvoření objektu enginu je první věc, kterou uděláte, když **use OCR engine** funkčnost. Představte si to jako zapnutí skeneru; bez něj nemůžete klást žádné otázky. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +Třída `OcrEngine` je lehký obal kolem podkladového rozpoznávacího enginu. Nespouští těžké zpracování, dokud o něco nepožádáte, takže náklady na spuštění jsou zanedbatelné. + +> **Pro tip:** Pokud vaše aplikace vytváří mnoho instancí enginu, znovu použijte jedinou. Šetří paměť a snižuje latenci inicializace. + +## Krok 2: Dotaz na OCR znakovou sadu pro konkrétní skript + +Nyní, když máme engine, můžeme se ho zeptat, které znaky zná pro konkrétní psací systém. Metoda `get_supported_characters(script_name)` vrací Python `list` Unicode znaků. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Spuštění výše uvedeného úryvku vytiskne něco jako: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Přesný výstup závisí na verzi OCR knihovny, ale vždy získáte plochý seznam znaků. Pokud potřebujete jak velká, tak malá písmena, jednoduše požádejte o skript `"Latin"` a poté na každý prvek aplikujte `.lower()`, nebo dotazujte samostatný skript `"Latin‑lower"`, pokud knihovna rozlišuje. + +### Proč je to důležité? + +Když předáte obrázek obsahující neobvyklé diakritické znaky (např. “ñ” nebo “ø”), OCR engine je může tiše nahradit zástupným znakem nebo je úplně zahodit. Znalost **ocr character set** předem vám umožní předvalidovat vstup, varovat uživatele nebo přejít na jiný engine. + +## Krok 3: Získání znaků pro jiný skript – Příklad Cyrilice + +Stejná metoda funguje pro jakýkoli skript, který engine tvrdí, že podporuje. Podívejme se, co se stane s Cyrilicí. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Typický výstup: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Pokud engine **ne** podporuje požadovaný skript, obvykle vyvolá `ValueError` (nebo vrátí prázdný seznam v závislosti na implementaci). Ochráníme se proti tomu. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +## Krok 4: Vizualizace OCR znakové sady (volitelné) + +Někdy rychlá vizualizace pomůže, zejména když potřebujete ukázat zainteresovaným stranám, které znaky jsou pokryty. Níže je minimální příklad používající `matplotlib` k vykreslení mřížky znaků. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Image alt text:** ![Diagram sady znaků OCR](ocr_character_set.png){alt="Přehled sady znaků OCR"} + +Diagram není vyžadován pro hlavní řešení, ale ukazuje, jak můžete **use OCR engine** metadata využít nad rámec prostého výpisu. + +## Krok 5: Integrace znakové sady do reálných pracovních toků + +Nyní, když můžeme získat **ocr character set**, podívejme se na praktický scénář: validace výstupu OCR před jeho uložením do databáze. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Tento vzor zabraňuje tomu, aby se špatná data dostala do následných pipeline, což je častý problém při práci s vícejazyčnými dokumenty. + +## Časté úskalí a okrajové případy + +| Pitfall | Why it Happens | How to Avoid | +|---------|----------------|--------------| +| **Chyba v názvu skriptu** (např. `"Cyrillic "` s koncovým mezerou) | Engine interpretuje řetězec doslovně a nemůže skript najít. | Odstraňte bílé znaky: `script.strip()` před voláním `get_supported_characters`. | +| **Prázdný seznam znaků** | Některé enginy zpřístupní skript, ale ještě nenačtou jazykový model. | Zavolejte `engine.load_language_model(script)`, pokud knihovna takovou metodu poskytuje, nebo zajistěte, aby soubory modelu byly přítomny. | +| **Problémy s normalizací Unicode** | Znaky jako “é” se mohou objevit jako složené (`\u00E9`) nebo rozložené (`e\u0301`). | Normalizujte řetězce pomocí `unicodedata.normalize('NFC', text)` před validací. | +| **Výkon u velkých skriptů** (např. čínština) | Získání tisíců znaků může být pomalé. | Uložte výsledek do cache po prvním volání; většina aplikací potřebuje seznam jen jednou. | + +## Kompletní funkční příklad + +Spojením všeho dohromady, zde je jeden skript, který můžete zkopírovat a spustit: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Co byste se měli naučit dál? + +Následující tutoriály pokrývají úzce související témata, která staví na technikách předvedených v tomto průvodci. Každý zdroj obsahuje kompletní funkční ukázky kódu s podrobnými vysvětleními, které vám pomohou zvládnout další funkce API a prozkoumat alternativní implementační přístupy ve vašich projektech. + +- [Aspose OCR tutoriál – Optické rozpoznávání znaků](/ocr/english/) +- [OCR post processing – Získání možností znaků](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Jak nastavit prahovou hodnotu v OCR rozpoznávání obrazu](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/czech/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/czech/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..d30460edd --- /dev/null +++ b/ocr/czech/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,237 @@ +--- +category: general +date: 2026-06-06 +description: Provádějte OCR na obrázku pomocí Aspose OCR a modelu Hugging Face. Naučte + se, jak stáhnout model Hugging Face, extrahovat text z faktury a uvolnit GPU zdroje. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: cs +og_description: Proveďte OCR na obrázku pomocí Aspose OCR a modelu Hugging Face. Tento + tutoriál ukazuje, jak stáhnout model, extrahovat text z faktury a uvolnit prostředky + GPU. +og_title: Provádějte OCR na obrázku s Aspose OCR a LLM – Kompletní průvodce +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Provést OCR na obrázku s Aspose OCR a LLM – Kompletní průvodce +url: /cs/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Perform OCR on Image with Aspose OCR & LLM – Complete Guide + +Už jste někdy chtěli **perform OCR on image** soubory, ale uvízli jste u otázky „kde začít?“? Nejste sami – mnoho vývojářů narazí na tuto překážku, když poprvé řeší automatizaci dokumentů. Dobrou zprávou je, že s Aspose OCR a lehkým LLM od Hugging Face můžete převést surový sken faktury na čistý, prohledávatelný text během několika řádků Pythonu. + +V tomto tutoriálu projdeme vše, co potřebujete: od **loading image for OCR**, přes **download the Hugging Face model**, až po **extract text from invoice** data, a nakonec **free GPU resources**, aby vaše aplikace zůstala úsporná. Na konci budete mít samostatný skript, který můžete vložit do libovolného projektu. + +--- + +## What You’ll Learn + +- Jak **perform OCR on image** pomocí Aspose `OcrEngine`. +- Přesné kroky k **download Hugging Face model** souborům automaticky. +- Techniky k **extract text from invoice** PDF nebo PNG s AI‑vylepšeným post‑processingem. +- Nejlepší postupy pro **free GPU resources** po inferenci. +- Tipy pro **load image for OCR** efektivně a vyhnout se běžným úskalím. + +Externí dokumentace není potřeba – vše, co potřebujete, je zde, s kompletním kódem, vysvětleními a očekávaným výstupem. + +## Prerequisites + +Než se ponoříme, ujistěte se, že máte: + +| Požadavek | Důvod | +|-------------|--------| +| Python 3.9+ | Moderní syntaxe a typové nápovědy | +| `asposeocr` package (`pip install asposeocr`) | Jádrový OCR engine | +| Access to a GPU (optional but recommended) | Zrychluje LLM post‑processor | +| An invoice image (`sample_invoice.png`) | Reálný testovací případ | + +Pokud vám něco chybí, nainstalujte to nyní; skript také **download Hugging Face model** automaticky, takže nebudete muset sami hledat soubory. + +## Step 1: Perform OCR on Image – Create the Engine + +Prvním krokem je spustit OCR engine od Aspose. Představte si to jako otevření prázdného plátna, na které bude obrázek později přeměněn na text. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Proč je to důležité:** `OcrEngine` abstrahuje veškeré nízkoúrovňové předzpracování obrazu, takže se můžete soustředit na vyšší úroveň pracovního postupu. Také poskytuje metodu `set_post_processor`, která nám později umožní připojit LLM pro chytřejší výstup. + +## Step 2: Load Image for OCR – Choose the Right File + +Nyní, když engine existuje, musíme **load image for OCR**. Aspose podporuje PNG, JPG, TIFF a několik dalších formátů. Ujistěte se, že cesta je absolutní nebo relativní k umístění vašeho skriptu. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tip:** Pokud je váš obrázek velký, zvažte jeho předběžné zmenšení, aby se snížil tlak na paměť. OCR engine dokáže zpracovat vysoce rozlišené skeny, ale obrázek s 300 DPI je obvykle optimální pro faktury. + +## Step 3: Perform Raw OCR and View the Extracted Text + +Po načtení obrázku můžeme konečně **perform OCR on image** a podívat se, co surový engine vrátí. Tento krok nám poskytne výchozí bod před přidáním AI magie. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Očekávaný výstup (zkrácený):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Surový výstup často obsahuje zalomení řádků, nesprávně rozpoznané znaky nebo chybějící pole – právě proto následně použijeme jazykový model. + +## Step 4: Download Hugging Face Model – Configure the LLM Post‑Processor + +Zde se ukáže, kde **download Hugging Face model** září. Aspose AI může automaticky stáhnout model z hubu Hugging Face, pokud ještě není na disku. Použijeme model Qwen2.5‑3B‑Instruct‑GGUF, který vyvažuje přesnost a paměťovou stopu. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Proč to funguje:** `allow_auto_download` vás ušetří ručního stahování souboru `.gguf`. Kvantizace (`int8`) snižuje velikost modelu na přibližně 3 GB, což ho činí proveditelným na většině spotřebitelských GPU. Nastavte `gpu_layers` podle vašeho hardwaru – více vrstev na GPU = rychlejší inference. + +## Step 5: Extract Text from Invoice Using AI‑Enhanced Post‑Processing + +Nyní připojíme LLM k OCR engine a spustíme **post‑processor**, který vyčistí surový výstup, opraví chyby OCR a hezky naformátuje pole faktury. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Ukázkový vylepšený výstup:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Co se stalo?** LLM rozpoznal, že „Invoice #12345“ by mělo být „Invoice Number: 12345“, opravil formát data a dokonce odhadl pole „Bill To“, které surový engine přehlédl. To je jádro automatizace **extract text from invoice**. + +## Step 6: Free GPU Resources – Clean Up After Processing + +Pokud to spouštíte v dlouhožijící službě (např. Flask API), musíte po každé inferenci **free GPU resources** aby nedošlo k selhání z nedostatku paměti. Aspose AI poskytuje jednoduchou metodu pro to. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro tip:** Zavolejte `free_resources()` uvnitř bloku `finally:`, pokud obalujete OCR volání do try/except. To zaručuje úklid i při výskytu výjimky. + +## Step 7: Full Script – Put It All Together + +Níže je kompletní, připravený ke spuštění skript. Zkopírujte jej, upravte cesty a můžete spustit. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Spusťte skript a sledujte přeměnu od šumivého OCR k čistým, strukturovaným datům faktury. 🎉 + +## Common Questions & Edge Cases + +| Otázka | Odpověď | +|----------|--------| +| **What if the model fails to download?** | Ensure your machine has internet access and that the `hugging_face_repo_id` is correct. You can also manually download the | + +## What Should You Learn Next? + +Následující tutoriály pokrývají úzce související témata, která staví na technikách předvedených v tomto průvodci. Každý zdroj obsahuje kompletní funkční ukázky kódu s podrobnými vysvětleními, které vám pomohou zvládnout další funkce API a prozkoumat alternativní přístupy k implementaci ve vašich projektech. + +- [Extrahovat text z obrázku C# s výběrem jazyka pomocí Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extrahovat text z obrázku – optimalizace OCR s Aspose.OCR pro .NET](/ocr/english/net/ocr-optimization/) +- [Převést obrázek na text – provést OCR na obrázku z URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/czech/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/czech/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..b77176e83 --- /dev/null +++ b/ocr/czech/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,282 @@ +--- +category: general +date: 2026-06-06 +description: Rozpoznávejte text z obrázku pomocí Aspose OCR – naučte se, jak načíst + obrázek pro OCR a provést OCR na obrázku s využitím AI post‑processingu v Pythonu. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: cs +og_description: Rychle rozpoznávejte text z obrázku. Tento návod ukazuje, jak načíst + obrázek pro OCR, provést OCR na obrázku a vylepšit výsledky pomocí AI post‑zpracování. +og_title: rozpoznat text z obrázku pomocí Aspose OCR a AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: rozpoznat text z obrázku pomocí Aspose OCR a AI +url: /cs/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# rozpoznat text z obrázku pomocí Aspose OCR & AI + +Už jste někdy potřebovali rozpoznat text z obrázku, ale nebyli si jisti, která knihovna vám poskytne jak rychlost, tak přesnost? Nejste v tom sami. V tomto průvodci projdeme kompletním, end‑to‑end příkladem, který ukazuje **how to load image for OCR**, **perform OCR on image** a následně vylepší výstup pomocí AI post‑processoru od Aspose. Na konci budete mít připravený skript, který převádí PNG na čistý, prohledávatelný text. + +## Co se naučíte + +Probereme vše od instalace balíčku Aspose OCR až po uvolnění zdrojů na konci běhu. Uvidíte, proč má smysl povolit rozpoznávání ručně psaného textu, jak nakonfigurovat Qwen 2.5 LLM pro post‑processing a jak vypadá finální výstup. Žádné externí odkazy nejsou potřeba – stačí zkopírovat, vložit a spustit. + +### Prerequisites + +- Python 3.8 nebo novější +- `asposeocr` balíček (`pip install asposeocr`) +- Obrázkový soubor (např. `doc.png`) obsahující tištěný nebo ručně psaný text +- Volitelné: GPU pro rychlejší inferenci LLM (skript funguje i na CPU) + +--- + +## Rozpoznat text z obrázku – krok za krokem + +Níže pod každým blokem kódu najdete krátké vysvětlení **proč** provádíme danou akci, ne jen **co** řádek dělá. + +### Krok 1: Nainstalujte a importujte požadované moduly + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Proč?* Importování `asposeocr` nám poskytuje třídu `OcrEngine`, zatímco podmodul `ai` poskytuje LLM‑založený post‑processor, který dramaticky zlepšuje surový výstup OCR. + +### Krok 2: Vytvořte OCR engine a povolte rozpoznávání ručně psaného textu + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Povolení rozpoznávání ručně psaného textu rozšiřuje znakovou sadu engine, takže nepřijdete o škrábance, když **perform OCR on image** soubory obsahují smíšený tištěný a kurzivní text. + +### Krok 3: Načtěte obrázek pro OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +Volání `load_image` je okamžik, kdy **load image for OCR**; pokud je cesta špatná, engine vyvolá informativní výjimku, čímž vás ochrání před nejasnými chybami v následujícím kroku. + +### Krok 4: Proveďte surový OCR průchod + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +V této fázi získáte objekt `RecognitionResult`, který obsahuje nevyfiltrovaný text, skóre důvěry a metadata rozložení. Často je šumivý – proto je potřeba AI‑řízené vyčištění. + +### Krok 5: Nakonfigurujte AI model Aspose pro LLM post‑processing + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Proč se obtěžovat s těmito nastaveními? +- **auto‑download** zajišťuje, že model je k dispozici při prvním spuštění. +- **int8 quantization** výrazně snižuje nároky na paměť bez velké ztráty přesnosti. +- **gpu_layers** vám umožní využít kompatibilní GPU pro rychlejší inferenci. + +### Krok 6: Inicializujte AI procesor a připojte jej jako post‑processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Připojení procesoru znamená, že pokaždé, když zavoláte `run_postprocessor`, LLM vyčistí pravopis, spojí rozdělená slova a dokonce doplní chybějící interpunkci. + +### Krok 7: Spusťte post‑processor pro vylepšení výstupu OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` je obvykle mnohem čitelnější než surový řetězec – představte si to jako kontrolu pravopisu, která zároveň rozumí kontextu. + +### Krok 8: Uvolněte zdroje + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Úklid je klíčový v dlouhodobě běžících službách; jinak uniknete paměť GPU a nakonec aplikaci zhavarujete. + +--- + +## Kompletní skript, který můžete spustit ještě dnes + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Očekávaný výstup** (příklad pro jednoduchý obrázek faktury): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Všimněte si, jak AI vrstva opravila „Inv0ice“ → „Invoice“ a přidala chybějící interpunkci. + +--- + +## Často kladené otázky (a rychlé odpovědi) + +- **Potřebuji GPU?** Ne. Skript přepne na CPU, ale inference bude pomalejší. +- **Mohu použít jiný LLM?** Rozhodně—stačí změnit `hugging_face_repo_id` a podle toho upravit `gpu_layers`. +- **Co když je můj obrázek obrovský?** Nejprve jej změňte velikost (např. pomocí Pillow), aby využití paměti bylo rozumné. +- **Je rozpoznávání ručně psaného textu vždy zapnuté?** Můžete přepínat `enable_handwritten_recognition` podle zatížení. + +--- + +## Závěr + +Nyní víte, jak **recognize text from image** pomocí Aspose OCR, jak **load image for OCR** a jak **perform OCR on image** s AI‑vylepšeným post‑processingem. Kompletní, spustitelný příklad výše vám poskytuje pevný základ pro integraci OCR do jakéhokoli Python projektu – ať už skenujete účtenky, digitalizujete smlouvy nebo extrahujete data z ručně psaných formulářů. + +Jste připraveni na další krok? Zkuste vyměnit model Qwen za větší, experimentujte s různými kvantizačními schématy nebo spojte více obrázků dohromady pro dávkové zpracování. Možnosti jsou neomezené a kód, který jste právě vytvořili, je zvládne s grácií. + +Šťastné programování a ať jsou vaše OCR výsledky vždy krystalicky čisté! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Screenshot showing how to recognize text from image using Aspose OCR"} + +## Co byste se měli naučit dál? + +Následující tutoriály pokrývají úzce související témata, která staví na technikách předvedených v tomto průvodci. Každý zdroj obsahuje kompletní funkční ukázky kódu s podrobnými vysvětleními, aby vám pomohl zvládnout další funkce API a prozkoumat alternativní implementační přístupy ve vlastních projektech. + +- [Extrahovat text z obrázku pomocí Aspose OCR – průvodce krok za krokem](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Převést obrázek na text – provést OCR na obrázku z URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extrahovat text z obrázku v C# s výběrem jazyka pomocí Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/dutch/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/dutch/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..bd3536b6d --- /dev/null +++ b/ocr/dutch/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,216 @@ +--- +category: general +date: 2026-06-06 +description: Extraheer tekst uit een handgeschreven afbeelding met Python OCR. Leer + hoe je een handgeschreven foto snel en betrouwbaar naar tekst kunt omzetten. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: nl +og_description: Tekst extraheren uit handgeschreven afbeelding met Python. Deze gids + laat zien hoe je een handgeschreven foto naar tekst converteert en beantwoordt hoe + je handgeschreven tekst kunt herkennen. +og_title: Tekst extraheren uit handgeschreven afbeelding – Python OCR-tutorial +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Tekst extraheren uit handgeschreven afbeelding met Python OCR – Stapsgewijze + handleiding +url: /nl/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Tekst extraheren uit handgeschreven afbeelding met Python OCR – Stapsgewijze gids + +Heb je je ooit afgevraagd **hoe je handgeschreven tekst** kunt herkennen in een foto die je met je telefoon hebt gemaakt? Je bent niet de enige. In veel projecten—of het nu gaat om het digitaliseren van college‑notities of het ophalen van gegevens uit ondertekende formulieren—heb je snel en zonder gedoe **tekst uit handgeschreven afbeelding** nodig. + +In deze tutorial lopen we een compleet, kant‑klaar voorbeeld door dat je precies laat zien hoe je **handgeschreven foto naar tekst** kunt omzetten met een populaire Python OCR‑bibliotheek. Geen vage verwijzingen, alleen concrete code, uitleg en tips die je vandaag kunt kopiëren‑plakken. + +![tekst extraheren uit handgeschreven afbeelding](https://example.com/placeholder-handwritten.jpg "tekst extraheren uit handgeschreven afbeelding") + +## Wat je nodig hebt + +Voordat we beginnen, zorg dat je het volgende hebt: + +| Vereiste | Waarom het belangrijk is | +|----------|--------------------------| +| Python 3.9 of nieuwer | Moderne syntax & bibliotheekondersteuning | +| `pip` (Python package manager) | Om het OCR‑pakket te installeren | +| Een duidelijke afbeelding van handgeschreven notities (JPEG/PNG) | OCR‑nauwkeurigheid daalt bij onscherpe afbeeldingen | +| Basiskennis van Python‑functies | Helpt je later het voorbeeld aan te passen | + +Als je een van deze mist, download dan de nieuwste Python van en installeer deze—geen gedoe. + +## Installeer de Python OCR‑bibliotheek + +De code‑snippet die we gaan gebruiken maakt gebruik van het `ocr`‑pakket (een lichte wrapper rond Tesseract die handige instellingen toevoegt). Installeer het met één commando: + +```bash +pip install ocr +``` + +> **Pro tip:** Na de installatie, voer `tesseract --version` uit in je terminal om te bevestigen dat de onderliggende engine aanwezig is. Als je Tesseract niet hebt, volg dan de officiële gids voor je besturingssysteem—de meeste pakketbeheerders hebben het (`apt-get install tesseract-ocr` op Ubuntu, `brew install tesseract` op macOS). + +## Stap 1: Maak een OCR‑engine‑instantie + +Het maken van de engine is de eerste steen in de muur van **python ocr handwritten recognition**. Beschouw de engine als het brein dat later de krabbels zal lezen. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Waarom dit belangrijk is: zonder een engine kun je de herkennings‑pipeline niet aanpassen. De standaardinstellingen zijn afgestemd op gedrukte tekst, dus we moeten ze in de volgende stap aanpassen. + +## Stap 2: Schakel handgeschreven herkenning in + +Standaard gaat de engine uit van gedrukte tekens. Het inschakelen van de handgeschreven modus zet een schakelaar die Tesseract vertelt zijn LSTM‑model te gebruiken dat getraind is op cursieve streken. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Wat als je dit overslaat?** De OCR zal de streken als ruis behandelen, wat leidt tot onsamenhangende output. Het inschakelen van de vlag is de kern van **hoe je handgeschreven tekst herkent**. + +## Stap 3: Laad je handgeschreven foto + +Nu wijzen we de engine op het afbeeldingsbestand. Het pad kan absoluut of relatief zijn; zorg er gewoon voor dat het bestand bestaat. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Een snelle controle: open de afbeelding in de viewer van je besturingssysteem. Als de tekst onscherp is, overweeg dan pre‑processing (contrastverhoging, rotatie) voordat je het aan de engine geeft—die aanpassingen verhogen vaak het succespercentage van **tekst extraheren uit handgeschreven afbeelding**. + +## Stap 4: Voer het herkenningsproces uit + +Met alles ingesteld, vragen we de engine eindelijk om zijn werk te doen. De `recognize()`‑methode retourneert een resultaatobject dat de geëxtraheerde string en vertrouwensscores bevat. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Achter de schermen converteert de engine de bitmap naar een reeks feature‑vectoren, voert ze door het LSTM‑netwerk en zet de tekens aan elkaar. Dat is de magie van **python ocr handwritten recognition**. + +## Stap 5: Toon de geëxtraheerde tekst + +Het resultaatobject biedt een `.text`‑attribuut dat de platte Unicode‑string bevat. Print het, schrijf het naar een bestand, of voer het in een andere pipeline in—jij beslist. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Verwachte output + +Als de bronafbeelding de notitie “Buy milk, eggs, and bread” bevat, zie je iets als: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Let op hoe de output interpunctie en regeleinden behoudt (indien aanwezig). Als je onzin krijgt, controleer dan de beeldkwaliteit en de `enable_handwritten_recognition`‑vlag. + +## Veelvoorkomende valkuilen behandelen + +| Probleem | Symptoom | Oplossing | +|----------|----------|-----------| +| Lage vertrouwensscores | Veel “?” of onsamenhangende tekens | Verhoog de beeld‑DPI tot ≥300, pas binarisatie toe (`opencv`), of snijd bij tot het interessegebied. | +| Gemengde talen | Output mengt Engels met een ander script | Stel `engine.ocr_settings.language = "eng"` (of een andere ISO‑code) in vóór `recognize()`. | +| Grote bestanden | Lange verwerkingstijd of geheugenfout | Verklein de afbeelding tot een redelijke dimensie (bijv. max breedte 1200 px) vóór het laden. | +| Ontbrekende Tesseract | `ImportError` of `FileNotFoundError` | Installeer Tesseract apart en zorg dat het in je systeem‑PATH staat. | + +Deze aanpassingen houden je **convert handwritten photo to text** workflow robuust over diverse datasets. + +## Volledig script dat je vandaag kunt uitvoeren + +Hieronder staat het volledige, zelfstandige programma dat alle onderdelen samenvoegt. Kopieer het naar een bestand genaamd `handwritten_ocr.py` en voer `python handwritten_ocr.py` uit. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Voer het uit, en je ziet de tekst in de console afgedrukt—precies het resultaat dat je nodig hebt wanneer je **convert handwritten photo to text** wilt. + +## Verder gaan + +Nu je de basis van **extract text from handwritten image** onder de knie hebt, overweeg deze volgende stappen: + +- **Batchverwerking:** Loop over een map met afbeeldingen en sla elk resultaat op in een CSV‑bestand. +- **Post‑processing:** Gebruik reguliere expressies om veelvoorkomende OCR‑fouten op te schonen (bijv. “1” vs “l”). +- **Integratie:** Voer de geëxtraheerde strings in een Natural Language Processing‑pipeline voor sentimentanalyse of trefwoordextractie. +- **Alternatieve bibliotheken:** Als je hogere nauwkeurigheid nodig hebt, verken `easyocr` of `pytesseract` met aangepaste LSTM‑modellen—beide ondersteunen ook **python ocr handwritten recognition**. + +Onthoud dat de kwaliteit van de bronafbeelding vaak het succes bepaalt, dus besteed een paar minuten aan pre‑processing. Een beetje extra moeite nu bespaart je later veel debuggen. + +## Conclusie + +We hebben een compleet, end‑to‑end voorbeeld doorlopen dat laat zien **hoe je handgeschreven tekst herkent** en, nog belangrijker, hoe je **tekst uit handgeschreven afbeelding** kunt extraheren met Python. Door het `ocr`‑pakket te installeren, de handgeschreven vlag in te schakelen, je afbeelding te laden en `recognize()` aan te roepen, kun je **convert handwritten photo to text** in slechts een handvol regels. + +Probeer het met je eigen notities, pas de pre‑processing stappen aan, en laat de OCR het zware werk doen. Als je tegen problemen aanloopt, bekijk dan opnieuw de tabel “Veelvoorkomende valkuilen behandelen” of experimenteer met alternatieve OCR‑back‑ends. Veel plezier met coderen, en moge je handgeschreven data direct doorzoekbaar worden! + +## Wat moet je hierna leren? + +De volgende tutorials behandelen nauw verwante onderwerpen die voortbouwen op de technieken die in deze gids worden getoond. Elke bron bevat volledige werkende code‑voorbeelden met stapsgewijze uitleg om je te helpen extra API‑functies onder de knie te krijgen en alternatieve implementatie‑benaderingen in je eigen projecten te verkennen. + +- [Tekst extraheren uit afbeelding met Aspose OCR – Stapsgewijze gids](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Hoe pagina‑rechthoeken te herkennen voor OCR‑tekstherkenning in Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Hoe OCR te gebruiken – Afbeelding herkennen zonder detectie van tekstgebied](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/dutch/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/dutch/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..d0f40ff57 --- /dev/null +++ b/ocr/dutch/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,231 @@ +--- +category: general +date: 2026-06-06 +description: 'OCR-tekensetgids: leer hoe je de OCR-engine gebruikt om ondersteunde + tekens voor Latijnse en Cyrillische scripts te tonen met volledige Python‑voorbeelden.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: nl +og_description: 'OCR-tekensetgids: ontdek hoe je de OCR-engine in Python gebruikt + om ondersteunde tekens voor verschillende scripts weer te geven, compleet met code + en tips.' +og_title: OCR-tekenset – Ophalen en gebruiken van OCR-engine +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR-tekenset – Ophalen en gebruiken van OCR‑engine in Python +url: /nl/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR‑tekenreeks – Ophalen en gebruiken van OCR‑engine in Python + +Wil je de **ocr character set** weten die je bibliotheek ondersteunt? In deze tutorial laten we je zien hoe je de **use OCR engine** kunt gebruiken om ondersteunde tekens voor verschillende scripts op te vragen, en waarom dat belangrijk is voor projecten in de echte wereld. + +Als je ooit naar een leeg scherm hebt gestaren en je afvroeg waarom sommige letters nooit verschijnen na OCR‑verwerking, ben je niet de enige. Het antwoord zit vaak verborgen in de tekenreeks die de engine kent. Aan het einde van deze gids kun je: + +* Lijst elk teken dat een OCR‑engine kan herkennen voor een gegeven script. +* Omgaan met gevallen waarin een script niet wordt ondersteund. +* De character‑set informatie integreren in downstream validatie of UI‑logica. + +Geen magische black‑box trucjes—gewoon plain Python, een paar regels code, en een beetje uitleg. + +--- + +## Vereisten + +Voordat we beginnen, zorg ervoor dat je het volgende hebt: + +* Python 3.8+ geïnstalleerd (de code gebruikt f‑strings). +* De OCR‑bibliotheek die `ocr.OcrEngine` levert—voor dit voorbeeld gaan we ervan uit dat een pakket genaamd `ocr` al geïnstalleerd is via `pip install ocr-lib`. +* Basiskennis van het import‑systeem van Python en van afdrukken. + +Als je de bibliotheek mist, voer dan uit: + +```bash +pip install ocr-lib +``` + +Dat is alles—geen extra binaries of native dependencies nodig voor de basis character‑set queries die we gaan uitvoeren. + +--- + +## Stap 1: Initialiseer de OCR‑engine instantie + +Het maken van een engine‑object is het eerste wat je doet wanneer je de **use OCR engine** functionaliteit gebruikt. Beschouw het als het inschakelen van een scanner; zonder deze kun je geen vragen stellen. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +De `OcrEngine`‑klasse is een lichtgewicht wrapper rond de onderliggende herkenningsengine. Het start geen zware verwerking totdat je iets vraagt, dus de opstartkosten zijn verwaarloosbaar. + +> **Pro tip:** Als je applicatie veel engine‑instanties maakt, hergebruik er dan één. Het bespaart geheugen en vermindert de initialisatielatentie. + +--- + +## Stap 2: Vraag de OCR‑tekenreeks op voor een specifiek script + +Nu we een engine hebben, kunnen we hem vragen welke tekens hij kent voor een bepaald schrijfsysteem. De methode `get_supported_characters(script_name)` retourneert een Python `list` van Unicode‑tekens. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Het uitvoeren van de bovenstaande code print iets als: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +De exacte output hangt af van de versie van de OCR‑bibliotheek, maar je krijgt altijd een platte lijst van tekens. Als je zowel hoofdletters als kleine letters nodig hebt, vraag dan simpelweg het `"Latin"`‑script op en pas vervolgens `.lower()` toe op elk element, of vraag een apart `"Latin‑lower"`‑script op als de bibliotheek ze onderscheidt. + +### Waarom is dit belangrijk? + +Wanneer je een afbeelding met ongebruikelijke diakritische tekens (bijv. “ñ” of “ø”) invoert, kan de OCR‑engine ze stilletjes vervangen door een placeholder of ze volledig weglaten. Het kennen van de **ocr character set** van tevoren stelt je in staat om invoer vooraf te valideren, gebruikers te waarschuwen, of terug te vallen op een andere engine. + +--- + +## Stap 3: Haal tekens op voor een ander script – Cyrillic‑voorbeeld + +Dezelfde methode werkt voor elk script dat de engine claimt te ondersteunen. Laten we zien wat er gebeurt met Cyrillic. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Typische output: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Als de engine **niet** het gevraagde script ondersteunt, werpt hij meestal een `ValueError` (of retourneert een lege lijst, afhankelijk van de implementatie). Laten we daar tegen beschermen. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Stap 4: Visualiseer de OCR‑tekenreeks (optioneel) + +Soms helpt een snelle visualisatie, vooral wanneer je belanghebbenden wilt laten zien welke tekens gedekt zijn. Hieronder staat een minimaal voorbeeld met `matplotlib` om een raster van tekens weer te geven. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Image alt text:** ![OCR‑tekenreeks diagram](ocr_character_set.png){alt="Overzicht van OCR‑tekenreeks"} + +Het diagram is niet vereist voor de kernoplossing, maar het toont hoe je de **use OCR engine** metadata kunt gebruiken buiten eenvoudige afdrukken. + +--- + +## Stap 5: Integreer de tekenreeks in real‑world workflows + +Nu we de **ocr character set** kunnen ophalen, bekijken we een praktisch scenario: het valideren van OCR‑output voordat deze in een database wordt opgeslagen. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Dit patroon voorkomt dat vuildata in downstream‑pijplijnen terechtkomt, een veelvoorkomend pijnpunt bij het omgaan met meertalige documenten. + +--- + +## Veelvoorkomende valkuilen en randgevallen + +| Valkuil | Waarom het gebeurt | Hoe te vermijden | +|---------|--------------------|------------------| +| **Script name typo** (bijv. `"Cyrillic "` met een spatie aan het einde) | De engine behandelt de string letterlijk en kan het script niet vinden. | Verwijder witruimte: `script.strip()` voordat `get_supported_characters` wordt aangeroepen. | +| **Empty character list** | Sommige engines tonen een script maar hebben het taalmodel nog niet geladen. | Roep `engine.load_language_model(script)` aan als de bibliotheek zo’n methode biedt, of zorg dat de modelbestanden aanwezig zijn. | +| **Unicode normalization issues** | Tekens zoals “é” kunnen verschijnen als samengesteld (`\u00E9`) of gedecomposeerd (`e\u0301`). | Normaliseer strings met `unicodedata.normalize('NFC', text)` vóór validatie. | +| **Performance on huge scripts** (bijv. Chinees) | Het ophalen van duizenden tekens kan traag zijn. | Cache het resultaat na de eerste oproep; de meeste applicaties hebben de lijst slechts één keer nodig. | + +--- + +## Volledig werkend voorbeeld + +Alles samengevoegd, hier is een enkel script dat je kunt kopiëren‑plakken en uitvoeren: + + + +## Wat moet je hierna leren? + +De volgende tutorials behandelen nauw verwante onderwerpen die voortbouwen op de technieken die in deze gids worden getoond. Elke bron bevat volledige werkende code‑voorbeelden met stap‑voor‑stap uitleg om je te helpen extra API‑functies onder de knie te krijgen en alternatieve implementatie‑benaderingen in je eigen projecten te verkennen. + +- [Aspose OCR Tutorial – Optische tekenherkenning](/ocr/english/) +- [OCR‑nabewerking – Haal tekenkeuzes op](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Hoe drempelwaarde in te stellen bij OCR‑beeldherkenning](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/dutch/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/dutch/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..f27aa254a --- /dev/null +++ b/ocr/dutch/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: Voer OCR uit op een afbeelding met Aspose OCR en een Hugging Face‑model. + Leer hoe je een Hugging Face‑model downloadt, tekst uit een factuur extraheert en + GPU‑bronnen vrijmaakt. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: nl +og_description: Voer OCR uit op een afbeelding met Aspose OCR en een Hugging Face‑model. + Deze tutorial laat zien hoe je het model downloadt, tekst uit een factuur extraheert + en GPU‑resources vrijmaakt. +og_title: Voer OCR uit op afbeelding met Aspose OCR & LLM – Complete gids +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Voer OCR uit op afbeelding met Aspose OCR & LLM – Complete gids +url: /nl/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR uitvoeren op afbeelding met Aspose OCR & LLM – Complete gids + +Heb je ooit **OCR willen uitvoeren op afbeelding**‑bestanden, maar zat je vast bij de vraag “waar moet ik beginnen?”? Je bent niet alleen—veel ontwikkelaars lopen tegen die muur aan wanneer ze voor het eerst documentautomatisering aanpakken. Het goede nieuws is dat je met Aspose OCR en een lichtgewicht LLM van Hugging Face een ruwe scan van een factuur kunt omzetten in nette, doorzoekbare tekst in slechts een paar regels Python. + +In deze tutorial lopen we alles door wat je nodig hebt: van **afbeelding laden voor OCR**, tot **het Hugging Face‑model downloaden**, tot **tekst uit factuur**‑data extraheren, en uiteindelijk **GPU‑bronnen vrijgeven** zodat je app slank blijft. Aan het einde heb je een zelfstandige script die je in elk project kunt gebruiken. + +--- + +## Wat je zult leren + +- Hoe je **OCR uitvoert op afbeelding** met Aspose’s `OcrEngine`. +- De exacte stappen om **Hugging Face‑model**‑bestanden automatisch te **downloaden**. +- Technieken om **tekst uit factuur**‑PDF’s of PNG’s te **extraheren** met AI‑verbeterde post‑processing. +- Best practices om **GPU‑bronnen vrij te geven** na inferentie. +- Tips om **afbeelding laden voor OCR** efficiënt te doen en veelvoorkomende valkuilen te vermijden. + +Geen externe documentatie nodig—alles wat je nodig hebt staat hier, met volledige code, uitleg en verwachte output. + +--- + +## Vereisten + +Voordat we beginnen, zorg dat je het volgende hebt: + +| Vereiste | Reden | +|----------|-------| +| Python 3.9+ | Moderne syntaxis en type‑hints | +| `asposeocr` package (`pip install asposeocr`) | Kern‑OCR‑engine | +| Toegang tot een GPU (optioneel maar aanbevolen) | Versnelt de LLM‑post‑processor | +| Een factuurafbeelding (`sample_invoice.png`) | Praktijkvoorbeeld | + +Als je iets mist, installeer het nu; het script zal ook **Hugging Face‑model** automatisch **downloaden**, zodat je zelf niet naar bestanden hoeft te zoeken. + +--- + +## Stap 1: OCR uitvoeren op afbeelding – Maak de engine + +Het eerste wat je moet doen is de OCR‑engine van Aspose starten. Zie het als een leeg canvas waarop later de afbeelding wordt omgezet in tekst. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Waarom dit belangrijk is:** `OcrEngine` abstraheert alle low‑level beeldvoorbewerking, zodat je je kunt richten op de hogere‑level workflow. Het biedt ook een `set_post_processor`‑methode waarmee we later een LLM kunnen koppelen voor slimmer resultaat. + +--- + +## Stap 2: Afbeelding laden voor OCR – Kies het juiste bestand + +Nu de engine bestaat, moeten we **afbeelding laden voor OCR**. Aspose ondersteunt PNG, JPG, TIFF en een handvol andere formaten. Zorg dat het pad absoluut is of relatief ten opzichte van de locatie van je script. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tip:** Als je afbeelding groot is, overweeg dan om deze van tevoren te verkleinen om geheugenbelasting te verminderen. De OCR‑engine kan hoge resolutie scans aan, maar een 300 DPI‑afbeelding is meestal een goed compromis voor facturen. + +--- + +## Stap 3: Raw OCR uitvoeren en de geëxtraheerde tekst bekijken + +Met de afbeelding geladen, kunnen we eindelijk **OCR uitvoeren op afbeelding** en zien wat de ruwe engine teruggeeft. Deze stap geeft ons een basislijn voordat we AI‑magie toevoegen. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Verwachte output (afgekapt):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +De ruwe output bevat vaak regeleinden, verkeerd herkende tekens of ontbrekende velden—precies waarom we vervolgens een taalmodel inschakelen. + +--- + +## Stap 4: Hugging Face‑model downloaden – Configureer de LLM‑post‑processor + +Hier komt de **download Hugging Face model**‑stap om de hoek kijken. Aspose AI kan automatisch een model van de Hugging Face‑hub ophalen als het nog niet op schijf staat. We gebruiken het Qwen2.5‑3B‑Instruct‑GGUF‑model, dat een goede balans biedt tussen nauwkeurigheid en geheugenfootprint. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Waarom dit werkt:** `allow_auto_download` bespaart je het handmatig downloaden van het `.gguf`‑bestand. Kwantisatie (`int8`) verkleint het model tot ongeveer 3 GB, waardoor het op de meeste consumenten‑GPU’s haalbaar is. Pas `gpu_layers` aan op basis van je hardware—meer lagen op de GPU = snellere inferentie. + +--- + +## Stap 5: Tekst uit factuur extraheren met AI‑verbeterde post‑processing + +Nu koppelen we de LLM aan de OCR‑engine en voeren een **post‑processor** uit die de ruwe output opschoont, OCR‑fouten corrigeert en de factuurvelden netjes formatteert. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Voorbeeld van verbeterde output:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Wat gebeurde er?** De LLM herkende dat “Invoice #12345” moet zijn “Invoice Number: 12345”, corrigeerde het datumformaat en inferreerde zelfs het “Bill To”‑veld dat de ruwe engine miste. Dit is de kern van **extract text from invoice**‑automatisering. + +--- + +## Stap 6: GPU‑bronnen vrijgeven – Opruimen na verwerking + +Als je dit draait in een langdurige service (bijv. een Flask‑API), moet je **GPU‑bronnen vrijgeven** na elke inferentie om out‑of‑memory‑crashes te voorkomen. Aspose AI biedt een eenvoudige methode hiervoor. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro‑tip:** Roep `free_resources()` aan binnen een `finally:`‑blok als je de OCR‑aanroep in een try/except wikkelt. Zo wordt altijd opgeruimd, zelfs bij een uitzondering. + +--- + +## Stap 7: Volledig script – Alles samenvoegen + +Hieronder vind je het complete, kant‑klaar script. Kopieer‑plak het, pas de paden aan, en je bent klaar om te gaan. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Voer het script uit en zie de transformatie van ruisende OCR naar nette, gestructureerde factuurdata. 🎉 + +--- + +## Veelgestelde vragen & randgevallen + +| Vraag | Antwoord | +|-------|----------| +| **Wat als het model niet kan downloaden?** | Zorg dat je machine internettoegang heeft en dat de `hugging_face_repo_id` correct is. Je kunt het bestand ook handmatig downloaden. | + +## Wat moet je hierna leren? + +De volgende tutorials behandelen nauw verwante onderwerpen die voortbouwen op de technieken die in deze gids zijn gedemonstreerd. Elke bron bevat volledige werkende code‑voorbeelden met stap‑voor‑stap uitleg om je te helpen extra API‑functies onder de knie te krijgen en alternatieve implementaties in je eigen projecten te verkennen. + +- [Afbeeldingstekst extraheren C# met taalkeuze met Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Tekst uit afbeelding extraheren – OCR‑optimalisatie met Aspose.OCR voor .NET](/ocr/english/net/ocr-optimization/) +- [Afbeelding naar tekst converteren – OCR uitvoeren op afbeelding vanaf URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/dutch/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/dutch/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..609f4a161 --- /dev/null +++ b/ocr/dutch/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,284 @@ +--- +category: general +date: 2026-06-06 +description: herken tekst van afbeelding met Aspose OCR – leer hoe je een afbeelding + laadt voor OCR en OCR uitvoert op een afbeelding met AI‑nabewerking in Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: nl +og_description: herken snel tekst van een afbeelding. Deze gids laat zien hoe je een + afbeelding laadt voor OCR, OCR op de afbeelding uitvoert en de resultaten verbetert + met AI-nabewerking. +og_title: herken tekst uit afbeelding met Aspose OCR & AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Tekst herkennen uit afbeelding met Aspose OCR & AI +url: /nl/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# tekst herkennen uit afbeelding met Aspose OCR & AI + +Heb je ooit tekst uit een afbeelding moeten herkennen maar wist je niet welke bibliotheek zowel snelheid als nauwkeurigheid biedt? Je bent niet de enige. In deze gids lopen we een volledig end‑to‑end voorbeeld door dat laat zien **hoe je een afbeelding laadt voor OCR**, **OCR uitvoert op een afbeelding**, en vervolgens de output verfijnt met Aspose’s AI‑post‑processor. Aan het einde heb je een kant‑klaar script dat een PNG omzet in schone, doorzoekbare tekst. + +## Wat je zult leren + +We behandelen alles, van het installeren van het Aspose OCR‑pakket tot het vrijgeven van bronnen aan het einde van de uitvoering. Je ziet waarom handgeschreven‑teksterkenning belangrijk is, hoe je een Qwen 2.5 LLM configureert voor post‑processing, en hoe de uiteindelijke output eruitziet. Geen externe referenties nodig—kopieer, plak en voer uit. + +### Vereisten + +- Python 3.8 of nieuwer +- `asposeocr`‑package (`pip install asposeocr`) +- Een afbeeldingsbestand (bijv. `doc.png`) dat gedrukte of handgeschreven tekst bevat +- Optioneel: een GPU voor snellere LLM‑inference (het script werkt ook op CPU) + +--- + +## Tekst herkennen uit afbeelding – Stap‑voor‑stap + +Onder elk code‑blok vind je een korte uitleg **waarom** we die specifieke actie uitvoeren, niet alleen **wat** de regel doet. + +### Stap 1: Installeer en importeer de benodigde modules + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Waarom?* Het importeren van `asposeocr` levert de `OcrEngine`‑klasse, terwijl de `ai`‑submodule de LLM‑gebaseerde post‑processor biedt die de ruwe OCR‑output drastisch verbetert. + +### Stap 2: Maak de OCR‑engine aan en schakel handgeschreven teksterkenning in + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Handgeschreven herkenning breidt de tekenset van de engine uit, zodat je geen krabbels verliest wanneer je **OCR uitvoert op afbeelding**‑bestanden die zowel gedrukte als cursieve tekst bevatten. + +### Stap 3: Laad de afbeelding voor OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +De `load_image`‑aanroep is het moment waarop je **afbeelding laadt voor OCR**; als het pad onjuist is, zal de engine een informatieve uitzondering werpen, waardoor je cryptische downstream‑fouten voorkomt. + +### Stap 4: Voer de ruwe OCR‑passage uit + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +Op dit punt krijg je een `RecognitionResult`‑object dat de ongefilterde tekst, vertrouwensscores en lay‑out‑metadata bevat. Het is vaak rommelig—vandaar de noodzaak voor AI‑gedreven opschoning. + +### Stap 5: Configureer het Aspose AI‑model voor LLM‑post‑processing + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Waarom deze instellingen? +- **auto‑download** garandeert dat het model bij de eerste uitvoering beschikbaar is. +- **int8 quantization** verlaagt het geheugenverbruik zonder grote nauwkeurigheidsverlies. +- **gpu_layers** laat je een compatibele GPU benutten voor snellere inference. + +### Stap 6: Initialiseert de AI‑processor en koppel deze als post‑processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Het koppelen van de processor betekent dat elke keer dat je `run_postprocessor` aanroept, de LLM spelling corrigeert, gebroken woorden samenvoegt en zelfs ontbrekende interpunctie invult. + +### Stap 7: Voer de post‑processor uit om de OCR‑output te verbeteren + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` is doorgaans veel leesbaarder dan de ruwe string—beschouw het als een spellingscontrole die ook de context begrijpt. + +### Stap 8: Vrijgeven van bronnen + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Opschonen is cruciaal in langdurige services; anders lekt GPU‑geheugen en crasht je applicatie uiteindelijk. + +--- + +## Volledig script dat je vandaag kunt uitvoeren + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Verwachte output** (voorbeeld voor een eenvoudige factuurafbeelding): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Merk op hoe de AI‑laag “Inv0ice” → “Invoice” corrigeerde en ontbrekende interpunctie toevoegde. + +--- + +## Veelgestelde vragen (en snelle antwoorden) + +- **Heb ik een GPU nodig?** Nee. Het script valt terug op CPU, maar inference zal trager zijn. +- **Kan ik een ander LLM gebruiken?** Absoluut—verander gewoon `hugging_face_repo_id` en pas `gpu_layers` dienovereenkomstig aan. +- **Wat als mijn afbeelding enorm is?** Schaal deze eerst (bijv. met Pillow) om het geheugenverbruik redelijk te houden. +- **Is handgeschreven herkenning altijd aan?** Je kunt `enable_handwritten_recognition` in- of uitschakelen afhankelijk van je workload. + +--- + +## Conclusie + +Je weet nu hoe je **tekst herkent uit afbeelding** met Aspose OCR, hoe je **afbeelding laadt voor OCR**, en hoe je **OCR uitvoert op afbeelding** met AI‑verbeterde post‑processing. Het volledige, uitvoerbare voorbeeld hierboven biedt een solide basis om OCR in elk Python‑project te integreren—of je nu bonnen scant, contracten digitaliseert, of gegevens uit handgeschreven formulieren haalt. + +Klaar voor de volgende stap? Probeer het Qwen‑model te vervangen door een groter model, experimenteer met verschillende kwantisatieschema's, of koppel meerdere afbeeldingen voor batchverwerking. De mogelijkheden zijn eindeloos, en de code die je net hebt gebouwd zal ze moeiteloos aan kunnen. + +Veel programmeerplezier, en moge je OCR‑resultaten altijd kristalhelder zijn! + +![Screenshot van Python‑console met verbeterde OCR‑output](/images/ocr_output.png){alt="Screenshot die laat zien hoe tekst uit afbeelding te herkennen met Aspose OCR"} + +## Wat moet je hierna leren? + + +De volgende tutorials behandelen nauw verwante onderwerpen die voortbouwen op de technieken die in deze gids worden getoond. Elke bron bevat complete werkende code‑voorbeelden met stap‑voor‑stap uitleg om je te helpen extra API‑functies onder de knie te krijgen en alternatieve implementatie‑benaderingen in je eigen projecten te verkennen. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/english/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/english/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..08d2fc7b9 --- /dev/null +++ b/ocr/english/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,217 @@ +--- +category: general +date: 2026-06-06 +description: Extract text from handwritten image using Python OCR. Learn how to convert + handwritten photo to text quickly and reliably. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: en +og_description: Extract text from handwritten image with Python. This guide shows + how to convert handwritten photo to text and answers how to recognize handwritten + text. +og_title: Extract Text from Handwritten Image – Python OCR Tutorial +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide +url: /python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + +Ever wondered **how to recognize handwritten text** in a photo you snapped on your phone? You're not alone. In many projects—whether it's digitizing lecture notes or pulling data from signed forms—you need to **extract text from handwritten image** fast and without headaches. + +In this tutorial we’ll walk through a complete, ready‑to‑run example that shows you exactly how to **convert handwritten photo to text** using a popular Python OCR library. No vague references, just concrete code, explanations, and tips you can copy‑paste today. + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## What You’ll Need + +Before we dive in, make sure you have the following: + +| Requirement | Why it matters | +|-------------|----------------| +| Python 3.9 or newer | Modern syntax & library support | +| `pip` (Python package manager) | To install the OCR package | +| A clear image of handwritten notes (JPEG/PNG) | OCR accuracy drops on blurry images | +| Basic familiarity with Python functions | Helps you adapt the example later | + +If you’re missing any of these, grab the latest Python from and install it—no drama. + +## Install the Python OCR Library + +The code snippet we’ll use relies on the `ocr` package (a thin wrapper around Tesseract that adds handy settings). Install it with a single command: + +```bash +pip install ocr +``` + +> **Pro tip:** After installation, run `tesseract --version` in your terminal to confirm the underlying engine is present. If you don’t have Tesseract, follow the official guide for your OS—most package managers have it (`apt-get install tesseract-ocr` on Ubuntu, `brew install tesseract` on macOS). + +## Step 1: Create an OCR Engine Instance + +Creating the engine is the first brick in the wall of **python ocr handwritten recognition**. Think of the engine as the brain that will later read the scribbles. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Why this matters: without an engine you can’t tweak the recognition pipeline. The default settings are tuned for printed text, so we’ll need to adjust them in the next step. + +## Step 2: Enable Handwritten Recognition + +By default the engine assumes printed characters. Enabling handwritten mode flips a switch that tells Tesseract to use its LSTM model trained on cursive strokes. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **What if you skip this?** The OCR will treat the strokes as noise, resulting in garbled output. Enabling the flag is the core of **how to recognize handwritten text**. + +## Step 3: Load Your Handwritten Photo + +Now we point the engine at the image file. The path can be absolute or relative; just be sure the file exists. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +A quick sanity check: open the image in your OS’s viewer. If the text is blurry, consider pre‑processing (contrast boost, rotation) before feeding it to the engine—those tweaks often raise the success rate of **extract text from handwritten image**. + +## Step 4: Run the Recognition Process + +With everything set, we finally ask the engine to do its job. The `recognize()` method returns a result object that holds the extracted string and confidence scores. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Behind the scenes, the engine converts the bitmap into a series of feature vectors, runs them through the LSTM network, and stitches the characters together. That’s the magic of **python ocr handwritten recognition**. + +## Step 5: Display the Extracted Text + +The result object exposes a `.text` attribute that contains the plain Unicode string. Print it, write it to a file, or feed it into another pipeline—your call. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Expected Output + +If the source image contains the note “Buy milk, eggs, and bread”, you’d see something like: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Notice how the output preserves punctuation and line breaks (if any). If you get gibberish, double‑check the image quality and the `enable_handwritten_recognition` flag. + +## Handling Common Pitfalls + +| Issue | Symptom | Fix | +|-------|---------|-----| +| Low confidence scores | Many “?” or nonsensical characters | Increase image DPI to ≥300, apply binarization (`opencv`), or crop to the region of interest. | +| Mixed languages | Output mixes English and another script | Set `engine.ocr_settings.language = "eng"` (or another ISO code) before `recognize()`. | +| Large files | Long processing time or memory error | Resize the image to a reasonable dimension (e.g., max width 1200 px) before loading. | +| Missing Tesseract | `ImportError` or `FileNotFoundError` | Install Tesseract separately and ensure it’s on your system PATH. | + +These adjustments keep your **convert handwritten photo to text** workflow robust across diverse datasets. + +## Full Script You Can Run Today + +Below is the complete, self‑contained program that puts all the pieces together. Copy it into a file named `handwritten_ocr.py` and execute `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Run it, and you’ll see the text printed to the console—exactly the result you need when you want to **convert handwritten photo to text**. + +## Going Further + +Now that you’ve mastered the basics of **extract text from handwritten image**, consider these next steps: + +- **Batch processing:** Loop over a folder of images and store each result in a CSV file. +- **Post‑processing:** Use regular expressions to clean up common OCR errors (e.g., “1” vs “l”). +- **Integration:** Feed the extracted strings into a Natural Language Processing pipeline for sentiment analysis or keyword extraction. +- **Alternative libraries:** If you need higher accuracy, explore `easyocr` or `pytesseract` with custom LSTM models—both support **python ocr handwritten recognition** as well. + +Remember, the quality of the source image often dictates success, so spend a few minutes on preprocessing. A little extra effort now saves you a lot of debugging later. + +## Conclusion + +We’ve walked through a complete, end‑to‑end example that shows **how to recognize handwritten text** and, more importantly, how to **extract text from handwritten image** using Python. By installing the `ocr` package, toggling the handwritten flag, loading your picture, and calling `recognize()`, you can **convert handwritten photo to text** in just a handful of lines. + +Give it a try with your own notes, tweak the preprocessing steps, and let the OCR do the heavy lifting. If you hit any snags, revisit the “Handling Common Pitfalls” table or experiment with alternative OCR back‑ends. Happy coding, and may your handwritten data become instantly searchable! + + +## What Should You Learn Next? + + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step-by-step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [How to Recognize Page Rectangles for OCR Text Recognition in Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [How to Use OCR - Recognize Image without Text Area Detection](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/english/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/english/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..c814bc256 --- /dev/null +++ b/ocr/english/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: en +og_description: 'OCR character set guide: discover how to use OCR engine in Python + to list supported characters for various scripts, complete with code and tips.' +og_title: OCR Character Set – Retrieve and Use OCR Engine +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR Character Set – Retrieve and Use OCR Engine in Python +url: /python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR Character Set – Retrieve and Use OCR Engine in Python + +Want to know the **ocr character set** that your library supports? In this tutorial we'll show you how to **use OCR engine** to query supported characters for different scripts, and why that matters for real‑world projects. + +If you’ve ever stared at a blank screen wondering why some letters never appear after OCR processing, you’re not alone. The answer is often hidden in the character set the engine knows about. By the end of this guide you’ll be able to: + +* List every character an OCR engine can recognize for a given script. +* Handle cases where a script isn’t supported. +* Plug the character‑set information into downstream validation or UI logic. + +No magical black‑box tricks—just plain Python, a couple of lines of code, and a bit of explanation. + +--- + +## Prerequisites + +Before we jump in, make sure you have: + +* Python 3.8+ installed (the code uses f‑strings). +* The OCR library that provides `ocr.OcrEngine`—for this example we’ll assume a package named `ocr` is already installed via `pip install ocr-lib`. +* Basic familiarity with Python’s import system and printing. + +If you’re missing the library, run: + +```bash +pip install ocr-lib +``` + +That’s it—no extra binaries or native dependencies needed for the basic character‑set queries we’ll perform. + +--- + +## Step 1: Initialize the OCR Engine Instance + +Creating an engine object is the first thing you do when you **use OCR engine** functionality. Think of it as turning on a scanner; without it, you can’t ask any questions. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +The `OcrEngine` class is a lightweight wrapper around the underlying recognition engine. It doesn’t start heavy processing until you request something, so the startup cost is negligible. + +> **Pro tip:** If your application creates many engine instances, reuse a single one. It saves memory and reduces initialization latency. + +--- + +## Step 2: Query the OCR Character Set for a Specific Script + +Now that we have an engine, we can ask it which characters it knows for a particular writing system. The method `get_supported_characters(script_name)` returns a Python `list` of Unicode characters. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Running the snippet above prints something like: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +The exact output depends on the OCR library version, but you’ll always get a flat list of characters. If you need both uppercase and lowercase, simply request the `"Latin"` script and then apply `.lower()` to each element, or query a separate `"Latin‑lower"` script if the library distinguishes them. + +### Why does this matter? + +When you feed an image containing unusual diacritics (e.g., “ñ” or “ø”), the OCR engine may silently replace them with a placeholder or drop them entirely. Knowing the **ocr character set** ahead of time lets you pre‑validate input, warn users, or fall back to a different engine. + +--- + +## Step 3: Retrieve Characters for Another Script – Cyrillic Example + +The same method works for any script the engine claims to support. Let’s see what happens with Cyrillic. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Typical output: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +If the engine does **not** support the requested script, it usually raises a `ValueError` (or returns an empty list depending on the implementation). Let’s guard against that. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Step 4: Visualize the OCR Character Set (Optional) + +Sometimes a quick visual helps, especially when you need to show stakeholders which characters are covered. Below is a minimal example using `matplotlib` to render a grid of characters. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Image alt text:** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +The diagram isn’t required for the core solution, but it demonstrates how you can **use OCR engine** metadata beyond plain printing. + +--- + +## Step 5: Integrate the Character Set into Real‑World Workflows + +Now that we can retrieve the **ocr character set**, let’s see a practical scenario: validating OCR output before storing it in a database. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +This pattern prevents garbage data from slipping into downstream pipelines, a common pain point when dealing with multilingual documents. + +--- + +## Common Pitfalls and Edge Cases + +| Pitfall | Why it Happens | How to Avoid | +|---------|----------------|--------------| +| **Script name typo** (e.g., `"Cyrillic "` with trailing space) | The engine treats the string literally and can’t find the script. | Strip whitespace: `script.strip()` before calling `get_supported_characters`. | +| **Empty character list** | Some engines expose a script but haven’t loaded the language model yet. | Call `engine.load_language_model(script)` if the library provides such a method, or ensure the model files are present. | +| **Unicode normalization issues** | Characters like “é” may appear as composed (`\u00E9`) or decomposed (`e\u0301`). | Normalise strings with `unicodedata.normalize('NFC', text)` before validation. | +| **Performance on huge scripts** (e.g., Chinese) | Retrieving thousands of characters can be slow. | Cache the result after the first call; most applications only need the list once. | + +--- + +## Full Working Example + +Putting everything together, here’s a single script you can copy‑paste and run: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## What Should You Learn Next? + + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step-by-step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/english/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/english/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..583a6495d --- /dev/null +++ b/ocr/english/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,257 @@ +--- +category: general +date: 2026-06-06 +description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: en +og_description: Perform OCR on image using Aspose OCR and a Hugging Face model. This + tutorial shows how to download the model, extract text from invoice, and free GPU + resources. +og_title: Perform OCR on Image with Aspose OCR & LLM – Complete Guide +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Perform OCR on Image with Aspose OCR & LLM – Complete Guide +url: /python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Perform OCR on Image with Aspose OCR & LLM – Complete Guide + +Ever wanted to **perform OCR on image** files but felt stuck at the “where do I start?” question? You’re not alone—many developers hit that wall when they first tackle document automation. The good news is that with Aspose OCR and a lightweight LLM from Hugging Face, you can turn a raw scan of an invoice into clean, searchable text in just a few lines of Python. + +In this tutorial we’ll walk through everything you need: from **loading image for OCR**, to **downloading the Hugging Face model**, to **extracting text from invoice** data, and finally **free GPU resources** so your app stays lean. By the end you’ll have a self‑contained script you can drop into any project. + +--- + +## What You’ll Learn + +- How to **perform OCR on image** using Aspose’s `OcrEngine`. +- The exact steps to **download Hugging Face model** files automatically. +- Techniques to **extract text from invoice** PDFs or PNGs with AI‑enhanced post‑processing. +- Best practices to **free GPU resources** after inference. +- Tips for **load image for OCR** efficiently and avoid common pitfalls. + +No external documentation is required—everything you need is right here, with full code, explanations, and expected output. + +--- + +## Prerequisites + +Before we dive in, make sure you have: + +| Requirement | Reason | +|-------------|--------| +| Python 3.9+ | Modern syntax and type hints | +| `asposeocr` package (`pip install asposeocr`) | Core OCR engine | +| Access to a GPU (optional but recommended) | Speeds up the LLM post‑processor | +| An invoice image (`sample_invoice.png`) | Real‑world test case | + +If you’re missing any of these, install them now; the script will also **download Hugging Face model** automatically, so you don’t have to hunt for files yourself. + +--- + +## Step 1: Perform OCR on Image – Create the Engine + +The first thing you need to do is spin up Aspose’s OCR engine. Think of it as opening a blank canvas where the image will later be painted into text. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Why this matters:** `OcrEngine` abstracts all the low‑level image preprocessing, so you can focus on the higher‑level workflow. It also exposes a `set_post_processor` method that will later let us hook an LLM for smarter output. + +--- + +## Step 2: Load Image for OCR – Choose the Right File + +Now that the engine exists, we need to **load image for OCR**. Aspose supports PNG, JPG, TIFF, and a handful of other formats. Make sure the path is absolute or relative to your script’s location. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tip:** If your image is large, consider resizing it beforehand to reduce memory pressure. The OCR engine can handle high‑resolution scans, but a 300 DPI image is usually a sweet spot for invoices. + +--- + +## Step 3: Perform Raw OCR and View the Extracted Text + +With the image loaded, we can finally **perform OCR on image** and see what the raw engine spits out. This step gives us a baseline before we add AI magic. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Expected output (truncated):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +The raw output often contains line breaks, mis‑recognized characters, or missing fields—exactly why we’ll bring in a language model next. + +--- + +## Step 4: Download Hugging Face Model – Configure the LLM Post‑Processor + +Here’s where the **download Hugging Face model** step shines. Aspose AI can automatically pull a model from the Hugging Face hub if it isn’t already on disk. We’ll use the Qwen2.5‑3B‑Instruct‑GGUF model, which balances accuracy and memory footprint. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Why this works:** `allow_auto_download` saves you from manually downloading the `.gguf` file. Quantization (`int8`) reduces the model size to roughly 3 GB, making it feasible on most consumer GPUs. Adjust `gpu_layers` based on your hardware—more layers on GPU = faster inference. + +--- + +## Step 5: Extract Text from Invoice Using AI‑Enhanced Post‑Processing + +Now we attach the LLM to the OCR engine and run a **post‑processor** that cleans up the raw output, corrects OCR mistakes, and formats the invoice fields nicely. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Sample enhanced output:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **What happened?** The LLM recognized that “Invoice #12345” should be “Invoice Number: 12345”, fixed the date format, and even inferred the “Bill To” field that the raw engine missed. This is the core of **extract text from invoice** automation. + +--- + +## Step 6: Free GPU Resources – Clean Up After Processing + +If you’re running this in a long‑living service (e.g., a Flask API), you must **free GPU resources** after each inference to avoid out‑of‑memory crashes. Aspose AI provides a straightforward method for that. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro tip:** Call `free_resources()` inside a `finally:` block if you’re wrapping the OCR call in a try/except. That guarantees cleanup even when an exception occurs. + +--- + +## Step 7: Full Script – Put It All Together + +Below is the complete, ready‑to‑run script. Copy‑paste it, adjust the paths, and you’re good to go. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Run the script and watch the transformation from noisy OCR to clean, structured invoice data. 🎉 + +--- + +## Common Questions & Edge Cases + +| Question | Answer | +|----------|--------| +| **What if the model fails to download?** | Ensure your machine has internet access and that the `hugging_face_repo_id` is correct. You can also manually download the + + +## What Should You Learn Next? + + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step-by-step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/english/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/english/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..0e6e11a9f --- /dev/null +++ b/ocr/english/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,284 @@ +--- +category: general +date: 2026-06-06 +description: recognize text from image with Aspose OCR – learn how to load image for + OCR and perform OCR on image using AI post‑processing in Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: en +og_description: recognize text from image quickly. This guide shows how to load image + for OCR, perform OCR on image, and boost results with AI post‑processing. +og_title: recognize text from image using Aspose OCR & AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: recognize text from image using Aspose OCR & AI +url: /python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# recognize text from image using Aspose OCR & AI + +Ever needed to recognize text from image but weren’t sure which library would give you both speed and accuracy? You're not alone. In this guide we’ll walk through a complete, end‑to‑end example that shows **how to load image for OCR**, **perform OCR on image**, and then polish the output with Aspose’s AI post‑processor. By the end you’ll have a ready‑to‑run script that turns a PNG into clean, searchable text. + +## What you’ll learn + +We’ll cover everything from installing the Aspose OCR package to releasing resources at the end of the run. You’ll see why enabling handwritten‑text recognition matters, how to configure a Qwen 2.5 LLM for post‑processing, and what the final output looks like. No external references required—just copy, paste, and run. + +### Prerequisites + +- Python 3.8 or newer +- `asposeocr` package (`pip install asposeocr`) +- An image file (e.g., `doc.png`) that contains printed or handwritten text +- Optional: a GPU for faster LLM inference (the script works on CPU too) + +--- + +## Recognize text from image – Step‑by‑Step + +Below each code block you’ll find a short explanation of **why** we’re doing that particular action, not merely **what** the line does. + +### Step 1: Install and import the required modules + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Why?* Importing `asposeocr` gives us the `OcrEngine` class, while the `ai` submodule provides the LLM‑based post‑processor that dramatically improves raw OCR output. + +### Step 2: Create the OCR engine and enable handwritten text recognition + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Enabling handwritten recognition expands the engine’s character set, so you won’t lose scribbles when you **perform OCR on image** files that contain mixed printed and cursive text. + +### Step 3: Load the image for OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +The `load_image` call is the moment where you **load image for OCR**; if the path is wrong, the engine will raise an informative exception, saving you from cryptic downstream errors. + +### Step 4: Run the raw OCR pass + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +At this stage you get a `RecognitionResult` object that contains the unfiltered text, confidence scores, and layout metadata. It’s often noisy—hence the need for AI‑driven cleanup. + +### Step 5: Configure the Aspose AI model for LLM post‑processing + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Why bother with these settings? +- **auto‑download** guarantees the model is available on first run. +- **int8 quantization** slashes memory demand without a huge accuracy hit. +- **gpu_layers** lets you leverage a compatible GPU for faster inference. + +### Step 6: Initialise the AI processor and attach it as a post‑processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Attaching the processor means every time you call `run_postprocessor`, the LLM will clean up spelling, merge broken words, and even infer missing punctuation. + +### Step 7: Run the post‑processor to enhance the OCR output + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +The `enhanced_result.text` is usually far more readable than the raw string—think of it as a spell‑checker that also understands context. + +### Step 8: Release resources + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Cleaning up is crucial in long‑running services; otherwise you’ll leak GPU memory and eventually crash your application. + +--- + +## Full script you can run today + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Expected output** (example for a simple invoice image): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Notice how the AI layer corrected “Inv0ice” → “Invoice” and added missing punctuation. + +--- + +## Frequently asked questions (and quick answers) + +- **Do I need a GPU?** No. The script falls back to CPU, but inference will be slower. +- **Can I use a different LLM?** Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` accordingly. +- **What if my image is huge?** Resize it first (e.g., using Pillow) to keep memory usage reasonable. +- **Is handwritten recognition always on?** You can toggle `enable_handwritten_recognition` depending on your workload. + +--- + +## Conclusion + +You now know how to **recognize text from image** using Aspose OCR, how to **load image for OCR**, and how to **perform OCR on image** with AI‑enhanced post‑processing. The complete, runnable example above gives you a solid foundation to integrate OCR into any Python project—whether it’s scanning receipts, digitizing contracts, or extracting data from handwritten forms. + +Ready for the next step? Try swapping the Qwen model for a larger one, experiment with different quantization schemes, or chain multiple images together for batch processing. The possibilities are endless, and the code you just built will handle them gracefully. + +Happy coding, and may your OCR results always be crystal‑clear! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Screenshot showing how to recognize text from image using Aspose OCR"} + + +## What Should You Learn Next? + + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step-by-step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/french/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/french/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..849aa287c --- /dev/null +++ b/ocr/french/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,216 @@ +--- +category: general +date: 2026-06-06 +description: Extraire du texte d’une image manuscrite à l’aide de l’OCR Python. Apprenez + comment convertir une photo manuscrite en texte rapidement et de manière fiable. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: fr +og_description: Extraire du texte d’une image manuscrite avec Python. Ce guide montre + comment convertir une photo manuscrite en texte et explique comment reconnaître + le texte manuscrit. +og_title: Extraire du texte d'une image manuscrite – Tutoriel OCR Python +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Extraire du texte d’une image manuscrite avec OCR Python – Guide étape par + étape +url: /fr/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Extraire du texte d’une image manuscrite avec Python OCR – Guide étape par étape + +Vous vous êtes déjà demandé **comment reconnaître du texte manuscrit** sur une photo prise avec votre téléphone ? Vous n’êtes pas seul. Dans de nombreux projets—que ce soit pour numériser des notes de cours ou extraire des données de formulaires signés—vous avez besoin d’**extraire du texte d’une image manuscrite** rapidement et sans prise de tête. + +Dans ce tutoriel, nous parcourrons un exemple complet, prêt à l’emploi, qui vous montre exactement comment **convertir une photo manuscrite en texte** à l’aide d’une bibliothèque OCR Python populaire. Pas de références vagues, seulement du code concret, des explications et des astuces que vous pouvez copier‑coller dès aujourd’hui. + +![extraire du texte à partir d'une image manuscrite](https://example.com/placeholder-handwritten.jpg "extraire du texte à partir d'une image manuscrite") + +## Ce dont vous avez besoin + +Avant de commencer, assurez‑vous de disposer de ce qui suit : + +| Exigence | Pourquoi c'est important | +|----------|---------------------------| +| Python 3.9 ou plus récent | Syntaxe moderne & prise en charge des bibliothèques | +| `pip` (gestionnaire de paquets Python) | Pour installer le paquet OCR | +| Une image claire de notes manuscrites (JPEG/PNG) | La précision de l’OCR chute avec les images floues | +| Familiarité de base avec les fonctions Python | Vous aide à adapter l’exemple plus tard | + +Si l’un de ces éléments vous manque, téléchargez la dernière version de Python sur et installez‑la—sans tracas. + +## Installer la bibliothèque OCR Python + +L’extrait de code que nous allons utiliser repose sur le paquet `ocr` (un léger wrapper autour de Tesseract qui ajoute des paramètres pratiques). Installez‑le avec une seule commande : + +```bash +pip install ocr +``` + +> **Astuce :** Après l’installation, exécutez `tesseract --version` dans votre terminal pour confirmer que le moteur sous‑jacent est présent. Si vous n’avez pas Tesseract, suivez le guide officiel pour votre OS—la plupart des gestionnaires de paquets le proposent (`apt-get install tesseract-ocr` sur Ubuntu, `brew install tesseract` sur macOS). + +## Étape 1 : Créer une instance du moteur OCR + +Créer le moteur est la première brique du mur de **python ocr handwritten recognition**. Pensez au moteur comme le cerveau qui lira plus tard les griffonnages. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Pourquoi c’est important : sans moteur, vous ne pouvez pas ajuster le pipeline de reconnaissance. Les paramètres par défaut sont réglés pour du texte imprimé, nous devrons donc les modifier à l’étape suivante. + +## Étape 2 : Activer la reconnaissance manuscrite + +Par défaut, le moteur suppose des caractères imprimés. Activer le mode manuscrit bascule un commutateur qui indique à Tesseract d’utiliser son modèle LSTM entraîné sur des traits cursifs. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Et si vous sautez cette étape ?** L’OCR traitera les traits comme du bruit, produisant une sortie incompréhensible. Activer le drapeau est le cœur de **how to recognize handwritten text**. + +## Étape 3 : Charger votre photo manuscrite + +Nous indiquons maintenant au moteur le fichier image. Le chemin peut être absolu ou relatif ; assurez‑vous simplement que le fichier existe. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Vérification rapide : ouvrez l’image avec le visualiseur de votre OS. Si le texte est flou, envisagez un pré‑traitement (augmentation du contraste, rotation) avant de le passer au moteur—ces ajustements augmentent souvent le taux de succès d’**extract text from handwritten image**. + +## Étape 4 : Exécuter le processus de reconnaissance + +Avec tout en place, nous demandons enfin au moteur d’accomplir sa tâche. La méthode `recognize()` renvoie un objet résultat contenant la chaîne extraite et les scores de confiance. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +En coulisses, le moteur convertit le bitmap en une série de vecteurs de caractéristiques, les fait passer à travers le réseau LSTM, puis assemble les caractères. C’est la magie de **python ocr handwritten recognition**. + +## Étape 5 : Afficher le texte extrait + +L’objet résultat expose un attribut `.text` qui contient la chaîne Unicode brute. Imprimez‑le, écrivez‑le dans un fichier, ou alimentez‑le dans un autre pipeline—à vous de choisir. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Sortie attendue + +Si l’image source contient la note « Buy milk, eggs, and bread », vous verrez quelque chose comme : + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Remarquez que la sortie conserve la ponctuation et les sauts de ligne (le cas échéant). Si vous obtenez du charabia, revérifiez la qualité de l’image et le drapeau `enable_handwritten_recognition`. + +## Gestion des problèmes courants + +| Problème | Symptom | Solution | +|----------|---------|----------| +| Scores de confiance faibles | Beaucoup de « ? » ou de caractères incohérents | Augmenter le DPI de l’image à ≥300, appliquer une binarisation (`opencv`), ou recadrer la zone d’intérêt. | +| Langues mixtes | La sortie mélange anglais et un autre script | Définir `engine.ocr_settings.language = "eng"` (ou un autre code ISO) avant `recognize()`. | +| Fichiers volumineux | Temps de traitement long ou erreur de mémoire | Redimensionner l’image à une dimension raisonnable (par ex., largeur maximale 1200 px) avant le chargement. | +| Tesseract manquant | `ImportError` ou `FileNotFoundError` | Installer Tesseract séparément et s’assurer qu’il est dans le PATH du système. | + +Ces ajustements maintiennent votre workflow **convert handwritten photo to text** robuste face à des jeux de données variés. + +## Script complet à exécuter dès aujourd’hui + +Voici le programme complet, autonome, qui assemble toutes les pièces. Copiez‑le dans un fichier nommé `handwritten_ocr.py` et lancez `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Exécutez‑le, et vous verrez le texte affiché dans la console—exactement le résultat dont vous avez besoin lorsque vous voulez **convert handwritten photo to text**. + +## Aller plus loin + +Maintenant que vous avez maîtrisé les bases d’**extract text from handwritten image**, envisagez les étapes suivantes : + +- **Traitement par lots :** Parcourez un dossier d’images et stockez chaque résultat dans un fichier CSV. +- **Post‑traitement :** Utilisez des expressions régulières pour nettoyer les erreurs OCR fréquentes (ex. « 1 » vs « l »). +- **Intégration :** Alimentez les chaînes extraites dans un pipeline de traitement du langage naturel pour l’analyse de sentiment ou l’extraction de mots‑clés. +- **Bibliothèques alternatives :** Si vous avez besoin d’une précision supérieure, explorez `easyocr` ou `pytesseract` avec des modèles LSTM personnalisés—les deux supportent **python ocr handwritten recognition** également. + +Rappelez‑vous que la qualité de l’image source détermine souvent le succès, alors consacrez quelques minutes au pré‑traitement. Un petit effort supplémentaire maintenant vous évite beaucoup de débogage plus tard. + +## Conclusion + +Nous avons parcouru un exemple complet, de bout en bout, qui montre **how to recognize handwritten text** et, surtout, comment **extract text from handwritten image** avec Python. En installant le paquet `ocr`, en activant le drapeau manuscrit, en chargeant votre image et en appelant `recognize()`, vous pouvez **convert handwritten photo to text** en quelques lignes seulement. + +Essayez‑le avec vos propres notes, ajustez les étapes de pré‑traitement, et laissez l’OCR faire le gros du travail. Si vous rencontrez des difficultés, consultez le tableau « Gestion des problèmes courants » ou expérimentez d’autres back‑ends OCR. Bon codage, et que vos données manuscrites deviennent instantanément recherchables ! + +## Que devriez‑vous apprendre ensuite ? + +Les tutoriels suivants couvrent des sujets étroitement liés qui s’appuient sur les techniques démontrées dans ce guide. Chaque ressource inclut des exemples de code fonctionnels complets avec des explications pas à pas pour vous aider à maîtriser des fonctionnalités API supplémentaires et explorer des approches d’implémentation alternatives dans vos propres projets. + +- [Extraire du texte d’une image avec Aspose OCR – Guide étape par étape](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Comment reconnaître les rectangles de page pour la reconnaissance de texte OCR dans Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Comment utiliser OCR - Reconnaître une image sans détection de zone de texte](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/french/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/french/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..56e8e0b40 --- /dev/null +++ b/ocr/french/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,243 @@ +--- +category: general +date: 2026-06-06 +description: 'Guide du jeu de caractères OCR : apprenez à utiliser le moteur OCR pour + répertorier les caractères pris en charge pour les scripts latin et cyrillique avec + des exemples Python complets.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: fr +og_description: 'Guide du jeu de caractères OCR : découvrez comment utiliser le moteur + OCR en Python pour répertorier les caractères pris en charge pour divers scripts, + avec du code et des astuces.' +og_title: Jeu de caractères OCR – Récupérer et utiliser le moteur OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Jeu de caractères OCR – Récupérer et utiliser le moteur OCR en Python +url: /fr/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Jeu de caractères OCR – Récupérer et utiliser le moteur OCR en Python + +Vous voulez connaître le **ocr character set** que votre bibliothèque prend en charge ? Dans ce tutoriel, nous vous montrerons comment **use OCR engine** pour interroger les caractères pris en charge pour différents scripts, et pourquoi cela est important pour les projets du monde réel. + +Si vous avez déjà fixé un écran blanc en vous demandant pourquoi certaines lettres n’apparaissent jamais après le traitement OCR, vous n’êtes pas seul. La réponse se cache souvent dans le jeu de caractères que le moteur connaît. À la fin de ce guide, vous serez capable de : + +* Lister chaque caractère qu’un moteur OCR peut reconnaître pour un script donné. +* Gérer les cas où un script n’est pas pris en charge. +* Intégrer les informations du jeu de caractères dans la validation en aval ou la logique d’interface utilisateur. + +Pas de tours de magie en boîte noire—juste du Python pur, quelques lignes de code, et un peu d’explication. + +--- + +## Prérequis + +Avant de commencer, assurez-vous d’avoir : + +* Python 3.8+ installé (le code utilise les f‑strings). +* La bibliothèque OCR qui fournit `ocr.OcrEngine`—pour cet exemple, nous supposerons qu’un paquet nommé `ocr` est déjà installé via `pip install ocr-lib`. +* Une connaissance de base du système d’importation de Python et de l’impression. + +Si la bibliothèque manque, exécutez : + +```bash +pip install ocr-lib +``` + +C’est tout—aucun binaire supplémentaire ou dépendance native n’est nécessaire pour les requêtes de jeu de caractères de base que nous allons effectuer. + +## Étape 1 : Initialiser l’instance du moteur OCR + +Créer un objet moteur est la première chose à faire lorsque vous **use OCR engine**. Pensez-y comme allumer un scanner ; sans cela, vous ne pouvez poser aucune question. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +La classe `OcrEngine` est un wrapper léger autour du moteur de reconnaissance sous‑jacent. Elle ne lance aucun traitement lourd tant que vous ne demandez rien, donc le coût de démarrage est négligeable. + +> **Astuce :** Si votre application crée de nombreuses instances du moteur, réutilisez‑en une seule. Cela économise de la mémoire et réduit la latence d’initialisation. + +## Étape 2 : Interroger le jeu de caractères OCR pour un script spécifique + +Maintenant que nous disposons d’un moteur, nous pouvons lui demander quels caractères il connaît pour un système d’écriture particulier. La méthode `get_supported_characters(script_name)` renvoie une `list` Python de caractères Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Exécuter le fragment ci‑dessus affiche quelque chose comme : + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +La sortie exacte dépend de la version de la bibliothèque OCR, mais vous obtiendrez toujours une liste plate de caractères. Si vous avez besoin à la fois des majuscules et des minuscules, demandez simplement le script `"Latin"` puis appliquez `.lower()` à chaque élément, ou interrogez un script séparé `"Latin‑lower"` si la bibliothèque les distingue. + +### Pourquoi cela importe‑t‑il ? + +Lorsque vous fournissez une image contenant des diacritiques inhabituels (par ex., “ñ” ou “ø”), le moteur OCR peut les remplacer silencieusement par un espace réservé ou les supprimer complètement. Connaître le **ocr character set** à l’avance vous permet de pré‑valider les entrées, d’avertir les utilisateurs, ou de revenir à un autre moteur. + +## Étape 3 : Récupérer les caractères pour un autre script – Exemple Cyrillique + +La même méthode fonctionne pour tout script que le moteur prétend prendre en charge. Voyons ce qui se passe avec le cyrillique. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Sortie typique : + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Si le moteur **ne** prend pas en charge le script demandé, il lève généralement une `ValueError` (ou renvoie une liste vide selon l’implémentation). Protégeons‑nous contre cela. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +## Étape 4 : Visualiser le jeu de caractères OCR (Optionnel) + +Parfois, une visualisation rapide aide, surtout lorsque vous devez montrer aux parties prenantes quels caractères sont couverts. Ci‑dessous, un exemple minimal utilisant `matplotlib` pour rendre une grille de caractères. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Texte alternatif de l’image :** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +Le diagramme n’est pas requis pour la solution principale, mais il montre comment vous pouvez **use OCR engine** les métadonnées au‑delà d’un simple affichage. + +## Étape 5 : Intégrer le jeu de caractères dans les flux de travail réels + +Maintenant que nous pouvons récupérer le **ocr character set**, voyons un scénario pratique : valider la sortie OCR avant de la stocker dans une base de données. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Ce modèle empêche les données indésirables de s’infiltrer dans les pipelines en aval, un problème fréquent lorsqu’on traite des documents multilingues. + +## Pièges courants et cas limites + +| Piège | Pourquoi cela se produit | Comment éviter | +|-------|--------------------------|----------------| +| **Erreur de nom de script** (p. ex., `"Cyrillic "` avec un espace à la fin) | Le moteur traite la chaîne littéralement et ne trouve pas le script. | Supprimez les espaces : `script.strip()` avant d’appeler `get_supported_characters`. | +| **Liste de caractères vide** | Certains moteurs exposent un script mais n’ont pas encore chargé le modèle de langue. | Appelez `engine.load_language_model(script)` si la bibliothèque fournit une telle méthode, ou assurez‑vous que les fichiers du modèle sont présents. | +| **Problèmes de normalisation Unicode** | Des caractères comme “é” peuvent apparaître sous forme composée (`\u00E9`) ou décomposée (`e\u0301`). | Normalisez les chaînes avec `unicodedata.normalize('NFC', text)` avant la validation. | +| **Performance sur les scripts volumineux** (p. ex., chinois) | Récupérer des milliers de caractères peut être lent. | Mettez en cache le résultat après le premier appel ; la plupart des applications n’ont besoin de la liste qu’une fois. | + +## Exemple complet fonctionnel + +En réunissant tous les éléments, voici un script unique que vous pouvez copier‑coller et exécuter : + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Que devriez‑vous apprendre ensuite ? + +Les tutoriels suivants couvrent des sujets étroitement liés qui s’appuient sur les techniques démontrées dans ce guide. Chaque ressource inclut des exemples de code complets avec des explications étape par étape pour vous aider à maîtriser des fonctionnalités d’API supplémentaires et explorer des approches d’implémentation alternatives dans vos propres projets. + +- [Tutoriel Aspose OCR – Reconnaissance optique de caractères](/ocr/english/) +- [Post‑traitement OCR – Obtenir les choix de caractères](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Comment définir la valeur de seuil dans la reconnaissance d’image OCR](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/french/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/french/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..b7111d404 --- /dev/null +++ b/ocr/french/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,261 @@ +--- +category: general +date: 2026-06-06 +description: Effectuer une OCR sur une image en utilisant Aspose OCR et un modèle + Hugging Face. Apprenez à télécharger le modèle Hugging Face, extraire le texte d’une + facture et libérer les ressources GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: fr +og_description: Effectuez la reconnaissance optique de caractères (OCR) sur une image + en utilisant Aspose OCR et un modèle Hugging Face. Ce tutoriel montre comment télécharger + le modèle, extraire le texte d’une facture et libérer les ressources GPU. +og_title: Effectuer l'OCR sur une image avec Aspose OCR & LLM – Guide complet +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Effectuer l’OCR sur une image avec Aspose OCR et LLM – Guide complet +url: /fr/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Effectuer une OCR sur une image avec Aspose OCR & LLM – Guide complet + +Vous avez déjà voulu **effectuer une OCR sur des fichiers image** mais vous êtes resté bloqué à la question « par où commencer » ? Vous n’êtes pas seul — de nombreux développeurs rencontrent ce mur lorsqu’ils abordent pour la première fois l’automatisation de documents. La bonne nouvelle, c’est qu’avec Aspose OCR et un LLM léger de Hugging Face, vous pouvez transformer un scan brut de facture en texte propre et interrogeable en quelques lignes de Python. + +Dans ce tutoriel, nous passerons en revue tout ce dont vous avez besoin : du **chargement d’image pour l’OCR**, au **téléchargement du modèle Hugging Face**, en passant par **l’extraction de texte d’une facture**, jusqu’à **la libération des ressources GPU** afin que votre application reste légère. À la fin, vous disposerez d’un script autonome que vous pourrez intégrer à n’importe quel projet. + +--- + +## Ce que vous allez apprendre + +- Comment **effectuer une OCR sur image** à l’aide du `OcrEngine` d’Aspose. +- Les étapes exactes pour **télécharger le modèle Hugging Face** automatiquement. +- Des techniques pour **extraire le texte d’une facture** à partir de PDF ou PNG avec un post‑traitement amélioré par l’IA. +- Les meilleures pratiques pour **libérer les ressources GPU** après l’inférence. +- Des astuces pour **charger l’image pour l’OCR** efficacement et éviter les pièges courants. + +Aucune documentation externe n’est requise — tout ce dont vous avez besoin se trouve ici, avec le code complet, les explications et le résultat attendu. + +--- + +## Prérequis + +Avant de commencer, assurez‑vous d’avoir : + +| Exigence | Raison | +|----------|--------| +| Python 3.9+ | Syntaxe moderne et annotations de type | +| paquet `asposeocr` (`pip install asposeocr`) | Moteur OCR principal | +| Accès à un GPU (optionnel mais recommandé) | Accélère le post‑processeur LLM | +| Une image de facture (`sample_invoice.png`) | Cas de test réel | + +Si l’un de ces éléments vous manque, installez‑le maintenant ; le script **téléchargera également le modèle Hugging Face** automatiquement, vous n’aurez donc pas à chercher les fichiers vous‑même. + +--- + +## Étape 1 : Effectuer une OCR sur image – Créer le moteur + +La première chose à faire est d’instancier le moteur OCR d’Aspose. Pensez‑y comme à l’ouverture d’une toile vierge où l’image sera ensuite peinte en texte. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Pourquoi c’est important :** `OcrEngine` abstrait tout le pré‑traitement d’image de bas niveau, vous permettant de vous concentrer sur le flux de travail de haut niveau. Il expose également une méthode `set_post_processor` qui nous permettra plus tard de brancher un LLM pour un résultat plus intelligent. + +--- + +## Étape 2 : Charger l’image pour l’OCR – Choisir le bon fichier + +Maintenant que le moteur existe, nous devons **charger l’image pour l’OCR**. Aspose prend en charge PNG, JPG, TIFF et quelques autres formats. Assurez‑vous que le chemin soit absolu ou relatif à l’emplacement de votre script. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Astuce :** Si votre image est volumineuse, pensez à la redimensionner au préalable afin de réduire la pression mémoire. Le moteur OCR peut gérer des scans haute résolution, mais une image à 300 DPI est généralement le meilleur compromis pour les factures. + +--- + +## Étape 3 : Effectuer l’OCR brute et visualiser le texte extrait + +Avec l’image chargée, nous pouvons enfin **effectuer une OCR sur image** et voir ce que le moteur brut renvoie. Cette étape nous donne une base avant d’ajouter la magie de l’IA. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Résultat attendu (truncaté) :** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +La sortie brute contient souvent des sauts de ligne, des caractères mal reconnus ou des champs manquants — c’est exactement pourquoi nous allons introduire un modèle de langue ensuite. + +--- + +## Étape 4 : Télécharger le modèle Hugging Face – Configurer le post‑processeur LLM + +C’est ici que l’étape **télécharger le modèle Hugging Face** brille. Aspose AI peut automatiquement récupérer un modèle depuis le hub Hugging Face s’il n’est pas déjà présent sur le disque. Nous utiliserons le modèle **Qwen2.5‑3B‑Instruct‑GGUF**, qui offre un bon équilibre entre précision et empreinte mémoire. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Pourquoi cela fonctionne :** `allow_auto_download` vous évite de télécharger manuellement le fichier `.gguf`. La quantisation (`int8`) réduit la taille du modèle à environ 3 Go, le rendant viable sur la plupart des GPU grand public. Ajustez `gpu_layers` en fonction de votre matériel — plus de couches sur le GPU = inférence plus rapide. + +--- + +## Étape 5 : Extraire le texte de la facture avec un post‑traitement IA‑enhancé + +Nous attachons maintenant le LLM au moteur OCR et exécutons un **post‑processeur** qui nettoie la sortie brute, corrige les erreurs d’OCR et formate correctement les champs de la facture. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Exemple de sortie améliorée :** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Que s’est‑il passé ?** Le LLM a reconnu que « Invoice #12345 » devait être « Invoice Number: 12345 », a corrigé le format de date, et a même déduit le champ « Bill To » que le moteur brut avait manqué. C’est le cœur de l’**extraction de texte de facture** automatisée. + +--- + +## Étape 6 : Libérer les ressources GPU – Nettoyer après le traitement + +Si vous exécutez cela dans un service à longue durée de vie (par ex. une API Flask), vous devez **libérer les ressources GPU** après chaque inférence afin d’éviter les plantages « out‑of‑memory ». Aspose AI fournit une méthode simple pour cela. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Astuce pro :** Appelez `free_resources()` dans un bloc `finally:` si vous encapsulez l’appel OCR dans un `try/except`. Cela garantit le nettoyage même en cas d’exception. + +--- + +## Étape 7 : Script complet – Tout mettre ensemble + +Voici le script complet, prêt à être exécuté. Copiez‑collez‑le, ajustez les chemins, et vous êtes prêt. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Exécutez le script et observez la transformation d’une OCR bruitée en données de facture propres et structurées. 🎉 + +--- + +## Questions fréquentes & Cas limites + +| Question | Réponse | +|----------|---------| +| **Que faire si le modèle ne parvient pas à se télécharger ?** | Vérifiez que votre machine a accès à Internet et que le `hugging_face_repo_id` est correct. Vous pouvez également télécharger manuellement le fichier. | +| **Mon GPU n’est pas assez puissant, puis‑je utiliser le CPU ?** | Oui, définissez `gpu_layers=0` ou omettez le paramètre ; le modèle s’exécutera alors sur le CPU, bien que plus lentement. | +| **Comment gérer plusieurs factures dans un même répertoire ?** | Parcourez le répertoire avec `os.listdir()` et appliquez le même pipeline à chaque fichier image. | +| **Le texte extrait contient‑il encore des erreurs ?** | Vous pouvez affiner le prompt du LLM ou ajouter une étape de validation regex pour les champs critiques (numéro de facture, date, total). | + +--- + +## Que devriez‑vous apprendre ensuite ? + +Les tutoriels suivants couvrent des sujets étroitement liés qui prolongent les techniques démontrées dans ce guide. Chaque ressource inclut des exemples de code complets avec des explications pas à pas pour vous aider à maîtriser d’autres fonctionnalités de l’API et explorer des approches d’implémentation alternatives dans vos propres projets. + +- [Extraire le texte d’image C# avec sélection de langue en utilisant Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extraire le texte d’une image – Optimisation OCR avec Aspose.OCR pour .NET](/ocr/english/net/ocr-optimization/) +- [Convertir une image en texte – Effectuer une OCR sur image depuis une URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/french/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/french/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..a79e02e2a --- /dev/null +++ b/ocr/french/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,284 @@ +--- +category: general +date: 2026-06-06 +description: reconnaître le texte d’une image avec Aspose OCR – apprenez comment charger + une image pour l’OCR et effectuer l’OCR sur l’image en utilisant le post‑traitement + IA en Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: fr +og_description: Reconnaître le texte d’une image rapidement. Ce guide montre comment + charger une image pour l’OCR, effectuer l’OCR sur l’image et améliorer les résultats + grâce au post‑traitement IA. +og_title: reconnaître le texte d'une image avec Aspose OCR et IA +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Reconnaître le texte à partir d’une image avec Aspose OCR et IA +url: /fr/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Reconnaître du texte à partir d'une image avec Aspose OCR & AI + +Vous avez déjà eu besoin de reconnaître du texte à partir d'une image mais vous ne saviez pas quelle bibliothèque offrirait à la fois rapidité et précision ? Vous n'êtes pas seul. Dans ce guide, nous parcourrons un exemple complet, de bout en bout, qui montre **comment charger une image pour l'OCR**, **effectuer l'OCR sur une image**, puis affiner le résultat avec le post‑processeur IA d'Aspose. À la fin, vous disposerez d'un script prêt à l'emploi qui transforme un PNG en texte propre et interrogeable. + +## Ce que vous allez apprendre + +Nous couvrirons tout, de l'installation du package Aspose OCR à la libération des ressources à la fin de l'exécution. Vous verrez pourquoi l'activation de la reconnaissance de texte manuscrit est importante, comment configurer un LLM Qwen 2.5 pour le post‑traitement, et à quoi ressemble le résultat final. Aucun référentiel externe requis — il suffit de copier, coller et exécuter. + +### Prérequis + +- Python 3.8 ou plus récent +- `asposeocr` package (`pip install asposeocr`) +- Un fichier image (par ex., `doc.png`) contenant du texte imprimé ou manuscrit +- Optionnel : un GPU pour une inférence LLM plus rapide (le script fonctionne également sur CPU) + +--- + +## Reconnaître du texte à partir d'une image – Étape par étape + +Sous chaque bloc de code, vous trouverez une brève explication du **pourquoi** nous effectuons cette action particulière, et non simplement du **quoi** fait la ligne. + +### Étape 1 : Installer et importer les modules requis + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Pourquoi ?* L'importation de `asposeocr` nous fournit la classe `OcrEngine`, tandis que le sous‑module `ai` fournit le post‑processeur basé sur LLM qui améliore considérablement le résultat brut de l'OCR. + +### Étape 2 : Créer le moteur OCR et activer la reconnaissance de texte manuscrit + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Activer la reconnaissance manuscrite élargit le jeu de caractères du moteur, ainsi vous ne perdrez pas les griffonnages lorsque vous **effectuez l'OCR sur une image** contenant du texte imprimé et cursif mélangés. + +### Étape 3 : Charger l'image pour l'OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +L'appel `load_image` est le moment où vous **chargez l'image pour l'OCR** ; si le chemin est incorrect, le moteur lèvera une exception informative, vous évitant ainsi des erreurs cryptiques en aval. + +### Étape 4 : Exécuter le passage OCR brut + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +À ce stade, vous obtenez un objet `RecognitionResult` contenant le texte non filtré, les scores de confiance et les métadonnées de mise en page. Il est souvent bruyant — d'où la nécessité d'un nettoyage piloté par l'IA. + +### Étape 5 : Configurer le modèle IA d'Aspose pour le post‑traitement LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Pourquoi se soucier de ces paramètres ? +- **auto‑download** garantit que le modèle est disponible lors du premier lancement. +- **int8 quantization** réduit la consommation de mémoire sans perte d'exactitude majeure. +- **gpu_layers** vous permet d'exploiter un GPU compatible pour une inférence plus rapide. + +### Étape 6 : Initialiser le processeur IA et l'attacher comme post‑processeur + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Attacher le processeur signifie que chaque fois que vous appelez `run_postprocessor`, le LLM corrigera l'orthographe, fusionnera les mots coupés et même inférera la ponctuation manquante. + +### Étape 7 : Exécuter le post‑processeur pour améliorer le résultat OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` est généralement beaucoup plus lisible que la chaîne brute — pensez-y comme à un correcteur orthographique qui comprend également le contexte. + +### Étape 8 : Libérer les ressources + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Le nettoyage est crucial dans les services de longue durée ; sinon vous ferez fuir la mémoire GPU et finirez par faire planter votre application. + +--- + +## Script complet que vous pouvez exécuter dès aujourd'hui + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Sortie attendue** (exemple pour une image de facture simple) : + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Remarquez comment la couche IA a corrigé « Inv0ice » → « Invoice » et ajouté la ponctuation manquante. + +--- + +## Questions fréquemment posées (et réponses rapides) + +- **Ai‑je besoin d'un GPU ?** Non. Le script revient au CPU, mais l'inférence sera plus lente. +- **Puis‑je utiliser un autre LLM ?** Absolument — il suffit de changer `hugging_face_repo_id` et d'ajuster `gpu_layers` en conséquence. +- **Et si mon image est très grande ?** Redimensionnez‑la d'abord (par ex., avec Pillow) pour garder une utilisation de mémoire raisonnable. +- **La reconnaissance manuscrite est‑elle toujours activée ?** Vous pouvez basculer `enable_handwritten_recognition` selon votre charge de travail. + +--- + +## Conclusion + +Vous savez maintenant comment **reconnaître du texte à partir d'une image** avec Aspose OCR, comment **charger une image pour l'OCR**, et comment **effectuer l'OCR sur une image** avec un post‑traitement amélioré par l'IA. L'exemple complet et exécutable ci‑dessus vous fournit une base solide pour intégrer l'OCR dans n'importe quel projet Python — que ce soit pour scanner des reçus, numériser des contrats ou extraire des données de formulaires manuscrits. + +Prêt pour l'étape suivante ? Essayez de remplacer le modèle Qwen par un plus grand, expérimentez différents schémas de quantification, ou enchaînez plusieurs images pour un traitement par lots. Les possibilités sont infinies, et le code que vous venez de créer les gérera avec souplesse. + +Bon codage, et que vos résultats OCR soient toujours d'une clarté cristalline ! + +![Capture d'écran de la console Python affichant le résultat OCR amélioré](/images/ocr_output.png){alt="Capture d'écran montrant comment reconnaître du texte à partir d'une image avec Aspose OCR"} + +## Que devriez‑vous apprendre ensuite ? + +Les tutoriels suivants couvrent des sujets étroitement liés qui s'appuient sur les techniques démontrées dans ce guide. Chaque ressource comprend des exemples de code complets et fonctionnels avec des explications étape par étape pour vous aider à maîtriser des fonctionnalités API supplémentaires et explorer des approches d'implémentation alternatives dans vos propres projets. + +- [Extraire du texte d'une image avec Aspose OCR – Guide étape par étape](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Convertir une image en texte – Effectuer l'OCR sur une image depuis une URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extraire le texte d'une image en C# avec sélection de langue en utilisant Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/german/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/german/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..e5e0ae033 --- /dev/null +++ b/ocr/german/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,215 @@ +--- +category: general +date: 2026-06-06 +description: Extrahiere Text aus handgeschriebenen Bildern mit Python‑OCR. Erfahre, + wie du handgeschriebene Fotos schnell und zuverlässig in Text umwandelst. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: de +og_description: Extrahiere Text aus handgeschriebenem Bild mit Python. Dieser Leitfaden + zeigt, wie man ein handgeschriebenes Foto in Text umwandelt und erklärt, wie man + handgeschriebenen Text erkennt. +og_title: Text aus handgeschriebenem Bild extrahieren – Python OCR‑Tutorial +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Text aus handschriftlichem Bild mit Python‑OCR extrahieren – Schritt‑für‑Schritt‑Anleitung +url: /de/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Text aus handgeschriebenem Bild mit Python OCR extrahieren – Schritt‑für‑Schritt‑Anleitung + +Haben Sie sich schon einmal gefragt, **wie man handgeschriebenen Text** in einem Foto erkennt, das Sie mit Ihrem Handy aufgenommen haben? Sie sind nicht allein. In vielen Projekten – sei es beim Digitalisieren von Vorlesungsnotizen oder beim Auslesen von unterschriebenen Formularen – müssen Sie **Text aus handgeschriebenem Bild** schnell und problemlos extrahieren. + +In diesem Tutorial gehen wir Schritt für Schritt durch ein vollständiges, sofort ausführbares Beispiel, das Ihnen genau zeigt, **wie man ein handgeschriebenes Foto in Text umwandelt** mithilfe einer populären Python‑OCR‑Bibliothek. Keine vagen Verweise, sondern konkreter Code, Erklärungen und Tipps, die Sie noch heute copy‑pasten können. + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## Was Sie benötigen + +Bevor wir starten, stellen Sie sicher, dass Sie Folgendes haben: + +| Anforderung | Warum wichtig | +|-------------|----------------| +| Python 3.9 oder neuer | Moderne Syntax & Bibliotheksunterstützung | +| `pip` (Python‑Paketmanager) | Zum Installieren des OCR‑Pakets | +| Ein klares Bild von handgeschriebenen Notizen (JPEG/PNG) | OCR‑Genauigkeit leidet bei verschwommenen Bildern | +| Grundlegende Vertrautheit mit Python‑Funktionen | Hilft, das Beispiel später anzupassen | + +Falls Ihnen etwas fehlt, holen Sie sich das neueste Python von und installieren Sie es – ganz unkompliziert. + +## Installieren der Python‑OCR‑Bibliothek + +Das Code‑Snippet, das wir verwenden, beruht auf dem Paket `ocr` (ein leichter Wrapper um Tesseract, der praktische Einstellungen hinzufügt). Installieren Sie es mit einem einzigen Befehl: + +```bash +pip install ocr +``` + +> **Pro‑Tipp:** Nach der Installation führen Sie `tesseract --version` in Ihrem Terminal aus, um zu bestätigen, dass die zugrundeliegende Engine vorhanden ist. Wenn Sie Tesseract nicht haben, folgen Sie der offiziellen Anleitung für Ihr Betriebssystem – die meisten Paketmanager bieten es an (`apt-get install tesseract-ocr` unter Ubuntu, `brew install tesseract` unter macOS). + +## Schritt 1: Erstellen einer OCR‑Engine‑Instanz + +Das Erstellen der Engine ist der erste Baustein in der **python ocr handwritten recognition**. Denken Sie an die Engine als das Gehirn, das später die Kritzeleien liest. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Warum das wichtig ist: Ohne eine Engine können Sie die Erkennungs‑Pipeline nicht anpassen. Die Standardeinstellungen sind für Drucktext optimiert, sodass wir sie im nächsten Schritt anpassen müssen. + +## Schritt 2: Handgeschriebene Erkennung aktivieren + +Standardmäßig geht die Engine von gedruckten Zeichen aus. Das Aktivieren des handgeschriebenen Modus schaltet einen Schalter um, der Tesseract anweist, sein LSTM‑Modell zu verwenden, das auf kursive Striche trainiert ist. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Was passiert, wenn Sie das überspringen?** Die OCR behandelt die Striche als Rauschen, was zu wirrem Output führt. Das Setzen des Flags ist das Kernstück von **how to recognize handwritten text**. + +## Schritt 3: Laden Ihres handgeschriebenen Fotos + +Jetzt zeigen wir der Engine die Bilddatei. Der Pfad kann absolut oder relativ sein; stellen Sie nur sicher, dass die Datei existiert. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Ein kurzer Plausibilitäts‑Check: Öffnen Sie das Bild im Viewer Ihres Betriebssystems. Wenn der Text verschwommen ist, überlegen Sie, das Bild vorher zu bearbeiten (Kontrast erhöhen, drehen), bevor Sie es an die Engine übergeben – solche Anpassungen erhöhen häufig die Erfolgsrate von **extract text from handwritten image**. + +## Schritt 4: Erkennungsprozess ausführen + +Mit allen Einstellungen fragen wir die Engine schließlich nach ihrer Arbeit. Die Methode `recognize()` liefert ein Ergebnis‑Objekt, das den extrahierten String und Konfidenzwerte enthält. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Im Hintergrund wandelt die Engine das Bitmap in eine Reihe von Feature‑Vektoren um, lässt sie durch das LSTM‑Netz laufen und fügt die Zeichen zusammen. Das ist die Magie der **python ocr handwritten recognition**. + +## Schritt 5: Extrahierten Text anzeigen + +Das Ergebnis‑Objekt stellt ein Attribut `.text` bereit, das den reinen Unicode‑String enthält. Drucken Sie ihn, schreiben Sie ihn in eine Datei oder leiten Sie ihn in eine andere Pipeline weiter – Sie entscheiden. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Erwartete Ausgabe + +Enthält das Quellbild die Notiz „Buy milk, eggs, and bread“, sehen Sie etwa Folgendes: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Beachten Sie, dass die Ausgabe Interpunktion und Zeilenumbrüche (falls vorhanden) beibehält. Wenn Sie Kauderwelsch erhalten, überprüfen Sie die Bildqualität und das Flag `enable_handwritten_recognition`. + +## Häufige Stolperfallen behandeln + +| Problem | Symptom | Lösung | +|---------|---------|--------| +| Niedrige Konfidenzwerte | Viele „?“ oder unsinnige Zeichen | Bild‑DPI auf ≥300 erhöhen, Binarisierung (`opencv`) anwenden oder auf den interessierenden Bereich zuschneiden. | +| Gemischte Sprachen | Ausgabe mischt Englisch mit einer anderen Schrift | Setzen Sie `engine.ocr_settings.language = "eng"` (oder einen anderen ISO‑Code) vor `recognize()`. | +| Große Dateien | Lange Verarbeitungszeit oder Speicherfehler | Bild vor dem Laden auf vernünftige Abmessungen verkleinern (z. B. maximale Breite 1200 px). | +| Fehlendes Tesseract | `ImportError` oder `FileNotFoundError` | Tesseract separat installieren und sicherstellen, dass es im System‑PATH liegt. | + +Diese Anpassungen halten Ihren **convert handwritten photo to text**‑Workflow robust über verschiedene Datensätze hinweg. + +## Vollständiges Skript, das Sie noch heute ausführen können + +Unten finden Sie das komplette, eigenständige Programm, das alle Bausteine zusammenführt. Kopieren Sie es in eine Datei namens `handwritten_ocr.py` und führen Sie `python handwritten_ocr.py` aus. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Starten Sie es, und Sie sehen den Text in der Konsole – exakt das Ergebnis, das Sie benötigen, wenn Sie **convert handwritten photo to text** wollen. + +## Weiterführende Schritte + +Jetzt, wo Sie die Grundlagen von **extract text from handwritten image** beherrschen, denken Sie an folgende nächste Schritte: + +- **Batch‑Verarbeitung:** Durchlaufen Sie einen Ordner mit Bildern und speichern Sie jedes Ergebnis in einer CSV‑Datei. +- **Nachbearbeitung:** Verwenden Sie reguläre Ausdrücke, um typische OCR‑Fehler zu bereinigen (z. B. „1“ vs „l“). +- **Integration:** Füttern Sie die extrahierten Zeichenketten in eine Natural‑Language‑Processing‑Pipeline für Sentiment‑Analyse oder Stichwort‑Extraktion. +- **Alternative Bibliotheken:** Wenn Sie höhere Genauigkeit benötigen, probieren Sie `easyocr` oder `pytesseract` mit eigenen LSTM‑Modellen – beide unterstützen **python ocr handwritten recognition** ebenfalls. + +Denken Sie daran, dass die Qualität des Ausgangsbildes häufig den Erfolg bestimmt; investieren Sie ein paar Minuten in die Vorverarbeitung. Ein wenig zusätzlicher Aufwand jetzt spart später viel Debugging‑Zeit. + +## Fazit + +Wir haben ein vollständiges End‑to‑End‑Beispiel durchgearbeitet, das zeigt, **wie man handgeschriebenen Text erkennt** und, noch wichtiger, **wie man Text aus handgeschriebenem Bild** mit Python extrahiert. Durch die Installation des `ocr`‑Pakets, das Setzen des Handwritten‑Flags, das Laden Ihres Bildes und den Aufruf von `recognize()` können Sie **convert handwritten photo to text** in nur wenigen Zeilen erledigen. + +Probieren Sie es mit Ihren eigenen Notizen aus, passen Sie die Vorverarbeitungsschritte an und lassen Sie die OCR die schwere Arbeit übernehmen. Wenn Sie auf Probleme stoßen, schauen Sie noch einmal in die Tabelle „Häufige Stolperfallen“ oder experimentieren Sie mit alternativen OCR‑Backends. Viel Spaß beim Coden, und möge Ihre handgeschriebene Daten sofort durchsuchbar werden! + +## Was sollten Sie als Nächstes lernen? + +Die folgenden Tutorials behandeln eng verwandte Themen, die auf den in diesem Leitfaden gezeigten Techniken aufbauen. Jede Ressource enthält vollständige, funktionierende Code‑Beispiele mit Schritt‑für‑Schritt‑Erklärungen, damit Sie weitere API‑Funktionen meistern und alternative Implementierungsansätze in Ihren eigenen Projekten erkunden können. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [How to Recognize Page Rectangles for OCR Text Recognition in Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [How to Use OCR - Recognize Image without Text Area Detection](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/german/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/german/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..86e3e5243 --- /dev/null +++ b/ocr/german/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,258 @@ +--- +category: general +date: 2026-06-06 +description: 'OCR‑Zeichensatz‑Leitfaden: Erfahren Sie, wie Sie die OCR‑Engine nutzen, + um unterstützte Zeichen für lateinische und kyrillische Schriften aufzulisten, mit + vollständigen Python‑Beispielen.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: de +og_description: 'OCR‑Zeichensatz‑Leitfaden: Entdecken Sie, wie Sie die OCR‑Engine + in Python verwenden, um unterstützte Zeichen für verschiedene Schriftsysteme aufzulisten, + inklusive Code und Tipps.' +og_title: OCR‑Zeichensatz – OCR‑Engine abrufen und nutzen +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR‑Zeichensatz – OCR‑Engine in Python abrufen und verwenden +url: /de/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR-Zeichensatz – OCR‑Engine in Python abrufen und verwenden + +Möchten Sie den **ocr character set** kennen, den Ihre Bibliothek unterstützt? In diesem Tutorial zeigen wir Ihnen, wie Sie **use OCR engine** nutzen, um unterstützte Zeichen für verschiedene Skripte abzufragen, und warum das für reale Projekte wichtig ist. + +Wenn Sie schon einmal auf einen leeren Bildschirm gestarrt haben und sich gefragt haben, warum manche Buchstaben nach der OCR‑Verarbeitung nie erscheinen, sind Sie nicht allein. Die Antwort liegt oft im Zeichensatz, den die Engine kennt. Am Ende dieses Leitfadens können Sie: + +* Jeden Charakter auflisten, den eine OCR‑Engine für ein bestimmtes Skript erkennen kann. +* Fälle behandeln, in denen ein Skript nicht unterstützt wird. +* Die Zeichen‑Set‑Informationen in nachgelagerte Validierungen oder UI‑Logik einbinden. + +Keine magischen Black‑Box‑Tricks – nur reines Python, ein paar Code‑Zeilen und ein wenig Erklärung. + +--- + +## Voraussetzungen + +Bevor wir loslegen, stellen Sie sicher, dass Sie: + +* Python 3.8+ installiert haben (der Code verwendet f‑Strings). +* Die OCR‑Bibliothek, die `ocr.OcrEngine` bereitstellt – für dieses Beispiel gehen wir davon aus, dass ein Paket namens `ocr` bereits über `pip install ocr-lib` installiert ist. +* Grundlegende Vertrautheit mit Pythons Importsystem und dem Ausgeben von Text. + +Falls Ihnen die Bibliothek fehlt, führen Sie aus: + +```bash +pip install ocr-lib +``` + +Das war’s – keine zusätzlichen Binärdateien oder nativen Abhängigkeiten nötig für die einfachen Zeichen‑Set‑Abfragen, die wir durchführen werden. + +--- + +## Schritt 1: Initialisieren der OCR‑Engine‑Instanz + +Ein Engine‑Objekt zu erstellen ist das Erste, was Sie tun, wenn Sie **use OCR engine**‑Funktionalität nutzen. Denken Sie daran wie das Einschalten eines Scanners; ohne ihn können Sie keine Fragen stellen. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +Die Klasse `OcrEngine` ist ein leichter Wrapper um die zugrunde liegende Erkennungs‑Engine. Sie startet keine aufwändige Verarbeitung, bis Sie etwas anfordern, sodass die Startkosten vernachlässigbar sind. + +> **Pro Tipp:** Wenn Ihre Anwendung viele Engine‑Instanzen erstellt, verwenden Sie eine einzelne wieder. Das spart Speicher und reduziert die Initialisierungslatenz. + +--- + +## Schritt 2: Abfragen des OCR‑Zeichensatzes für ein bestimmtes Skript + +Jetzt, wo wir eine Engine haben, können wir sie fragen, welche Zeichen sie für ein bestimmtes Schriftsystem kennt. Die Methode `get_supported_characters(script_name)` gibt eine Python‑`list` von Unicode‑Zeichen zurück. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Das Ausführen des obigen Snippets gibt etwa Folgendes aus: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Die genaue Ausgabe hängt von der Version der OCR‑Bibliothek ab, aber Sie erhalten immer eine flache Liste von Zeichen. Wenn Sie sowohl Groß‑ als auch Kleinbuchstaben benötigen, fragen Sie einfach das Skript `"Latin"` an und wenden anschließend `.lower()` auf jedes Element an, oder fragen Sie ein separates Skript `"Latin‑lower"` ab, falls die Bibliothek diese unterscheidet. + +### Warum ist das wichtig? + +Wenn Sie ein Bild mit ungewöhnlichen Diakritika (z. B. „ñ“ oder „ø“) einspeisen, kann die OCR‑Engine diese stillschweigend durch einen Platzhalter ersetzen oder ganz weglassen. Das Vorab‑Wissen über den **ocr character set** ermöglicht Ihnen, Eingaben zu validieren, Benutzer zu warnen oder auf eine andere Engine zurückzugreifen. + +--- + +## Schritt 3: Abrufen von Zeichen für ein weiteres Skript – Kyrillisches Beispiel + +Die gleiche Methode funktioniert für jedes Skript, das die Engine zu unterstützen behauptet. Schauen wir uns an, was bei Kyrillisch passiert. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Typische Ausgabe: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Wenn die Engine das angeforderte Skript **nicht** unterstützt, wirft sie in der Regel einen `ValueError` (oder gibt je nach Implementierung eine leere Liste zurück). Lassen Sie uns dagegen absichern. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Schritt 4: Visualisieren des OCR‑Zeichensatzes (optional) + +Manchmal hilft eine schnelle Visualisierung, besonders wenn Sie Stakeholdern zeigen möchten, welche Zeichen abgedeckt sind. Unten ein minimales Beispiel mit `matplotlib`, das ein Raster von Zeichen rendert. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Bild‑Alt‑Text:** ![OCR‑Zeichensatz‑Diagramm](ocr_character_set.png){alt="Übersicht des OCR‑Zeichensatzes"} + +Das Diagramm ist für die Kernlösung nicht erforderlich, demonstriert aber, wie Sie **use OCR engine**‑Metadaten über reines Ausdrucken hinaus nutzen können. + +--- + +## Schritt 5: Integrieren des Zeichensatzes in reale Workflows + +Jetzt, wo wir den **ocr character set** abrufen können, sehen wir uns ein praktisches Szenario an: Validierung von OCR‑Ausgaben, bevor sie in einer Datenbank gespeichert werden. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Dieses Muster verhindert, dass Mülldaten in nachgelagerte Pipelines gelangen – ein häufiges Schmerz‑Punkt‑Problem bei mehrsprachigen Dokumenten. + +--- + +## Häufige Fallstricke und Randfälle + +| Fallstrick | Warum es passiert | Wie zu vermeiden | +|------------|-------------------|------------------| +| **Tippfehler im Skriptnamen** (z. B. `"Cyrillic "` mit nachfolgendem Leerzeichen) | Die Engine behandelt den String wörtlich und kann das Skript nicht finden. | Entfernen Sie Leerzeichen: `script.strip()` bevor Sie `get_supported_characters` aufrufen. | +| **Leere Zeichenliste** | Einige Engines geben ein Skript zurück, haben das Sprachmodell aber noch nicht geladen. | Rufen Sie `engine.load_language_model(script)` auf, falls die Bibliothek eine solche Methode bereitstellt, oder stellen Sie sicher, dass die Modelldateien vorhanden sind. | +| **Unicode‑Normalisierungs‑Probleme** | Zeichen wie „é“ können als zusammengesetzt (`\u00E9`) oder zerlegt (`e\u0301`) auftreten. | Normalisieren Sie Zeichenketten mit `unicodedata.normalize('NFC', text)` vor der Validierung. | +| **Performance bei riesigen Skripten** (z. B. Chinesisch) | Das Abrufen von Tausenden von Zeichen kann langsam sein. | Cachen Sie das Ergebnis nach dem ersten Aufruf; die meisten Anwendungen benötigen die Liste nur einmal. | + +--- + +## Vollständiges funktionierendes Beispiel + +Alles zusammengeführt, hier ein einzelnes Skript, das Sie kopieren und ausführen können: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Was sollten Sie als Nächstes lernen? + + +Die folgenden Tutorials behandeln eng verwandte Themen, die auf den in diesem Leitfaden gezeigten Techniken aufbauen. Jede Ressource enthält vollständige, funktionierende Code‑Beispiele mit Schritt‑für‑Schritt‑Erklärungen, um Ihnen zu helfen, weitere API‑Funktionen zu meistern und alternative Implementierungsansätze in Ihren eigenen Projekten zu erkunden. + +- [Aspose OCR Tutorial – Optische Zeichenerkennung](/ocr/english/) +- [OCR‑Nachbearbeitung – Zeichenoptionen abrufen](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Wie man den Schwellenwert in der OCR‑Bilderkennung festlegt](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/german/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/german/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..c94f6f3ad --- /dev/null +++ b/ocr/german/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,254 @@ +--- +category: general +date: 2026-06-06 +description: Führen Sie OCR auf einem Bild mit Aspose OCR und einem Hugging‑Face‑Modell + durch. Erfahren Sie, wie Sie das Hugging‑Face‑Modell herunterladen, Text aus einer + Rechnung extrahieren und GPU‑Ressourcen freigeben. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: de +og_description: Führen Sie OCR auf einem Bild mit Aspose OCR und einem Hugging‑Face‑Modell + durch. Dieses Tutorial zeigt, wie man das Modell herunterlädt, Text aus einer Rechnung + extrahiert und GPU‑Ressourcen freigibt. +og_title: OCR auf Bild mit Aspose OCR & LLM – Kompletter Leitfaden +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: OCR auf Bild mit Aspose OCR & LLM – Komplettanleitung +url: /de/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Durchführen von OCR auf Bild mit Aspose OCR & LLM – Komplettanleitung + +Haben Sie jemals **OCR auf Bild**‑Dateien durchführen wollen, aber standen vor der Frage „Wo fange ich an?“? Sie sind nicht allein – viele Entwickler stoßen an diese Hürde, wenn sie erstmals Dokumenten‑Automatisierung angehen. Die gute Nachricht ist, dass Sie mit Aspose OCR und einem leichten LLM von Hugging Face einen Roh‑Scan einer Rechnung in sauberen, durchsuchbaren Text verwandeln können, und das mit nur wenigen Zeilen Python. + +In diesem Tutorial führen wir Sie durch alles, was Sie benötigen: von **loading image for OCR**, über **downloading the Hugging Face model**, bis hin zu **extracting text from invoice**‑Daten, und schließlich **free GPU resources**, damit Ihre Anwendung schlank bleibt. Am Ende haben Sie ein eigenständiges Skript, das Sie in jedes Projekt einbinden können. + +--- + +## Was Sie lernen werden + +- Wie Sie **perform OCR on image** mit Aspose’s `OcrEngine` durchführen. +- Die genauen Schritte, um **download Hugging Face model**‑Dateien automatisch herunterzuladen. +- Techniken, um **extract text from invoice**‑PDFs oder PNGs mit KI‑verbesserter Nachbearbeitung zu extrahieren. +- Best Practices, um **free GPU resources** nach der Inferenz freizugeben. +- Tipps, wie Sie **load image for OCR** effizient laden und häufige Stolperfallen vermeiden. + +Keine externe Dokumentation ist erforderlich – alles, was Sie brauchen, finden Sie hier, inklusive vollständigem Code, Erklärungen und erwarteter Ausgabe. + +--- + +## Voraussetzungen + +| Anforderung | Grund | +|-------------|--------| +| Python 3.9+ | Moderne Syntax und Typ‑Hinweise | +| `asposeocr` package (`pip install asposeocr`) | Kern‑OCR‑Engine | +| Access to a GPU (optional but recommended) | Beschleunigt den LLM‑Post‑Processor | +| An invoice image (`sample_invoice.png`) | Praxisnahes Testbeispiel | + +Falls Ihnen etwas davon fehlt, installieren Sie es jetzt; das Skript wird zudem **download Hugging Face model** automatisch ausführen, sodass Sie nicht selbst nach Dateien suchen müssen. + +--- + +## Schritt 1: OCR auf Bild durchführen – Engine erstellen + +Das Erste, was Sie tun müssen, ist, Aspose’s OCR‑Engine zu starten. Stellen Sie sich das vor wie das Öffnen einer leeren Leinwand, auf die das Bild später in Text „gemalt“ wird. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Why this matters:** `OcrEngine` abstracts all the low‑level image preprocessing, so you can focus on the higher‑level workflow. It also exposes a `set_post_processor` method that will later let us hook an LLM for smarter output. + +--- + +## Schritt 2: Bild für OCR laden – Die richtige Datei wählen + +Jetzt, wo die Engine existiert, müssen wir **load image for OCR**. Aspose unterstützt PNG, JPG, TIFF und einige weitere Formate. Stellen Sie sicher, dass der Pfad absolut oder relativ zu Ihrem Skript liegt. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tip:** If your image is large, consider resizing it beforehand to reduce memory pressure. The OCR engine can handle high‑resolution scans, but a 300 DPI image is usually a sweet spot for invoices. + +--- + +## Schritt 3: Roh‑OCR ausführen und den extrahierten Text ansehen + +Mit dem geladenen Bild können wir endlich **perform OCR on image** und sehen, was die rohe Engine ausgibt. Dieser Schritt liefert uns eine Basis, bevor wir KI‑Magie hinzufügen. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Erwartete Ausgabe (gekürzt):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Die Rohausgabe enthält oft Zeilenumbrüche, falsch erkannte Zeichen oder fehlende Felder – genau der Grund, warum wir als Nächstes ein Sprachmodell einsetzen. + +--- + +## Schritt 4: Hugging Face Modell herunterladen – LLM‑Post‑Processor konfigurieren + +Hier kommt der **download Hugging Face model**‑Schritt ins Spiel. Aspose AI kann automatisch ein Modell vom Hugging Face‑Hub holen, falls es noch nicht auf der Festplatte liegt. Wir verwenden das Qwen2.5‑3B‑Instruct‑GGUF‑Modell, das Genauigkeit und Speicherbedarf gut ausbalanciert. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Why this works:** `allow_auto_download` saves you from manually downloading the `.gguf` file. Quantization (`int8`) reduces the model size to roughly 3 GB, making it feasible on most consumer GPUs. Adjust `gpu_layers` based on your hardware—more layers on GPU = faster inference. + +--- + +## Schritt 5: Text aus Rechnung mit KI‑verbesserter Nachbearbeitung extrahieren + +Jetzt verbinden wir das LLM mit der OCR‑Engine und führen einen **post‑processor** aus, der die Rohausgabe bereinigt, OCR‑Fehler korrigiert und die Rechnungsfelder sauber formatiert. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Beispiel für verbesserten Output:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **What happened?** The LLM recognized that “Invoice #12345” should be “Invoice Number: 12345”, fixed the date format, and even inferred the “Bill To” field that the raw engine missed. This is the core of **extract text from invoice** automation. + +--- + +## Schritt 6: GPU‑Ressourcen freigeben – Aufräumen nach der Verarbeitung + +Falls Sie das in einem langlebigen Service (z. B. einer Flask‑API) ausführen, müssen Sie **free GPU resources** nach jeder Inferenz freigeben, um Out‑of‑Memory‑Abstürze zu vermeiden. Aspose AI bietet dafür eine unkomplizierte Methode. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro tip:** Call `free_resources()` inside a `finally:` block if you’re wrapping the OCR call in a try/except. That guarantees cleanup even when an exception occurs. + +--- + +## Schritt 7: Komplettes Skript – Alles zusammenführen + +Unten finden Sie das vollständige, sofort ausführbare Skript. Kopieren Sie es, passen Sie die Pfade an, und Sie können loslegen. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Führen Sie das Skript aus und beobachten Sie die Transformation von rauschigem OCR‑Output zu sauberen, strukturierten Rechnungsdaten. 🎉 + +--- + +## Häufige Fragen & Sonderfälle + +| Frage | Antwort | +|----------|--------| +| **Was passiert, wenn das Modell nicht heruntergeladen werden kann?** | Stellen Sie sicher, dass Ihr Rechner über Internetzugang verfügt und dass die `hugging_face_repo_id` korrekt ist. Sie können das Modell auch manuell herunterladen the + +## Was sollten Sie als Nächstes lernen? + +Die folgenden Tutorials behandeln eng verwandte Themen, die auf den in diesem Leitfaden gezeigten Techniken aufbauen. Jede Ressource enthält vollständige, funktionierende Codebeispiele mit Schritt‑für‑Schritt‑Erklärungen, um Ihnen zu helfen, weitere API‑Funktionen zu meistern und alternative Implementierungsansätze in Ihren eigenen Projekten zu erkunden. + +- [Extrahieren von Bildtext in C# mit Sprachauswahl mittels Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extrahieren von Text aus Bild – OCR‑Optimierung mit Aspose.OCR für .NET](/ocr/english/net/ocr-optimization/) +- [Bild zu Text konvertieren – OCR auf Bild von URL ausführen](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/german/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/german/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..5958b01b4 --- /dev/null +++ b/ocr/german/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,283 @@ +--- +category: general +date: 2026-06-06 +description: Texterkennung aus Bild mit Aspose OCR – lernen Sie, wie man ein Bild + für OCR lädt und OCR auf dem Bild mit KI‑Nachbearbeitung in Python durchführt. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: de +og_description: Erkenne Text aus Bildern schnell. Dieser Leitfaden zeigt, wie man + ein Bild für OCR lädt, OCR auf dem Bild durchführt und die Ergebnisse mit KI‑Nachbearbeitung + verbessert. +og_title: Text aus Bild mit Aspose OCR & KI erkennen +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Text aus Bild mit Aspose OCR & KI erkennen +url: /de/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Texterkennung aus Bild mit Aspose OCR & KI + +Haben Sie jemals Text aus einem Bild erkennen müssen, waren sich aber nicht sicher, welche Bibliothek sowohl Geschwindigkeit als auch Genauigkeit bietet? Sie sind nicht allein. In diesem Leitfaden gehen wir ein komplettes End‑to‑End‑Beispiel durch, das zeigt, **wie man ein Bild für OCR lädt**, **OCR auf einem Bild ausführt** und dann das Ergebnis mit Asposes KI‑Post‑Processor verfeinert. Am Ende haben Sie ein sofort ausführbares Skript, das ein PNG in sauberen, durchsuchbaren Text verwandelt. + +## Was Sie lernen werden + +Wir behandeln alles, von der Installation des Aspose OCR‑Pakets bis zum Freigeben von Ressourcen am Ende des Laufs. Sie werden sehen, warum die Aktivierung der Handschrift‑Erkennung wichtig ist, wie man ein Qwen 2.5 LLM für die Nachbearbeitung konfiguriert und wie das endgültige Ergebnis aussieht. Keine externen Referenzen nötig – einfach kopieren, einfügen und ausführen. + +### Voraussetzungen + +- Python 3.8 oder neuer +- `asposeocr`‑Paket (`pip install asposeocr`) +- Eine Bilddatei (z. B. `doc.png`), die gedruckten oder handgeschriebenen Text enthält +- Optional: eine GPU für schnellere LLM‑Inference (das Skript funktioniert auch auf der CPU) + +--- + +## Texterkennung aus Bild – Schritt für Schritt + +Unter jedem Code‑Block finden Sie eine kurze Erklärung, **warum** wir diese bestimmte Aktion ausführen, nicht nur **was** die Zeile bewirkt. + +### Schritt 1: Installieren und Importieren der erforderlichen Module + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Warum?* Das Importieren von `asposeocr` stellt uns die Klasse `OcrEngine` zur Verfügung, während das Untermodul `ai` den LLM‑basierten Post‑Processor bereitstellt, der das rohe OCR‑Ergebnis erheblich verbessert. + +### Schritt 2: Erstellen der OCR‑Engine und Aktivieren der Handschrift‑Erkennung + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Die Aktivierung der Handschrift‑Erkennung erweitert den Zeichensatz der Engine, sodass Sie keine Kritzeleien verlieren, wenn Sie **OCR auf Bild**‑Dateien ausführen, die gemischten Druck- und Kurrentschrift‑Text enthalten. + +### Schritt 3: Bild für OCR laden + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +Der Aufruf `load_image` ist der Moment, in dem Sie **Bild für OCR laden**; ist der Pfad falsch, wirft die Engine eine informative Ausnahme, die Sie vor kryptischen nachfolgenden Fehlern bewahrt. + +### Schritt 4: Roh‑OCR‑Durchlauf ausführen + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +In diesem Stadium erhalten Sie ein `RecognitionResult`‑Objekt, das den ungefilterten Text, Konfidenzwerte und Layout‑Metadaten enthält. Es ist oft verrauscht – daher der Bedarf an KI‑gestützter Bereinigung. + +### Schritt 5: Konfigurieren des Aspose KI‑Modells für LLM‑Nachbearbeitung + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Warum diese Einstellungen? +- **auto‑download** stellt sicher, dass das Modell beim ersten Lauf verfügbar ist. +- **int8 quantization** reduziert den Speicherbedarf stark, ohne die Genauigkeit wesentlich zu beeinträchtigen. +- **gpu_layers** ermöglicht die Nutzung einer kompatiblen GPU für schnellere Inferenz. + +### Schritt 6: Initialisieren des KI‑Processors und Anbinden als Post‑Processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Das Anbinden des Processors bedeutet, dass jedes Mal, wenn Sie `run_postprocessor` aufrufen, das LLM Rechtschreibung bereinigt, getrennte Wörter zusammenführt und sogar fehlende Interpunktion ergänzt. + +### Schritt 7: Post‑Processor ausführen, um das OCR‑Ergebnis zu verbessern + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +Das `enhanced_result.text` ist in der Regel deutlich lesbarer als die Rohzeichenkette – denken Sie an einen Rechtschreibprüfer, der auch den Kontext versteht. + +### Schritt 8: Ressourcen freigeben + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Aufräumen ist in langfristig laufenden Diensten entscheidend; andernfalls verlieren Sie GPU‑Speicher und Ihre Anwendung stürzt schließlich ab. + +--- + +## Vollständiges Skript, das Sie heute ausführen können + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Erwartete Ausgabe** (Beispiel für ein einfaches Rechnungsbild): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Beachten Sie, wie die KI‑Schicht “Inv0ice” → “Invoice” korrigierte und fehlende Interpunktion hinzufügte. + +--- + +## Häufig gestellte Fragen (und schnelle Antworten) + +- **Brauche ich eine GPU?** Nein. Das Skript greift auf die CPU zurück, aber die Inferenz ist langsamer. +- **Kann ich ein anderes LLM verwenden?** Absolut – ändern Sie einfach `hugging_face_repo_id` und passen `gpu_layers` entsprechend an. +- **Was, wenn mein Bild riesig ist?** Skalieren Sie es zuerst (z. B. mit Pillow), um den Speicherverbrauch im Rahmen zu halten. +- **Ist die Handschrift‑Erkennung immer aktiv?** Sie können `enable_handwritten_recognition` je nach Arbeitslast ein- oder ausschalten. + +--- + +## Fazit + +Sie wissen jetzt, wie man **Texte aus Bild** mit Aspose OCR erkennt, wie man **Bild für OCR lädt** und wie man **OCR auf Bild** mit KI‑verbesserter Nachbearbeitung ausführt. Das komplette, ausführbare Beispiel oben bietet Ihnen eine solide Grundlage, OCR in jedes Python‑Projekt zu integrieren – sei es zum Scannen von Belegen, Digitalisieren von Verträgen oder Extrahieren von Daten aus handschriftlichen Formularen. + +Bereit für den nächsten Schritt? Versuchen Sie, das Qwen‑Modell durch ein größeres zu ersetzen, experimentieren Sie mit verschiedenen Quantisierungsschemata oder verketten Sie mehrere Bilder für die Stapelverarbeitung. Die Möglichkeiten sind endlos, und der Code, den Sie gerade erstellt haben, wird sie problemlos bewältigen. + +Viel Spaß beim Coden, und mögen Ihre OCR‑Ergebnisse stets kristallklar sein! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Screenshot, der zeigt, wie man Text aus Bild mit Aspose OCR erkennt"} + +## Was Sie als Nächstes lernen sollten + +Die folgenden Tutorials behandeln eng verwandte Themen, die auf den in diesem Leitfaden gezeigten Techniken aufbauen. Jede Ressource enthält vollständige, funktionierende Code‑Beispiele mit Schritt‑für‑Schritt‑Erklärungen, um Ihnen zu helfen, weitere API‑Funktionen zu meistern und alternative Implementierungsansätze in Ihren eigenen Projekten zu erkunden. + +- [Text aus Bild extrahieren mit Aspose OCR – Schritt‑für‑Schritt‑Anleitung](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Bild in Text umwandeln – OCR auf Bild von URL ausführen](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Bildtext in C# extrahieren mit Sprachauswahl mittels Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/greek/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/greek/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..874434531 --- /dev/null +++ b/ocr/greek/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,215 @@ +--- +category: general +date: 2026-06-06 +description: Εξάγετε κείμενο από χειρόγραφη εικόνα χρησιμοποιώντας Python OCR. Μάθετε + πώς να μετατρέψετε μια χειρόγραφη φωτογραφία σε κείμενο γρήγορα και αξιόπιστα. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: el +og_description: Εξάγετε κείμενο από χειρόγραφη εικόνα με Python. Αυτός ο οδηγός δείχνει + πώς να μετατρέψετε μια φωτογραφία με χειρόγραφο σε κείμενο και απαντά πώς να αναγνωρίσετε + το χειρόγραφο κείμενο. +og_title: Εξαγωγή κειμένου από χειρόγραφη εικόνα – Οδηγός Python OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Εξαγωγή κειμένου από χειρόγραφη εικόνα με Python OCR – Οδηγός βήμα‑βήμα +url: /el/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Απόσπαση Κειμένου από Χειρόγραφη Εικόνα με Python OCR – Οδηγός Βήμα‑Βήμα + +Έχετε αναρωτηθεί ποτέ **πώς να αναγνωρίσετε χειρόγραφο κείμενο** σε μια φωτογραφία που τραβήξατε με το τηλέφωνό σας; Δεν είστε μόνοι. Σε πολλά έργα—είτε πρόκειται για ψηφιοποίηση σημειώσεων διαλέξεων είτε για εξαγωγή δεδομένων από υπογεγραμμένες φόρμες—χρειάζεστε να **αποκτήσετε κείμενο από χειρόγραφη εικόνα** γρήγορα και χωρίς προβλήματα. + +Σε αυτό το tutorial θα περάσουμε από ένα πλήρες, έτοιμο‑για‑εκτέλεση παράδειγμα που δείχνει ακριβώς πώς να **μετατρέψετε χειρόγραφη φωτογραφία σε κείμενο** χρησιμοποιώντας μια δημοφιλής βιβλιοθήκη Python OCR. Χωρίς ασαφείς αναφορές, μόνο συγκεκριμένος κώδικας, εξηγήσεις και συμβουλές που μπορείτε να αντιγράψετε‑και‑επικολλήσετε σήμερα. + +![Απόσπαση κειμένου από χειρόγραφη εικόνα](https://example.com/placeholder-handwritten.jpg "Απόσπαση κειμένου από χειρόγραφη εικόνα") + +## Τι Θα Χρειαστεί + +Πριν βουτήξουμε, βεβαιωθείτε ότι έχετε τα παρακάτω: + +| Απαίτηση | Γιατί είναι σημαντικό | +|----------|------------------------| +| Python 3.9 ή νεότερο | Σύγχρονη σύνταξη & υποστήριξη βιβλιοθηκών | +| `pip` (διαχειριστής πακέτων Python) | Για την εγκατάσταση του πακέτου OCR | +| Μια καθαρή εικόνα χειρόγραφων σημειώσεων (JPEG/PNG) | Η ακρίβεια του OCR μειώνεται σε θολές εικόνες | +| Βασική εξοικείωση με συναρτήσεις Python | Σας βοηθά να προσαρμόσετε το παράδειγμα αργότερα | + +Αν λείπει κάτι από αυτά, κατεβάστε την τελευταία έκδοση του Python από και εγκαταστήστε το—χωρίς προβλήματα. + +## Εγκατάσταση της Βιβλιοθήκης Python OCR + +Το απόσπασμα κώδικα που θα χρησιμοποιήσουμε βασίζεται στο πακέτο `ocr` (μια ελαφριά επικάλυψη γύρω από το Tesseract που προσθέτει χρήσιμες ρυθμίσεις). Εγκαταστήστε το με μία μόνο εντολή: + +```bash +pip install ocr +``` + +> **Pro tip:** After installation, run `tesseract --version` in your terminal to confirm the underlying engine is present. If you don’t have Tesseract, follow the official guide for your OS—most package managers have it (`apt-get install tesseract-ocr` on Ubuntu, `brew install tesseract` on macOS). + +## Βήμα 1: Δημιουργία ενός Αντικειμένου OCR Engine + +Creating the engine is the first brick in the wall of **python ocr handwritten recognition**. Think of the engine as the brain that will later read the scribbles. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Why this matters: without an engine you can’t tweak the recognition pipeline. The default settings are tuned for printed text, so we’ll need to adjust them in the next step. + +## Βήμα 2: Ενεργοποίηση Αναγνώρισης Χειρόγραφου + +By default the engine assumes printed characters. Enabling handwritten mode flips a switch that tells Tesseract to use its LSTM model trained on cursive strokes. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **What if you skip this?** The OCR will treat the strokes as noise, resulting in garbled output. Enabling the flag is the core of **how to recognize handwritten text**. + +## Βήμα 3: Φόρτωση της Χειρόγραφης Φωτογραφίας Σας + +Now we point the engine at the image file. The path can be absolute or relative; just be sure the file exists. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +A quick sanity check: open the image in your OS’s viewer. If the text is blurry, consider pre‑processing (contrast boost, rotation) before feeding it to the engine—those tweaks often raise the success rate of **extract text from handwritten image**. + +## Βήμα 4: Εκτέλεση της Διαδικασίας Αναγνώρισης + +With everything set, we finally ask the engine to do its job. The `recognize()` method returns a result object that holds the extracted string and confidence scores. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Behind the scenes, the engine converts the bitmap into a series of feature vectors, runs them through the LSTM network, and stitches the characters together. That’s the magic of **python ocr handwritten recognition**. + +## Βήμα 5: Εμφάνιση του Αποσπασμένου Κειμένου + +The result object exposes a `.text` attribute that contains the plain Unicode string. Print it, write it to a file, or feed it into another pipeline—your call. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Αναμενόμενο Αποτέλεσμα + +If the source image contains the note “Buy milk, eggs, and bread”, you’d see something like: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Notice how the output preserves punctuation and line breaks (if any). If you get gibberish, double‑check the image quality and the `enable_handwritten_recognition` flag. + +## Διαχείριση Συνηθισμένων Προβλημάτων + +| Πρόβλημα | Σύμπτωμα | Διόρθωση | +|----------|----------|----------| +| Low confidence scores | Many “?” or nonsensical characters | Increase image DPI to ≥300, apply binarization (`opencv`), or crop to the region of interest. | +| Mixed languages | Output mixes English and another script | Set `engine.ocr_settings.language = "eng"` (or another ISO code) before `recognize()`. | +| Large files | Long processing time or memory error | Resize the image to a reasonable dimension (e.g., max width 1200 px) before loading. | +| Missing Tesseract | `ImportError` or `FileNotFoundError` | Install Tesseract separately and ensure it’s on your system PATH. | + +These adjustments keep your **convert handwritten photo to text** workflow robust across diverse datasets. + +## Πλήρες Σενάριο που Μπορείτε να Εκτελέσετε Σήμερα + +Below is the complete, self‑contained program that puts all the pieces together. Copy it into a file named `handwritten_ocr.py` and execute `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Run it, and you’ll see the text printed to the console—exactly the result you need when you want to **convert handwritten photo to text**. + +## Προχωρώντας Περαιτέρω + +Now that you’ve mastered the basics of **extract text from handwritten image**, consider these next steps: + +- **Batch processing:** Loop over a folder of images and store each result in a CSV file. +- **Post‑processing:** Use regular expressions to clean up common OCR errors (e.g., “1” vs “l”). +- **Integration:** Feed the extracted strings into a Natural Language Processing pipeline for sentiment analysis or keyword extraction. +- **Alternative libraries:** If you need higher accuracy, explore `easyocr` or `pytesseract` with custom LSTM models—both support **python ocr handwritten recognition** as well. + +Remember, the quality of the source image often dictates success, so spend a few minutes on preprocessing. A little extra effort now saves you a lot of debugging later. + +## Συμπέρασμα + +We’ve walked through a complete, end‑to‑end example that shows **how to recognize handwritten text** and, more importantly, how to **extract text from handwritten image** using Python. By installing the `ocr` package, toggling the handwritten flag, loading your picture, and calling `recognize()`, you can **convert handwritten photo to text** in just a handful of lines. + +Give it a try with your own notes, tweak the preprocessing steps, and let the OCR do the heavy lifting. If you hit any snags, revisit the “Handling Common Pitfalls” table or experiment with alternative OCR back‑ends. Happy coding, and may your handwritten data become instantly searchable! + +## Τι Θα Πρέπει Να Μάθετε Στη Σύντομη Μελλοντική + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step‑by‑step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Απόσπαση Κειμένου από Εικόνα με Aspose OCR – Οδηγός Βήμα‑Βήμα](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Πώς να Αναγνωρίσετε Ορθογώνια Σελίδας για Αναγνώριση Κειμένου OCR στο Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Πώς να Χρησιμοποιήσετε OCR - Αναγνώριση Εικόνας χωρίς Ανίχνευση Περιοχής Κειμένου](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/greek/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/greek/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..79f90f648 --- /dev/null +++ b/ocr/greek/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,218 @@ +--- +category: general +date: 2026-06-06 +description: 'Οδηγός συνόλου χαρακτήρων OCR: μάθετε πώς να χρησιμοποιείτε τη μηχανή + OCR για να εμφανίσετε τους υποστηριζόμενους χαρακτήρες για τα λατινικά και κυριλλικά + αλφάβητα με πλήρη παραδείγματα Python.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: el +og_description: 'Οδηγός συνόλου χαρακτήρων OCR: ανακαλύψτε πώς να χρησιμοποιήσετε + τη μηχανή OCR στην Python για να καταγράψετε τους υποστηριζόμενους χαρακτήρες για + διάφορα συστήματα γραφής, με πλήρη κώδικα και συμβουλές.' +og_title: Σύνολο Χαρακτήρων OCR – Ανάκτηση και Χρήση της Μηχανής OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Σύνολο Χαρακτήρων OCR – Ανάκτηση και Χρήση της Μηχανής OCR σε Python +url: /el/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR Character Set – Ανάκτηση και Χρήση της Μηχανής OCR σε Python + +Θέλετε να γνωρίζετε το **ocr character set** που υποστηρίζει η βιβλιοθήκη σας; Σε αυτό το tutorial θα σας δείξουμε πώς να **use OCR engine** για να ερωτήσετε τους υποστηριζόμενους χαρακτήρες για διαφορετικά scripts, και γιατί αυτό είναι σημαντικό για πραγματικά έργα. + +Αν έχετε ποτέ κολλήσει σε μια κενή οθόνη αναρωτιόμενοι γιατί ορισμένα γράμματα δεν εμφανίζονται ποτέ μετά την επεξεργασία OCR, δεν είστε μόνοι. Η απάντηση είναι συχνά κρυμμένη στο σύνολο χαρακτήρων που γνωρίζει η μηχανή. Στο τέλος αυτού του οδηγού θα μπορείτε να: + +* Καταγράψετε κάθε χαρακτήρα που μπορεί να αναγνωρίσει μια μηχανή OCR για ένα συγκεκριμένο script. +* Διαχειριστείτε περιπτώσεις όπου ένα script δεν υποστηρίζεται. +* Ενσωματώσετε τις πληροφορίες του character‑set σε downstream validation ή λογική UI. + +Καμία μαγική μαύρη‑κουτί τεχνική—μόνο απλό Python, μερικές γραμμές κώδικα, και λίγη εξήγηση. + +--- + +## Προαπαιτούμενα + +Πριν προχωρήσουμε, βεβαιωθείτε ότι έχετε: + +* Εγκατεστημένο Python 3.8+ (ο κώδικας χρησιμοποιεί f‑strings). +* Τη βιβλιοθήκη OCR που παρέχει `ocr.OcrEngine`—για αυτό το παράδειγμα θα υποθέσουμε ότι ένα πακέτο με όνομα `ocr` είναι ήδη εγκατεστημένο μέσω `pip install ocr-lib`. +* Βασική εξοικείωση με το σύστημα εισαγωγών του Python και την εκτύπωση. + +Αν λείπει η βιβλιοθήκη, εκτελέστε: + +```bash +pip install ocr-lib +``` + +Αυτό είναι όλο—δεν χρειάζονται επιπλέον δυαδικά αρχεία ή εγγενείς εξαρτήσεις για τις βασικές ερωτήσεις character‑set που θα εκτελέσουμε. + +## Βήμα 1: Αρχικοποίηση του Αντικειμένου OCR Engine + +Η δημιουργία ενός αντικειμένου μηχανής είναι το πρώτο βήμα όταν **use OCR engine** λειτουργικότητα. Σκεφτείτε το σαν να ενεργοποιείτε έναν σαρωτή· χωρίς αυτό, δεν μπορείτε να κάνετε ερωτήσεις. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +Η κλάση `OcrEngine` είναι ένα ελαφρύ wrapper γύρω από τη βασική μηχανή αναγνώρισης. Δεν ξεκινά βαριά επεξεργασία μέχρι να ζητήσετε κάτι, έτσι το κόστος εκκίνησης είναι αμελητέο. + +> **Συμβουλή:** Αν η εφαρμογή σας δημιουργεί πολλές στιγμές της μηχανής, επαναχρησιμοποιήστε μία μόνο. Εξοικονομεί μνήμη και μειώνει την καθυστέρηση εκκίνησης. + +## Βήμα 2: Ερώτηση του OCR Character Set για Συγκεκριμένο Script + +Τώρα που έχουμε μια μηχανή, μπορούμε να τη ρωτήσουμε ποιοι χαρακτήρες γνωρίζει για ένα συγκεκριμένο σύστημα γραφής. Η μέθοδος `get_supported_characters(script_name)` επιστρέφει μια Python `list` από χαρακτήρες Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Η εκτέλεση του παραπάνω αποσπάσματος εκτυπώνει κάτι όπως: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Η ακριβής έξοδος εξαρτάται από την έκδοση της βιβλιοθήκης OCR, αλλά πάντα θα λαμβάνετε μια επίπεδη λίστα χαρακτήρων. Αν χρειάζεστε τόσο κεφαλαία όσο και πεζά, απλώς ζητήστε το script `"Latin"` και στη συνέχεια εφαρμόστε `.lower()` σε κάθε στοιχείο, ή ερωτήστε ένα ξεχωριστό script `"Latin‑lower"` αν η βιβλιοθήκη τα διακρίνει. + +### Γιατί είναι σημαντικό αυτό; + +Όταν τροφοδοτείτε μια εικόνα που περιέχει ασυνήθιστα διακριτικά (π.χ., “ñ” ή “ø”), η μηχανή OCR μπορεί σιωπηρά να τα αντικαταστήσει με έναν placeholder ή να τα αφαιρέσει εντελώς. Η γνώση του **ocr character set** εκ των προτέρων σας επιτρέπει να προ‑επαληθεύσετε την είσοδο, να προειδοποιήσετε τους χρήστες ή να μεταβείτε σε διαφορετική μηχανή. + +## Βήμα 3: Ανάκτηση Χαρακτήρων για Άλλο Script – Παράδειγμα Κυριλλικού + +Η ίδια μέθοδος λειτουργεί για οποιοδήποτε script υποστηρίζει η μηχανή. Ας δούμε τι συμβαίνει με το Κυριλλικό. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Τυπική έξοδος: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Αν η μηχανή **not** υποστηρίζει το ζητούμενο script, συνήθως εγείρει ένα `ValueError` (ή επιστρέφει κενή λίστα ανάλογα με την υλοποίηση). Ας προστατευτούμε από αυτό. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +## Βήμα 4: Οπτικοποίηση του OCR Character Set (Προαιρετικό) + +Μερικές φορές μια γρήγορη οπτική βοήθεια είναι χρήσιμη, ειδικά όταν χρειάζεται να δείξετε στους ενδιαφερόμενους ποιους χαρακτήρες καλύπτετε. Παρακάτω είναι ένα ελάχιστο παράδειγμα που χρησιμοποιεί το `matplotlib` για να αποδώσει ένα πλέγμα χαρακτήρων. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Κείμενο εναλλακτικής εικόνας:** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +Το διάγραμμα δεν απαιτείται για τη βασική λύση, αλλά δείχνει πώς μπορείτε να **use OCR engine** μεταδεδομένα πέρα από την απλή εκτύπωση. + +## Βήμα 5: Ενσωμάτωση του Character Set σε Πραγματικές Ροές Εργασίας + +Τώρα που μπορούμε να ανακτήσουμε το **ocr character set**, ας δούμε ένα πρακτικό σενάριο: επικύρωση του αποτελέσματος OCR πριν το αποθηκεύσουμε σε μια βάση δεδομένων. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Αυτό το μοτίβο αποτρέπει την εισαγωγή άχρηστων δεδομένων σε downstream pipelines, ένα κοινό πρόβλημα όταν εργάζεστε με πολυγλωσσικά έγγραφα. + +## Συνηθισμένα Πιθανά Προβλήματα και Ακραίες Περιπτώσεις + +| Pitfall | Why it Happens | How to Avoid | +|---------|----------------|--------------| +| **Λάθος ονομασίας script** (π.χ., `"Cyrillic "` με κενό στο τέλος) | Η μηχανή αντιμετωπίζει τη συμβολοσειρά κυριολεκτικά και δεν μπορεί να βρει το script. | Αφαιρέστε τα κενά: `script.strip()` πριν καλέσετε το `get_supported_characters`. | +| **Κενή λίστα χαρακτήρων** | Κάποιες μηχανές εκθέτουν ένα script αλλά δεν έχουν φορτώσει ακόμη το μοντέλο γλώσσας. | Καλέστε `engine.load_language_model(script)` αν η βιβλιοθήκη παρέχει τέτοια μέθοδο, ή βεβαιωθείτε ότι τα αρχεία μοντέλου είναι παρόντα. | +| **Προβλήματα κανονικοποίησης Unicode** | Χαρακτήρες όπως “é” μπορεί να εμφανίζονται ως συνθετοί (`\u00E9`) ή αποσυνθετοί (`e\u0301`). | Κανονικοποιήστε τις συμβολοσειρές με `unicodedata.normalize('NFC', text)` πριν την επικύρωση. | +| **Απόδοση σε τεράστια scripts** (π.χ., Chinese) | Η ανάκτηση χιλιάδων χαρακτήρων μπορεί να είναι αργή. | Αποθηκεύστε το αποτέλεσμα στην cache μετά την πρώτη κλήση· οι περισσότερες εφαρμογές χρειάζονται τη λίστα μόνο μία φορά. | + +## Πλήρες Παράδειγμα Λειτουργίας + +Συνδυάζοντας όλα, εδώ είναι ένα ενιαίο script που μπορείτε να αντιγράψετε‑επικολλήσετε και να εκτελέσετε: + + + +## Τι Να Μάθετε Στη Σειρά; + +Τα παρακάτω tutorials καλύπτουν στενά σχετικούς θέματα που βασίζονται στις τεχνικές που παρουσιάστηκαν σε αυτόν τον οδηγό. Κάθε πόρος περιλαμβάνει πλήρη παραδείγματα κώδικα με βήμα‑βήμα εξηγήσεις για να σας βοηθήσει να κατακτήσετε πρόσθετες δυνατότητες API και να εξερευνήσετε εναλλακτικές προσεγγίσεις υλοποίησης στα δικά σας έργα. + +- [Aspose OCR Tutorial – Οπτική Αναγνώριση Χαρακτήρων](/ocr/english/) +- [OCR Post Processing – Λήψη Επιλογών Χαρακτήρων](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Πώς να Ορίσετε Τιμή Κατωφλίου στην OCR Αναγνώριση Εικόνας](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/greek/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/greek/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..da57b87fa --- /dev/null +++ b/ocr/greek/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,236 @@ +--- +category: general +date: 2026-06-06 +description: Πραγματοποιήστε OCR σε εικόνα χρησιμοποιώντας το Aspose OCR και ένα μοντέλο + Hugging Face. Μάθετε πώς να κατεβάσετε το μοντέλο Hugging Face, να εξάγετε κείμενο + από τιμολόγιο και να ελευθερώσετε τους πόρους GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: el +og_description: Πραγματοποιήστε OCR σε εικόνα χρησιμοποιώντας το Aspose OCR και ένα + μοντέλο Hugging Face. Αυτό το σεμινάριο δείχνει πώς να κατεβάσετε το μοντέλο, να + εξάγετε κείμενο από τιμολόγιο και να ελευθερώσετε τους πόρους GPU. +og_title: Εκτελέστε OCR σε εικόνα με το Aspose OCR & LLM – Πλήρης οδηγός +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Εκτελέστε OCR σε εικόνα με το Aspose OCR & LLM – Πλήρης οδηγός +url: /el/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Εκτέλεση OCR σε Εικόνα με Aspose OCR & LLM – Πλήρης Οδηγός + +Έχετε ποτέ θέλει να **εκτελέσετε OCR σε εικόνα** αρχεία αλλά νιώσατε κολλημένοι στην ερώτηση «πού να ξεκινήσω;»; Δεν είστε μόνοι—πολλοί προγραμματιστές αντιμετωπίζουν αυτό το εμπόδιο όταν ασχολούνται για πρώτη φορά με την αυτοματοποίηση εγγράφων. Τα καλά νέα είναι ότι με το Aspose OCR και ένα ελαφρύ LLM από το Hugging Face, μπορείτε να μετατρέψετε μια ακατέργαστη σάρωση ενός τιμολογίου σε καθαρό, αναζητήσιμο κείμενο με λίγες γραμμές Python. + +Σε αυτό το tutorial θα περάσουμε από όλα όσα χρειάζεστε: από **φόρτωση εικόνας για OCR**, μέχρι **λήψη του μοντέλου Hugging Face**, μέχρι **εξαγωγή κειμένου από τιμολόγιο** δεδομένα, και τέλος **απελευθέρωση πόρων GPU** ώστε η εφαρμογή σας να παραμένει ελαφριά. Στο τέλος θα έχετε ένα αυτόνομο script που μπορείτε να ενσωματώσετε σε οποιοδήποτε έργο. + +--- + +## Τι Θα Μάθετε + +- Πώς να **εκτελέσετε OCR σε εικόνα** χρησιμοποιώντας το `OcrEngine` της Aspose. +- Τα ακριβή βήματα για **λήψη αρχείων μοντέλου Hugging Face** αυτόματα. +- Τεχνικές για **εξαγωγή κειμένου από τιμολόγιο** PDFs ή PNGs με AI‑βελτιωμένη μετα‑επεξεργασία. +- Καλές πρακτικές για **απελευθέρωση πόρων GPU** μετά την εκτίμηση. +- Συμβουλές για **φόρτωση εικόνας για OCR** αποδοτικά και αποφυγή κοινών παγίδων. + +Δεν απαιτείται εξωτερική τεκμηρίωση—όλα όσα χρειάζεστε είναι εδώ, με πλήρη κώδικα, εξηγήσεις και αναμενόμενα αποτελέσματα. + +## Προαπαιτούμενα + +| Απαίτηση | Αιτία | +|-------------|--------| +| Python 3.9+ | Σύγχρονη σύνταξη και type hints | +| `asposeocr` package (`pip install asposeocr`) | Κύρια μηχανή OCR | +| Access to a GPU (optional but recommended) | Επιταχύνει τον LLM post‑processor | +| An invoice image (`sample_invoice.png`) | Πραγματικό παράδειγμα | + +Αν σας λείπει κάποιο από αυτά, εγκαταστήστε το τώρα· το script θα **κατεβάσει αυτόματα το μοντέλο Hugging Face**, ώστε να μην χρειάζεται να ψάχνετε τα αρχεία μόνοι σας. + +## Βήμα 1: Εκτέλεση OCR σε Εικόνα – Δημιουργία της Μηχανής + +Το πρώτο πράγμα που πρέπει να κάνετε είναι να εκκινήσετε τη μηχανή OCR της Aspose. Σκεφτείτε το ως το άνοιγμα ενός κεννού καμβά όπου η εικόνα θα μετατραπεί αργότερα σε κείμενο. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Γιατί είναι σημαντικό:** Το `OcrEngine` αφαιρεί όλες τις χαμηλού επιπέδου προεπεξεργασίες εικόνας, ώστε να μπορείτε να εστιάσετε στη ροή εργασίας υψηλότερου επιπέδου. Επίσης εκθέτει τη μέθοδο `set_post_processor` που αργότερα θα μας επιτρέψει να συνδέσουμε ένα LLM για πιο έξυπνη έξοδο. + +## Βήμα 2: Φόρτωση Εικόνας για OCR – Επιλογή του Κατάλληλου Αρχείου + +Τώρα που υπάρχει η μηχανή, πρέπει να **φορτώσουμε εικόνα για OCR**. Η Aspose υποστηρίζει PNG, JPG, TIFF και μερικές άλλες μορφές. Βεβαιωθείτε ότι η διαδρομή είναι απόλυτη ή σχετική με τη θέση του script σας. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Συμβουλή:** Αν η εικόνα σας είναι μεγάλη, σκεφτείτε να την αλλάξετε μέγεθος εκ των προτέρων για να μειώσετε την πίεση μνήμης. Η μηχανή OCR μπορεί να διαχειριστεί σαρώσεις υψηλής ανάλυσης, αλλά μια εικόνα 300 DPI είναι συνήθως το ιδανικό σημείο για τιμολόγια. + +## Βήμα 3: Εκτέλεση Ακατέργαστου OCR και Προβολή του Εξαγόμενου Κειμένου + +Με την εικόνα φορτωμένη, μπορούμε τελικά να **εκτελέσουμε OCR σε εικόνα** και να δούμε τι εξάγει η ακατέργαστη μηχανή. Αυτό το βήμα μας δίνει μια βάση πριν προσθέσουμε τη μαγεία AI. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Αναμενόμενο αποτέλεσμα (κομμένο):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Η ακατέργαστη έξοδος συχνά περιέχει αλλαγές γραμμής, λανθασμένα αναγνωρισμένους χαρακτήρες ή ελλιπή πεδία—ακριβώς ο λόγος που θα φέρουμε ένα μοντέλο γλώσσας στη συνέχεια. + +## Βήμα 4: Λήψη Μοντέλου Hugging Face – Διαμόρφωση του LLM Post‑Processor + +Εδώ είναι που το βήμα **λήψη μοντέλου Hugging Face** ξεχωρίζει. Το Aspose AI μπορεί αυτόματα να κατεβάσει ένα μοντέλο από το κέντρο Hugging Face αν δεν υπάρχει ήδη στον δίσκο. Θα χρησιμοποιήσουμε το μοντέλο Qwen2.5‑3B‑Instruct‑GGUF, το οποίο ισορροπεί την ακρίβεια και το αποτύπωμα μνήμης. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Γιατί λειτουργεί:** Η `allow_auto_download` σας εξοικονομεί το χειροκίνητο κατέβασμα του αρχείου `.gguf`. Η ποσοτικοποίηση (`int8`) μειώνει το μέγεθος του μοντέλου σε περίπου 3 GB, καθιστώντας το εφικτό στα περισσότερα καταναλωτικά GPU. Ρυθμίστε το `gpu_layers` ανάλογα με το υλικό σας—περισσότερα στρώματα στο GPU = ταχύτερη εκτίμηση. + +## Βήμα 5: Εξαγωγή Κειμένου από Τιμολόγιο Χρησιμοποιώντας AI‑Βελτιωμένη Μετα‑Επεξεργασία + +Τώρα συνδέουμε το LLM στη μηχανή OCR και τρέχουμε έναν **post‑processor** που καθαρίζει την ακατέργαστη έξοδο, διορθώνει λάθη OCR και μορφοποιεί τα πεδία του τιμολογίου όμορφα. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Δείγμα βελτιωμένης εξόδου:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Τι συνέβη;** Το LLM αναγνώρισε ότι το “Invoice #12345” πρέπει να είναι “Invoice Number: 12345”, διόρθωσε τη μορφή της ημερομηνίας και ακόμη συνέλαβε το πεδίο “Bill To” που η ακατέργαστη μηχανή παρέλειψε. Αυτό είναι το βασικό στοιχείο της αυτοματοποίησης **εξαγωγής κειμένου από τιμολόγιο**. + +## Βήμα 6: Απελευθέρωση Πόρων GPU – Καθαρισμός μετά την Επεξεργασία + +Αν εκτελείτε αυτό σε μια μακροχρόνια υπηρεσία (π.χ., ένα Flask API), πρέπει να **απελευθερώσετε πόρους GPU** μετά από κάθε εκτίμηση για να αποφύγετε καταρρεύσεις λόγω έλλειψης μνήμης. Το Aspose AI παρέχει μια απλή μέθοδο γι' αυτό. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Επαγγελματική συμβουλή:** Καλέστε το `free_resources()` μέσα σε ένα μπλοκ `finally:` αν τυλίγετε την κλήση OCR σε try/except. Αυτό εγγυάται τον καθαρισμό ακόμη και όταν προκύψει εξαίρεση. + +## Βήμα 7: Πλήρες Script – Συνδυάστε Όλα Μαζί + +Παρακάτω είναι το πλήρες, έτοιμο‑να‑τρέξει script. Αντιγράψτε‑και‑επικολλήστε το, προσαρμόστε τις διαδρομές, και είστε έτοιμοι. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Τρέξτε το script και παρακολουθήστε τη μεταμόρφωση από θορυβώδη OCR σε καθαρά, δομημένα δεδομένα τιμολογίου. 🎉 + +## Συχνές Ερωτήσεις & Ακραίες Περιπτώσεις + +| Ερώτηση | Απάντηση | +|----------|--------| +| **Τι γίνεται αν το μοντέλο δεν κατέβει;** | Βεβαιωθείτε ότι ο υπολογιστής σας έχει πρόσβαση στο internet και ότι το `hugging_face_repo_id` είναι σωστό. Μπορείτε επίσης να κατεβάσετε χειροκίνητα το | + +## Τι Θα Πρέπει Να Μάθετε Στη Σειρά; + +Τα παρακάτω tutorials καλύπτουν στενά σχετιζόμενα θέματα που επεκτείνουν τις τεχνικές που παρουσιάστηκαν σε αυτόν τον οδηγό. Κάθε πόρος περιλαμβάνει πλήρη παραδείγματα κώδικα με βήμα‑βήμα εξηγήσεις για να σας βοηθήσει να κατακτήσετε πρόσθετες δυνατότητες API και να εξερευνήσετε εναλλακτικές προσεγγίσεις υλοποίησης στα δικά σας έργα. + +- [Εξαγωγή κειμένου εικόνας C# με επιλογή γλώσσας χρησιμοποιώντας Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Εξαγωγή Κειμένου από Εικόνα – Βελτιστοποίηση OCR με Aspose.OCR για .NET](/ocr/english/net/ocr-optimization/) +- [Μετατροπή Εικόνας σε Κείμενο – Εκτέλεση OCR σε Εικόνα από URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/greek/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/greek/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..728fca078 --- /dev/null +++ b/ocr/greek/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,284 @@ +--- +category: general +date: 2026-06-06 +description: αναγνωρίστε κείμενο από εικόνα με το Aspose OCR – μάθετε πώς να φορτώνετε + εικόνα για OCR και να εκτελείτε OCR στην εικόνα χρησιμοποιώντας AI post‑processing + σε Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: el +og_description: αναγνωρίστε κείμενο από εικόνα γρήγορα. Αυτός ο οδηγός δείχνει πώς + να φορτώσετε εικόνα για OCR, να εκτελέσετε OCR στην εικόνα και να ενισχύσετε τα + αποτελέσματα με AI post‑processing. +og_title: Αναγνώριση κειμένου από εικόνα χρησιμοποιώντας Aspose OCR & AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Αναγνώριση κειμένου από εικόνα χρησιμοποιώντας το Aspose OCR & AI +url: /el/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Αναγνώριση κειμένου από εικόνα χρησιμοποιώντας Aspose OCR & AI + +Ποτέ χρειάστηκε να αναγνωρίσετε κείμενο από εικόνα αλλά δεν ήσασταν σίγουροι ποια βιβλιοθήκη θα σας προσφέρει ταυτόχρονα ταχύτητα και ακρίβεια; Δεν είστε μόνοι. Σε αυτόν τον οδηγό θα περάσουμε από ένα πλήρες, end‑to‑end παράδειγμα που δείχνει **πώς να φορτώσετε εικόνα για OCR**, **πώς να εκτελέσετε OCR σε εικόνα**, και στη συνέχεια θα τελειοποιήσουμε το αποτέλεσμα με τον AI post‑processor της Aspose. Στο τέλος θα έχετε ένα έτοιμο script που μετατρέπει ένα PNG σε καθαρό, αναζητήσιμο κείμενο. + +## Τι θα μάθετε + +Θα καλύψουμε τα πάντα, από την εγκατάσταση του πακέτου Aspose OCR μέχρι την απελευθέρωση πόρων στο τέλος της εκτέλεσης. Θα δείτε γιατί η ενεργοποίηση της αναγνώρισης χειρόγραφου κειμένου είναι σημαντική, πώς να ρυθμίσετε ένα Qwen 2.5 LLM για post‑processing, και πώς φαίνεται το τελικό αποτέλεσμα. Δεν απαιτούνται εξωτερικές αναφορές—απλώς αντιγράψτε, επικολλήστε και τρέξτε. + +### Προαπαιτήσεις + +- Python 3.8 ή νεότερη +- πακέτο `asposeocr` (`pip install asposeocr`) +- Ένα αρχείο εικόνας (π.χ., `doc.png`) που περιέχει τυπωμένο ή χειρόγραφο κείμενο +- Προαιρετικά: μια GPU για ταχύτερη εκτέλεση LLM (το script λειτουργεί και σε CPU) + +--- + +## Αναγνώριση κειμένου από εικόνα – Βήμα‑βήμα + +Κάτω από κάθε μπλοκ κώδικα θα βρείτε μια σύντομη εξήγηση του **γιατί** κάνουμε αυτή τη συγκεκριμένη ενέργεια, όχι μόνο του **τι** κάνει η γραμμή. + +### Βήμα 1: Εγκατάσταση και εισαγωγή των απαιτούμενων modules + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Γιατί;* Η εισαγωγή του `asposeocr` μας δίνει την κλάση `OcrEngine`, ενώ το υπο-μοντέλο `ai` παρέχει τον LLM‑based post‑processor που βελτιώνει δραματικά το ακατέργαστο αποτέλεσμα OCR. + +### Βήμα 2: Δημιουργία του OCR engine και ενεργοποίηση της αναγνώρισης χειρόγραφου κειμένου + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Η ενεργοποίηση της αναγνώρισης χειρόγραφου επεκτείνει το σύνολο χαρακτήρων του engine, ώστε να μην χάσετε τις σημειώσεις όταν **εκτελείτε OCR σε εικόνα** που περιέχει μικτό τυπωμένο και καλλιγραφικό κείμενο. + +### Βήμα 3: Φόρτωση της εικόνας για OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +Η κλήση `load_image` είναι η στιγμή που **φορτώνετε εικόνα για OCR**· αν η διαδρομή είναι λανθασμένη, το engine θα ρίξει μια ενημερωτική εξαίρεση, σώζοντάς σας από ασαφείς σφάλματα σε επόμενα βήματα. + +### Βήμα 4: Εκτέλεση της ακατέργαστης διεργασίας OCR + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +Σε αυτό το στάδιο λαμβάνετε ένα αντικείμενο `RecognitionResult` που περιέχει το ακατέργαστο κείμενο, τις βαθμολογίες εμπιστοσύνης και τα μεταδεδομένα διάταξης. Συχνά είναι θορυβώδες—γι' αυτό χρειάζεται η καθαριότητα με AI. + +### Βήμα 5: Διαμόρφωση του μοντέλου Aspose AI για LLM post‑processing + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Γιατί να ασχοληθούμε με αυτές τις ρυθμίσεις; +- **auto‑download** εγγυάται ότι το μοντέλο είναι διαθέσιμο στην πρώτη εκτέλεση. +- **int8 quantization** μειώνει τη χρήση μνήμης χωρίς σημαντική απώλεια ακρίβειας. +- **gpu_layers** σας επιτρέπει να αξιοποιήσετε μια συμβατή GPU για ταχύτερη εκτέλεση. + +### Βήμα 6: Αρχικοποίηση του AI επεξεργαστή και σύνδεσή του ως post‑processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Η σύνδεση του επεξεργαστή σημαίνει ότι κάθε φορά που καλείτε `run_postprocessor`, το LLM θα διορθώσει ορθογραφικά, θα ενώσει σπασμένες λέξεις και ακόμη θα προτείνει ελλιπείς στίξη. + +### Βήμα 7: Εκτέλεση του post‑processor για βελτίωση του OCR output + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +Το `enhanced_result.text` είναι συνήθως πολύ πιο αναγνώσιμο από το ακατέργαστο string—σκεφτείτε το ως έναν ορθογραφικό ελεγκτή που καταλαβαίνει επίσης το συμφραζόμενο. + +### Βήμα 8: Απελευθέρωση πόρων + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Ο καθαρισμός είναι κρίσιμος σε υπηρεσίες που τρέχουν πολύ χρόνο· διαφορετικά θα διαρρεύσει μνήμη GPU και τελικά θα καταρρεύσει η εφαρμογή σας. + +--- + +## Πλήρες script που μπορείτε να εκτελέσετε σήμερα + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Αναμενόμενο αποτέλεσμα** (παράδειγμα για μια απλή εικόνα τιμολόγησης): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Παρατηρήστε πώς το AI layer διόρθωσε το “Inv0ice” → “Invoice” και πρόσθεσε την ελλιπή στίξη. + +--- + +## Συχνές ερωτήσεις (και σύντομες απαντήσεις) + +- **Χρειάζομαι GPU;** Όχι. Το script επιστρέφει σε CPU, αλλά η εκτέλεση θα είναι πιο αργή. +- **Μπορώ να χρησιμοποιήσω διαφορετικό LLM;** Απόλυτα—απλώς αλλάξτε το `hugging_face_repo_id` και προσαρμόστε το `gpu_layers` ανάλογα. +- **Τι γίνεται αν η εικόνα μου είναι τεράστια;** Αλλάξτε το μέγεθός της πρώτα (π.χ., με το Pillow) για να διατηρήσετε λογική χρήση μνήμης. +- **Είναι πάντα ενεργή η αναγνώριση χειρόγραφου;** Μπορείτε να ενεργοποιήσετε/απενεργοποιήσετε το `enable_handwritten_recognition` ανάλογα με το φορτίο εργασίας. + +--- + +## Συμπέρασμα + +Τώρα γνωρίζετε πώς να **αναγνωρίζετε κείμενο από εικόνα** χρησιμοποιώντας Aspose OCR, πώς να **φορτώνετε εικόνα για OCR**, και πώς να **εκτελείτε OCR σε εικόνα** με AI‑βελτιωμένο post‑processing. Το πλήρες, εκτελέσιμο παράδειγμα παραπάνω σας δίνει μια σταθερή βάση για να ενσωματώσετε OCR σε οποιοδήποτε έργο Python—είτε πρόκειται για σάρωση αποδείξεων, ψηφιοποίηση συμβάσεων, ή εξαγωγή δεδομένων από χειρόγραφες φόρμες. + +Έτοιμοι για το επόμενο βήμα; Δοκιμάστε να αντικαταστήσετε το μοντέλο Qwen με ένα μεγαλύτερο, πειραματιστείτε με διαφορετικά σχήματα ποσοτικοποίησης, ή συνδέστε πολλαπλές εικόνες για επεξεργασία σε batch. Οι δυνατότητες είναι ατελείωτες, και ο κώδικας που μόλις δημιουργήσατε θα τις διαχειριστεί άψογα. + +Καλή προγραμματιστική δουλειά, και εύχομαι τα OCR αποτελέσματά σας να είναι πάντα κρυστάλλινα! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Στιγμιότυπο οθόνης που δείχνει πώς να αναγνωρίζετε κείμενο από εικόνα χρησιμοποιώντας Aspose OCR"} + +## Τι Θα Μάθετε Στη Σειρά; + +Οι παρακάτω οδηγίες καλύπτουν στενά συναφή θέματα που βασίζονται στις τεχνικές που παρουσιάστηκαν σε αυτόν τον οδηγό. Κάθε πόρος περιλαμβάνει πλήρη λειτουργικό κώδικα με βήμα‑βήμα εξηγήσεις για να σας βοηθήσει να κατακτήσετε πρόσθετες δυνατότητες API και να εξερευνήσετε εναλλακτικές προσεγγίσεις στα δικά σας έργα. + +- [Εξαγωγή Κειμένου από Εικόνα με Aspose OCR – Οδηγός Βήμα‑βήμα](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Μετατροπή Εικόνας σε Κείμενο – Εκτέλεση OCR σε Εικόνα από URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Εξαγωγή κειμένου εικόνας C# με επιλογή γλώσσας χρησιμοποιώντας Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hindi/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/hindi/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..958715d52 --- /dev/null +++ b/ocr/hindi/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,213 @@ +--- +category: general +date: 2026-06-06 +description: Python OCR का उपयोग करके हस्तलिखित छवि से टेक्स्ट निकालें। सीखें कि कैसे + हस्तलिखित फोटो को तेज़ी और भरोसेमंद तरीके से टेक्स्ट में बदलें। +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: hi +og_description: Python के साथ हस्तलिखित छवि से टेक्स्ट निकालें। यह गाइड दिखाता है + कि कैसे हस्तलिखित फोटो को टेक्स्ट में बदलें और यह बताता है कि हस्तलिखित टेक्स्ट + को कैसे पहचानें। +og_title: हस्तलेखित छवि से टेक्स्ट निकालें – पाइथन OCR ट्यूटोरियल +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Python OCR के साथ हस्तलिखित छवि से टेक्स्ट निकालें – चरण‑दर‑चरण मार्गदर्शिका +url: /hi/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# हाथ से लिखी छवि से टेक्स्ट निकालें Python OCR – चरण‑दर‑चरण गाइड + +क्या आपने कभी सोचा है **हाथ से लिखे टेक्स्ट को पहचानने** के बारे में, जब आप अपने फ़ोन से फोटो लेते हैं? आप अकेले नहीं हैं। कई प्रोजेक्ट्स में—चाहे वह लेक्चर नोट्स को डिजिटल बनाना हो या साइन किए गए फ़ॉर्म से डेटा निकालना—आपको **हाथ से लिखी छवि से टेक्स्ट निकालना** तेज़ और बिना झंझट के चाहिए। + +इस ट्यूटोरियल में हम एक पूर्ण, तैयार‑चलाने‑योग्य उदाहरण के माध्यम से दिखाएंगे कि कैसे एक लोकप्रिय Python OCR लाइब्रेरी का उपयोग करके **हाथ से लिखी फोटो को टेक्स्ट में बदलें**। कोई अस्पष्ट संदर्भ नहीं, सिर्फ ठोस कोड, व्याख्याएँ, और टिप्स जिन्हें आप आज ही कॉपी‑पेस्ट कर सकते हैं। + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## आपको क्या चाहिए + +| आवश्यकता | क्यों महत्वपूर्ण है | +|-------------|----------------| +| Python 3.9 or newer | आधुनिक सिंटैक्स और लाइब्रेरी समर्थन | +| `pip` (Python package manager) | OCR पैकेज स्थापित करने के लिए | +| A clear image of handwritten notes (JPEG/PNG) | धुंधली छवियों पर OCR की सटीकता घटती है | +| Basic familiarity with Python functions | बाद में उदाहरण को अनुकूलित करने में मदद करता है | + +यदि आप इनमें से कोई भी चीज़ नहीं रखते, तो से नवीनतम Python डाउनलोड करें और इंस्टॉल करें—कोई दिक्कत नहीं। + +## Python OCR लाइब्रेरी स्थापित करें + +हमारे द्वारा उपयोग किया जाने वाला कोड स्निपेट `ocr` पैकेज पर निर्भर करता है (Tesseract का एक हल्का रैपर जो उपयोगी सेटिंग्स जोड़ता है)। इसे एक ही कमांड से इंस्टॉल करें: + +```bash +pip install ocr +``` + +> **Pro tip:** इंस्टॉल करने के बाद, टर्मिनल में `tesseract --version` चलाएँ ताकि यह पुष्टि हो सके कि बेस इंजन मौजूद है। यदि आपके पास Tesseract नहीं है, तो अपने OS के आधिकारिक गाइड का पालन करें—अधिकांश पैकेज मैनेजर्स में यह उपलब्ध है (`apt-get install tesseract-ocr` Ubuntu पर, `brew install tesseract` macOS पर)। + +## चरण 1: OCR इंजन इंस्टेंस बनाएं + +Creating the engine is the first brick in the wall of **python ocr handwritten recognition**. Think of the engine as the brain that will later read the scribbles. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Why this matters: बिना इंजन के आप पहचान पाइपलाइन को ट्यून नहीं कर सकते। डिफ़ॉल्ट सेटिंग्स प्रिंटेड टेक्स्ट के लिए ट्यून की गई हैं, इसलिए हमें अगले चरण में उन्हें समायोजित करना होगा। + +## चरण 2: हस्तलेख पहचान सक्षम करें + +By default the engine assumes printed characters. Enabling handwritten mode flips a switch that tells Tesseract to use its LSTM model trained on cursive strokes. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **What if you skip this?** OCR स्ट्रोक्स को शोर के रूप में लेगा, जिससे गड़बड़ आउटपुट मिलेगा। इस फ़्लैग को सक्षम करना **हाथ से लिखे टेक्स्ट को पहचानने** का मूल है। + +## चरण 3: अपनी हस्तलेख फोटो लोड करें + +Now we point the engine at the image file. The path can be absolute or relative; just be sure the file exists. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +A quick sanity check: अपने OS के व्यूअर में इमेज खोलें। यदि टेक्स्ट धुंधला है, तो इंजन को फीड करने से पहले प्री‑प्रोसेसिंग (कॉन्ट्रास्ट बूस्ट, रोटेशन) पर विचार करें—ये ट्यूनिंग अक्सर **हाथ से लिखी छवि से टेक्स्ट निकालें** की सफलता दर बढ़ाती हैं। + +## चरण 4: पहचान प्रक्रिया चलाएँ + +With everything set, we finally ask the engine to do its job. The `recognize()` method returns a result object that holds the extracted string and confidence scores. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Behind the scenes, the engine converts the bitmap into a series of feature vectors, runs them through the LSTM network, and stitches the characters together. That’s the magic of **python ocr handwritten recognition**. + +## चरण 5: निकाले गए टेक्स्ट को प्रदर्शित करें + +The result object exposes a `.text` attribute that contains the plain Unicode string. Print it, write it to a file, or feed it into another pipeline—your call. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### अपेक्षित आउटपुट + +If the source image contains the note “Buy milk, eggs, and bread”, you’d see something like: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Notice how the output preserves punctuation and line breaks (if any). If you get gibberish, double‑check the image quality and the `enable_handwritten_recognition` flag. + +## सामान्य समस्याओं का समाधान + +| समस्या | लक्षण | समाधान | +|-------|---------|-----| +| कम विश्वास स्कोर | कई “?” या बेतुके अक्षर | इमेज DPI को ≥300 तक बढ़ाएँ, बाइनराइज़ेशन (`opencv`) लागू करें, या रुचि के क्षेत्र को क्रॉप करें। | +| मिश्रित भाषाएँ | आउटपुट में अंग्रेज़ी और अन्य लिपि मिश्रित है | `engine.ocr_settings.language = "eng"` (या कोई अन्य ISO कोड) को `recognize()` से पहले सेट करें। | +| बड़ी फ़ाइलें | लंबा प्रोसेसिंग समय या मेमोरी त्रुटि | लोड करने से पहले इमेज को उचित आकार (जैसे, अधिकतम चौड़ाई 1200 px) में रिसाइज़ करें। | +| टेस्सरैक्ट अनुपलब्ध | `ImportError` या `FileNotFoundError` | टेस्सरैक्ट को अलग से इंस्टॉल करें और सुनिश्चित करें कि यह आपके सिस्टम PATH में है। | + +These adjustments keep your **convert handwritten photo to text** workflow robust across diverse datasets. + +## आज ही चलाने योग्य पूर्ण स्क्रिप्ट + +Below is the complete, self‑contained program that puts all the pieces together. Copy it into a file named `handwritten_ocr.py` and execute `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Run it, and you’ll see the text printed to the console—exactly the result you need when you want to **convert handwritten photo to text**. + +## आगे बढ़ते हुए + +Now that you’ve mastered the basics of **extract text from handwritten image**, consider these next steps: + +- **Batch processing:** इमेजों के फ़ोल्डर पर लूप चलाएँ और प्रत्येक परिणाम को CSV फ़ाइल में सहेजें। +- **Post‑processing:** सामान्य OCR त्रुटियों (जैसे, “1” बनाम “l”) को साफ़ करने के लिए रेगुलर एक्सप्रेशन का उपयोग करें। +- **Integration:** निकाले गए स्ट्रिंग्स को एक Natural Language Processing पाइपलाइन में फीड करें ताकि सेंटिमेंट एनालिसिस या कीवर्ड एक्सट्रैक्शन किया जा सके। +- **Alternative libraries:** यदि आपको अधिक सटीकता चाहिए, तो `easyocr` या `pytesseract` को कस्टम LSTM मॉडल के साथ एक्सप्लोर करें—दोनों **python ocr handwritten recognition** को सपोर्ट करते हैं। + +Remember, the quality of the source image often dictates success, so spend a few minutes on preprocessing. A little extra effort now saves you a lot of debugging later. + +## निष्कर्ष + +We’ve walked through a complete, end‑to‑end example that shows **how to recognize handwritten text** and, more importantly, how to **extract text from handwritten image** using Python. By installing the `ocr` package, toggling the handwritten flag, loading your picture, and calling `recognize()`, you can **convert handwritten photo to text** in just a handful of lines. + +Give it a try with your own notes, tweak the preprocessing steps, and let the OCR do the heavy lifting. If you hit any snags, revisit the “Handling Common Pitfalls” table or experiment with alternative OCR back‑ends. Happy coding, and may your handwritten data become instantly searchable! + +## अब आप क्या सीखें? + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step-by-step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Aspose OCR के साथ छवि से टेक्स्ट निकालें – चरण‑दर‑चरण गाइड](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Aspose.OCR में OCR टेक्स्ट पहचान के लिए पेज आयतों को पहचानने का तरीका](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [OCR का उपयोग कैसे करें - टेक्स्ट एरिया डिटेक्शन के बिना इमेज को पहचानें](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hindi/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/hindi/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..55d48ff44 --- /dev/null +++ b/ocr/hindi/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,228 @@ +--- +category: general +date: 2026-06-06 +description: 'OCR कैरेक्टर सेट गाइड: जानिए कैसे OCR इंजन का उपयोग करके लैटिन और सिरिलिक + लिपियों के लिए समर्थित अक्षरों की सूची बनाएं, पूर्ण Python उदाहरणों के साथ।' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: hi +og_description: 'OCR कैरेक्टर सेट गाइड: जानें कि Python में OCR इंजन का उपयोग करके + विभिन्न लिपियों के लिए समर्थित अक्षरों की सूची कैसे बनाएं, कोड और टिप्स के साथ पूर्ण।' +og_title: OCR कैरेक्टर सेट – OCR इंजन को प्राप्त करें और उपयोग करें +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR कैरेक्टर सेट – पाइथन में OCR इंजन प्राप्त करें और उपयोग करें +url: /hi/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR कैरेक्टर सेट – Python में OCR इंजन को प्राप्त करें और उपयोग करें + +क्या आप जानना चाहते हैं कि आपका लाइब्रेरी कौन सा **ocr character set** सपोर्ट करता है? इस ट्यूटोरियल में हम आपको दिखाएंगे कि **use OCR engine** का उपयोग करके विभिन्न स्क्रिप्ट्स के लिए समर्थित कैरेक्टर्स को कैसे क्वेरी करें, और यह वास्तविक‑दुनिया के प्रोजेक्ट्स के लिए क्यों महत्वपूर्ण है। + +यदि आपने कभी खाली स्क्रीन पर घूरते हुए सोचा हो कि OCR प्रोसेसिंग के बाद कुछ अक्षर क्यों नहीं दिखते, तो आप अकेले नहीं हैं। उत्तर अक्सर उस कैरेक्टर सेट में छिपा होता है जिसे इंजन जानता है। इस गाइड के अंत तक आप सक्षम होंगे: + +* किसी दिए गए स्क्रिप्ट के लिए OCR इंजन द्वारा पहचाने जा सकने वाले सभी कैरेक्टर्स की सूची बनाना। +* उन मामलों को संभालना जहाँ स्क्रिप्ट समर्थित नहीं है। +* कैरेक्टर‑सेट जानकारी को डाउनस्ट्रीम वैलिडेशन या UI लॉजिक में इंटीग्रेट करना। + +कोई जादुई ब्लैक‑बॉक्स ट्रिक्स नहीं—सिर्फ साधारण Python, कुछ लाइनों का कोड, और थोड़ी सी व्याख्या। + +--- + +## आवश्यकताएँ + +शुरू करने से पहले, सुनिश्चित करें कि आपके पास है: + +* Python 3.8+ स्थापित (कोड f‑strings का उपयोग करता है)। +* वह OCR लाइब्रेरी जो `ocr.OcrEngine` प्रदान करती है—इस उदाहरण के लिए हम मानेंगे कि `ocr` नामक पैकेज पहले से `pip install ocr-lib` के माध्यम से स्थापित है। +* Python के इम्पोर्ट सिस्टम और प्रिंटिंग की बुनियादी समझ। + +यदि आपके पास लाइब्रेरी नहीं है, तो चलाएँ: + +```bash +pip install ocr-lib +``` + +बस इतना ही—बुनियादी कैरेक्टर‑सेट क्वेरीज़ के लिए कोई अतिरिक्त बाइनरी या नेटिव डिपेंडेंसीज़ की आवश्यकता नहीं है। + +--- + +## चरण 1: OCR इंजन इंस्टेंस को इनिशियलाइज़ करें + +एक इंजन ऑब्जेक्ट बनाना वह पहला कदम है जब आप **use OCR engine** फ़ंक्शनलिटी का उपयोग करते हैं। इसे एक स्कैनर को ऑन करने के रूप में सोचें; इसके बिना आप कोई प्रश्न नहीं पूछ सकते। + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +`OcrEngine` क्लास अंतर्निहित रिकग्निशन इंजन के चारों ओर एक हल्का रैपर है। यह तब तक भारी प्रोसेसिंग नहीं शुरू करता जब तक आप कुछ अनुरोध नहीं करते, इसलिए स्टार्टअप लागत नगण्य है। + +> **Pro tip:** यदि आपका एप्लिकेशन कई इंजन इंस्टेंस बनाता है, तो एक ही को पुन: उपयोग करें। यह मेमोरी बचाता है और इनिशियलाइज़ेशन लेटेंसी को कम करता है। + +--- + +## चरण 2: किसी विशिष्ट स्क्रिप्ट के लिए OCR कैरेक्टर सेट को क्वेरी करें + +अब जब हमारे पास एक इंजन है, हम उससे पूछ सकते हैं कि वह किसी विशेष लेखन प्रणाली के लिए कौन से कैरेक्टर्स जानता है। मेथड `get_supported_characters(script_name)` Python `list` के रूप में यूनिकोड कैरेक्टर्स लौटाता है। + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +ऊपर दिया गया स्निपेट चलाने पर कुछ इस तरह का आउटपुट मिलता है: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +सटीक आउटपुट OCR लाइब्रेरी के संस्करण पर निर्भर करता है, लेकिन आपको हमेशा कैरेक्टर्स की एक फ्लैट लिस्ट मिलेगी। यदि आपको अपरकेस और लोअरकेस दोनों चाहिए, तो बस `"Latin"` स्क्रिप्ट का अनुरोध करें और प्रत्येक एलिमेंट पर `.lower()` लागू करें, या यदि लाइब्रेरी उन्हें अलग करती है तो अलग `"Latin‑lower"` स्क्रिप्ट को क्वेरी करें। + +### यह क्यों महत्वपूर्ण है? + +जब आप किसी इमेज में असामान्य डायाक्रिटिक्स (जैसे “ñ” या “ø”) देते हैं, तो OCR इंजन उन्हें चुपचाप प्लेसहोल्डर से बदल सकता है या पूरी तरह हटा सकता है। पहले से **ocr character set** को जानना आपको इनपुट को पूर्व‑वैलिडेट करने, उपयोगकर्ताओं को चेतावनी देने, या किसी अन्य इंजन पर फॉल बैक करने में मदद करता है। + +--- + +## चरण 3: किसी अन्य स्क्रिप्ट के लिए कैरेक्टर्स प्राप्त करें – Cyrillic उदाहरण + +इसी मेथड का उपयोग किसी भी स्क्रिप्ट के लिए किया जा सकता है जिसे इंजन सपोर्ट करता है। चलिए देखते हैं Cyrillic के साथ क्या होता है। + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +सामान्य आउटपुट: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +यदि इंजन अनुरोधित स्क्रिप्ट को **सपोर्ट** नहीं करता, तो आमतौर पर यह `ValueError` उठाता है (या इम्प्लीमेंटेशन के अनुसार खाली लिस्ट लौटाता है)। चलिए इसके लिए सुरक्षा लागू करते हैं। + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## चरण 4: OCR कैरेक्टर सेट को विज़ुअलाइज़ करें (वैकल्पिक) + +कभी‑कभी एक त्वरित विज़ुअल मदद करता है, विशेषकर जब आपको स्टेकहोल्डर्स को दिखाना हो कि कौन से कैरेक्टर्स कवर किए गए हैं। नीचे `matplotlib` का उपयोग करके कैरेक्टर्स का ग्रिड रेंडर करने का एक न्यूनतम उदाहरण है। + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Image alt text:** ![OCR कैरेक्टर सेट डायग्राम](ocr_character_set.png){alt="OCR कैरेक्टर सेट का अवलोकन"} + +डायग्राम कोर समाधान के लिए आवश्यक नहीं है, लेकिन यह दर्शाता है कि आप **use OCR engine** मेटाडेटा को साधारण प्रिंटिंग से आगे कैसे उपयोग कर सकते हैं। + +--- + +## चरण 5: वास्तविक‑दुनिया के वर्कफ़्लो में कैरेक्टर सेट को इंटीग्रेट करें + +अब जब हम **ocr character set** को प्राप्त कर सकते हैं, चलिए एक व्यावहारिक परिदृश्य देखते हैं: डेटाबेस में सहेजने से पहले OCR आउटपुट को वैलिडेट करना। + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +यह पैटर्न गार्बेज डेटा को डाउनस्ट्रीम पाइपलाइन में प्रवेश करने से रोकता है, जो बहुभाषी दस्तावेज़ों के साथ काम करते समय एक आम समस्या है। + +--- + +## सामान्य समस्याएँ और किनारे के केस + +| समस्या | क्यों होता है | कैसे बचें | +|---------|----------------|--------------| +| **स्क्रिप्ट नाम टाइपो** (जैसे `"Cyrillic "` ट्रेलिंग स्पेस के साथ) | इंजन स्ट्रिंग को लिटरली लेता है और स्क्रिप्ट नहीं ढूँढ पाता। | `script.strip()` करके व्हाइटस्पेस हटाएँ, फिर `get_supported_characters` कॉल करें। | +| **खाली कैरेक्टर लिस्ट** | कुछ इंजन स्क्रिप्ट दिखाते हैं लेकिन भाषा मॉडल अभी लोड नहीं किया है। | यदि लाइब्रेरी ऐसा मेथड देती है तो `engine.load_language_model(script)` कॉल करें, या सुनिश्चित करें कि मॉडल फ़ाइलें मौजूद हैं। | +| **Unicode सामान्यीकरण समस्याएँ** | “é” जैसे कैरेक्टर्स कंपोज़्ड (`\u00E9`) या डी-कम्पोज़्ड (`e\u0301`) रूप में हो सकते हैं। | वैलिडेशन से पहले `unicodedata.normalize('NFC', text)` से स्ट्रिंग्स को नॉर्मलाइज़ करें। | +| **बड़ी स्क्रिप्ट्स पर प्रदर्शन** (जैसे Chinese) | हजारों कैरेक्टर्स को रिट्रीव करना धीमा हो सकता है। | पहली कॉल के बाद परिणाम को कैश करें; अधिकांश एप्लिकेशन को लिस्ट केवल एक बार चाहिए। | + +--- + +## पूर्ण कार्यशील उदाहरण + +सब कुछ एक साथ मिलाकर, यहाँ एक सिंगल स्क्रिप्ट है जिसे आप कॉपी‑पेस्ट करके चला सकते हैं: + + + +## अब आप क्या सीखें? + +- [Aspose OCR ट्यूटोरियल – ऑप्टिकल कैरेक्टर रिकग्निशन](/ocr/english/) +- [OCR पोस्ट प्रोसेसिंग – कैरेक्टर विकल्प प्राप्त करें](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [OCR इमेज रिकग्निशन में थ्रेशहोल्ड वैल्यू कैसे सेट करें](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hindi/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/hindi/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..e35e2c554 --- /dev/null +++ b/ocr/hindi/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,236 @@ +--- +category: general +date: 2026-06-06 +description: Aspose OCR और Hugging Face मॉडल का उपयोग करके छवि पर OCR करें। जानें + कि Hugging Face मॉडल कैसे डाउनलोड करें, इनवॉइस से टेक्स्ट निकालें, और GPU संसाधनों + को मुक्त करें। +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: hi +og_description: Aspose OCR और Hugging Face मॉडल का उपयोग करके छवि पर OCR करें। यह + ट्यूटोरियल दिखाता है कि मॉडल को कैसे डाउनलोड करें, इनवॉइस से टेक्स्ट निकालें, और + GPU संसाधनों को मुक्त करें। +og_title: Aspose OCR और LLM के साथ छवि पर OCR करें – पूर्ण गाइड +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Aspose OCR और LLM के साथ छवि पर OCR करें – पूर्ण गाइड +url: /hi/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# इमेज पर OCR करें Aspose OCR & LLM – पूर्ण गाइड + +क्या आप कभी **इमेज पर OCR करना** चाहते थे लेकिन “मैं कहाँ से शुरू करूँ?” सवाल पर अटके हुए महसूस किया? आप अकेले नहीं हैं—कई डेवलपर्स दस्तावेज़ ऑटोमेशन को पहली बार संभालते समय इसी समस्या का सामना करते हैं। अच्छी खबर यह है कि Aspose OCR और Hugging Face का हल्का LLM उपयोग करके आप एक इनवॉइस की कच्ची स्कैन को कुछ ही पायथन लाइनों में साफ़, खोज योग्य टेक्स्ट में बदल सकते हैं। + +इस ट्यूटोरियल में हम आपको वह सब कुछ दिखाएंगे जो आपको चाहिए: **OCR के लिए इमेज लोड करना**, **Hugging Face मॉडल डाउनलोड करना**, **इनवॉइस डेटा से टेक्स्ट निकालना**, और अंत में **GPU संसाधनों को मुक्त करना** ताकि आपका ऐप हल्का रहे। अंत तक आपके पास एक स्व-निहित स्क्रिप्ट होगी जिसे आप किसी भी प्रोजेक्ट में उपयोग कर सकते हैं। + +--- + +## आप क्या सीखेंगे + +- Aspose के `OcrEngine` का उपयोग करके **इमेज पर OCR करना** कैसे करें। +- स्वचालित रूप से **Hugging Face मॉडल** फ़ाइलें **डाउनलोड** करने के सटीक चरण। +- AI‑सहायता प्राप्त पोस्ट‑प्रोसेसिंग के साथ **इनवॉइस** PDFs या PNGs से **टेक्स्ट निकालने** की तकनीकें। +- इनफ़रेंस के बाद **GPU संसाधनों को मुक्त** करने की सर्वोत्तम प्रथाएँ। +- **OCR के लिए इमेज लोड** करने के लिए प्रभावी टिप्स और सामान्य समस्याओं से बचें। + +कोई बाहरी दस्तावेज़ आवश्यक नहीं है—आपको जो कुछ भी चाहिए वह यहाँ ही है, पूर्ण कोड, व्याख्याएँ, और अपेक्षित आउटपुट के साथ। + +## आवश्यकताएँ + +शुरू करने से पहले, सुनिश्चित करें कि आपके पास ये हैं: + +| आवश्यकता | कारण | +|-------------|--------| +| Python 3.9+ | आधुनिक सिंटैक्स और टाइप हिंट्स | +| `asposeocr` पैकेज (`pip install asposeocr`) | मुख्य OCR इंजन | +| GPU तक पहुँच (वैकल्पिक लेकिन अनुशंसित) | LLM पोस्ट‑प्रोसेसर को तेज़ करता है | +| एक इनवॉइस इमेज (`sample_invoice.png`) | वास्तविक‑दुनिया परीक्षण केस | + +यदि इनमें से कोई भी आपके पास नहीं है, तो अभी इंस्टॉल करें; स्क्रिप्ट भी **Hugging Face मॉडल** को स्वचालित रूप से **डाउनलोड** करेगी, इसलिए आपको फ़ाइलें स्वयं खोजने की ज़रूरत नहीं पड़ेगी। + +## चरण 1: इमेज पर OCR करें – इंजन बनाएं + +पहला काम है Aspose का OCR इंजन शुरू करना। इसे ऐसे समझें जैसे एक खाली कैनवास खोलना जहाँ बाद में इमेज को टेक्स्ट में बदल दिया जाएगा। + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **यह क्यों महत्वपूर्ण है:** `OcrEngine` सभी लो‑लेवल इमेज प्री‑प्रोसेसिंग को एब्स्ट्रैक्ट करता है, इसलिए आप उच्च‑स्तरीय वर्कफ़्लो पर ध्यान दे सकते हैं। यह एक `set_post_processor` मेथड भी प्रदान करता है जो बाद में हमें स्मार्ट आउटपुट के लिए LLM को जोड़ने की सुविधा देता है। + +## चरण 2: OCR के लिए इमेज लोड करें – सही फ़ाइल चुनें + +अब जब इंजन मौजूद है, हमें **OCR के लिए इमेज लोड** करनी है। Aspose PNG, JPG, TIFF और कुछ अन्य फ़ॉर्मेट्स को सपोर्ट करता है। सुनिश्चित करें कि पाथ आपके स्क्रिप्ट के स्थान के सापेक्ष या पूर्ण (absolute) हो। + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **टिप:** यदि आपकी इमेज बड़ी है, तो मेमोरी दबाव कम करने के लिए पहले उसका आकार बदलने पर विचार करें। OCR इंजन हाई‑रेज़ोल्यूशन स्कैन संभाल सकता है, लेकिन 300 DPI की इमेज आमतौर पर इनवॉइस के लिए उपयुक्त होती है। + +## चरण 3: रॉ OCR करें और निकाले गए टेक्स्ट को देखें + +इमेज लोड होने के बाद, हम अंततः **इमेज पर OCR कर सकते** हैं और देख सकते हैं कि रॉ इंजन क्या आउटपुट देता है। यह चरण AI जादू जोड़ने से पहले हमें एक बेसलाइन देता है। + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**अपेक्षित आउटपुट (संक्षिप्त):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +रॉ आउटपुट में अक्सर लाइन ब्रेक, गलत पहचान वाले अक्षर, या गायब फ़ील्ड होते हैं—इसी कारण हम अगली बार एक भाषा मॉडल लाएंगे। + +## चरण 4: Hugging Face मॉडल डाउनलोड करें – LLM पोस्ट‑प्रोसेसर कॉन्फ़िगर करें + +यहीं पर **Hugging Face मॉडल डाउनलोड** करने का चरण चमकता है। यदि मॉडल डिस्क पर नहीं है तो Aspose AI स्वचालित रूप से Hugging Face हब से मॉडल खींच सकता है। हम Qwen2.5‑3B‑Instruct‑GGUF मॉडल का उपयोग करेंगे, जो सटीकता और मेमोरी फुटप्रिंट के बीच संतुलन बनाता है। + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **यह क्यों काम करता है:** `allow_auto_download` आपको `.gguf` फ़ाइल मैन्युअली डाउनलोड करने से बचाता है। क्वांटाइज़ेशन (`int8`) मॉडल का आकार लगभग 3 GB तक घटा देता है, जिससे यह अधिकांश कंज्यूमर GPU पर संभव हो जाता है। अपने हार्डवेयर के अनुसार `gpu_layers` को समायोजित करें—GPU पर अधिक लेयर = तेज़ इनफ़रेंस। + +## चरण 5: AI‑सहायता प्राप्त पोस्ट‑प्रोसेसिंग के साथ इनवॉइस से टेक्स्ट निकालें + +अब हम OCR इंजन के साथ LLM को जोड़ते हैं और एक **पोस्ट‑प्रोसेसर** चलाते हैं जो रॉ आउटपुट को साफ़ करता है, OCR त्रुटियों को सुधारता है, और इनवॉइस फ़ील्ड्स को सुंदर रूप से फॉर्मेट करता है। + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**उदाहरण उन्नत आउटपुट:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **क्या हुआ?** LLM ने पहचाना कि “Invoice #12345” को “Invoice Number: 12345” होना चाहिए, तारीख फ़ॉर्मेट ठीक किया, और यहाँ तक कि “Bill To” फ़ील्ड का अनुमान लगाया जो रॉ इंजन ने मिस किया था। यही **इनवॉइस से टेक्स्ट निकालने** ऑटोमेशन का मूल है। + +## चरण 6: GPU संसाधनों को मुक्त करें – प्रोसेसिंग के बाद साफ़‑सफ़ाई + +यदि आप इसे एक दीर्घकालिक सेवा (जैसे Flask API) में चला रहे हैं, तो प्रत्येक इनफ़रेंस के बाद **GPU संसाधनों को मुक्त** करना आवश्यक है ताकि मेमोरी‑ओवरफ़्लो क्रैश से बचा जा सके। Aspose AI इसके लिए एक सरल मेथड प्रदान करता है। + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **प्रो टिप:** यदि आप OCR कॉल को try/except में लपेट रहे हैं तो `free_resources()` को `finally:` ब्लॉक के अंदर कॉल करें। इससे अपवाद होने पर भी सफ़ाई सुनिश्चित होती है। + +## चरण 7: पूर्ण स्क्रिप्ट – सब कुछ एक साथ रखें + +नीचे पूर्ण, चलाने‑के‑लिए‑तैयार स्क्रिप्ट दी गई है। इसे कॉपी‑पेस्ट करें, पाथ्स को समायोजित करें, और आप तैयार हैं। + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +स्क्रिप्ट चलाएँ और शोरयुक्त OCR से साफ़, संरचित इनवॉइस डेटा में परिवर्तन देखें। 🎉 + +## सामान्य प्रश्न और किनारे के मामले + +| प्रश्न | उत्तर | +|----------|--------| +| **यदि मॉडल डाउनलोड नहीं होता तो क्या करें?** | सुनिश्चित करें कि आपके मशीन में इंटरनेट एक्सेस है और `hugging_face_repo_id` सही है। आप मैन्युअली भी फ़ाइल डाउनलोड कर सकते हैं। | + +## आगे आप क्या सीखें? + +- [Aspose.OCR का उपयोग करके भाषा चयन के साथ इमेज टेक्स्ट निकालें (C#)](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [इमेज से टेक्स्ट निकालें – .NET के लिए Aspose.OCR के साथ OCR अनुकूलन](/ocr/english/net/ocr-optimization/) +- [इमेज को टेक्स्ट में बदलें – URL से इमेज पर OCR करें](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hindi/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/hindi/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..a6f70eadc --- /dev/null +++ b/ocr/hindi/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,283 @@ +--- +category: general +date: 2026-06-06 +description: Aspose OCR के साथ छवि से टेक्स्ट पहचानें – सीखें कि OCR के लिए छवि कैसे + लोड करें और Python में AI पोस्ट‑प्रोसेसिंग का उपयोग करके छवि पर OCR कैसे करें। +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: hi +og_description: छवि से तेज़ी से टेक्स्ट पहचानें। यह गाइड दिखाता है कि OCR के लिए छवि + कैसे लोड करें, छवि पर OCR कैसे करें, और AI पोस्ट‑प्रोसेसिंग से परिणामों को कैसे + बढ़ाएँ। +og_title: Aspose OCR और AI का उपयोग करके छवि से टेक्स्ट पहचानें +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Aspose OCR और AI का उपयोग करके छवि से टेक्स्ट पहचानें +url: /hi/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Aspose OCR & AI का उपयोग करके छवि से टेक्स्ट पहचानें + +क्या आपको कभी छवि से टेक्स्ट पहचानने की ज़रूरत पड़ी, लेकिन यह नहीं पता था कि कौन‑सी लाइब्रेरी गति और सटीकता दोनों देगी? आप अकेले नहीं हैं। इस गाइड में हम एक पूर्ण, एंड‑टू‑एंड उदाहरण के माध्यम से चलेंगे जो दिखाता है **कैसे इमेज को OCR के लिए लोड करें**, **इमेज पर OCR करें**, और फिर Aspose के AI पोस्ट‑प्रोसेसर से आउटपुट को पॉलिश करें। अंत तक आपके पास एक तैयार‑स्क्रिप्ट होगी जो PNG को साफ़, सर्चेबल टेक्स्ट में बदल देगी। + +## आप क्या सीखेंगे + +हम सब कुछ कवर करेंगे, Aspose OCR पैकेज को इंस्टॉल करने से लेकर रन के अंत में रिसोर्सेज़ रिलीज़ करने तक। आप देखेंगे कि हैंडराइटन‑टेक्स्ट रिकग्निशन को एनेबल करना क्यों महत्वपूर्ण है, पोस्ट‑प्रोसेसिंग के लिए Qwen 2.5 LLM को कैसे कॉन्फ़िगर करें, और अंतिम आउटपुट कैसा दिखता है। कोई बाहरी रेफ़रेंस नहीं चाहिए—सिर्फ कॉपी, पेस्ट और रन करें। + +### पूर्वापेक्षाएँ + +- Python 3.8 या नया +- `asposeocr` पैकेज (`pip install asposeocr`) +- एक इमेज फ़ाइल (जैसे `doc.png`) जिसमें प्रिंटेड या हैंडराइटन टेक्स्ट हो +- वैकल्पिक: तेज़ LLM इन्फ़रेंस के लिए GPU (स्क्रिप्ट CPU पर भी काम करती है) + +--- + +## छवि से टेक्स्ट पहचानें – चरण‑दर‑चरण + +नीचे प्रत्येक कोड ब्लॉक के बाद एक छोटा स्पष्टीकरण मिलेगा कि **क्यों** हम वह विशेष कार्रवाई कर रहे हैं, न कि केवल **क्या** वह लाइन करती है। + +### चरण 1: आवश्यक मॉड्यूल इंस्टॉल और इम्पोर्ट करें + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*क्यों?* `asposeocr` को इम्पोर्ट करने से हमें `OcrEngine` क्लास मिलती है, जबकि `ai` सबमॉड्यूल LLM‑आधारित पोस्ट‑प्रोसेसर प्रदान करता है जो कच्चे OCR आउटपुट को नाटकीय रूप से सुधारता है। + +### चरण 2: OCR इंजन बनाएं और हैंडराइटन टेक्स्ट रिकग्निशन एनेबल करें + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +हैंडराइटन रिकग्निशन को एनेबल करने से इंजन का कैरेक्टर सेट विस्तृत हो जाता है, इसलिए जब आप **इमेज पर OCR करें** जिसमें प्रिंटेड और कर्सिव दोनों टेक्स्ट हो, तो स्क्रिबल्स नहीं खोएँगे। + +### चरण 3: OCR के लिए इमेज लोड करें + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +`load_image` कॉल वह क्षण है जब आप **इमेज को OCR के लिए लोड** करते हैं; यदि पाथ गलत है, तो इंजन एक सूचनात्मक एक्सेप्शन उठाएगा, जिससे बाद में आने वाली गड़बड़ी से बचा जा सकेगा। + +### चरण 4: रॉ OCR पास चलाएँ + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +इस चरण पर आपको एक `RecognitionResult` ऑब्जेक्ट मिलेगा जिसमें अनफ़िल्टर किया गया टेक्स्ट, कॉन्फिडेंस स्कोर, और लेआउट मेटाडेटा शामिल है। अक्सर यह शोरयुक्त होता है—इसीलिए AI‑ड्रिवेन क्लीन‑अप की ज़रूरत पड़ती है। + +### चरण 5: LLM पोस्ट‑प्रोसेसिंग के लिए Aspose AI मॉडल कॉन्फ़िगर करें + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +इन सेटिंग्स की ज़रूरत क्यों? +- **auto‑download** सुनिश्चित करता है कि मॉडल पहली रन पर उपलब्ध हो। +- **int8 quantization** मेमोरी की मांग को घटाता है बिना बड़ी सटीकता हानि के। +- **gpu_layers** संगत GPU का उपयोग करके तेज़ इन्फ़रेंस की अनुमति देता है। + +### चरण 6: AI प्रोसेसर इनिशियलाइज़ करें और पोस्ट‑प्रोसेसर के रूप में अटैच करें + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +प्रोसेसर को अटैच करने का मतलब है कि हर बार जब आप `run_postprocessor` कॉल करेंगे, LLM स्पेलिंग को ठीक करेगा, टूटे हुए शब्दों को जोड़ देगा, और यहाँ तक कि गायब पंक्चुएशन का अनुमान भी लगाएगा। + +### चरण 7: OCR आउटपुट को एन्हांस करने के लिए पोस्ट‑प्रोसेसर चलाएँ + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` आमतौर पर कच्चे स्ट्रिंग से कहीं अधिक पढ़ने योग्य होता है—इसे एक स्पेल‑चेकर समझें जो संदर्भ भी समझता है। + +### चरण 8: रिसोर्सेज़ रिलीज़ करें + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +लॉन्ग‑रनिंग सर्विसेज़ में क्लीन‑अप बहुत ज़रूरी है; नहीं तो GPU मेमोरी लीक होगी और अंत में आपका एप्लिकेशन क्रैश हो जाएगा। + +--- + +## आज ही चलाने योग्य पूरा स्क्रिप्ट + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**अपेक्षित आउटपुट** (एक साधारण इनवॉइस छवि का उदाहरण): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +ध्यान दें कि AI लेयर ने “Inv0ice” → “Invoice” को सही किया और गायब पंक्चुएशन जोड़ दिया। + +--- + +## अक्सर पूछे जाने वाले प्रश्न (और त्वरित उत्तर) + +- **क्या मुझे GPU चाहिए?** नहीं। स्क्रिप्ट CPU पर फ़ॉल्बैक करती है, लेकिन इन्फ़रेंस धीमा होगा। +- **क्या मैं कोई अलग LLM इस्तेमाल कर सकता हूँ?** बिल्कुल—सिर्फ `hugging_face_repo_id` बदलें और `gpu_layers` को उसी अनुसार एडजस्ट करें। +- **अगर मेरी इमेज बहुत बड़ी है तो?** पहले उसे रिसाइज़ करें (जैसे Pillow का उपयोग करके) ताकि मेमोरी उपयोग उचित रहे। +- **क्या हैंडराइटन रिकग्निशन हमेशा ऑन रहता है?** आप अपने वर्कलोड के अनुसार `enable_handwritten_recognition` को टॉगल कर सकते हैं। + +--- + +## निष्कर्ष + +अब आप जानते हैं कि **Aspose OCR का उपयोग करके छवि से टेक्स्ट कैसे पहचानें**, **इमेज को OCR के लिए कैसे लोड करें**, और **AI‑एन्हांस्ड पोस्ट‑प्रोसेसिंग के साथ इमेज पर OCR कैसे करें**। ऊपर दिया गया पूर्ण, रन‑योग्य उदाहरण आपको किसी भी Python प्रोजेक्ट में OCR को इंटीग्रेट करने के लिए एक ठोस आधार देता है—चाहे वह रसीद स्कैन करना हो, कॉन्ट्रैक्ट्स को डिजिटल बनाना हो, या हैंडराइटन फॉर्म्स से डेटा निकालना हो। + +अगला कदम क्या है? Qwen मॉडल को बड़े मॉडल से बदलें, विभिन्न क्वांटाइज़ेशन स्कीम्स के साथ प्रयोग करें, या बैच प्रोसेसिंग के लिए कई इमेजेज़ को चेन करें। संभावनाएँ अनंत हैं, और आपका कोड उन्हें सहजता से संभाल लेगा। + +कोडिंग का आनंद लें, और आपके OCR परिणाम हमेशा स्पष्ट रहें! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Aspose OCR का उपयोग करके छवि से टेक्स्ट पहचानने का स्क्रीनशॉट"} + +## अगला आप क्या सीखें? + +निम्नलिखित ट्यूटोरियल्स उन विषयों को कवर करते हैं जो इस गाइड में दिखाए गए तकनीकों पर आधारित हैं। प्रत्येक संसाधन में पूर्ण कार्यशील कोड उदाहरण और चरण‑दर‑चरण व्याख्याएँ शामिल हैं, जिससे आप अतिरिक्त API फीचर्स में महारत हासिल कर सकें और अपने प्रोजेक्ट्स में वैकल्पिक इम्प्लीमेंटेशन अप्रोचेज़ का अन्वेषण कर सकें। + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hongkong/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/hongkong/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..54100e836 --- /dev/null +++ b/ocr/hongkong/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,210 @@ +--- +category: general +date: 2026-06-06 +description: 使用 Python OCR 從手寫圖像中提取文字。快速且可靠地將手寫照片轉換為文字。 +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: zh-hant +og_description: 使用 Python 從手寫圖像中提取文字。本指南說明如何將手寫相片轉換為文字,並解答如何辨識手寫文字。 +og_title: 從手寫圖像提取文字 – Python OCR 教學 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: 使用 Python OCR 從手寫圖像提取文字 – 步驟指南 +url: /zh-hant/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 使用 Python OCR 從手寫圖像提取文字 – 步驟指南 + +有沒有想過 **如何辨識手寫文字**,就在手機拍的相片裡?你並不孤單。在許多專案中——無論是數位化課堂筆記或從簽名表格中提取資料——你都需要快速且毫無困擾地 **從手寫圖像提取文字**。 + +在本教學中,我們將一步步示範完整、可直接執行的範例,告訴你如何使用流行的 Python OCR 函式庫 **將手寫照片轉換為文字**。不會有模糊的參考,只有具體的程式碼、說明與可立即複製貼上的技巧。 + +![從手寫圖像提取文字](https://example.com/placeholder-handwritten.jpg "從手寫圖像提取文字") + +## 您需要的條件 + +| 需求 | 為何重要 | +|------|----------| +| Python 3.9 或更新版 | 現代語法與函式庫支援 | +| `pip`(Python 套件管理員) | 用於安裝 OCR 套件 | +| 清晰的手寫筆記圖像(JPEG/PNG) | 模糊圖像會降低 OCR 準確度 | +| 基本了解 Python 函式 | 有助於日後調整範例 | + +如果缺少上述任何項目,請從 下載最新的 Python 並安裝——毫無困難。 + +## 安裝 Python OCR 函式庫 + +我們將使用的程式碼片段依賴 `ocr` 套件(它是 Tesseract 的薄層封裝,加入了便利設定)。只要一條指令即可安裝: + +```bash +pip install ocr +``` + +> **Pro tip:** 安裝完成後,在終端機執行 `tesseract --version` 以確認底層引擎已安裝。若未安裝 Tesseract,請依照官方指南為你的作業系統安裝——大多數套件管理員都有提供(Ubuntu 使用 `apt-get install tesseract-ocr`,macOS 使用 `brew install tesseract`)。 + +## 步驟 1:建立 OCR 引擎實例 + +建立引擎是 **python ocr handwritten recognition** 這座牆的第一塊磚。可以把引擎想像成之後閱讀塗鴉的“大腦”。 + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +為什麼這很重要:沒有引擎就無法調整辨識流程。預設設定是針對印刷文字調校的,我們需要在下一步進行調整。 + +## 步驟 2:啟用手寫辨識 + +預設情況下,引擎會假設是印刷字元。啟用手寫模式會切換開關,告訴 Tesseract 使用其針對連筆筆劃訓練的 LSTM 模型。 + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **如果跳過這一步會怎樣?** OCR 會把筆劃當成噪音,導致輸出雜亂。啟用此旗標是 **如何辨識手寫文字** 的核心。 + +## 步驟 3:載入您的手寫照片 + +現在把引擎指向圖像檔案。路徑可以是絕對或相對的,只要確保檔案真的存在即可。 + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +快速檢查:在作業系統的檢視器中開啟圖像。若文字模糊,請考慮在送入引擎前先做前處理(提升對比、旋轉校正),這些調整常能提升 **從手寫圖像提取文字** 的成功率。 + +## 步驟 4:執行辨識程序 + +所有設定就緒後,我們終於請引擎執行工作。`recognize()` 方法會回傳一個結果物件,裡面包含提取的字串與信心分數。 + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +在幕後,引擎會將位圖轉換成一系列特徵向量,送入 LSTM 網路,最後把字元串接起來。這就是 **python ocr handwritten recognition** 的魔法。 + +## 步驟 5:顯示提取的文字 + +結果物件會公開一個 `.text` 屬性,內含純 Unicode 字串。你可以印出、寫入檔案,或傳入其他流程——由你決定。 + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### 預期輸出 + +如果來源圖像的筆記內容是「Buy milk, eggs, and bread」,你會看到類似以下的輸出: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +注意輸出會保留標點與換行(若有)。若得到亂碼,請再次檢查圖像品質以及 `enable_handwritten_recognition` 旗標是否正確設定。 + +## 處理常見問題 + +| 問題 | 症狀 | 解決方法 | +|------|------|----------| +| 低信心分數 | 出現大量「?」或無意義字元 | 將圖像 DPI 提升至 ≥300,使用二值化(`opencv`),或裁切至感興趣區域。 | +| 混合語言 | 輸出同時混雜英文與其他文字 | 在 `recognize()` 前設定 `engine.ocr_settings.language = "eng"`(或其他 ISO 代碼)。 | +| 大檔案 | 處理時間過長或記憶體錯誤 | 在載入前將圖像縮放至合理尺寸(例如最大寬度 1200 px)。 | +| 缺少 Tesseract | `ImportError` 或 `FileNotFoundError` | 獨立安裝 Tesseract,並確保其路徑已加入系統 PATH。 | + +這些調整可讓你的 **convert handwritten photo to text** 工作流程在各種資料集上保持穩定。 + +## 完整腳本,今天即可執行 + +以下是完整、獨立的程式,將所有步驟串起來。將它複製到名為 `handwritten_ocr.py` 的檔案,然後執行 `python handwritten_ocr.py`。 + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +執行後,你會在主控台看到印出的文字——正是當你想 **convert handwritten photo to text** 時所需要的結果。 + +## 更進一步 + +現在你已掌握 **從手寫圖像提取文字** 的基礎,以下是可進一步探索的方向: + +- **批次處理:** 迴圈遍歷資料夾中的多張圖像,將每個結果存入 CSV 檔案。 +- **後處理:** 使用正規表達式清理常見的 OCR 錯誤(例如「1」與「l」的混淆)。 +- **整合:** 將提取的字串送入自然語言處理管線,進行情感分析或關鍵字抽取。 +- **替代函式庫:** 若需要更高準確度,可探索 `easyocr` 或 `pytesseract` 搭配自訂 LSTM 模型——兩者同樣支援 **python ocr handwritten recognition**。 + +請記得,來源圖像的品質往往決定成功與否,花幾分鐘做前處理能為日後省下大量除錯時間。 + +## 結論 + +我們已完整示範一個端對端的範例,說明 **如何辨識手寫文字**,更重要的是 **如何使用 Python 從手寫圖像提取文字**。只要安裝 `ocr` 套件、開啟手寫旗標、載入圖片,並呼叫 `recognize()`,就能在幾行程式碼內 **convert handwritten photo to text**。 + +試著用自己的筆記執行,調整前處理步驟,讓 OCR 承擔繁重的工作。若遇到問題,請回顧「處理常見問題」表格或嘗試其他 OCR 後端。祝開發順利,讓你的手寫資料即時可搜尋! + +## 接下來該學什麼? + +以下教學與本指南示範的技術密切相關,提供完整可執行的程式碼範例與逐步說明,協助你掌握更多 API 功能並在專案中探索替代實作方式。 + +- [使用 Aspose OCR 從圖像提取文字 – 步驟指南](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [如何在 Aspose.OCR 中為 OCR 文字辨識準備頁面矩形](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [如何使用 OCR - 在未偵測文字區域的情況下辨識圖像](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hongkong/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/hongkong/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..c83ca009a --- /dev/null +++ b/ocr/hongkong/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,228 @@ +--- +category: general +date: 2026-06-06 +description: OCR 字元集指南:學習如何使用 OCR 引擎列出拉丁文與西里爾文腳本支援的字元,並附完整的 Python 範例。 +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: zh-hant +og_description: OCR 字元集指南:探索如何在 Python 中使用 OCR 引擎列出各種文字系統支援的字元,並附上程式碼與技巧。 +og_title: OCR 字元集 – 取得並使用 OCR 引擎 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR字元集 – 於 Python 中檢索與使用 OCR 引擎 +url: /zh-hant/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR 字元集 – 在 Python 中取得與使用 OCR 引擎 + +想了解您的函式庫支援的 **ocr character set** 嗎?在本教學中,我們將示範如何 **use OCR engine** 來查詢不同文字系統支援的字元,並說明這在實務專案中的重要性。 + +如果您曾經盯著空白畫面,疑惑為何某些字母在 OCR 處理後永遠不會出現,您並不孤單。答案往往隱藏在引擎所認識的字元集裡。閱讀完本指南後,您將能夠: + +* 列出 OCR 引擎對特定文字系統能辨識的所有字元。 +* 處理文字系統未被支援的情況。 +* 將字元集資訊套用到後續驗證或 UI 邏輯中。 + +沒有神祕的黑盒技巧——只要純 Python、幾行程式碼,加上一點說明即可。 + +--- + +## 前置條件 + +在開始之前,請確保您已具備: + +* 已安裝 Python 3.8+(程式碼使用 f‑strings)。 +* 提供 `ocr.OcrEngine` 的 OCR 函式庫——本範例假設已透過 `pip install ocr-lib` 安裝名為 `ocr` 的套件。 +* 具備 Python 匯入機制與列印的基本概念。 + +如果缺少該函式庫,請執行: + +```bash +pip install ocr-lib +``` + +就這樣——不需要額外的二進位檔或原生相依性,即可執行我們將要查詢的基本字元集。 + +--- + +## 步驟 1:初始化 OCR 引擎實例 + +建立引擎物件是使用 **use OCR engine** 功能的第一步。把它想成開啟掃描器;沒有它,就無法提出任何問題。 + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +`OcrEngine` 類別是底層辨識引擎的輕量包裝。除非您真的請求處理,否則不會啟動大量運算,因此啟動成本可以忽略不計。 + +> **小技巧:** 若您的應用程式會建立多個引擎實例,請重複使用同一個實例。這樣可節省記憶體並降低初始化延遲。 + +--- + +## 步驟 2:查詢特定文字系統的 OCR 字元集 + +現在有了引擎,我們可以詢問它對某個書寫系統認識哪些字元。`get_supported_characters(script_name)` 會回傳一個 Python `list`,內容是 Unicode 字元。 + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +執行上面的程式碼會印出類似以下內容: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +實際輸出會依 OCR 函式庫版本而異,但一定會得到一個平面字元列表。若您同時需要大寫與小寫,只要請求 `"Latin"` 文字系統後對每個元素使用 `.lower()`,或是若函式庫有區分,直接查詢 `"Latin‑lower"`。 + +### 為什麼這很重要? + +當您輸入含有不常見變音符號的影像(例如 “ñ” 或 “ø”)時,OCR 引擎可能會悄悄將它們換成佔位符或直接省略。事先了解 **ocr character set** 能讓您在輸入前先行驗證、提醒使用者,或改用其他引擎。 + +--- + +## 步驟 3:取得其他文字系統的字元 – 以西里爾文為例 + +相同的方法可用於引擎聲稱支援的任何文字系統。讓我們看看西里爾文的情況。 + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +典型輸出: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +如果引擎 **不** 支援所請求的文字系統,通常會拋出 `ValueError`(或依實作回傳空列表)。我們來加上防護。 + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## 步驟 4:視覺化 OCR 字元集(可選) + +有時候快速的視覺化能幫助說服利害關係人哪些字元已被涵蓋。以下示範使用 `matplotlib` 繪製字元格狀圖。 + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **圖片說明文字:** ![OCR 字元集圖示](ocr_character_set.png){alt="OCR 字元集概覽"} + +此圖示並非核心解決方案所必需,但它展示了如何 **use OCR engine** 的中繼資料超越純文字列印的應用。 + +--- + +## 步驟 5:將字元集整合至實務工作流程 + +既然我們已能取得 **ocr character set**,接下來示範一個實務情境:在將 OCR 結果寫入資料庫前先行驗證。 + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +此模式可防止雜訊資料流入下游管線,這是處理多語言文件時常見的痛點。 + +--- + +## 常見陷阱與邊緣案例 + +| 陷阱 | 為何會發生 | 如何避免 | +|---------|----------------|--------------| +| **文字系統名稱拼寫錯誤**(例如 `"Cyrillic "` 含尾端空白) | 引擎會把字串照字面解讀,找不到對應的文字系統。 | 在呼叫 `get_supported_characters` 前使用 `script.strip()` 去除空白。 | +| **返回空字元列表** | 某些引擎會公開文字系統,但尚未載入語言模型。 | 若函式庫提供 `engine.load_language_model(script)`,請先呼叫;或確保模型檔案已放置。 | +| **Unicode 正規化問題** | “é” 可能以組合形式 (`\u00E9`) 或分解形式 (`e\u0301`) 出現。 | 在驗證前使用 `unicodedata.normalize('NFC', text)` 正規化字串。 | +| **大型文字系統的效能問題**(例如中文) | 取得上千個字元可能較慢。 | 在首次呼叫後快取結果;大多數應用只需要一次列表即可。 | + +--- + +## 完整範例 + +以下將所有步驟整合成一個可直接複製貼上執行的腳本: + + + +## 接下來該學什麼? + +以下教學與本指南所示技術緊密相關,能幫助您進一步掌握 API 功能,並在自己的專案中探索其他實作方式。 + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hongkong/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/hongkong/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..d205dcdda --- /dev/null +++ b/ocr/hongkong/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,260 @@ +--- +category: general +date: 2026-06-06 +description: 使用 Aspose OCR 與 Hugging Face 模型對圖像執行 OCR。了解如何下載 Hugging Face 模型、從發票中提取文字,並釋放 + GPU 資源。 +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: zh-hant +og_description: 使用 Aspose OCR 和 Hugging Face 模型對圖像執行 OCR。本教程展示如何下載模型、從發票中提取文字,以及釋放 + GPU 資源。 +og_title: 使用 Aspose OCR 與 LLM 進行圖像文字辨識 – 完整指南 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: 使用 Aspose OCR 與 LLM 進行圖像文字辨識 – 完整指南 +url: /zh-hant/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 使用 Aspose OCR 與 LLM 執行影像 OCR – 完整指南 + +是否曾想 **對影像檔案執行 OCR**,卻卡在「從哪裡開始?」的問題上?你並不孤單——許多開發者在首次接觸文件自動化時都會遇到這道牆。好消息是,結合 Aspose OCR 與 Hugging Face 的輕量級 LLM,你只需幾行 Python 程式碼,就能把發票的掃描圖轉換成乾淨、可搜尋的文字。 + +在本教學中,我們將一步步說明所有必備內容:從 **載入影像以進行 OCR**、**下載 Hugging Face 模型**、**從發票資料中抽取文字**,最後 **釋放 GPU 資源**,確保你的應用保持輕量。完成後,你將擁有一個可直接嵌入任何專案的完整腳本。 + +--- + +## 你將學會 + +- 如何使用 Aspose 的 `OcrEngine` **對影像執行 OCR**。 +- 如何自動 **下載 Hugging Face 模型** 檔案。 +- 使用 AI 增強的後處理,**從發票 PDF 或 PNG 中抽取文字** 的技巧。 +- 在推論完成後 **釋放 GPU 資源** 的最佳實踐。 +- 高效 **載入影像以進行 OCR**,並避免常見陷阱。 + +不需要額外文件——所有內容、完整程式碼、說明與預期輸出皆在此。 + +--- + +## 前置條件 + +在開始之前,請確保你已具備以下項目: + +| 前置條件 | 原因 | +|----------|------| +| Python 3.9+ | 支援現代語法與型別提示 | +| `asposeocr` 套件(`pip install asposeocr`) | 核心 OCR 引擎 | +| 可使用的 GPU(可選,但建議) | 加速 LLM 後處理 | +| 發票影像(`sample_invoice.png`) | 真實測試案例 | + +若缺少任何項目,請立即安裝;腳本也會 **自動下載 Hugging Face 模型**,免去自行搜尋檔案的麻煩。 + +--- + +## 步驟 1:執行影像 OCR – 建立引擎 + +首先需要啟動 Aspose 的 OCR 引擎。把它想像成一張空白畫布,之後會把影像「畫」成文字。 + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **為什麼重要:** `OcrEngine` 抽象化所有低階影像前處理,讓你專注於高階工作流程。它同時提供 `set_post_processor` 方法,稍後可掛接 LLM 以產生更智慧的輸出。 + +--- + +## 步驟 2:載入影像以進行 OCR – 選擇正確檔案 + +引擎建立後,我們需要 **載入影像以進行 OCR**。Aspose 支援 PNG、JPG、TIFF 等多種格式。請確保路徑為絕對路徑或相對於腳本所在位置。 + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **小技巧:** 若影像檔案過大,建議先行縮小尺寸以減少記憶體壓力。OCR 引擎能處理高解析度掃描,但 300 DPI 的影像通常是發票的最佳取捨。 + +--- + +## 步驟 3:執行原始 OCR 並檢視抽取文字 + +影像載入完成後,我們終於可以 **對影像執行 OCR**,觀察原始引擎輸出的結果。此步驟提供基線,之後再加入 AI 魔法。 + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**預期輸出(截斷示例):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +原始輸出常會出現換行、錯誤辨識的字元或遺漏欄位——這正是我們接下來要引入語言模型的原因。 + +--- + +## 步驟 4:下載 Hugging Face 模型 – 設定 LLM 後處理器 + +這裡就是 **下載 Hugging Face 模型** 發揮威力的地方。Aspose AI 能自動從 Hugging Face Hub 拉取模型(若本機尚未存在)。我們將使用 Qwen2.5‑3B‑Instruct‑GGUF 模型,兼顧精準度與記憶體占用。 + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **為什麼可行:** `allow_auto_download` 讓你免去手動下載 `.gguf` 檔案的步驟。量化為 `int8` 後模型大小約 3 GB,對大多數消費級 GPU 皆可負荷。依硬體條件調整 `gpu_layers`——GPU 上的層數越多,推論速度越快。 + +--- + +## 步驟 5:使用 AI 增強的後處理抽取發票文字 + +現在把 LLM 接到 OCR 引擎,執行 **後處理器**,清理原始輸出、校正 OCR 錯誤,並將發票欄位格式化為易讀結構。 + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**增強後範例輸出:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **發生了什麼?** LLM 辨識出 “Invoice #12345” 應為 “Invoice Number: 12345”,修正了日期格式,甚至推斷出原始引擎遺漏的 “Bill To” 欄位。這正是 **從發票抽取文字** 自動化的核心。 + +--- + +## 步驟 6:釋放 GPU 資源 – 處理完畢後清理 + +若你在長時間執行的服務(例如 Flask API)中使用此流程,必須在每次推論後 **釋放 GPU 資源**,避免記憶體耗盡。Aspose AI 提供簡潔的清理方法。 + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **專業提示:** 若將 OCR 呼叫包在 `try/except` 中,請在 `finally:` 區塊內呼叫 `free_resources()`,即使發生例外也能保證資源釋放。 + +--- + +## 步驟 7:完整腳本 – 整合全部步驟 + +以下為可直接執行的完整腳本。複製貼上、調整路徑後即可運行。 + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +執行腳本,即可見證從雜訊 OCR 到乾淨結構化發票資料的轉變。 🎉 + +--- + +## 常見問題與邊緣案例 + +| 問題 | 解答 | +|------|------| +| **模型下載失敗該怎麼辦?** | 確認機器具備網路連線,且 `hugging_face_repo_id` 正確。你也可以手動下載模型檔案,放置於指定目錄。 | +| **影像解析度過高導致記憶體不足** | 先使用 Pillow 或 OpenCV 重新取樣至 300 DPI,或將圖檔壓縮後再送入 OCR。 | +| **GPU 不可用,是否只能使用 CPU?** | 可以,將 `gpu_layers` 設為 0,模型將在 CPU 上執行,雖然速度較慢,但仍能完成後處理。 | +| **如何處理多頁 PDF 發票?** | 先使用 PDF 轉影像工具(如 `pdf2image`)將每頁轉為 PNG,然後逐頁套用本流程,最後合併結果。 | +| **想要自訂後處理規則** | 可自行實作 `PostProcessor` 介面,或在 LLM 提示詞中加入特定格式要求。 | + +--- + +## 接下來該學什麼? + +以下教學與本指南緊密相關,能進一步深化你對 API 功能的掌握,並探索其他實作方式。 + +- [使用 Aspose.OCR 以語言選擇抽取影像文字(C#)](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [使用 Aspose.OCR for .NET 進行影像 OCR 最佳化](/ocr/english/net/ocr-optimization/) +- [從 URL 執行影像 OCR – 轉換影像為文字](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hongkong/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/hongkong/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..de42e887b --- /dev/null +++ b/ocr/hongkong/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,281 @@ +--- +category: general +date: 2026-06-06 +description: 使用 Aspose OCR 辨識圖片文字 – 學習如何載入圖片進行 OCR,並在 Python 中使用 AI 後處理執行圖片 OCR。 +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: zh-hant +og_description: 快速辨識圖像文字。本指南說明如何載入圖像進行 OCR、執行圖像 OCR,並透過 AI 後處理提升結果。 +og_title: 使用 Aspose OCR 與 AI 從圖像辨識文字 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: 使用 Aspose OCR 與 AI 從圖像中辨識文字 +url: /zh-hant/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 使用 Aspose OCR 與 AI 識別圖像文字 + +有沒有曾經需要從圖像中識別文字,但不確定哪個函式庫能同時提供速度與準確度?你並不孤單。在本指南中,我們將逐步示範一個完整的端到端範例,說明 **如何載入圖像以進行 OCR**、**如何在圖像上執行 OCR**,以及如何使用 Aspose 的 AI 後處理器來潤飾輸出。完成後,你將擁有一個可直接執行的腳本,將 PNG 轉換為乾淨、可搜尋的文字。 + +## 你將學到什麼 + +我們將涵蓋從安裝 Aspose OCR 套件到執行結束時釋放資源的全部步驟。你會了解為何啟用手寫文字識別很重要、如何為後處理設定 Qwen 2.5 LLM,以及最終輸出會是什麼樣子。無需外部參考——只要複製、貼上並執行即可。 + +### 前置條件 + +- Python 3.8 或更新版本 +- `asposeocr` 套件(`pip install asposeocr`) +- 一個圖像檔案(例如 `doc.png`),內含印刷或手寫文字 +- 可選:用於加速 LLM 推論的 GPU(腳本亦可在 CPU 上執行) + +--- + +## 識別圖像文字 – 步驟說明 + +在每個程式碼區塊下方,你會看到對 **為何** 執行該動作的簡短說明,而不僅是 **程式碼做了什麼**。 + +### 步驟 1:安裝並匯入所需模組 + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*為何?* 匯入 `asposeocr` 可取得 `OcrEngine` 類別,而 `ai` 子模組則提供基於 LLM 的後處理器,能顯著提升原始 OCR 輸出。 + +### 步驟 2:建立 OCR 引擎並啟用手寫文字識別 + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +啟用手寫識別會擴充引擎的字元集,讓你在 **對圖像執行 OCR** 時不會遺失混合印刷與手寫文字的塗鴉。 + +### 步驟 3:載入圖像以進行 OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +`load_image` 呼叫即是 **載入圖像以進行 OCR** 的時刻;若路徑錯誤,引擎會拋出具說明性的例外,避免你在後續遭遇難以理解的錯誤。 + +### 步驟 4:執行原始 OCR + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +此時會取得一個 `RecognitionResult` 物件,內含未過濾的文字、信心分數與版面資訊。結果通常雜訊較多——因此需要 AI 驅動的清理。 + +### 步驟 5:設定 Aspose AI 模型以進行 LLM 後處理 + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +為何要設定這些參數? + +- **auto‑download** 確保模型在首次執行時即已下載可用。 +- **int8 quantization** 大幅降低記憶體需求,且對準確度影響不大。 +- **gpu_layers** 讓你利用相容的 GPU 加速推論。 + +### 步驟 6:初始化 AI 處理器並將其作為後處理器附加 + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +將處理器附加後,每次呼叫 `run_postprocessor` 時,LLM 都會校正拼寫、合併斷詞,甚至推斷缺失的標點符號。 + +### 步驟 7:執行後處理器以提升 OCR 輸出 + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` 通常比原始字串更易讀——可視為同時具備語境理解的拼寫檢查器。 + +### 步驟 8:釋放資源 + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +在長時間運行的服務中,清理資源至關重要;否則會造成 GPU 記憶體泄漏,最終導致應用程式崩潰。 + +--- + +## 完整腳本(即刻執行) + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**預期輸出**(簡易發票圖像範例): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +請留意 AI 層將 “Inv0ice” 修正為 “Invoice”,並補上缺失的標點符號。 + +--- + +## 常見問題(快速解答) + +- **我需要 GPU 嗎?** 不需要。腳本會回退至 CPU,但推論速度會較慢。 +- **我可以使用其他 LLM 嗎?** 當然可以——只需更改 `hugging_face_repo_id` 並相應調整 `gpu_layers`。 +- **如果我的圖像很大怎麼辦?** 請先將其縮放(例如使用 Pillow)以維持合理的記憶體使用量。 +- **手寫識別是否永遠開啟?** 你可以依工作負載切換 `enable_handwritten_recognition`。 + +--- + +## 結論 + +現在你已掌握如何使用 Aspose OCR **識別圖像文字**、如何 **載入圖像以進行 OCR**,以及如何透過 AI 強化的後處理 **在圖像上執行 OCR**。上述完整且可執行的範例為你提供了堅實的基礎,能將 OCR 整合至任何 Python 專案——無論是掃描收據、數位化合約,或從手寫表單中擷取資料。 + +準備好進一步嗎?嘗試將 Qwen 模型換成更大的模型、實驗不同的量化方案,或將多張圖像串接以進行批次處理。可能性無限,而你剛建立的程式碼將優雅地應對各種需求。 + +祝程式開發順利,願你的 OCR 結果永遠清晰如水晶! + +![顯示如何使用 Aspose OCR 識別圖像文字的螢幕截圖](/images/ocr_output.png){alt="顯示如何使用 Aspose OCR 識別圖像文字的螢幕截圖"} + +## 接下來該學什麼? + +以下教學涵蓋與本指南緊密相關的主題,建立在所示技術之上。每個資源皆提供完整可運作的程式碼範例與步驟說明,協助你精通更多 API 功能,並在自己的專案中探索替代實作方式。 + +- [使用 Aspose OCR 從圖像提取文字 – 步驟說明指南](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [將圖像轉換為文字 – 從 URL 執行 OCR](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [使用 Aspose.OCR 以語言選擇提取圖像文字(C#)](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hungarian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/hungarian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..9577aa027 --- /dev/null +++ b/ocr/hungarian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,215 @@ +--- +category: general +date: 2026-06-06 +description: Szöveg kinyerése kézírásos képből Python OCR segítségével. Tanulja meg, + hogyan alakíthatja át a kézírásos fotót szöveggé gyorsan és megbízhatóan. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: hu +og_description: Szöveg kinyerése kézírásos képből Python segítségével. Ez az útmutató + bemutatja, hogyan lehet a kézírásos fényképet szöveggé alakítani, és választ ad + arra, hogyan ismerhető fel a kézírásos szöveg. +og_title: Szöveg kinyerése kézírásos képből – Python OCR útmutató +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Szöveg kinyerése kézírásos képből Python OCR-rel – Lépésről lépésre útmutató +url: /hu/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Kézzel írott kép szövegének kinyerése Python OCR‑val – Lépésről‑lépésre útmutató + +Gondolkodtál már azon, **hogyan ismerjünk fel kézzel írott szöveget** egy telefonoddal készített fényképen? Nem vagy egyedül. Sok projektben – legyen szó előadások jegyzeteinek digitalizálásáról vagy aláírt űrlapok adatainak kinyeréséről – szükség van arra, hogy **kézzel írott képből szöveget nyerjünk ki** gyorsan és problémamentesen. + +Ebben a tutorialban egy teljes, azonnal futtatható példán keresztül mutatjuk be, hogyan **alakítsd át a kézzel írott fotót szöveggé** egy népszerű Python OCR könyvtár segítségével. Nincs homályos hivatkozás, csak konkrét kód, magyarázatok és tippek, amelyeket már ma másolhatsz‑beilleszthetsz. + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## Amire szükséged lesz + +Mielőtt belevágnánk, győződj meg róla, hogy a következőkkel rendelkezel: + +| Követelmény | Miért fontos | +|-------------|----------------| +| Python 3.9 vagy újabb | Modern szintaxis és könyvtár‑támogatás | +| `pip` (Python csomagkezelő) | Az OCR csomag telepítéséhez | +| Egy tiszta kézzel írott jegyzet képe (JPEG/PNG) | Az OCR pontossága romlik a homályos képeken | +| Alapvető ismeretek a Python függvényekről | Segít a példa későbbi testreszabásában | + +Ha valamelyik hiányzik, töltsd le a legújabb Python‑t a oldalról és telepítsd – semmi dráma. + +## A Python OCR könyvtár telepítése + +A kódrészlet, amelyet használni fogunk, a `ocr` csomagra támaszkodik (egy vékony wrapper a Tesseract köré, ami kényelmes beállításokat ad). Telepítsd egyetlen paranccsal: + +```bash +pip install ocr +``` + +> **Pro tipp:** Telepítés után futtasd a `tesseract --version` parancsot a terminálban, hogy megbizonyosodj a háttérben lévő motor jelenlétéről. Ha nincs Tesseract, kövesd az operációs rendszeredhez tartozó hivatalos útmutatót – a legtöbb csomagkezelő tartalmazza (`apt-get install tesseract-ocr` Ubuntun, `brew install tesseract` macOS‑en). + +## 1. lépés: OCR motor példány létrehozása + +A motor létrehozása az első tégla a **python ocr handwritten recognition** falában. Gondolj a motorra úgy, mint az agyra, amely később elolvassa a karikákat. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Miért fontos: motor nélkül nem tudod finomhangolni a felismerési folyamatot. Az alapértelmezett beállítások nyomtatott szövegre vannak optimalizálva, ezért a következő lépésben módosítanunk kell őket. + +## 2. lépés: Kézzel írott felismerés engedélyezése + +Alapértelmezés szerint a motor nyomtatott karaktereket vár. A kézzel írott mód bekapcsolása egy kapcsolót állít át, amely azt mondja a Tesseract‑nak, hogy használja az LSTM modellt, amely a folyó írásra van betanítva. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Mi történik, ha kihagyod?** Az OCR a vonalakat zajként kezeli, ami torz kimenetet eredményez. A flag engedélyezése a **how to recognize handwritten text** lényege. + +## 3. lépés: Kézzel írott fotó betöltése + +Most a motort a kép fájlra irányítjuk. Az útvonal lehet abszolút vagy relatív; csak győződj meg róla, hogy a fájl létezik. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Gyors ellenőrzés: nyisd meg a képet a rendszered képnézegetőjében. Ha a szöveg homályos, fontold meg az előfeldolgozást (kontraszt növelése, forgatás), mielőtt a motorhoz adnád – ezek a finomhangolások gyakran növelik a **extract text from handwritten image** sikerességét. + +## 4. lépés: A felismerési folyamat indítása + +Minden beállítva, most végre megkérjük a motort, hogy végezze el a feladatát. A `recognize()` metódus egy eredményobjektust ad vissza, amely a kinyert szöveget és a biztonsági pontszámokat tartalmazza. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +A háttérben a motor a bitmapet jellemző vektorokká alakítja, ezeket átadja az LSTM hálózaton, majd összefűzi a karaktereket. Ez a **python ocr handwritten recognition** varázsa. + +## 5. lépés: A kinyert szöveg megjelenítése + +Az eredményobjektum egy `.text` attribútummal rendelkezik, amely a tiszta Unicode sztringet tartalmazza. Nyomtasd ki, írd fájlba, vagy továbbítsd egy másik folyamatba – a döntés a tiéd. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Várható kimenet + +Ha a forráskép a „Buy milk, eggs, and bread” (Vedd meg a tejet, a tojást és a kenyeret) feljegyzést tartalmazza, valami ilyesmit látsz majd: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Vedd észre, hogy a kimenet megőrzi a központozást és a sortöréseket (ha vannak). Ha értelmetlen szöveget kapsz, ellenőrizd a kép minőségét és az `enable_handwritten_recognition` flaget. + +## Gyakori hibák kezelése + +| Probléma | Tünet | Megoldás | +|----------|-------|----------| +| Alacsony biztonsági pontszámok | Sok “?” vagy értelmetlen karakter | Növeld a kép DPI‑ját ≥300-ra, alkalmazz binarizációt (`opencv`), vagy vágd le a érdeklődési területet. | +| Vegyes nyelvek | A kimenet angolt és egy másik írásrendszert kever | Állítsd be `engine.ocr_settings.language = "eng"` (vagy más ISO kód) a `recognize()` előtt. | +| Nagy fájlok | Hosszú feldolgozási idő vagy memóriahiba | Méretezd át a képet egy ésszerű dimenzióra (pl. max szélesség 1200 px) betöltés előtt. | +| Hiányzó Tesseract | `ImportError` vagy `FileNotFoundError` | Telepítsd a Tesseract‑ot külön, és győződj meg róla, hogy a rendszer PATH‑jában szerepel. | + +Ezek a finomhangolások a **convert handwritten photo to text** munkafolyamatodat robusztusabbá teszik különböző adathalmazoknál. + +## Teljes szkript, amit ma futtathatsz + +Az alábbiakban megtalálod a komplett, önálló programot, amely összerakja az összes elemet. Másold egy `handwritten_ocr.py` nevű fájlba, majd futtasd a `python handwritten_ocr.py` parancsot. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Futtasd, és a szöveg a konzolra lesz kiírva – pontosan az eredmény, amire szükséged van, amikor **convert handwritten photo to text** szeretnél. + +## További lépések + +Miután elsajátítottad a **extract text from handwritten image** alapjait, gondolj ezekre a következő lépésekre: + +- **Kötegelt feldolgozás:** Iterálj egy mappán képek felett, és minden eredményt tárolj egy CSV fájlban. +- **Utófeldolgozás:** Használj reguláris kifejezéseket a gyakori OCR hibák (pl. „1” vs „l”) tisztításához. +- **Integráció:** A kinyert szövegeket egy természetes nyelvfeldolgozó (NLP) pipeline‑ba küldheted sentiment‑analízis vagy kulcsszó‑kivonás céljából. +- **Alternatív könyvtárak:** Ha nagyobb pontosságra van szükséged, nézd meg az `easyocr` vagy a `pytesseract` saját LSTM modellekkel – mindkettő támogatja a **python ocr handwritten recognition**‑t. + +Ne feledd, a forráskép minősége gyakran meghatározza a sikerességet, ezért szánj néhány percet az előfeldolgozásra. Egy kis plusz erőfeszítés most rengeteg hibakeresést takarít meg később. + +## Összegzés + +Léptünk egy teljes, vég‑től‑végig példán keresztül, amely megmutatja, **hogyan ismerjünk fel kézzel írott szöveget**, és ami még fontosabb, **hogyan nyerjünk ki szöveget kézzel írott képből** Python segítségével. A `ocr` csomag telepítésével, a kézzel írott flag bekapcsolásával, a kép betöltésével és a `recognize()` meghívásával **convert handwritten photo to text** néhány sor kóddal megvalósítható. + +Próbáld ki a saját jegyzeteiddel, finomítsd az előfeldolgozási lépéseket, és hagyd, hogy az OCR elvégezze a nehéz munkát. Ha elakadsz, nézd meg a „Gyik” táblázatot, vagy kísérletezz alternatív OCR backend‑ekkel. Jó kódolást, és legyen a kézzel írott adatod azonnal kereshető! + +## Mit érdemes még tanulni? + +A következő tutorialok szorosan kapcsolódó témákat fednek le, amelyek a jelen útmutatóban bemutatott technikákra épülnek. Minden forrás tartalmaz teljes, működő kódrészleteket lépésről‑lépésre magyarázatokkal, hogy könnyedén elsajátíthasd az API további funkcióit és alternatív megvalósítási megközelítéseket a saját projektjeidben. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [How to Recognize Page Rectangles for OCR Text Recognition in Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [How to Use OCR - Recognize Image without Text Area Detection](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hungarian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/hungarian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..e71a6ba95 --- /dev/null +++ b/ocr/hungarian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,232 @@ +--- +category: general +date: 2026-06-06 +description: 'OCR karakterkészlet útmutató: tanulja meg, hogyan használja az OCR motorját + a latin és cirill írásrendszerek támogatott karaktereinek listázásához teljes Python + példákkal.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: hu +og_description: 'OCR karakterkészlet útmutató: fedezze fel, hogyan használhatja az + OCR motorját Pythonban a különböző írásrendszerek támogatott karaktereinek listázásához, + kóddal és tippekkel együtt.' +og_title: OCR karakterkészlet – OCR motor lekérése és használata +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR karakterkészlet – OCR motor lekérése és használata Pythonban +url: /hu/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR karakterkészlet – OCR motor lekérdezése és használata Pythonban + +Szeretnéd tudni, hogy a **ocr karakterkészlet** milyen karaktereket támogat a könyvtárad? Ebben az útmutatóban megmutatjuk, hogyan **használhatod az OCR motort** a támogatott karakterek lekérdezésére különböző írásrendszerekhez, és miért fontos ez a valós projektekben. + +Ha már valaha üres képernyőre bámultál, és azon tűnődtél, miért nem jelennek meg bizonyos betűk az OCR feldolgozás után, nem vagy egyedül. A válasz gyakran a motor által ismert karakterkészletben rejlik. A végére a következőket fogod tudni: + +* Listázni minden karaktert, amelyet egy OCR motor fel tud ismerni egy adott írásrendszerhez. +* Kezelni azokat az eseteket, amikor egy írásrendszer nem támogatott. +* A karakterkészlet információt beépíteni a további validációba vagy UI logikába. + +Nincs varázslatos fekete doboz trükk – csak tiszta Python, néhány kódsor, és egy kis magyarázat. + +--- + +## Előfeltételek + +Mielőtt belevágnánk, győződj meg róla, hogy: + +* Python 3.8+ telepítve van (a kód f‑stringeket használ). +* Az OCR könyvtár, amely biztosítja a `ocr.OcrEngine`‑t – ebben a példában feltételezzük, hogy a `ocr` nevű csomag már telepítve van a `pip install ocr-lib` paranccsal. +* Alapvető ismereted van a Python import rendszeréről és a kiírásról. + +Ha hiányzik a könyvtár, futtasd: + +```bash +pip install ocr-lib +``` + +Ennyi – nincs szükség extra binárisokra vagy natív függőségekre a karakterkészlet lekérdezéséhez, amelyet végrehajtunk. + +--- + +## 1. lépés: Az OCR motor példányának inicializálása + +Az motor objektum létrehozása az első dolog, amit meg kell tenned, amikor **használod az OCR motort**. Gondolj rá úgy, mint egy szkenner bekapcsolására; nélküle nem tehetsz fel kérdéseket. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +Az `OcrEngine` osztály egy könnyűsúlyú burkoló a mögöttes felismerő motor körül. Nem indít nehéz feldolgozást, amíg valamit nem kérsz, így az indítási költség elhanyagolható. + +> **Pro tipp:** Ha az alkalmazásod sok motor példányt hoz létre, használj egyetlen példányt újra. Így memória takarítasz meg és csökkented az inicializációs késleltetést. + +--- + +## 2. lépés: OCR karakterkészlet lekérdezése egy adott írásrendszerhez + +Most, hogy van egy motorunk, megkérdezhetjük, mely karaktereket ismer egy adott írásrendszerhez. A `get_supported_characters(script_name)` metódus egy Python `list`‑et ad vissza Unicode karakterekkel. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +A fenti kódrészlet futtatása valami ilyesmit nyomtat: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +A pontos kimenet az OCR könyvtár verziójától függ, de mindig egy lapos karakterlistát kapsz. Ha nagy- és kisbetűkre is szükséged van, egyszerűen kérd le a `"Latin"` írásrendszert, majd alkalmazd a `.lower()`‑t minden elemre, vagy kérdezd le a `"Latin‑lower"` írásrendszert, ha a könyvtár megkülönbözteti őket. + +### Miért fontos ez? + +Ha egy olyan képet adsz az OCR-nek, amely szokatlan diakritikus jeleket tartalmaz (pl. „ñ” vagy „ø”), a motor csendben helyettesítheti őket egy helykitöltővel vagy teljesen eldobhatja őket. A **ocr karakterkészlet** előzetes ismerete lehetővé teszi a bemenet elővalidálását, a felhasználók figyelmeztetését, vagy egy másik motor használatát. + +--- + +## 3. lépés: Karakterek lekérdezése egy másik írásrendszerhez – Cirill példa + +Ugyanaz a metódus minden olyan írásrendszerhez működik, amelyet a motor támogat. Nézzük meg, mi történik a cirill esetén. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Tipikus kimenet: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Ha a motor **nem** támogatja a kért írásrendszert, általában `ValueError`‑t dob (vagy egy üres listát ad vissza a megvalósítástól függően). Védd le ezt a helyzetet. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## 4. lépés: OCR karakterkészlet vizualizálása (opcionális) + +Néha egy gyors vizuális ábrázolás segít, különösen, ha a stakeholder‑eknek kell bemutatni, mely karakterek vannak lefedve. Az alábbi egyszerű példa a `matplotlib`‑ot használja egy karakterrács megjelenítésére. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Kép alternatív szöveg:** ![OCR karakterkészlet diagram](ocr_character_set.png){alt="OCR karakterkészlet áttekintése"} + +A diagram nem kötelező a megoldáshoz, de bemutatja, hogyan használhatod a **OCR motor** metaadatait a tiszta kiírás mellett is. + +--- + +## 5. lépés: A karakterkészlet integrálása valós munkafolyamatokba + +Most, hogy le tudjuk kérni a **ocr karakterkészletet**, nézzünk egy gyakorlati szituációt: az OCR kimenet validálása adatbázisba mentés előtt. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Ez a minta megakadályozza, hogy szemét adatok kerüljenek a downstream csővezetékekbe – gyakori fájdalompont a többnyelvű dokumentumok kezelésekor. + +--- + +## Gyakori hibák és széljegyek + +| Hiba | Miért fordul elő | Hogyan kerüld el | +|------|------------------|------------------| +| **Írásrendszer‑név elütés** (pl. `"Cyrillic "` szóközzel a végén) | A motor a karakterláncot szó szerint kezeli, és nem találja az írásrendszert. | Távolítsd el a felesleges szóközöket: `script.strip()` a `get_supported_characters` hívása előtt. | +| **Üres karakterlista** | Egyes motorok kinyilvánítanak egy írásrendszert, de még nem töltötték be a nyelvi modellt. | Hívd meg az `engine.load_language_model(script)`‑t, ha a könyvtár ilyen metódust biztosít, vagy ellenőrizd, hogy a modellfájlok jelen vannak. | +| **Unicode normalizációs problémák** | Olyan karakterek, mint a “é”, megjelenhetnek összeállított (`\u00E9`) vagy szétbontott (`e\u0301`) formában. | Normalizáld a stringeket a `unicodedata.normalize('NFC', text)`‑vel a validálás előtt. | +| **Teljesítmény nagy írásrendszerek esetén** (pl. kínai) | Több ezer karakter lekérdezése lassú lehet. | Cache-eld az eredményt az első hívás után; a legtöbb alkalmazásnak csak egyszer kell a listára. | + +--- + +## Teljes működő példa + +Az alábbi egyetlen szkript mindent egyben tartalmaz, amit egyszerűen másolj és futtass: + + + +## Mi legyen a következő tanulnivalód? + +A következő útmutatók szorosan kapcsolódó témákat fednek le, amelyek a jelen útmutatóban bemutatott technikákra épülnek. Minden forrás teljesen működő kódpéldákat tartalmaz lépésről‑lépésre magyarázatokkal, hogy segítsenek elsajátítani további API funkciókat és alternatív megvalósítási megközelítéseket a saját projektjeidben. + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hungarian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/hungarian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..bb8c844e6 --- /dev/null +++ b/ocr/hungarian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: Képen OCR végrehajtása az Aspose OCR és egy Hugging Face modell használatával. + Tanulja meg, hogyan töltheti le a Hugging Face modellt, hogyan vonjon ki szöveget + egy számlából, és hogyan szabadítsa fel a GPU erőforrásokat. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: hu +og_description: Képen OCR végrehajtása az Aspose OCR és egy Hugging Face modell segítségével. + Ez az útmutató bemutatja, hogyan töltsd le a modellt, hogyan nyerd ki a szöveget + egy számláról, és hogyan szabadítsd fel a GPU erőforrásokat. +og_title: OCR végrehajtása képen az Aspose OCR és LLM segítségével – Teljes útmutató +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: OCR végrehajtása képen az Aspose OCR és LLM segítségével – Teljes útmutató +url: /hu/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR végrehajtása képen Aspose OCR & LLM segítségével – Teljes útmutató + +Szerettél volna **OCR-t végrehajtani képfájlokon**, de elakadtál a „hol is kezdjem?” kérdésnél? Nem vagy egyedül – sok fejlesztő találkozik ezzel a falakkal, amikor először foglalkozik dokumentumautomatizálással. A jó hír, hogy az Aspose OCR és egy könnyű LLM a Hugging Face‑ről néhány Python sorral nyers beolvasást egy számla tiszta, kereshető szöveggé alakíthat. + +Ebben a tutorialban mindent végigvezetünk: a **kép betöltését OCR-hez**, a **Hugging Face modell letöltését**, a **szöveg kinyerését számlából**, és végül a **GPU erőforrások felszabadítását**, hogy az alkalmazásod karcsú maradjon. A végére egy önálló szkriptet kapsz, amit bármelyik projektedbe beilleszthetsz. + +--- + +## Mit fogsz megtanulni + +- Hogyan **végezz OCR-t képen** az Aspose `OcrEngine` segítségével. +- A pontos lépéseket a **Hugging Face modell** fájlok automatikus letöltéséhez. +- Technikai megoldásokat a **szöveg kinyeréséhez számlákból** PDF‑ek vagy PNG‑k esetén AI‑támogatott utófeldolgozással. +- Legjobb gyakorlatok a **GPU erőforrások felszabadításához** az inferencia után. +- Tippek a **kép betöltéséhez OCR-hez** hatékonyan, a gyakori buktatók elkerüléséhez. + +Nem szükséges külső dokumentáció – minden, amire szükséged van, itt van, teljes kóddal, magyarázatokkal és várt kimenettel. + +--- + +## Előfeltételek + +Mielőtt belevágnánk, győződj meg róla, hogy rendelkezel a következőkkel: + +| Requirement | Reason | +|-------------|--------| +| Python 3.9+ | Modern szintaxis és típusjelölések | +| `asposeocr` package (`pip install asposeocr`) | Alap OCR motor | +| GPU elérés (opcionális, de ajánlott) | Gyorsítja az LLM utófeldolgozót | +| Számla kép (`sample_invoice.png`) | Valós teszteset | + +Ha valamelyik hiányzik, telepítsd most; a szkript **automatikusan letölti a Hugging Face modellt**, így nem kell magad keresgélned a fájlok után. + +--- + +## 1. lépés: OCR végrehajtása képen – Motor létrehozása + +Az első dolog, amit meg kell tenned, hogy elindítod az Aspose OCR motort. Gondolj rá úgy, mint egy üres vászonra, ahová később a kép szöveggé lesz festve. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Miért fontos:** Az `OcrEngine` elrejti az alacsony szintű képelőkészítést, így a magasabb szintű munkafolyamatra koncentrálhatsz. Emellett egy `set_post_processor` metódust is biztosít, amely később egy LLM‑et csatlakoztat a okosabb kimenethez. + +--- + +## 2. lépés: Kép betöltése OCR‑hez – A megfelelő fájl kiválasztása + +Most, hogy a motor létezik, **betölteni kell a képet OCR‑hez**. Az Aspose támogatja a PNG, JPG, TIFF és néhány egyéb formátumot. Győződj meg róla, hogy az útvonal abszolút vagy a szkript helyéhez relatív. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tipp:** Ha a képed nagy, fontold meg előzetes átméretezését a memóriaigény csökkentése érdekében. Az OCR motor képes kezelni a nagy felbontású beolvasásokat, de egy 300 DPI‑os kép általában ideális a számlákhoz. + +--- + +## 3. lépés: Nyers OCR végrehajtása és a kinyert szöveg megtekintése + +A kép betöltése után végre **OCR‑t hajthatunk végre képen**, és megnézhetjük, mit ad ki a nyers motor. Ez a lépés egy kiindulási alapot biztosít, mielőtt az AI‑varázslatot hozzáadnánk. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Várt kimenet (rövidítve):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +A nyers kimenet gyakran tartalmaz sortöréseket, félreolvasott karaktereket vagy hiányzó mezőket – ezért hozunk be egy nyelvi modellt a következő lépésben. + +--- + +## 4. lépés: Hugging Face modell letöltése – LLM utófeldolgozó beállítása + +Itt jön a **Hugging Face modell letöltése** lépés a főszerepbe. Az Aspose AI automatikusan letölthet egy modellt a Hugging Face hub‑ról, ha az még nincs a lemezen. A Qwen2.5‑3B‑Instruct‑GGUF modellt fogjuk használni, amely jó egyensúlyt teremt a pontosság és a memóriaigény között. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Miért működik:** Az `allow_auto_download` elkerüli a `.gguf` fájl kézi letöltését. A kvantálás (`int8`) a modell méretét körülbelül 3 GB‑ra csökkenti, így a legtöbb fogyasztói GPU‑n is futtatható. Állítsd be a `gpu_layers` értékét a hardverednek megfelelően – több GPU‑réteg = gyorsabb inferencia. + +--- + +## 5. lépés: Szöveg kinyerése számlából AI‑támogatott utófeldolgozással + +Most csatlakoztatjuk az LLM‑et az OCR motorhoz, és egy **utófeldolgozót** futtatunk, amely megtisztítja a nyers kimenetet, javítja az OCR hibákat, és szépen formázza a számla mezőket. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Minta javított kimenet:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Mi történt?** Az LLM felismerte, hogy a “Invoice #12345” helyett “Invoice Number: 12345” kell legyen, javította a dátumformátumot, és még a “Bill To” mezőt is kitalálta, amelyet a nyers motor kihagyott. Ez a **szöveg kinyerése számlából** automatizálásának magja. + +--- + +## 6. lépés: GPU erőforrások felszabadítása – Tisztítás a feldolgozás után + +Ha ezt egy hosszú életű szolgáltatásban (pl. Flask API) futtatod, minden inferencia után **felszabadítani kell a GPU erőforrásokat**, hogy elkerüld a memóriahiányos összeomlásokat. Az Aspose AI egy egyszerű módszert biztosít erre. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro tipp:** Hívd meg a `free_resources()`‑t egy `finally:` blokkban, ha az OCR hívást try/except‑ben csomagolod. Így a tisztítás garantáltan megtörténik, még kivétel esetén is. + +--- + +## 7. lépés: Teljes szkript – Összeállítás egyben + +Az alábbiakban a teljes, azonnal futtatható szkript található. Másold be, állítsd be az útvonalakat, és már indulhat is. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Futtasd a szkriptet, és figyeld, ahogy a zajos OCR kimenet tiszta, strukturált számlaadatokká alakul. 🎉 + +--- + +## Gyakori kérdések & széljegyek + +| Question | Answer | +|----------|--------| +| **Mi történik, ha a modell letöltése sikertelen?** | Győződj meg róla, hogy a géped internetkapcsolattal rendelkezik, és a `hugging_face_repo_id` helyes. Manuálisan is letöltheted a | + +## Mit érdemes még megtanulni? + +A következő tutorialok szorosan kapcsolódó témákat fednek le, amelyek tovább építik a jelen útmutatóban bemutatott technikákat. Minden forrás komplett, működő kódpéldákat tartalmaz lépésről‑lépésre magyarázatokkal, hogy elsajátíthasd a további API‑funkciókat és alternatív megvalósítási megközelítéseket a saját projektjeidben. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/hungarian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/hungarian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..ce27bb2d6 --- /dev/null +++ b/ocr/hungarian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,283 @@ +--- +category: general +date: 2026-06-06 +description: szöveg felismerése képről az Aspose OCR-rel – tanulja meg, hogyan töltsön + be képet OCR-hez, és hogyan hajtson végre OCR-t a képen AI utófeldolgozással Pythonban. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: hu +og_description: Ismerje fel a szöveget a képről gyorsan. Ez az útmutató bemutatja, + hogyan töltsön be képet OCR-hez, hogyan végezzen OCR-t a képen, és hogyan javítsa + az eredményeket AI utófeldolgozással. +og_title: szöveg felismerése képről az Aspose OCR és AI segítségével +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Szöveg felismerése képről az Aspose OCR és AI segítségével +url: /hu/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Szöveg felismerése képről az Aspose OCR & AI segítségével + +Valaha szükséged volt már arra, hogy szöveget ismerj fel egy képen, de nem tudtad, melyik könyvtár biztosítja a sebességet és a pontosságot egyaránt? Nem vagy egyedül. Ebben az útmutatóban egy teljes, vég‑től‑végig példát mutatunk be, amely bemutatja, hogyan **töltsünk be képet az OCR-hez**, **végezzük el az OCR-t a képen**, majd finomítsuk a kimenetet az Aspose AI utófeldolgozójával. A végére egy kész‑futásra kész szkriptet kapsz, amely egy PNG‑t tiszta, kereshető szöveggé alakít. + +## Mit fogsz megtanulni + +Mindent lefedünk az Aspose OCR csomag telepítésétől a futás végén a források felszabadításáig. Megmutatjuk, miért fontos a kézírás‑felismerés engedélyezése, hogyan konfiguráljunk egy Qwen 2.5 LLM‑et az utófeldolgozáshoz, és milyen lesz a végső kimenet. Nem szükséges külső hivatkozás – csak másold, illeszd be és futtasd. + +### Előfeltételek + +- Python 3.8 vagy újabb +- `asposeocr` csomag (`pip install asposeocr`) +- Képfájl (pl. `doc.png`), amely nyomtatott vagy kézírásos szöveget tartalmaz +- Opcionális: GPU a gyorsabb LLM inferenciához (a szkript CPU‑n is működik) + +--- + +## Szöveg felismerése képről – Lépésről‑lépésre + +Az egyes kódtömbök alatt rövid magyarázatot találsz arra, **miért** hajtjuk végre az adott műveletet, nem csak arra, **mit** csinál a sor. + +### 1. lépés: A szükséges modulok telepítése és importálása + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Miért?* Az `asposeocr` importálása biztosítja számunkra az `OcrEngine` osztályt, míg az `ai` almodul a LLM‑alapú utófeldolgozót nyújtja, amely drámaikusan javítja a nyers OCR kimenetet. + +### 2. lépés: Az OCR motor létrehozása és a kézírás‑felismerés engedélyezése + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +A kézírás‑felismerés engedélyezése kibővíti a motor karakterkészletét, így nem veszítesz el karcokat, amikor **OCR‑t végzel a képen** olyan fájlok esetén, amelyek nyomtatott és folyó írású szöveget is tartalmaznak. + +### 3. lépés: Kép betöltése az OCR‑hez + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +A `load_image` hívás az a pillanat, amikor **betöltöd a képet az OCR‑hez**; ha az útvonal hibás, a motor informatív kivételt dob, így elkerülheted a későbbi rejtélyes hibákat. + +### 4. lépés: Nyers OCR futtatása + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +Ebben a szakaszban kapsz egy `RecognitionResult` objektumot, amely a szűretlen szöveget, a biztonsági pontszámokat és a layout metaadatokat tartalmazza. Gyakran zajos – ezért szükséges az AI‑alapú tisztítás. + +### 5. lépés: Az Aspose AI modell konfigurálása LLM utófeldolgozáshoz + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Miért gondoskodunk ezekről a beállításokról? +- **auto‑download** garantálja, hogy a modell az első futtatáskor elérhető legyen. +- **int8 quantization** jelentősen csökkenti a memóriaigényt anélkül, hogy nagy pontosságcsökkenést okozna. +- **gpu_layers** lehetővé teszi, hogy kompatibilis GPU‑t használj a gyorsabb inferenciához. + +### 6. lépés: Az AI processzor inicializálása és csatolása utófeldolgozóként + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +A processzor csatolása azt jelenti, hogy minden `run_postprocessor` híváskor az LLM javítja a helyesírást, egyesíti a széttört szavakat, és még a hiányzó írásjeleket is kitalálja. + +### 7. lépés: Az utófeldolgozó futtatása az OCR kimenet javításához + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +Az `enhanced_result.text` általában sokkal olvashatóbb, mint a nyers karakterlánc – tekintsd úgy, mint egy helyesírás-ellenőrzőt, amely a kontextust is érti. + +### 8. lépés: Erőforrások felszabadítása + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +A takarítás elengedhetetlen a hosszú távú szolgáltatásokban; ellenkező esetben GPU‑memóriát szivárogtatsz, és végül összeomlik az alkalmazásod. + +--- + +## Teljes szkript, amelyet ma futtathatsz + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Várt kimenet** (példa egy egyszerű számla képre): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Vedd észre, hogyan javította az AI réteg a „Inv0ice” → „Invoice” szót, és adta hozzá a hiányzó írásjeleket. + +--- + +## Gyakran feltett kérdések (és gyors válaszok) + +- **Szükségem van GPU‑ra?** Nem. A szkript visszaesik CPU‑ra, de az inferencia lassabb lesz. +- **Használhatok másik LLM‑et?** Természetesen – csak cseréld le a `hugging_face_repo_id`‑t, és ennek megfelelően állítsd be a `gpu_layers`‑t. +- **Mi van, ha a kép hatalmas?** Először méretezd át (pl. a Pillow használatával), hogy a memóriahasználat ésszerű maradjon. +- **A kézírás‑felismerés mindig be van kapcsolva?** A `enable_handwritten_recognition` beállítást a terhelésednek megfelelően ki- vagy bekapcsolhatod. + +--- + +## Következtetés + +Most már tudod, hogyan **ismerj fel szöveget képről** az Aspose OCR használatával, hogyan **tölts be képet az OCR‑hez**, és hogyan **végezz OCR‑t a képen** AI‑javított utófeldolgozással. A fenti teljes, futtatható példa szilárd alapot nyújt az OCR bármely Python projektbe való integrálásához – legyen szó nyugták beolvasásáról, szerződések digitalizálásáról vagy kézírásos űrlapok adatainak kinyeréséről. + +Készen állsz a következő lépésre? Próbáld ki a Qwen modellt egy nagyobbra cserélni, kísérletezz különböző kvantálási sémákkal, vagy láncolj több képet egymás után kötegelt feldolgozáshoz. A lehetőségek végtelenek, és a most épített kód elegánsan kezeli őket. + +Boldog kódolást, és legyenek az OCR eredményeid mindig kristálytiszta! + +![Python konzol képernyőképe, amely a javított OCR kimenetet mutatja](/images/ocr_output.png){alt="Képernyőkép, amely bemutatja, hogyan lehet szöveget felismerni képről az Aspose OCR használatával"} + +## Mit érdemes következőként megtanulni? + +Az alábbi oktatóanyagok szorosan kapcsolódó témákat fednek le, amelyek a jelen útmutatóban bemutatott technikákra épülnek. Minden forrás tartalmaz teljes, működő kódpéldákat lépésről‑lépésre magyarázatokkal, hogy elsajátíthasd a további API funkciókat, és alternatív megvalósítási megközelítéseket fedezhess fel saját projektjeidben. + +- [Szöveg kinyerése képről Aspose OCR‑rel – Lépésről‑lépésre útmutató](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Kép konvertálása szöveggé – OCR végrehajtása képen URL‑ről](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Képszöveg kinyerése C#‑ban nyelvválasztással az Aspose.OCR használatával](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/indonesian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/indonesian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..c6af1e4d4 --- /dev/null +++ b/ocr/indonesian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,216 @@ +--- +category: general +date: 2026-06-06 +description: Ekstrak teks dari gambar tulisan tangan menggunakan Python OCR. Pelajari + cara mengubah foto tulisan tangan menjadi teks dengan cepat dan andal. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: id +og_description: Ekstrak teks dari gambar tulisan tangan dengan Python. Panduan ini + menunjukkan cara mengubah foto tulisan tangan menjadi teks dan menjawab cara mengenali + teks tulisan tangan. +og_title: Ekstrak Teks dari Gambar Tulisan Tangan – Tutorial OCR Python +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Ekstrak Teks dari Gambar Tulisan Tangan dengan Python OCR – Panduan Langkah + demi Langkah +url: /id/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Ekstrak Teks dari Gambar Tulisan Tangan dengan Python OCR – Panduan Langkah‑per‑Langkah + +Pernah bertanya-tanya **bagaimana cara mengenali teks tulisan tangan** dalam foto yang Anda ambil dengan ponsel? Anda tidak sendirian. Dalam banyak proyek—baik itu mendigitalkan catatan kuliah atau mengambil data dari formulir yang ditandatangani—Anda perlu **mengekstrak teks dari gambar tulisan tangan** dengan cepat dan tanpa masalah. + +Dalam tutorial ini kami akan membimbing Anda melalui contoh lengkap yang siap dijalankan yang menunjukkan secara tepat cara **mengubah foto tulisan tangan menjadi teks** menggunakan pustaka Python OCR yang populer. Tanpa referensi yang samar, hanya kode konkret, penjelasan, dan tip yang dapat Anda salin‑tempel hari ini. + +![ekstrak teks dari gambar tulisan tangan](https://example.com/placeholder-handwritten.jpg "ekstrak teks dari gambar tulisan tangan") + +## Apa yang Anda Butuhkan + +Sebelum kita mulai, pastikan Anda memiliki hal‑hal berikut: + +| Persyaratan | Mengapa penting | +|-------------|-----------------| +| Python 3.9 atau lebih baru | Sintaks modern & dukungan pustaka | +| `pip` (manajer paket Python) | Untuk menginstal paket OCR | +| Gambar yang jelas dari catatan tulisan tangan (JPEG/PNG) | Akurasi OCR menurun pada gambar yang buram | +| Pemahaman dasar tentang fungsi Python | Membantu Anda menyesuaikan contoh nanti | + +Jika Anda belum memiliki salah satu dari ini, unduh Python terbaru dari dan instal—tanpa ribet. + +## Instal Pustaka Python OCR + +Potongan kode yang akan kami gunakan bergantung pada paket `ocr` (pembungkus tipis di atas Tesseract yang menambahkan pengaturan berguna). Instal dengan satu perintah: + +```bash +pip install ocr +``` + +> **Tip pro:** Setelah instalasi, jalankan `tesseract --version` di terminal Anda untuk memastikan mesin dasar tersedia. Jika Anda belum memiliki Tesseract, ikuti panduan resmi untuk OS Anda—kebanyakan manajer paket sudah menyediakannya (`apt-get install tesseract-ocr` di Ubuntu, `brew install tesseract` di macOS). + +## Langkah 1: Buat Instance Mesin OCR + +Membuat mesin adalah batu pertama dalam dinding **python ocr handwritten recognition**. Anggap mesin sebagai otak yang nanti akan membaca coretan. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Mengapa ini penting: tanpa mesin Anda tidak dapat menyesuaikan alur pengenalan. Pengaturan default dioptimalkan untuk teks cetak, jadi kita perlu menyesuaikannya pada langkah berikutnya. + +## Langkah 2: Aktifkan Pengenalan Tulisan Tangan + +Secara default mesin mengasumsikan karakter cetak. Mengaktifkan mode tulisan tangan mengubah saklar yang memberi tahu Tesseract untuk menggunakan model LSTM yang dilatih pada goresan kursif. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Bagaimana jika Anda melewatkannya?** OCR akan memperlakukan goresan sebagai noise, menghasilkan output yang kacau. Mengaktifkan flag tersebut adalah inti dari **bagaimana cara mengenali teks tulisan tangan**. + +## Langkah 3: Muat Foto Tulisan Tangan Anda + +Sekarang kami mengarahkan mesin ke file gambar. Path dapat berupa absolut atau relatif; pastikan file tersebut ada. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Pemeriksaan cepat: buka gambar di penampil OS Anda. Jika teksnya buram, pertimbangkan pra‑pemrosesan (peningkatan kontras, rotasi) sebelum memberikannya ke mesin—penyesuaian tersebut sering meningkatkan tingkat keberhasilan **ekstrak teks dari gambar tulisan tangan**. + +## Langkah 4: Jalankan Proses Pengenalan + +Setelah semuanya siap, kami akhirnya meminta mesin untuk melakukan tugasnya. Metode `recognize()` mengembalikan objek hasil yang berisi string yang diekstrak dan skor kepercayaan. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Di balik layar, mesin mengubah bitmap menjadi serangkaian vektor fitur, memprosesnya melalui jaringan LSTM, dan menyatukan karakter‑karakter. Itulah keajaiban **python ocr handwritten recognition**. + +## Langkah 5: Tampilkan Teks yang Diekstrak + +Objek hasil menyediakan atribut `.text` yang berisi string Unicode biasa. Cetak, tulis ke file, atau masukkan ke pipeline lain—pilihan Anda. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Output yang Diharapkan + +Jika gambar sumber berisi catatan “Buy milk, eggs, and bread”, Anda akan melihat sesuatu seperti: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Perhatikan bagaimana output mempertahankan tanda baca dan pemisah baris (jika ada). Jika Anda mendapatkan teks tak terbaca, periksa kembali kualitas gambar dan flag `enable_handwritten_recognition`. + +## Menangani Kendala Umum + +| Masalah | Gejala | Solusi | +|---------|--------|--------| +| Skor kepercayaan rendah | Banyak “?” atau karakter tak masuk akal | Tingkatkan DPI gambar menjadi ≥300, terapkan binarisasi (`opencv`), atau potong ke wilayah yang diinginkan. | +| Bahasa campuran | Output mencampur bahasa Inggris dengan skrip lain | Set `engine.ocr_settings.language = "eng"` (atau kode ISO lain) sebelum `recognize()`. | +| File besar | Waktu pemrosesan lama atau error memori | Ubah ukuran gambar ke dimensi yang wajar (mis., lebar maksimum 1200 px) sebelum memuat. | +| Tesseract tidak ada | `ImportError` atau `FileNotFoundError` | Instal Tesseract secara terpisah dan pastikan berada di PATH sistem Anda. | + +Penyesuaian ini membuat alur kerja **convert handwritten photo to text** Anda tetap kuat di berbagai dataset. + +## Skrip Lengkap yang Dapat Anda Jalankan Hari Ini + +Berikut adalah program lengkap yang berdiri sendiri yang menyatukan semua bagian. Salin ke file bernama `handwritten_ocr.py` dan jalankan `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Jalankan, dan Anda akan melihat teks tercetak di konsol—tepat hasil yang Anda butuhkan ketika ingin **convert handwritten photo to text**. + +## Melangkah Lebih Jauh + +Sekarang Anda telah menguasai dasar **ekstrak teks dari gambar tulisan tangan**, pertimbangkan langkah selanjutnya berikut: + +- **Pemrosesan batch:** Loop melalui folder gambar dan simpan setiap hasil ke file CSV. +- **Pasca‑pemrosesan:** Gunakan regular expression untuk membersihkan kesalahan OCR umum (mis., “1” vs “l”). +- **Integrasi:** Masukkan string yang diekstrak ke pipeline Natural Language Processing untuk analisis sentimen atau ekstraksi kata kunci. +- **Pustaka alternatif:** Jika Anda membutuhkan akurasi lebih tinggi, jelajahi `easyocr` atau `pytesseract` dengan model LSTM khusus—keduanya juga mendukung **python ocr handwritten recognition**. + +Ingat, kualitas gambar sumber sering menentukan keberhasilan, jadi luangkan beberapa menit untuk pra‑pemrosesan. Sedikit usaha ekstra sekarang menghemat banyak debugging nanti. + +## Kesimpulan + +Kami telah membahas contoh lengkap end‑to‑end yang menunjukkan **bagaimana cara mengenali teks tulisan tangan** dan, yang lebih penting, bagaimana **mengekstrak teks dari gambar tulisan tangan** menggunakan Python. Dengan menginstal paket `ocr`, mengaktifkan flag tulisan tangan, memuat gambar Anda, dan memanggil `recognize()`, Anda dapat **convert handwritten photo to text** dalam beberapa baris kode. + +Cobalah dengan catatan Anda sendiri, sesuaikan langkah pra‑pemrosesan, dan biarkan OCR melakukan pekerjaan berat. Jika Anda menemui kendala, tinjau kembali tabel “Menangani Kendala Umum” atau coba alternatif back‑end OCR. Selamat coding, semoga data tulisan tangan Anda menjadi dapat dicari secara instan! + +## Apa yang Harus Anda Pelajari Selanjutnya? + +Tutorial berikut mencakup topik terkait erat yang membangun teknik yang ditunjukkan dalam panduan ini. Setiap sumber menyertakan contoh kode lengkap yang berfungsi dengan penjelasan langkah demi langkah untuk membantu Anda menguasai fitur API tambahan dan menjelajahi pendekatan implementasi alternatif dalam proyek Anda. + +- [Ekstrak Teks dari Gambar dengan Aspose OCR – Panduan Langkah‑per‑Langkah](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Cara Mengenali Persegi Panjang Halaman untuk Pengenalan Teks OCR di Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Cara Menggunakan OCR - Mengenali Gambar tanpa Deteksi Area Teks](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/indonesian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/indonesian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..c99d4d8bb --- /dev/null +++ b/ocr/indonesian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,218 @@ +--- +category: general +date: 2026-06-06 +description: 'Panduan set karakter OCR: pelajari cara menggunakan mesin OCR untuk + menampilkan karakter yang didukung untuk skrip Latin dan Cyrillic dengan contoh + Python lengkap.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: id +og_description: 'Panduan set karakter OCR: temukan cara menggunakan mesin OCR di Python + untuk menampilkan karakter yang didukung untuk berbagai skrip, lengkap dengan kode + dan tips.' +og_title: Set Karakter OCR – Dapatkan dan Gunakan Mesin OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Set Karakter OCR – Mengambil dan Menggunakan Mesin OCR di Python +url: /id/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Set Karakter OCR – Mengambil dan Menggunakan Mesin OCR di Python + +Ingin mengetahui **ocr character set** yang didukung oleh perpustakaan Anda? Dalam tutorial ini kami akan menunjukkan cara **use OCR engine** untuk menanyakan karakter yang didukung untuk berbagai skrip, dan mengapa hal itu penting untuk proyek dunia nyata. + +Jika Anda pernah menatap layar kosong bertanya-tanya mengapa beberapa huruf tidak pernah muncul setelah pemrosesan OCR, Anda tidak sendirian. Jawabannya sering tersembunyi dalam set karakter yang diketahui mesin. Pada akhir panduan ini Anda akan dapat: + +* Daftar semua karakter yang dapat dikenali mesin OCR untuk skrip tertentu. +* Menangani kasus di mana skrip tidak didukung. +* Menyisipkan informasi set karakter ke dalam validasi hilir atau logika UI. + +Tidak ada trik kotak hitam yang ajaib—hanya Python biasa, beberapa baris kode, dan sedikit penjelasan. + +--- + +## Prasyarat + +Sebelum kita mulai, pastikan Anda memiliki: + +* Python 3.8+ terpasang (kode menggunakan f‑strings). +* Perpustakaan OCR yang menyediakan `ocr.OcrEngine`—untuk contoh ini kami mengasumsikan paket bernama `ocr` sudah terpasang via `pip install ocr-lib`. +* Familiaritas dasar dengan sistem impor Python dan pencetakan. + +Jika Anda belum memiliki perpustakaan tersebut, jalankan: + +```bash +pip install ocr-lib +``` + +Itu saja—tidak ada binari tambahan atau dependensi native yang diperlukan untuk kueri set karakter dasar yang akan kami lakukan. + +## Langkah 1: Inisialisasi Instance Mesin OCR + +Membuat objek mesin adalah hal pertama yang Anda lakukan ketika **use OCR engine** fungsionalitas. Anggaplah itu seperti menyalakan pemindai; tanpa itu, Anda tidak dapat mengajukan pertanyaan apa pun. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +Kelas `OcrEngine` adalah pembungkus ringan di atas mesin pengenalan yang mendasarinya. Ia tidak memulai pemrosesan berat sampai Anda meminta sesuatu, sehingga biaya startupnya dapat diabaikan. + +> **Tip pro:** Jika aplikasi Anda membuat banyak instance mesin, gunakan kembali satu instance saja. Ini menghemat memori dan mengurangi latensi inisialisasi. + +## Langkah 2: Menanyakan Set Karakter OCR untuk Skrip Tertentu + +Sekarang kita memiliki mesin, kita dapat menanyakannya karakter apa yang diketahui untuk sistem penulisan tertentu. Metode `get_supported_characters(script_name)` mengembalikan `list` Python berisi karakter Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Menjalankan potongan kode di atas mencetak sesuatu seperti: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Output tepatnya tergantung pada versi perpustakaan OCR, tetapi Anda akan selalu mendapatkan daftar datar karakter. Jika Anda membutuhkan huruf besar dan kecil, cukup minta skrip `"Latin"` lalu terapkan `.lower()` pada setiap elemen, atau tanyakan skrip terpisah `"Latin‑lower"` jika perpustakaan membedakannya. + +### Mengapa ini penting? + +Ketika Anda memberi gambar yang berisi diakritik tidak biasa (mis., “ñ” atau “ø”), mesin OCR mungkin secara diam-diam menggantinya dengan placeholder atau menghapusnya sepenuhnya. Mengetahui **ocr character set** sebelumnya memungkinkan Anda memvalidasi input terlebih dahulu, memberi peringatan kepada pengguna, atau beralih ke mesin lain. + +## Langkah 3: Mengambil Karakter untuk Skrip Lain – Contoh Cyrillic + +Metode yang sama bekerja untuk skrip apa pun yang diklaim didukung oleh mesin. Mari lihat apa yang terjadi dengan Cyrillic. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Output tipikal: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Jika mesin **tidak** mendukung skrip yang diminta, biasanya ia mengeluarkan `ValueError` (atau mengembalikan daftar kosong tergantung implementasinya). Mari lindungi dari hal itu. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +## Langkah 4: Visualisasikan Set Karakter OCR (Opsional) + +Terkadang visual cepat membantu, terutama ketika Anda perlu menunjukkan kepada pemangku kepentingan karakter mana yang tercakup. Di bawah ini contoh minimal menggunakan `matplotlib` untuk menampilkan grid karakter. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Teks alt gambar:** ![diagram set karakter OCR](ocr_character_set.png){alt="ikhtisar set karakter OCR"} + +Diagram tidak diperlukan untuk solusi inti, tetapi ini menunjukkan bagaimana Anda dapat **use OCR engine** metadata di luar pencetakan biasa. + +## Langkah 5: Mengintegrasikan Set Karakter ke dalam Alur Kerja Dunia Nyata + +Sekarang kita dapat mengambil **ocr character set**, mari lihat skenario praktis: memvalidasi output OCR sebelum menyimpannya ke dalam basis data. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Pola ini mencegah data sampah masuk ke pipeline hilir, titik masalah umum saat menangani dokumen multibahasa. + +## Kesalahan Umum dan Kasus Pinggir + +| Kesalahan | Mengapa Terjadi | Cara Menghindari | +|-----------|-----------------|------------------| +| **Typo nama skrip** (mis., `"Cyrillic "` dengan spasi di akhir) | Mesin memperlakukan string secara literal dan tidak dapat menemukan skrip. | Hilangkan spasi: `script.strip()` sebelum memanggil `get_supported_characters`. | +| **Daftar karakter kosong** | Beberapa mesin menampilkan skrip tetapi belum memuat model bahasa. | Panggil `engine.load_language_model(script)` jika perpustakaan menyediakan metode tersebut, atau pastikan file model tersedia. | +| **Masalah normalisasi Unicode** | Karakter seperti “é” dapat muncul sebagai terkomposisi (`\u00E9`) atau terdekomposisi (`e\u0301`). | Normalisasi string dengan `unicodedata.normalize('NFC', text)` sebelum validasi. | +| **Kinerja pada skrip besar** (mis., Chinese) | Mengambil ribuan karakter dapat menjadi lambat. | Cache hasil setelah panggilan pertama; kebanyakan aplikasi hanya membutuhkan daftar sekali. | + +## Contoh Lengkap yang Berfungsi + +Menggabungkan semuanya, berikut satu skrip yang dapat Anda salin‑tempel dan jalankan: + + + +## Apa yang Harus Anda Pelajari Selanjutnya? + +Tutorial berikut mencakup topik terkait erat yang membangun teknik yang ditunjukkan dalam panduan ini. Setiap sumber mencakup contoh kode lengkap yang berfungsi dengan penjelasan langkah demi langkah untuk membantu Anda menguasai fitur API tambahan dan mengeksplorasi pendekatan implementasi alternatif dalam proyek Anda. + +- [Tutorial Aspose OCR – Pengenalan Karakter Optik](/ocr/english/) +- [Pemrosesan Pasca OCR – Dapatkan Pilihan Karakter](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Cara Mengatur Nilai Ambang pada Pengenalan Gambar OCR](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/indonesian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/indonesian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..483fc1d82 --- /dev/null +++ b/ocr/indonesian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: Lakukan OCR pada gambar menggunakan Aspose OCR dan model Hugging Face. + Pelajari cara mengunduh model Hugging Face, mengekstrak teks dari faktur, dan membebaskan + sumber daya GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: id +og_description: Lakukan OCR pada gambar menggunakan Aspose OCR dan model Hugging Face. + Tutorial ini menunjukkan cara mengunduh model, mengekstrak teks dari faktur, dan + membebaskan sumber daya GPU. +og_title: Lakukan OCR pada Gambar dengan Aspose OCR & LLM – Panduan Lengkap +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Lakukan OCR pada Gambar dengan Aspose OCR & LLM – Panduan Lengkap +url: /id/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Lakukan OCR pada Gambar dengan Aspose OCR & LLM – Panduan Lengkap + +Pernah ingin **perform OCR on image** file tetapi merasa terjebak pada pertanyaan “dari mana saya harus memulai?”? Anda tidak sendirian—banyak pengembang mengalami hal yang sama ketika pertama kali menangani otomasi dokumen. Kabar baiknya, dengan Aspose OCR dan LLM ringan dari Hugging Face, Anda dapat mengubah pemindaian mentah faktur menjadi teks bersih yang dapat dicari hanya dalam beberapa baris Python. + +Dalam tutorial ini kami akan membahas semua yang Anda perlukan: mulai dari **loading image for OCR**, hingga **downloading the Hugging Face model**, hingga **extracting text from invoice** data, dan akhirnya **free GPU resources** agar aplikasi Anda tetap ringan. Pada akhir tutorial Anda akan memiliki skrip mandiri yang dapat Anda masukkan ke proyek mana pun. + +--- + +## Apa yang Akan Anda Pelajari + +- Cara **perform OCR on image** menggunakan `OcrEngine` milik Aspose. +- Langkah tepat untuk **download Hugging Face model** secara otomatis. +- Teknik **extract text from invoice** dari PDF atau PNG dengan pemrosesan lanjutan berbasis AI. +- Praktik terbaik untuk **free GPU resources** setelah inferensi. +- Tips **load image for OCR** secara efisien dan menghindari jebakan umum. + +Tidak diperlukan dokumentasi eksternal—semua yang Anda butuhkan ada di sini, lengkap dengan kode, penjelasan, dan output yang diharapkan. + +--- + +## Prasyarat + +Sebelum kita mulai, pastikan Anda memiliki: + +| Requirement | Reason | +|-------------|--------| +| Python 3.9+ | Sintaks modern dan type hints | +| `asposeocr` package (`pip install asposeocr`) | Mesin OCR inti | +| Akses ke GPU (opsional tetapi direkomendasikan) | Mempercepat post‑processor LLM | +| Gambar faktur (`sample_invoice.png`) | Kasus uji dunia nyata | + +Jika ada yang belum terpasang, instal sekarang; skrip juga akan **download Hugging Face model** secara otomatis, jadi Anda tidak perlu mencari file secara manual. + +--- + +## Langkah 1: Perform OCR on Image – Buat Engine + +Hal pertama yang harus Anda lakukan adalah memulai engine OCR Aspose. Anggap saja ini seperti membuka kanvas kosong di mana gambar nanti akan diubah menjadi teks. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Mengapa ini penting:** `OcrEngine` mengabstraksi semua pra‑pemrosesan gambar tingkat rendah, sehingga Anda dapat fokus pada alur kerja tingkat tinggi. Engine ini juga menyediakan metode `set_post_processor` yang nantinya akan menghubungkan LLM untuk output yang lebih cerdas. + +--- + +## Langkah 2: Load Image for OCR – Pilih File yang Tepat + +Setelah engine ada, kita perlu **load image for OCR**. Aspose mendukung PNG, JPG, TIFF, dan beberapa format lainnya. Pastikan path bersifat absolut atau relatif terhadap lokasi skrip Anda. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tip:** Jika gambar Anda berukuran besar, pertimbangkan untuk mengubah ukurannya terlebih dahulu guna mengurangi tekanan memori. Engine OCR dapat menangani pemindaian resolusi tinggi, tetapi gambar 300 DPI biasanya merupakan titik optimal untuk faktur. + +--- + +## Langkah 3: Perform Raw OCR dan Lihat Teks yang Diekstrak + +Dengan gambar sudah dimuat, kita akhirnya dapat **perform OCR on image** dan melihat apa yang dihasilkan oleh engine mentah. Langkah ini memberi kita baseline sebelum menambahkan sentuhan AI. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Output yang diharapkan (dipotong):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Output mentah sering kali berisi pemisah baris, karakter yang salah dikenali, atau bidang yang hilang—itulah mengapa kita akan menambahkan model bahasa pada langkah berikutnya. + +--- + +## Langkah 4: Download Hugging Face Model – Konfigurasikan Post‑Processor LLM + +Inilah saat **download Hugging Face model** menjadi sorotan. Aspose AI dapat secara otomatis mengambil model dari hub Hugging Face jika belum ada di disk. Kita akan menggunakan model Qwen2.5‑3B‑Instruct‑GGUF, yang menyeimbangkan akurasi dan jejak memori. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Mengapa ini berhasil:** `allow_auto_download` menghilangkan kebutuhan mengunduh file `.gguf` secara manual. Kuantisasi (`int8`) mengurangi ukuran model menjadi sekitar 3 GB, membuatnya dapat dijalankan pada kebanyakan GPU konsumen. Sesuaikan `gpu_layers` berdasarkan perangkat keras Anda—lebih banyak lapisan di GPU = inferensi lebih cepat. + +--- + +## Langkah 5: Extract Text from Invoice Menggunakan Post‑Processing AI‑Enhanced + +Sekarang kita mengaitkan LLM ke engine OCR dan menjalankan **post‑processor** yang membersihkan output mentah, memperbaiki kesalahan OCR, serta memformat bidang faktur dengan rapi. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Contoh output yang ditingkatkan:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Apa yang terjadi?** LLM mengenali bahwa “Invoice #12345” seharusnya “Invoice Number: 12345”, memperbaiki format tanggal, dan bahkan menebak bidang “Bill To” yang terlewat oleh engine mentah. Inilah inti dari otomatisasi **extract text from invoice**. + +--- + +## Langkah 6: Free GPU Resources – Bersihkan Setelah Pemrosesan + +Jika Anda menjalankan ini dalam layanan yang berjalan lama (misalnya API Flask), Anda harus **free GPU resources** setelah setiap inferensi untuk menghindari crash karena kehabisan memori. Aspose AI menyediakan metode sederhana untuk itu. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro tip:** Panggil `free_resources()` di dalam blok `finally:` jika Anda membungkus panggilan OCR dalam `try/except`. Hal ini menjamin pembersihan bahkan ketika terjadi pengecualian. + +--- + +## Langkah 7: Skrip Lengkap – Gabungkan Semua + +Berikut adalah skrip lengkap yang siap dijalankan. Salin‑tempel, sesuaikan path, dan Anda siap. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Jalankan skrip dan saksikan transformasi dari OCR berisik menjadi data faktur yang bersih dan terstruktur. 🎉 + +--- + +## Pertanyaan Umum & Kasus Edge + +| Question | Answer | +|----------|--------| +| **What if the model fails to download?** | Ensure your machine has internet access and that the `hugging_face_repo_id` is correct. You can also manually download the + +## Apa yang Harus Anda Pelajari Selanjutnya? + +Tutorial berikut mencakup topik terkait yang membangun teknik yang ditunjukkan dalam panduan ini. Setiap sumber menyertakan contoh kode lengkap dengan penjelasan langkah demi langkah untuk membantu Anda menguasai fitur API tambahan dan mengeksplorasi pendekatan implementasi alternatif dalam proyek Anda sendiri. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/indonesian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/indonesian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..2d1c47ddb --- /dev/null +++ b/ocr/indonesian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,277 @@ +--- +category: general +date: 2026-06-06 +description: Mengenali teks dari gambar dengan Aspose OCR – pelajari cara memuat gambar + untuk OCR dan melakukan OCR pada gambar menggunakan pemrosesan pasca‑AI di Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: id +og_description: Mengenali teks dari gambar dengan cepat. Panduan ini menunjukkan cara + memuat gambar untuk OCR, melakukan OCR pada gambar, dan meningkatkan hasil dengan + pemrosesan pasca‑AI. +og_title: Mengenali teks dari gambar menggunakan Aspose OCR & AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Mengenali teks dari gambar menggunakan Aspose OCR & AI +url: /id/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# mengenali teks dari gambar menggunakan Aspose OCR & AI + +Pernah membutuhkan untuk mengenali teks dari gambar tetapi tidak yakin perpustakaan mana yang memberikan kecepatan dan akurasi? Anda tidak sendirian. Dalam panduan ini kami akan membahas contoh lengkap end‑to‑end yang menunjukkan **cara memuat gambar untuk OCR**, **melakukan OCR pada gambar**, dan kemudian memoles output dengan post‑processor AI Aspose. Pada akhir Anda akan memiliki skrip siap‑jalankan yang mengubah PNG menjadi teks bersih yang dapat dicari. + +## Apa yang akan Anda pelajari + +Kami akan membahas semuanya mulai dari menginstal paket Aspose OCR hingga melepaskan sumber daya di akhir proses. Anda akan melihat mengapa mengaktifkan pengenalan teks tulisan tangan penting, cara mengonfigurasi Qwen 2.5 LLM untuk post‑processing, dan seperti apa output akhir. Tidak diperlukan referensi eksternal—cukup salin, tempel, dan jalankan. + +### Prasyarat + +- Python 3.8 atau lebih baru +- paket `asposeocr` (`pip install asposeocr`) +- File gambar (misalnya `doc.png`) yang berisi teks cetak atau tulisan tangan +- Opsional: GPU untuk inferensi LLM yang lebih cepat (skrip juga berjalan di CPU) + +--- + +## Mengenali teks dari gambar – Langkah‑per‑Langkah + +Di bawah setiap blok kode Anda akan menemukan penjelasan singkat tentang **mengapa** kami melakukan tindakan tertentu itu, bukan sekadar **apa** yang dilakukan baris tersebut. + +### Langkah 1: Instal dan impor modul yang diperlukan + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Mengapa?* Mengimpor `asposeocr` memberi kita kelas `OcrEngine`, sementara submodul `ai` menyediakan post‑processor berbasis LLM yang secara dramatis meningkatkan output OCR mentah. + +### Langkah 2: Buat mesin OCR dan aktifkan pengenalan teks tulisan tangan + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Mengaktifkan pengenalan tulisan tangan memperluas set karakter mesin, sehingga Anda tidak akan kehilangan coretan ketika Anda **melakukan OCR pada gambar** yang berisi teks cetak dan kursif campuran. + +### Langkah 3: Muat gambar untuk OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +Pemanggilan `load_image` adalah momen di mana Anda **memuat gambar untuk OCR**; jika jalurnya salah, mesin akan mengeluarkan pengecualian informatif, menyelamatkan Anda dari kesalahan downstream yang membingungkan. + +### Langkah 4: Jalankan proses OCR mentah + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +Pada tahap ini Anda mendapatkan objek `RecognitionResult` yang berisi teks yang belum difilter, skor kepercayaan, dan metadata tata letak. Biasanya hasilnya berisik—oleh karena itu diperlukan pembersihan berbasis AI. + +### Langkah 5: Konfigurasikan model AI Aspose untuk post‑processing LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Mengapa repot dengan pengaturan ini? +- **auto‑download** menjamin model tersedia pada run pertama. +- **int8 quantization** mengurangi kebutuhan memori tanpa mengorbankan akurasi secara signifikan. +- **gpu_layers** memungkinkan Anda memanfaatkan GPU yang kompatibel untuk inferensi yang lebih cepat. + +### Langkah 6: Inisialisasi prosesor AI dan lampirkan sebagai post‑processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Melampirkan prosesor berarti setiap kali Anda memanggil `run_postprocessor`, LLM akan membersihkan ejaan, menggabungkan kata yang terputus, dan bahkan menebak tanda baca yang hilang. + +### Langkah 7: Jalankan post‑processor untuk meningkatkan output OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` biasanya jauh lebih mudah dibaca dibandingkan string mentah—anggaplah sebagai pemeriksa ejaan yang juga memahami konteks. + +### Langkah 8: Lepaskan sumber daya + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Membersihkan sangat penting dalam layanan yang berjalan lama; jika tidak, Anda akan mengalami kebocoran memori GPU dan akhirnya aplikasi Anda akan crash. + +## Skrip lengkap yang dapat Anda jalankan hari ini + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Output yang diharapkan** (contoh untuk gambar faktur sederhana): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Perhatikan bagaimana lapisan AI memperbaiki “Inv0ice” → “Invoice” dan menambahkan tanda baca yang hilang. + +## Pertanyaan yang sering diajukan (dan jawaban singkat) + +- **Apakah saya membutuhkan GPU?** Tidak. Skrip akan kembali ke CPU, tetapi inferensi akan lebih lambat. +- **Bisakah saya menggunakan LLM yang berbeda?** Tentu—cukup ubah `hugging_face_repo_id` dan sesuaikan `gpu_layers` sesuai kebutuhan. +- **Bagaimana jika gambar saya sangat besar?** Ubah ukurannya terlebih dahulu (misalnya dengan Pillow) agar penggunaan memori tetap wajar. +- **Apakah pengenalan tulisan tangan selalu aktif?** Anda dapat mengaktifkan atau menonaktifkan `enable_handwritten_recognition` tergantung pada beban kerja Anda. + +## Kesimpulan + +Anda sekarang tahu cara **mengenali teks dari gambar** menggunakan Aspose OCR, cara **memuat gambar untuk OCR**, dan cara **melakukan OCR pada gambar** dengan post‑processing yang ditingkatkan AI. Contoh lengkap yang dapat dijalankan di atas memberi Anda fondasi yang kuat untuk mengintegrasikan OCR ke dalam proyek Python apa pun—baik itu memindai struk, mendigitalkan kontrak, atau mengekstrak data dari formulir tulisan tangan. + +Siap untuk langkah selanjutnya? Coba ganti model Qwen dengan yang lebih besar, bereksperimen dengan skema kuantisasi yang berbeda, atau rangkaikan beberapa gambar bersama untuk pemrosesan batch. Kemungkinannya tak terbatas, dan kode yang baru saja Anda buat akan menangani semuanya dengan elegan. + +Selamat coding, dan semoga hasil OCR Anda selalu jernih seperti kristal! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Tangkapan layar yang menunjukkan cara mengenali teks dari gambar menggunakan Aspose OCR"} + +## Apa yang Harus Anda Pelajari Selanjutnya? + +Tutorial berikut mencakup topik yang sangat terkait yang membangun teknik yang ditunjukkan dalam panduan ini. Setiap sumber menyertakan contoh kode lengkap yang berfungsi dengan penjelasan langkah‑demi‑langkah untuk membantu Anda menguasai fitur API tambahan dan mengeksplorasi pendekatan implementasi alternatif dalam proyek Anda sendiri. + +- [Ekstrak Teks dari Gambar dengan Aspose OCR – Panduan Langkah‑per‑Langkah](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Konversi Gambar ke Teks – Lakukan OCR pada Gambar dari URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Ekstrak teks gambar C# dengan pemilihan bahasa menggunakan Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/italian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/italian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..5e4579a12 --- /dev/null +++ b/ocr/italian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,214 @@ +--- +category: general +date: 2026-06-06 +description: Estrai il testo da un'immagine scritta a mano usando OCR in Python. Scopri + come convertire rapidamente e in modo affidabile una foto di scrittura a mano in + testo. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: it +og_description: Estrai il testo da un'immagine scritta a mano con Python. Questa guida + mostra come convertire una foto scritta a mano in testo e risponde a come riconoscere + il testo scritto a mano. +og_title: Estrai testo da immagine scritta a mano – Tutorial OCR Python +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Estrai il testo da un'immagine manoscritta con OCR Python – Guida passo passo +url: /it/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Estrai testo da immagine scritta a mano con Python OCR – Guida passo‑passo + +Ti sei mai chiesto **come riconoscere il testo scritto a mano** in una foto scattata con il tuo telefono? Non sei il solo. In molti progetti—che si tratti di digitalizzare appunti di lezione o estrarre dati da moduli firmati—hai bisogno di **estrarre testo da immagine scritta a mano** rapidamente e senza problemi. + +In questo tutorial ti guideremo attraverso un esempio completo, pronto‑da‑eseguire, che mostra esattamente come **convertire foto scritta a mano in testo** usando una popolare libreria Python OCR. Nessun riferimento vago, solo codice concreto, spiegazioni e consigli che puoi copiare‑incollare subito. + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## Cosa ti servirà + +| Requisito | Perché è importante | +|-------------|----------------| +| Python 3.9 o più recente | Sintassi moderna e supporto alle librerie | +| `pip` (gestore pacchetti Python) | Per installare il pacchetto OCR | +| Un'immagine chiara di appunti scritti a mano (JPEG/PNG) | L'accuratezza dell'OCR diminuisce con immagini sfocate | +| Familiarità di base con le funzioni Python | Ti aiuta ad adattare l'esempio in seguito | + +Se ti manca qualcuno di questi, scarica l'ultima versione di Python da e installala—senza problemi. + +## Installa la libreria Python OCR + +Il frammento di codice che utilizzeremo si basa sul pacchetto `ocr` (un leggero wrapper attorno a Tesseract che aggiunge impostazioni pratiche). Installalo con un solo comando: + +```bash +pip install ocr +``` + +> **Pro tip:** Dopo l'installazione, esegui `tesseract --version` nel terminale per confermare che il motore sottostante sia presente. Se non hai Tesseract, segui la guida ufficiale per il tuo OS—la maggior parte dei gestori di pacchetti lo include (`apt-get install tesseract-ocr` su Ubuntu, `brew install tesseract` su macOS). + +## Passo 1: Crea un'istanza del motore OCR + +Creare il motore è il primo mattone nel muro del **python ocr handwritten recognition**. Pensa al motore come al cervello che più tardi leggerà gli scarabocchi. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Perché è importante: senza un motore non puoi modificare la pipeline di riconoscimento. Le impostazioni predefinite sono ottimizzate per testo stampato, quindi dovremo regolarle nel passo successivo. + +## Passo 2: Abilita il riconoscimento della scrittura a mano + +Per impostazione predefinita il motore assume caratteri stampati. Abilitare la modalità handwritten attiva un interruttore che indica a Tesseract di usare il suo modello LSTM addestrato su tratti corsivi. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Cosa succede se lo salti?** L'OCR tratterà i tratti come rumore, producendo output incomprensibili. Abilitare il flag è il fulcro di **come riconoscere il testo scritto a mano**. + +## Passo 3: Carica la tua foto scritta a mano + +Ora puntiamo il motore al file immagine. Il percorso può essere assoluto o relativo; assicurati solo che il file esista. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Un rapido controllo di coerenza: apri l'immagine nel visualizzatore del tuo OS. Se il testo è sfocato, considera un pre‑processing (aumento del contrasto, rotazione) prima di passarla al motore—questi aggiustamenti spesso aumentano il tasso di successo di **extract text from handwritten image**. + +## Passo 4: Esegui il processo di riconoscimento + +Con tutto pronto, chiediamo finalmente al motore di fare il suo lavoro. Il metodo `recognize()` restituisce un oggetto risultato che contiene la stringa estratta e i punteggi di confidenza. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Dietro le quinte, il motore converte il bitmap in una serie di vettori di caratteristiche, li elabora attraverso la rete LSTM e unisce i caratteri. Questa è la magia del **python ocr handwritten recognition**. + +## Passo 5: Visualizza il testo estratto + +L'oggetto risultato espone un attributo `.text` che contiene la stringa Unicode semplice. Stampalo, scrivilo su un file o passalo a un'altra pipeline—a te la scelta. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Output previsto + +Se l'immagine di origine contiene la nota “Buy milk, eggs, and bread”, vedrai qualcosa di simile: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Nota come l'output conserva la punteggiatura e le interruzioni di riga (se presenti). Se ottieni nonsense, ricontrolla la qualità dell'immagine e il flag `enable_handwritten_recognition`. + +## Gestione dei problemi comuni + +| Problema | Sintomo | Soluzione | +|----------|---------|-----------| +| Punteggi di bassa confidenza | Molti “?” o caratteri senza senso | Aumenta la DPI dell'immagine a ≥300, applica binarizzazione (`opencv`) o ritaglia l'area di interesse. | +| Lingue miste | L'output mescola inglese e un altro script | Imposta `engine.ocr_settings.language = "eng"` (o un altro codice ISO) prima di `recognize()`. | +| File di grandi dimensioni | Lungo tempo di elaborazione o errore di memoria | Ridimensiona l'immagine a una dimensione ragionevole (es. larghezza massima 1200 px) prima del caricamento. | +| Tesseract mancante | `ImportError` o `FileNotFoundError` | Installa Tesseract separatamente e assicurati che sia nel PATH di sistema. | + +Questi aggiustamenti mantengono il tuo workflow **convert handwritten photo to text** robusto su dataset eterogenei. + +## Script completo che puoi eseguire oggi + +Di seguito trovi il programma completo e autonomo che mette insieme tutti i pezzi. Copialo in un file chiamato `handwritten_ocr.py` ed esegui `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Eseguilo e vedrai il testo stampato sulla console—esattamente il risultato di cui hai bisogno quando vuoi **convert handwritten photo to text**. + +## Approfondimenti + +Ora che hai padroneggiato le basi di **extract text from handwritten image**, considera i prossimi passi: + +- **Elaborazione batch:** Scorri una cartella di immagini e salva ogni risultato in un file CSV. +- **Post‑processing:** Usa espressioni regolari per pulire gli errori OCR comuni (es. “1” vs “l”). +- **Integrazione:** Inserisci le stringhe estratte in una pipeline di Natural Language Processing per analisi del sentiment o estrazione di parole chiave. +- **Librerie alternative:** Se ti serve maggiore precisione, esplora `easyocr` o `pytesseract` con modelli LSTM personalizzati—entrambi supportano **python ocr handwritten recognition**. + +Ricorda, la qualità dell'immagine di origine spesso determina il successo, quindi dedica qualche minuto al pre‑processing. Un piccolo sforzo ora ti salva da molte ore di debug in seguito. + +## Conclusione + +Abbiamo attraversato un esempio completo, end‑to‑end, che mostra **come riconoscere il testo scritto a mano** e, soprattutto, **come estrarre testo da immagine scritta a mano** usando Python. Installando il pacchetto `ocr`, attivando il flag handwritten, caricando la tua foto e chiamando `recognize()`, puoi **convert handwritten photo to text** in poche righe di codice. + +Provalo con i tuoi appunti, modifica i passaggi di pre‑processing e lascia che l'OCR faccia il lavoro pesante. Se incontri difficoltà, consulta di nuovo la tabella “Gestione dei problemi comuni” o sperimenta con altri back‑end OCR. Buon coding, e che i tuoi dati scritti a mano diventino subito ricercabili! + +## Cosa dovresti imparare dopo? + +I seguenti tutorial coprono argomenti strettamente correlati che si basano sulle tecniche dimostrate in questa guida. Ogni risorsa include esempi di codice completi e funzionanti con spiegazioni passo‑passo per aiutarti a padroneggiare funzionalità API aggiuntive ed esplorare approcci di implementazione alternativi nei tuoi progetti. + +- [Estrai testo da immagine con Aspose OCR – Guida passo‑passo](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Come riconoscere i rettangoli di pagina per il riconoscimento del testo OCR in Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Come usare OCR - Riconoscere l'immagine senza rilevamento dell'area di testo](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/italian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/italian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..a3173bff7 --- /dev/null +++ b/ocr/italian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,257 @@ +--- +category: general +date: 2026-06-06 +description: 'Guida al set di caratteri OCR: impara come utilizzare il motore OCR + per elencare i caratteri supportati per gli script latino e cirilico con esempi + Python completi.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: it +og_description: 'Guida al set di caratteri OCR: scopri come utilizzare il motore OCR + in Python per elencare i caratteri supportati per vari script, completa di codice + e consigli.' +og_title: Set di caratteri OCR – Recupera e utilizza il motore OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Set di caratteri OCR – Recupera e utilizza il motore OCR in Python +url: /it/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Set di Caratteri OCR – Recupera e Usa il Motore OCR in Python + +Vuoi conoscere il **ocr character set** supportato dalla tua libreria? In questo tutorial ti mostreremo come **use OCR engine** per interrogare i caratteri supportati per diversi script e perché è importante per progetti reali. + +Se ti sei mai trovato davanti a uno schermo vuoto chiedendoti perché alcune lettere non compaiono mai dopo l'elaborazione OCR, non sei solo. La risposta è spesso nascosta nel set di caratteri che il motore conosce. Alla fine di questa guida sarai in grado di: + +* Elencare ogni carattere che un motore OCR può riconoscere per uno script dato. +* Gestire i casi in cui uno script non è supportato. +* Inserire le informazioni del character‑set nella validazione a valle o nella logica UI. + +Nessun trucco magico di tipo black‑box—solo Python puro, un paio di righe di codice e una breve spiegazione. + +--- + +## Prerequisiti + +Prima di iniziare, assicurati di avere: + +* Python 3.8+ installato (il codice utilizza le f‑string). +* La libreria OCR che fornisce `ocr.OcrEngine`—per questo esempio supporremo che un pacchetto chiamato `ocr` sia già installato tramite `pip install ocr-lib`. +* Familiarità di base con il sistema di import di Python e la stampa. + +Se ti manca la libreria, esegui: + +```bash +pip install ocr-lib +``` + +È tutto—nessun binario aggiuntivo o dipendenze native necessarie per le query di base sul character‑set che eseguiremo. + +--- + +## Passo 1: Inizializza l'Istanza del Motore OCR + +Creare un oggetto engine è la prima cosa da fare quando **use OCR engine** funzionalità. Pensalo come accendere uno scanner; senza di esso, non puoi porre domande. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +La classe `OcrEngine` è un wrapper leggero attorno al motore di riconoscimento sottostante. Non avvia elaborazioni pesanti finché non richiedi qualcosa, quindi il costo di avvio è trascurabile. + +> **Suggerimento:** Se la tua applicazione crea molte istanze del motore, riutilizza una singola. Risparmia memoria e riduce la latenza di inizializzazione. + +--- + +## Passo 2: Interroga il Set di Caratteri OCR per uno Script Specifico + +Ora che abbiamo un engine, possiamo chiedergli quali caratteri conosce per un particolare sistema di scrittura. Il metodo `get_supported_characters(script_name)` restituisce una `list` Python di caratteri Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Eseguendo lo snippet sopra stampa qualcosa del genere: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +L'output esatto dipende dalla versione della libreria OCR, ma otterrai sempre una lista piatta di caratteri. Se ti servono sia maiuscole che minuscole, richiedi semplicemente lo script `"Latin"` e poi applica `.lower()` a ciascun elemento, oppure interroga uno script separato `"Latin‑lower"` se la libreria li distingue. + +### Perché è importante? + +Quando fornisci un'immagine contenente diacritici insoliti (ad esempio “ñ” o “ø”), il motore OCR può sostituirli silenziosamente con un segnaposto o eliminarli del tutto. Conoscere in anticipo il **ocr character set** ti consente di pre‑validare l'input, avvisare gli utenti o ricorrere a un motore diverso. + +--- + +## Passo 3: Recupera i Caratteri per un Altro Script – Esempio Cirillico + +Lo stesso metodo funziona per qualsiasi script che il motore dichiara di supportare. Vediamo cosa succede con il cirillico. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Output tipico: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Se il motore **non** supporta lo script richiesto, solitamente solleva un `ValueError` (o restituisce una lista vuota a seconda dell'implementazione). Proteggiamoci da questo. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Passo 4: Visualizza il Set di Caratteri OCR (Opzionale) + +A volte una visualizzazione rapida è utile, soprattutto quando devi mostrare agli stakeholder quali caratteri sono coperti. Di seguito un esempio minimale che utilizza `matplotlib` per renderizzare una griglia di caratteri. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Testo alternativo dell'immagine:** ![OCR character set diagram](ocr_character_set.png){alt="Panoramica del set di caratteri OCR"} + +Il diagramma non è necessario per la soluzione di base, ma dimostra come puoi **use OCR engine** i metadati oltre la semplice stampa. + +--- + +## Passo 5: Integra il Set di Caratteri nei Flussi di Lavoro Reali + +Ora che possiamo recuperare il **ocr character set**, vediamo uno scenario pratico: convalidare l'output OCR prima di archiviarlo in un database. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Questo schema impedisce che dati spazzatura infiltrino le pipeline a valle, un problema comune quando si gestiscono documenti multilingue. + +--- + +## Problemi Comuni e Casi Limite + +| Problema | Perché accade | Come evitarlo | +|----------|----------------|---------------| +| **Errore di digitazione del nome dello script** (ad es., `"Cyrillic "` con spazio finale) | Il motore interpreta la stringa letteralmente e non riesce a trovare lo script. | Rimuovi gli spazi: `script.strip()` prima di chiamare `get_supported_characters`. | +| **Lista di caratteri vuota** | Alcuni motori espongono uno script ma non hanno ancora caricato il modello linguistico. | Chiama `engine.load_language_model(script)` se la libreria fornisce tale metodo, oppure assicurati che i file del modello siano presenti. | +| **Problemi di normalizzazione Unicode** | Caratteri come “é” possono apparire composti (`\u00E9`) o decomposti (`e\u0301`). | Normalizza le stringhe con `unicodedata.normalize('NFC', text)` prima della convalida. | +| **Prestazioni su script di grandi dimensioni** (ad es., Cinese) | Recuperare migliaia di caratteri può essere lento. | Cache il risultato dopo la prima chiamata; la maggior parte delle applicazioni ha bisogno della lista una sola volta. | + +--- + +## Esempio Completo Funzionante + +Mettendo tutto insieme, ecco uno script unico che puoi copiare‑incollare ed eseguire: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Cosa Dovresti Imparare Dopo? + +I seguenti tutorial coprono argomenti strettamente correlati che si basano sulle tecniche dimostrate in questa guida. Ogni risorsa include esempi di codice completi e funzionanti con spiegazioni passo‑passo per aiutarti a padroneggiare funzionalità API aggiuntive ed esplorare approcci di implementazione alternativi nei tuoi progetti. + +- [Tutorial Aspose OCR – Riconoscimento Ottico dei Caratteri](/ocr/english/) +- [Post-elaborazione OCR – Ottenere le Scelte dei Caratteri](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Come Impostare il Valore di Soglia nel Riconoscimento Immagini OCR](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/italian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/italian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..5ada96681 --- /dev/null +++ b/ocr/italian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: Esegui OCR su un'immagine usando Aspose OCR e un modello Hugging Face. + Scopri come scaricare il modello Hugging Face, estrarre il testo dalla fattura e + liberare le risorse GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: it +og_description: Esegui OCR su un'immagine usando Aspose OCR e un modello Hugging Face. + Questo tutorial mostra come scaricare il modello, estrarre il testo dalla fattura + e liberare le risorse GPU. +og_title: Esegui OCR su un'immagine con Aspose OCR e LLM – Guida completa +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Esegui OCR su immagine con Aspose OCR e LLM – Guida completa +url: /it/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Esegui OCR su Immagine con Aspose OCR & LLM – Guida Completa + +Hai mai voluto **eseguire OCR su file immagine** ma ti sei bloccato alla domanda “da dove comincio?”? Non sei solo—molti sviluppatori incontrano questo ostacolo quando affrontano per la prima volta l’automazione dei documenti. La buona notizia è che, con Aspose OCR e un LLM leggero di Hugging Face, puoi trasformare una scansione grezza di una fattura in testo pulito e ricercabile in poche righe di Python. + +In questo tutorial percorreremo tutto ciò di cui hai bisogno: dal **caricamento dell’immagine per OCR**, al **download del modello Hugging Face**, all’**estrazione del testo da fattura**, fino a **liberare le risorse GPU** così la tua app rimane leggera. Alla fine avrai uno script autonomo da inserire in qualsiasi progetto. + +--- + +## Cosa Imparerai + +- Come **eseguire OCR su immagine** usando `OcrEngine` di Aspose. +- I passaggi esatti per **scaricare automaticamente i file del modello Hugging Face**. +- Tecniche per **estrarre testo da fattura** in PDF o PNG con post‑processing potenziato dall’AI. +- Best practice per **liberare le risorse GPU** dopo l’inferenza. +- Consigli per **caricare immagine per OCR** in modo efficiente ed evitare errori comuni. + +Nessuna documentazione esterna è necessaria—tutto ciò che ti serve è qui, con codice completo, spiegazioni e output atteso. + +--- + +## Prerequisiti + +Prima di iniziare, assicurati di avere: + +| Requisito | Motivo | +|-----------|--------| +| Python 3.9+ | Sintassi moderna e type hints | +| pacchetto `asposeocr` (`pip install asposeocr`) | Motore OCR principale | +| Accesso a una GPU (opzionale ma consigliato) | Accelera il post‑processor LLM | +| Un’immagine di fattura (`sample_invoice.png`) | Caso di test reale | + +Se ti manca qualcosa, installalo ora; lo script **scaricherà automaticamente il modello Hugging Face**, così non dovrai cercare i file da solo. + +--- + +## Passo 1: Esegui OCR su Immagine – Crea il Motore + +La prima cosa da fare è avviare il motore OCR di Aspose. Pensalo come una tela vuota dove l’immagine verrà poi trasformata in testo. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Perché è importante:** `OcrEngine` astrae tutta la pre‑elaborazione a basso livello, così puoi concentrarti sul flusso di lavoro ad alto livello. Espone anche un metodo `set_post_processor` che più tardi ci permetterà di collegare un LLM per un output più intelligente. + +--- + +## Passo 2: Carica Immagine per OCR – Scegli il File Giusto + +Ora che il motore esiste, dobbiamo **caricare immagine per OCR**. Aspose supporta PNG, JPG, TIFF e altri formati. Assicurati che il percorso sia assoluto o relativo alla posizione dello script. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Suggerimento:** Se la tua immagine è grande, considera di ridimensionarla in anticipo per ridurre la pressione sulla memoria. Il motore OCR può gestire scansioni ad alta risoluzione, ma un’immagine a 300 DPI è solitamente il punto ottimale per le fatture. + +--- + +## Passo 3: Esegui OCR Grezzo e Visualizza il Testo Estratto + +Con l’immagine caricata, possiamo finalmente **eseguire OCR su immagine** e vedere cosa restituisce il motore grezzo. Questo passaggio ci fornisce una baseline prima di aggiungere la magia dell’AI. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Output atteso (troncato):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +L’output grezzo contiene spesso interruzioni di riga, caratteri riconosciuti erroneamente o campi mancanti—esattamente il motivo per cui introdurremo un modello linguistico nel passo successivo. + +--- + +## Passo 4: Scarica Modello Hugging Face – Configura il Post‑Processor LLM + +Qui entra in gioco il passo **scarica modello Hugging Face**. Aspose AI può prelevare automaticamente un modello dal hub di Hugging Face se non è già presente su disco. Useremo il modello Qwen2.5‑3B‑Instruct‑GGUF, che bilancia precisione e ingombro di memoria. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Perché funziona:** `allow_auto_download` ti salva dal dover scaricare manualmente il file `.gguf`. La quantizzazione (`int8`) riduce la dimensione del modello a circa 3 GB, rendendolo fattibile sulla maggior parte delle GPU consumer. Regola `gpu_layers` in base all’hardware—più layer sulla GPU = inferenza più veloce. + +--- + +## Passo 5: Estrarre Testo da Fattura con Post‑Processing Potenziato dall’AI + +Ora colleghiamo l’LLM al motore OCR e avviamo un **post‑processor** che pulisce l’output grezzo, corregge gli errori di OCR e formatta i campi della fattura in modo ordinato. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Esempio di output migliorato:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Cosa è successo?** L’LLM ha riconosciuto che “Invoice #12345” doveva essere “Invoice Number: 12345”, ha corretto il formato della data e ha persino inferito il campo “Bill To” che il motore grezzo aveva tralasciato. Questo è il cuore dell’automazione **estrarre testo da fattura**. + +--- + +## Passo 6: Libera le Risorse GPU – Pulizia Dopo l’Elaborazione + +Se esegui questo codice in un servizio a lunga vita (ad es. un’API Flask), devi **liberare le risorse GPU** dopo ogni inferenza per evitare crash per out‑of‑memory. Aspose AI fornisce un metodo semplice per farlo. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Consiglio da esperto:** Chiama `free_resources()` all’interno di un blocco `finally:` se avvolgi la chiamata OCR in un `try/except`. In questo modo la pulizia avviene anche in caso di eccezione. + +--- + +## Passo 7: Script Completo – Metti Tutto Insieme + +Di seguito trovi lo script completo, pronto per l’esecuzione. Copialo, adatta i percorsi e sei a posto. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Esegui lo script e osserva la trasformazione dall’OCR rumoroso a dati di fattura puliti e strutturati. 🎉 + +--- + +## Domande Frequenti & Casi Limite + +| Domanda | Risposta | +|----------|----------| +| **E se il modello non riesce a scaricare?** | Assicurati che la macchina abbia accesso a Internet e che `hugging_face_repo_id` sia corretto. Puoi anche scaricare manualmente il | + +## Cosa Dovresti Imparare Dopo? + +I tutorial seguenti trattano argomenti strettamente correlati che approfondiscono le tecniche mostrate in questa guida. Ogni risorsa include esempi di codice completi con spiegazioni passo‑passo per aiutarti a padroneggiare ulteriori funzionalità API ed esplorare approcci alternativi nei tuoi progetti. + +- [Estrai testo immagine C# con selezione della lingua usando Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Estrai Testo da Immagine – Ottimizzazione OCR con Aspose.OCR per .NET](/ocr/english/net/ocr-optimization/) +- [Converti Immagine in Testo – Esegui OCR su Immagine da URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/italian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/italian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..b086ba0b8 --- /dev/null +++ b/ocr/italian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,284 @@ +--- +category: general +date: 2026-06-06 +description: Riconoscere il testo da un'immagine con Aspose OCR – impara come caricare + l'immagine per l'OCR ed eseguire l'OCR sull'immagine usando il post‑processing AI + in Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: it +og_description: Riconosci rapidamente il testo da un'immagine. Questa guida mostra + come caricare un'immagine per OCR, eseguire l'OCR sull'immagine e migliorare i risultati + con il post‑processing AI. +og_title: Riconosci il testo da un'immagine con Aspose OCR e AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Riconosci il testo da un'immagine usando Aspose OCR e AI +url: /it/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# riconoscere testo da immagine usando Aspose OCR & AI + +Ti è mai capitato di dover riconoscere testo da un'immagine ma non eri sicuro quale libreria ti offrisse sia velocità che precisione? Non sei solo. In questa guida percorreremo un esempio completo, end‑to‑end, che mostra **come caricare l'immagine per OCR**, **eseguire OCR sull'immagine**, e poi rifinire l'output con il post‑processore AI di Aspose. Alla fine avrai uno script pronto all'uso che trasforma un PNG in testo pulito e ricercabile. + +## Cosa imparerai + +Copriamo tutto, dall'installazione del pacchetto Aspose OCR al rilascio delle risorse alla fine dell'esecuzione. Vedrai perché abilitare il riconoscimento del testo scritto a mano è importante, come configurare un LLM Qwen 2.5 per il post‑processing e come appare l'output finale. Nessun riferimento esterno necessario—basta copiare, incollare e eseguire. + +### Prerequisiti + +- Python 3.8 o versioni successive +- pacchetto `asposeocr` (`pip install asposeocr`) +- Un file immagine (ad es., `doc.png`) che contiene testo stampato o scritto a mano +- Opzionale: una GPU per un'inferenza LLM più veloce (lo script funziona anche su CPU) + +--- + +## Riconoscere testo da immagine – Passo‑per‑Passo + +Sotto ogni blocco di codice troverai una breve spiegazione del **perché** facciamo quella specifica azione, non solo del **cosa** fa la riga. + +### Passo 1: Installa e importa i moduli richiesti + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Perché?* Importare `asposeocr` ci fornisce la classe `OcrEngine`, mentre il sottomodulo `ai` fornisce il post‑processore basato su LLM che migliora notevolmente l'output OCR grezzo. + +### Passo 2: Crea il motore OCR e abilita il riconoscimento del testo scritto a mano + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Abilitare il riconoscimento della scrittura a mano espande il set di caratteri del motore, così non perderai le scarabocchiature quando **esegui OCR sull'immagine** su file che contengono testo stampato e corsivo misto. + +### Passo 3: Carica l'immagine per OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +La chiamata `load_image` è il momento in cui **carichi l'immagine per OCR**; se il percorso è errato, il motore solleverà un'eccezione informativa, risparmiandoti errori criptici a valle. + +### Passo 4: Esegui la fase OCR grezza + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +In questa fase ottieni un oggetto `RecognitionResult` che contiene il testo non filtrato, i punteggi di confidenza e i metadati di layout. È spesso rumoroso—da qui la necessità di una pulizia guidata dall'AI. + +### Passo 5: Configura il modello AI di Aspose per il post‑processing LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Perché preoccuparsi di queste impostazioni? +- **auto‑download** garantisce che il modello sia disponibile al primo avvio. +- **int8 quantization** riduce drasticamente la memoria richiesta senza una grande perdita di precisione. +- **gpu_layers** ti permette di sfruttare una GPU compatibile per un'inferenza più veloce. + +### Passo 6: Inizializza il processore AI e collegalo come post‑processore + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Collegare il processore significa che ogni volta che chiami `run_postprocessor`, il LLM pulirà l'ortografia, unirà parole spezzate e persino inferirà la punteggiatura mancante. + +### Passo 7: Esegui il post‑processore per migliorare l'output OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` è solitamente molto più leggibile della stringa grezza—pensalo come un correttore ortografico che comprende anche il contesto. + +### Passo 8: Rilascia le risorse + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Pulire è fondamentale nei servizi a lungo termine; altrimenti perderai memoria GPU e alla fine l'applicazione si bloccherà. + +--- + +## Script completo che puoi eseguire oggi + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Output atteso** (esempio per un'immagine di fattura semplice): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Nota come lo strato AI ha corretto “Inv0ice” → “Invoice” e ha aggiunto la punteggiatura mancante. + +--- + +## Domande frequenti (e risposte rapide) + +- **Ho bisogno di una GPU?** No. Lo script ricade su CPU, ma l'inferenza sarà più lenta. +- **Posso usare un LLM diverso?** Assolutamente—basta cambiare `hugging_face_repo_id` e regolare `gpu_layers` di conseguenza. +- **E se la mia immagine è enorme?** Ridimensionala prima (ad es., usando Pillow) per mantenere un uso della memoria ragionevole. +- **Il riconoscimento della scrittura a mano è sempre attivo?** Puoi attivare/disattivare `enable_handwritten_recognition` a seconda del tuo carico di lavoro. + +--- + +## Conclusione + +Ora sai come **riconoscere testo da immagine** usando Aspose OCR, come **caricare l'immagine per OCR** e come **eseguire OCR sull'immagine** con post‑processing potenziato dall'AI. L'esempio completo e eseguibile sopra ti fornisce una solida base per integrare OCR in qualsiasi progetto Python—che si tratti di scansionare ricevute, digitalizzare contratti o estrarre dati da moduli scritti a mano. + +Pronto per il passo successivo? Prova a sostituire il modello Qwen con uno più grande, sperimenta diversi schemi di quantizzazione o concatena più immagini per l'elaborazione batch. Le possibilità sono infinite, e il codice che hai appena creato le gestirà con eleganza. + +Buona programmazione, e che i risultati del tuo OCR siano sempre cristallini! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Screenshot che mostra come riconoscere testo da immagine usando Aspose OCR"} + +## Cosa dovresti imparare dopo? + +I seguenti tutorial coprono argomenti strettamente correlati che si basano sulle tecniche dimostrate in questa guida. Ogni risorsa include esempi di codice completi e funzionanti con spiegazioni passo‑per‑passo per aiutarti a padroneggiare funzionalità API aggiuntive ed esplorare approcci di implementazione alternativi nei tuoi progetti. + +- [Estrai testo da immagine con Aspose OCR – Guida passo‑per‑passo](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Converti immagine in testo – Esegui OCR su immagine da URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Estrai testo immagine C# con selezione della lingua usando Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/japanese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/japanese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..4ac6a3c09 --- /dev/null +++ b/ocr/japanese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,210 @@ +--- +category: general +date: 2026-06-06 +description: Python OCR を使用して手書き画像からテキストを抽出します。手書きの写真を迅速かつ確実にテキストに変換する方法を学びましょう。 +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: ja +og_description: Pythonで手書き画像からテキストを抽出する。このガイドでは、手書きの写真をテキストに変換する方法と、手書きテキストを認識する方法を解説します。 +og_title: 手書き画像からテキストを抽出 – Python OCRチュートリアル +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Python OCRで手書き画像からテキストを抽出する – ステップバイステップガイド +url: /ja/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 手書き画像からテキストを抽出する Python OCR – ステップバイステップガイド + +スマートフォンで撮影した写真の中の**手書きテキストを認識する方法**を考えたことはありませんか? あなただけではありません。講義ノートのデジタル化や署名済みフォームからのデータ抽出など、さまざまなプロジェクトで**手書き画像からテキストを抽出**する必要があり、迅速かつ手間なく行いたいものです。 + +このチュートリアルでは、人気の Python OCR ライブラリを使って**手書き写真をテキストに変換**する方法を、実際に動作する完全な例を通して解説します。曖昧な説明はなく、具体的なコード、解説、すぐにコピー&ペーストできるヒントだけをご提供します。 + +![手書き画像からテキストを抽出](https://example.com/placeholder-handwritten.jpg "手書き画像からテキストを抽出") + +## 必要なもの + +| 要件 | 重要な理由 | +|------|------------| +| Python 3.9 or newer | 最新の構文とライブラリサポート | +| `pip` (Python package manager) | OCR パッケージをインストールするため | +| A clear image of handwritten notes (JPEG/PNG) | 画像がぼやけていると OCR の精度が低下します | +| Basic familiarity with Python functions | 後で例をカスタマイズするのに役立ちます | + +これらが揃っていない場合は、 から最新の Python を取得してインストールしてください—特別な手間は不要です。 + +## Python OCR ライブラリのインストール + +使用するコードスニペットは `ocr` パッケージ(Tesseract の薄いラッパーで、便利な設定が追加されています)に依存しています。以下のコマンドで一括インストールできます。 + +```bash +pip install ocr +``` + +> **Pro tip:** インストール後、端末で `tesseract --version` を実行して基盤エンジンが存在することを確認してください。Tesseract が未インストールの場合は、公式ガイドに従って OS に合わせてインストールしましょう(Ubuntu なら `apt-get install tesseract-ocr`、macOS なら `brew install tesseract`)。 + +## ステップ 1: OCR エンジンインスタンスの作成 + +Creating the engine is the first brick in the wall of **python ocr handwritten recognition**. Think of the engine as the brain that will later read the scribbles. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Why this matters: without an engine you can’t tweak the recognition pipeline. The default settings are tuned for printed text, so we’ll need to adjust them in the next step. + +## ステップ 2: 手書き認識の有効化 + +By default the engine assumes printed characters. Enabling handwritten mode flips a switch that tells Tesseract to use its LSTM model trained on cursive strokes. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **What if you skip this?** The OCR will treat the strokes as noise, resulting in garbled output. Enabling the flag is the core of **how to recognize handwritten text**. + +## ステップ 3: 手書き写真の読み込み + +Now we point the engine at the image file. The path can be absolute or relative; just be sure the file exists. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +A quick sanity check: open the image in your OS’s viewer. If the text is blurry, consider pre‑processing (contrast boost, rotation) before feeding it to the engine—those tweaks often raise the success rate of **extract text from handwritten image**. + +## ステップ 4: 認識プロセスの実行 + +With everything set, we finally ask the engine to do its job. The `recognize()` method returns a result object that holds the extracted string and confidence scores. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Behind the scenes, the engine converts the bitmap into a series of feature vectors, runs them through the LSTM network, and stitches the characters together. That’s the magic of **python ocr handwritten recognition**. + +## ステップ 5: 抽出されたテキストの表示 + +The result object exposes a `.text` attribute that contains the plain Unicode string. Print it, write it to a file, or feed it into another pipeline—your call. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### 期待される出力 + +If the source image contains the note “Buy milk, eggs, and bread”, you’d see something like: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Notice how the output preserves punctuation and line breaks (if any). If you get gibberish, double‑check the image quality and the `enable_handwritten_recognition` flag. + +## よくある落とし穴の対処法 + +| Issue | Symptom | Fix | +|-------|---------|-----| +| Low confidence scores | Many “?” or nonsensical characters | Increase image DPI to ≥300, apply binarization (`opencv`), or crop to the region of interest. | +| Mixed languages | Output mixes English and another script | Set `engine.ocr_settings.language = "eng"` (or another ISO code) before `recognize()`. | +| Large files | Long processing time or memory error | Resize the image to a reasonable dimension (e.g., max width 1200 px) before loading. | +| Missing Tesseract | `ImportError` or `FileNotFoundError` | Install Tesseract separately and ensure it’s on your system PATH. | + +These adjustments keep your **convert handwritten photo to text** workflow robust across diverse datasets. + +## 今日すぐに実行できる完全スクリプト + +Below is the complete, self‑contained program that puts all the pieces together. Copy it into a file named `handwritten_ocr.py` and execute `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Run it, and you’ll see the text printed to the console—exactly the result you need when you want to **convert handwritten photo to text**. + +## さらに踏み込む + +Now that you’ve mastered the basics of **extract text from handwritten image**, consider these next steps: + +- **Batch processing:** Loop over a folder of images and store each result in a CSV file. +- **Post‑processing:** Use regular expressions to clean up common OCR errors (e.g., “1” vs “l”). +- **Integration:** Feed the extracted strings into a Natural Language Processing pipeline for sentiment analysis or keyword extraction. +- **Alternative libraries:** If you need higher accuracy, explore `easyocr` or `pytesseract` with custom LSTM models—both support **python ocr handwritten recognition** as well. + +Remember, the quality of the source image often dictates success, so spend a few minutes on preprocessing. A little extra effort now saves you a lot of debugging later. + +## 結論 + +We’ve walked through a complete, end‑to‑end example that shows **how to recognize handwritten text** and, more importantly, how to **extract text from handwritten image** using Python. By installing the `ocr` package, toggling the handwritten flag, loading your picture, and calling `recognize()`, you can **convert handwritten photo to text** in just a handful of lines. + +Give it a try with your own notes, tweak the preprocessing steps, and let the OCR do the heavy lifting. If you hit any snags, revisit the “Handling Common Pitfalls” table or experiment with alternative OCR back‑ends. Happy coding, and may your handwritten data become instantly searchable! + +## 次に学ぶべきことは? + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step-by-step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [How to Recognize Page Rectangles for OCR Text Recognition in Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [How to Use OCR - Recognize Image without Text Area Detection](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/japanese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/japanese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..5d74c0e18 --- /dev/null +++ b/ocr/japanese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,254 @@ +--- +category: general +date: 2026-06-06 +description: OCR文字セットガイド:OCRエンジンの使い方を学び、ラテン文字とキリル文字のサポート文字を一覧表示する方法を、完全なPython例とともに紹介します。 +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: ja +og_description: OCR文字セットガイド:PythonでOCRエンジンを使用し、さまざまなスクリプトでサポートされている文字を一覧表示する方法を、コードとヒントとともに紹介します。 +og_title: OCR文字セット – OCRエンジンを取得して使用する +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR文字セット – PythonでOCRエンジンを取得して使用する +url: /ja/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR Character Set – Retrieve and Use OCR Engine in Python + +ライブラリがサポートしている **ocr character set** を知りたいですか?このチュートリアルでは、**use OCR engine** を使ってさまざまなスクリプトでサポートされている文字を照会する方法と、実際のプロジェクトでそれがなぜ重要かを解説します。 + +OCR 処理後に一部の文字が表示されないことに戸惑ったことがあるなら、あなたは一人ではありません。答えはエンジンが認識できる文字セットに隠れています。このガイドを読み終えると、以下ができるようになります。 + +* 指定したスクリプトに対して OCR エンジンが認識できるすべての文字を一覧表示する。 +* スクリプトがサポートされていない場合の処理方法。 +* 文字セット情報を下流のバリデーションや UI ロジックに組み込む。 + +魔法のブラックボックスはありません—ただの Python、数行のコード、そして少しの説明です。 + +--- + +## Prerequisites + +始める前に、以下が揃っていることを確認してください。 + +* Python 3.8+ がインストールされていること(コードは f‑strings を使用)。 +* `ocr.OcrEngine` を提供する OCR ライブラリ—この例では `pip install ocr-lib` でインストールされた `ocr` パッケージを想定しています。 +* Python のインポートシステムと `print` の基本的な使い方に慣れていること。 + +ライブラリが未インストールの場合は、以下を実行してください。 + +```bash +pip install ocr-lib +``` + +以上です—基本的な文字セット照会に必要な追加バイナリやネイティブ依存関係は不要です。 + +--- + +## Step 1: Initialize the OCR Engine Instance + +**use OCR engine** 機能を利用する際に最初に行うのがエンジンオブジェクトの作成です。スキャナーの電源を入れるイメージです。電源が入っていなければ、何も質問できません。 + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +`OcrEngine` クラスは基盤となる認識エンジンを軽量にラップしたものです。何かを要求するまで重い処理は開始されないので、起動コストはほぼ無視できます。 + +> **Pro tip:** アプリケーションで多数のエンジンインスタンスを作成する場合は、1つだけ再利用してください。メモリ節約と初期化遅延の削減につながります。 + +--- + +## Step 2: Query the OCR Character Set for a Specific Script + +エンジンが用意できたので、特定の書記体系でどの文字が認識できるか問い合わせます。`get_supported_characters(script_name)` メソッドは Unicode 文字の Python `list` を返します。 + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +上記スニペットを実行すると、次のような出力が得られます。 + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +正確な出力は OCR ライブラリのバージョンに依存しますが、常にフラットな文字リストが返ります。大文字と小文字の両方が必要な場合は、`"Latin"` スクリプトを取得して各要素に `.lower()` を適用するか、ライブラリが区別している場合は `"Latin‑lower"` スクリプトを別途問い合わせてください。 + +### Why does this matter? + +画像に珍しいアクセント記号(例: “ñ” や “ø”)が含まれていると、OCR エンジンはそれらをプレースホルダーに置き換えたり、完全に削除したりすることがあります。**ocr character set** を事前に把握しておくことで、入力を事前検証したり、ユーザーに警告したり、別のエンジンにフォールバックしたりできます。 + +--- + +## Step 3: Retrieve Characters for Another Script – Cyrillic Example + +同じメソッドは、エンジンがサポートすると主張する任意のスクリプトで機能します。ここではキリル文字で確認してみましょう。 + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +典型的な出力例: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +要求したスクリプトが **not** サポートされている場合、通常は `ValueError` が発生するか(実装によっては)空リストが返ります。これに備えて例外処理を入れましょう。 + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Step 4: Visualize the OCR Character Set (Optional) + +ステークホルダーにカバーされている文字を示す必要があるときなど、簡単なビジュアルが役立ちます。以下は `matplotlib` を使って文字のグリッドを描画する最小例です。 + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **画像の代替テキスト:** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +この図は必須ではありませんが、**use OCR engine** のメタデータを単なるテキスト出力以外で活用できることを示しています。 + +--- + +## Step 5: Integrate the Character Set into Real‑World Workflows + +**ocr character set** を取得できたので、実務シナリオを見てみましょう。データベースに保存する前に OCR 出力を検証する例です。 + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +このパターンは、マルチリンガル文書を扱う際に下流パイプラインへゴミデータが流れ込むのを防ぐ一般的な対策です。 + +--- + +## Common Pitfalls and Edge Cases + +| Pitfall | Why it Happens | How to Avoid | +|---------|----------------|--------------| +| **Script name typo** (e.g., `"Cyrillic "` with trailing space) | The engine treats the string literally and can’t find the script. | Strip whitespace: `script.strip()` before calling `get_supported_characters`. | +| **Empty character list** | Some engines expose a script but haven’t loaded the language model yet. | Call `engine.load_language_model(script)` if the library provides such a method, or ensure the model files are present. | +| **Unicode normalization issues** | Characters like “é” may appear as composed (`\u00E9`) or decomposed (`e\u0301`). | Normalise strings with `unicodedata.normalize('NFC', text)` before validation. | +| **Performance on huge scripts** (e.g., Chinese) | Retrieving thousands of characters can be slow. | Cache the result after the first call; most applications only need the list once. | + +--- + +## Full Working Example + +すべてをまとめた、コピー&ペーストで実行できる単一スクリプトを示します。 + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## What Should You Learn Next? + + +以下のチュートリアルは、本ガイドで示した手法を基にした関連トピックを扱っています。各リソースには、ステップバイステップの解説と完全なコード例が含まれており、追加の API 機能を習得したり、独自プロジェクトで代替実装アプローチを探求したりするのに役立ちます。 + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/japanese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/japanese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..e33abb482 --- /dev/null +++ b/ocr/japanese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,254 @@ +--- +category: general +date: 2026-06-06 +description: Aspose OCR と Hugging Face モデルを使用して画像の OCR を実行します。Hugging Face モデルのダウンロード方法、請求書からテキストを抽出する方法、GPU + リソースを解放する方法を学びます。 +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: ja +og_description: Aspose OCR と Hugging Face モデルを使用して画像の OCR を実行します。このチュートリアルでは、モデルのダウンロード方法、請求書からテキストを抽出する方法、GPU + リソースを解放する方法を示します。 +og_title: Aspose OCR と LLM を使用した画像の OCR 実行 – 完全ガイド +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Aspose OCR と LLM を使用した画像の OCR 実行 – 完全ガイド +url: /ja/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 画像で OCR を実行する – Aspose OCR と LLM 完全ガイド + +画像ファイルで **perform OCR on image** を実行したいと思ったことはありますか、しかし「どこから始めればいいのか?」という疑問で行き詰まっていませんか? あなたは一人ではありません—多くの開発者がドキュメント自動化に初めて取り組むときに同じ壁にぶつかります。良いニュースは、Aspose OCR と Hugging Face の軽量 LLM を組み合わせることで、請求書のスキャン画像を数行の Python コードだけでクリーンで検索可能なテキストに変換できることです。 + +このチュートリアルでは、必要なすべてを順に解説します:**loading image for OCR**、**download the Hugging Face model**、**extract text from invoice** データ、そして最後に **free GPU resources** を解放してアプリを軽量に保つ方法です。最後まで読むと、任意のプロジェクトに組み込める自己完結型スクリプトが手に入ります。 + +--- + +## 学べること + +- Aspose の `OcrEngine` を使用して **perform OCR on image** を実行する方法。 +- **download Hugging Face model** ファイルを自動的に取得する正確な手順。 +- AI 強化されたポストプロセッシングで **extract text from invoice** の PDF または PNG を処理するテクニック。 +- 推論後に **free GPU resources** を解放するベストプラクティス。 +- **load image for OCR** を効率的に行い、一般的な落とし穴を回避するヒント。 + +外部ドキュメントは不要です—必要なものはすべてここに揃っており、完全なコード、解説、期待される出力が含まれています。 + +--- + +## 前提条件 + +本格的に始める前に、以下が揃っていることを確認してください: + +| Requirement | Reason | +|-------------|--------| +| Python 3.9+ | モダンな構文と型ヒント | +| `asposeocr` package (`pip install asposeocr`) | コア OCR エンジン | +| Access to a GPU (optional but recommended) | LLM ポストプロセッサを高速化 | +| An invoice image (`sample_invoice.png`) | 実際のテストケース | + +これらが揃っていない場合は、今すぐインストールしてください。スクリプトは **download Hugging Face model** を自動的にダウンロードするので、ファイルを自分で探す必要はありません。 + +--- + +## Step 1: Perform OCR on Image – エンジンの作成 + +最初に行うべきことは、Aspose の OCR エンジンを起動することです。これは、画像が後でテキストに変換される空白のキャンバスを開くようなものです。 + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Why this matters:** `OcrEngine` は低レベルの画像前処理をすべて抽象化するため、上位レベルのワークフローに集中できます。また、後で LLM をフックしてより賢い出力を得るための `set_post_processor` メソッドも提供します。 + +--- + +## Step 2: Load Image for OCR – 正しいファイルを選択 + +エンジンが用意できたので、**load image for OCR** が必要です。Aspose は PNG、JPG、TIFF など多数のフォーマットをサポートしています。パスは絶対パスまたはスクリプトの場所からの相対パスであることを確認してください。 + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tip:** 画像が大きい場合は、事前にリサイズしてメモリ負荷を減らすことを検討してください。OCR エンジンは高解像度のスキャンを処理できますが、請求書の場合は通常 300 DPI の画像が最適です。 + +--- + +## Step 3: Perform Raw OCR and View the Extracted Text + +画像をロードしたら、いよいよ **perform OCR on image** を実行し、エンジンが出力する生のテキストを確認できます。このステップは、AI マジックを加える前のベースラインとなります。 + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**期待される出力(抜粋):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +生の出力には改行や誤認識文字、欠落フィールドが含まれることが多く、これが次に言語モデルを導入する理由です。 + +--- + +## Step 4: Download Hugging Face Model – LLM ポストプロセッサの設定 + +ここが **download Hugging Face model** 手順の見どころです。Aspose AI は、ディスクに存在しない場合に Hugging Face ハブから自動的にモデルを取得できます。精度とメモリ使用量のバランスが取れた Qwen2.5‑3B‑Instruct‑GGUF モデルを使用します。 + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Why this works:** `allow_auto_download` により `.gguf` ファイルを手動でダウンロードする手間が省けます。量子化(`int8`)によりモデルサイズは約 3 GB に削減され、ほとんどの一般的な GPU でも実行可能です。ハードウェアに応じて `gpu_layers` を調整してください—GPU 上のレイヤーが多いほど推論が高速になります。 + +--- + +## Step 5: Extract Text from Invoice Using AI‑Enhanced Post‑Processing + +ここで LLM を OCR エンジンに接続し、**post‑processor** を実行して生の出力をクリーンアップし、OCR の誤りを修正し、請求書の項目を整然とフォーマットします。 + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**サンプルの強化出力:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **What happened?** LLM は “Invoice #12345” を “Invoice Number: 12345” にすべきことを認識し、日付形式を修正し、さらに生エンジンが見逃した “Bill To” フィールドを推測しました。これが **extract text from invoice** 自動化の核心です。 + +--- + +## Step 6: Free GPU Resources – 処理後のクリーンアップ + +長時間稼働するサービス(例: Flask API)で実行する場合、各推論後に **free GPU resources** を行い、メモリ不足によるクラッシュを防ぐ必要があります。Aspose AI はそのためのシンプルなメソッドを提供しています。 + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro tip:** OCR 呼び出しを try/except でラップしている場合は、`finally:` ブロック内で `free_resources()` を呼び出してください。例外が発生した場合でも確実にクリーンアップが行われます。 + +--- + +## Step 7: Full Script – 全体をまとめる + +以下に、完全に実行可能なスクリプトを示します。コピーして貼り付け、パスを調整すればすぐに使用できます。 + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +スクリプトを実行すると、ノイズの多い OCR 結果がクリーンで構造化された請求書データに変換される様子が確認できます。 🎉 + +--- + +## よくある質問とエッジケース + +| Question | Answer | +|----------|--------| +| **モデルのダウンロードに失敗した場合はどうすればよいですか?** | マシンがインターネットに接続されていること、`hugging_face_repo_id` が正しいことを確認してください。手動でダウンロードすることもできます。 | + +## 次に学ぶべきことは? + +以下のチュートリアルは、本ガイドで示した手法を応用した、密接に関連するトピックを扱っています。各リソースには、完全な動作コード例とステップバイステップの解説が含まれており、追加の API 機能を習得し、独自プロジェクトで代替実装アプローチを探求するのに役立ちます。 + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/japanese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/japanese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..5c6333f8e --- /dev/null +++ b/ocr/japanese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,281 @@ +--- +category: general +date: 2026-06-06 +description: Aspose OCRで画像からテキストを認識する – OCR用に画像を読み込む方法と、PythonでAIポストプロセッシングを使用して画像に対してOCRを実行する方法を学びましょう。 +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: ja +og_description: 画像からテキストを素早く認識します。このガイドでは、OCR用に画像を読み込む方法、画像でOCRを実行する方法、そしてAIによる後処理で結果を向上させる方法を示します。 +og_title: Aspose OCR と AI を使用して画像からテキストを認識する +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Aspose OCR と AI を使用して画像からテキストを認識する +url: /ja/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Aspose OCR と AI を使用した画像からのテキスト認識 + +画像からテキストを認識したいが、速度と精度の両方を提供してくれるライブラリがどれか分からないことはありませんか? あなたは一人ではありません。このガイドでは、**画像を OCR 用にロードする方法**、**画像に対して OCR を実行する方法**、そして Aspose の AI ポストプロセッサで出力を磨く完全なエンドツーエンドの例を順を追って解説します。最後まで実行すれば、PNG をクリーンで検索可能なテキストに変換する実行可能なスクリプトが手に入ります。 + +## 学べること + +Aspose OCR パッケージのインストールから実行終了時のリソース解放までを網羅します。手書きテキスト認識を有効にする重要性、ポストプロセッシング用の Qwen 2.5 LLM の設定方法、最終的な出力イメージを確認できます。外部参照は不要です—コピーして貼り付け、実行するだけです。 + +### 前提条件 + +- Python 3.8 以上 +- `asposeocr` パッケージ(`pip install asposeocr`) +- 印刷テキストまたは手書きテキストを含む画像ファイル(例: `doc.png`) +- オプション: LLM 推論を高速化する GPU(スクリプトは CPU でも動作します) + +--- + +## 画像からテキストを認識する – 手順別 + +各コードブロックの下には、**何を**するかだけでなく、**なぜ**その操作を行うのかという短い説明があります。 + +### ステップ 1: 必要なモジュールのインストールとインポート + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Why?* `asposeocr` をインポートすると `OcrEngine` クラスが利用でき、`ai` サブモジュールは生の OCR 出力を劇的に改善する LLM ベースのポストプロセッサを提供します。 + +### ステップ 2: OCR エンジンの作成と手書きテキスト認識の有効化 + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +手書き認識を有効にするとエンジンの文字セットが拡張され、印刷テキストと手書きテキストが混在した **画像に対して OCR を実行** する際に文字が失われません。 + +### ステップ 3: OCR 用に画像をロード + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +`load_image` 呼び出しは **画像を OCR 用にロード** する瞬間です。パスが間違っているとエンジンは有益な例外を投げ、下流での暗号的なエラーを防ぎます。 + +### ステップ 4: 生の OCR パスを実行 + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +この段階で `RecognitionResult` オブジェクトが取得でき、フィルタリングされていないテキスト、信頼度スコア、レイアウトメタデータが含まれます。ノイズが多くなることがあるため、AI 主導のクリーンアップが必要です。 + +### ステップ 5: LLM ポストプロセッシング用に Aspose AI モデルを設定 + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +なぜこれらの設定が必要か? + +- **auto‑download** は初回実行時にモデルが利用可能であることを保証します。 +- **int8 quantization** は精度への大きな影響なしにメモリ使用量を大幅に削減します。 +- **gpu_layers** は対応 GPU を利用して推論を高速化できます。 + +### ステップ 6: AI プロセッサを初期化し、ポストプロセッサとして添付 + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +プロセッサを添付すると、`run_postprocessor` を呼び出すたびに LLM がスペルを修正し、分割された単語を結合し、欠落した句読点まで推測してくれます。 + +### ステップ 7: ポストプロセッサを実行して OCR 出力を強化 + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` は通常、生の文字列よりはるかに読みやすくなります—文脈も理解するスペルチェッカーと考えてください。 + +### ステップ 8: リソースを解放 + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +長時間稼働するサービスではクリーンアップが重要です。そうしないと GPU メモリがリークし、最終的にアプリケーションがクラッシュします。 + +--- + +## 今すぐ実行できる完全スクリプト + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**期待される出力**(シンプルな請求書画像の例): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +AI 層が “Inv0ice” → “Invoice” を修正し、欠落していた句読点を追加していることに注目してください。 + +--- + +## よくある質問(と簡単な回答) + +- **GPU が必要ですか?** いいえ。スクリプトは CPU にフォールバックしますが、推論は遅くなります。 +- **別の LLM を使用できますか?** もちろんです。`hugging_face_repo_id` を変更し、`gpu_layers` を適宜調整してください。 +- **画像が非常に大きい場合は?** まずリサイズしてください(例: Pillow を使用)でメモリ使用量を抑えます。 +- **手書き認識は常にオンですか?** ワークロードに応じて `enable_handwritten_recognition` を切り替えられます。 + +--- + +## 結論 + +これで Aspose OCR を使用した **画像からテキストを認識** する方法、**画像を OCR 用にロード** する方法、そして AI 強化ポストプロセッシングで **画像に対して OCR を実行** する方法が分かりました。上記の完全な実行例は、レシートのスキャン、契約書のデジタル化、手書きフォームからのデータ抽出など、あらゆる Python プロジェクトに OCR を統合するための堅実な基盤を提供します。 + +次のステップに進む準備はできましたか? Qwen モデルをより大きなものに差し替えたり、異なる量子化方式を試したり、バッチ処理のために複数画像を連結したりしてみてください。可能性は無限大で、今回構築したコードはそれらを優雅に処理します。 + +Happy coding, and may your OCR results always be crystal‑clear! + +![OCR出力が強化されたPythonコンソールのスクリーンショット](/images/ocr_output.png){alt="Aspose OCR を使用して画像からテキストを認識する方法を示すスクリーンショット"} + +## 次に学ぶべきことは? + +以下のチュートリアルは、本ガイドで示した手法を基にした密接に関連するトピックを取り上げています。各リソースには、ステップバイステップの解説付きで完全に動作するコード例が含まれており、追加の API 機能を習得したり、独自プロジェクトで代替実装アプローチを探求したりするのに役立ちます。 + +- [Aspose OCR を使用した画像からのテキスト抽出 – 手順別ガイド](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [画像をテキストに変換 – URL から画像に対して OCR を実行](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Aspose.OCR を使用した C# での画像テキスト抽出(言語選択)](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/korean/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/korean/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..2d9a85f8f --- /dev/null +++ b/ocr/korean/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,214 @@ +--- +category: general +date: 2026-06-06 +description: Python OCR을 사용하여 손글씨 이미지에서 텍스트를 추출합니다. 손글씨 사진을 빠르고 신뢰성 있게 텍스트로 변환하는 방법을 + 배워보세요. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: ko +og_description: Python으로 손글씨 이미지를 텍스트로 추출합니다. 이 가이드는 손글씨 사진을 텍스트로 변환하는 방법을 보여주고, 손글씨 + 텍스트를 인식하는 방법에 대해 답변합니다. +og_title: 손글씨 이미지에서 텍스트 추출 – 파이썬 OCR 튜토리얼 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Python OCR로 손글씨 이미지에서 텍스트 추출 – 단계별 가이드 +url: /ko/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 손글씨 이미지에서 텍스트 추출하기 (Python OCR) – 단계별 가이드 + +휴대폰으로 찍은 사진에서 **손글씨 텍스트를 인식하는 방법**이 궁금했나요? 당신만 그런 것이 아닙니다. 강의 노트를 디지털화하거나 서명된 양식에서 데이터를 추출하는 등 많은 프로젝트에서 **손글씨 이미지에서 텍스트를 추출**해야 할 때가 있습니다. + +이 튜토리얼에서는 인기 있는 Python OCR 라이브러리를 사용해 **손글씨 사진을 텍스트로 변환**하는 완전한 실행 예제를 단계별로 보여드립니다. 모호한 설명이 아니라 바로 복사·붙여넣기 할 수 있는 구체적인 코드와 설명, 팁을 제공합니다. + +![손글씨 이미지에서 텍스트 추출](https://example.com/placeholder-handwritten.jpg "손글씨 이미지에서 텍스트 추출") + +## 준비 사항 + +시작하기 전에 아래 항목들을 확인하세요: + +| 요구 사항 | 이유 | +|-------------|----------------| +| Python 3.9 이상 | 최신 문법 및 라이브러리 지원 | +| `pip` (Python 패키지 관리자) | OCR 패키지를 설치하기 위해 | +| 손글씨 노트가 선명한 이미지 (JPEG/PNG) | 흐린 이미지에서는 OCR 정확도가 떨어짐 | +| Python 함수에 대한 기본 이해 | 이후 예제를 자유롭게 변형할 수 있음 | + +이 중 하나라도 부족하다면 에서 최신 Python을 다운로드하고 설치하세요—별다른 어려움 없이 진행할 수 있습니다. + +## Python OCR 라이브러리 설치 + +우리가 사용할 코드는 `ocr` 패키지(`tesseract` 위에 편리한 설정을 추가한 얇은 래퍼)를 기반으로 합니다. 한 줄 명령으로 설치하세요: + +```bash +pip install ocr +``` + +> **Pro tip:** 설치 후 터미널에서 `tesseract --version`을 실행해 기본 엔진이 존재하는지 확인하세요. Tesseract가 없으면 공식 가이드를 따라 OS에 맞게 설치합니다(`apt-get install tesseract-ocr` on Ubuntu, `brew install tesseract` on macOS). + +## Step 1: OCR 엔진 인스턴스 생성 + +엔진을 만드는 것은 **python ocr handwritten recognition**의 첫 번째 벽돌입니다. 엔진은 나중에 낙서를 읽어낼 두뇌와 같습니다. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +왜 중요한가: 엔진 없이는 인식 파이프라인을 조정할 수 없습니다. 기본 설정은 인쇄된 텍스트에 맞춰져 있으므로 다음 단계에서 조정이 필요합니다. + +## Step 2: 손글씨 인식 활성화 + +기본적으로 엔진은 인쇄된 문자만을 가정합니다. 손글씨 모드를 활성화하면 Tesseract가 필기체 스트로크에 대해 학습된 LSTM 모델을 사용하도록 전환됩니다. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **What if you skip this?** OCR이 스트로크를 잡음으로 처리해 엉망진창 결과가 나옵니다. 이 플래그를 켜는 것이 **손글씨 텍스트를 인식하는 방법**의 핵심입니다. + +## Step 3: 손글씨 사진 로드 + +이제 엔진에 이미지 파일 경로를 지정합니다. 절대 경로나 상대 경로 모두 가능하지만 파일이 실제로 존재해야 합니다. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +간단히 확인해 보세요: OS 뷰어에서 이미지를 열어 텍스트가 흐릿하지 않은지 확인합니다. 흐릿하면 전처리(대비 강화, 회전 등)를 수행한 뒤 엔진에 전달하면 **손글씨 이미지에서 텍스트 추출** 성공률이 크게 올라갑니다. + +## Step 4: 인식 프로세스 실행 + +모든 준비가 끝났으니 엔진에 작업을 요청합니다. `recognize()` 메서드는 추출된 문자열과 신뢰도 점수를 담은 결과 객체를 반환합니다. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +엔진은 비트맵을 특징 벡터 시퀀스로 변환하고, 이를 LSTM 네트워크에 통과시켜 문자들을 이어 붙입니다. 이것이 **python ocr handwritten recognition**의 마법입니다. + +## Step 5: 추출된 텍스트 표시 + +결과 객체는 `.text` 속성을 통해 순수 Unicode 문자열을 제공합니다. 콘솔에 출력하거나 파일에 저장하거나 다른 파이프라인에 전달—선택은 자유입니다. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### 예상 출력 + +이미지에 “Buy milk, eggs, and bread” 라는 메모가 들어 있다면 다음과 비슷한 결과가 나옵니다: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +출력에 구두점과 줄바꿈(있는 경우)이 그대로 유지되는 것을 확인하세요. 의미 없는 문자열이 나오면 이미지 품질과 `enable_handwritten_recognition` 플래그를 다시 점검하세요. + +## 일반적인 함정 처리 + +| 문제 | 증상 | 해결 방법 | +|-------|---------|-----| +| 낮은 신뢰도 점수 | 많은 “?” 또는 의미 없는 문자 | 이미지 DPI를 ≥300으로 높이고, 이진화(`opencv`)를 적용하거나 관심 영역만 잘라내세요. | +| 혼합 언어 | 출력에 영어와 다른 스크립트가 섞임 | `engine.ocr_settings.language = "eng"`(또는 다른 ISO 코드) 를 `recognize()` 전에 설정하세요. | +| 대용량 파일 | 처리 시간이 오래 걸리거나 메모리 오류 | 로드하기 전에 이미지를 적절한 크기(예: 최대 너비 1200 px)로 리사이즈하세요. | +| Tesseract 미설치 | `ImportError` 또는 `FileNotFoundError` | Tesseract를 별도로 설치하고 시스템 PATH에 포함시킵니다. | + +이러한 조정으로 **손글씨 사진을 텍스트로 변환**하는 워크플로우를 다양한 데이터셋에 걸쳐 견고하게 유지할 수 있습니다. + +## 오늘 바로 실행할 수 있는 전체 스크립트 + +아래는 모든 요소를 하나로 합친 완전한 독립 프로그램입니다. `handwritten_ocr.py`라는 파일에 복사하고 `python handwritten_ocr.py`를 실행하세요. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +실행하면 콘솔에 텍스트가 출력됩니다—즉 **손글씨 사진을 텍스트로 변환**하고자 할 때 정확히 필요한 결과입니다. + +## 더 나아가기 + +이제 **손글씨 이미지에서 텍스트 추출** 기본을 마스터했으니 다음 단계들을 고려해 보세요: + +- **배치 처리:** 이미지 폴더를 순회하며 각 결과를 CSV 파일에 저장합니다. +- **후처리:** 정규 표현식을 사용해 흔한 OCR 오류(예: “1” vs “l”)를 정리합니다. +- **통합:** 추출된 문자열을 자연어 처리 파이프라인에 넣어 감성 분석이나 키워드 추출을 수행합니다. +- **대체 라이브러리:** 더 높은 정확도가 필요하면 `easyocr` 또는 `pytesseract`와 커스텀 LSTM 모델을 탐색하세요—두 라이브러리 모두 **python ocr handwritten recognition**을 지원합니다. + +이미지 원본의 품질이 성공을 좌우한다는 점을 기억하고, 전처리에 몇 분만 투자하세요. 지금 조금만 더 노력하면 나중에 디버깅에 드는 시간을 크게 절감할 수 있습니다. + +## 결론 + +우리는 **손글씨 텍스트를 인식하는 방법**과, 더 중요한 **손글씨 이미지에서 텍스트를 추출**하는 전체 흐름을 Python으로 구현하는 예제를 끝까지 따라왔습니다. `ocr` 패키지를 설치하고, 손글씨 플래그를 켜고, 사진을 로드한 뒤 `recognize()`를 호출하면 몇 줄의 코드만으로 **손글씨 사진을 텍스트로 변환**할 수 있습니다. + +자신의 메모로 직접 시도해 보고, 전처리 단계를 조정해 보며 OCR이 무거운 작업을 대신하도록 하세요. 문제가 발생하면 “일반적인 함정 처리” 표를 다시 살펴보거나 다른 OCR 백엔드를 실험해 보세요. 즐거운 코딩 되시고, 손글씨 데이터가 즉시 검색 가능해지길 바랍니다! + +## 다음에 배워야 할 내용은? + +다음 튜토리얼들은 이 가이드에서 시연한 기술을 기반으로 하는 밀접한 주제들을 다룹니다. 각 리소스는 완전한 코드 예제와 단계별 설명을 제공해 추가 API 기능을 마스터하고 프로젝트에 적용할 수 있도록 돕습니다. + +- [Aspose OCR을 사용한 이미지 텍스트 추출 – 단계별 가이드](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Aspose.OCR에서 OCR 텍스트 인식을 위한 페이지 사각형 준비](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [OCR 사용 방법 - 텍스트 영역 감지 없이 이미지 인식](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/korean/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/korean/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..82d332b9b --- /dev/null +++ b/ocr/korean/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: 'OCR 문자 집합 가이드: OCR 엔진을 사용하여 라틴어와 키릴 문자 스크립트에 대한 지원 문자를 전체 Python 예제로 + 확인하는 방법을 배웁니다.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: ko +og_description: 'OCR 문자 집합 가이드: 파이썬에서 OCR 엔진을 사용해 다양한 스크립트의 지원 문자 목록을 확인하고, 코드와 팁까지 + 완벽히 제공합니다.' +og_title: OCR 문자 집합 – OCR 엔진 가져오기 및 사용하기 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR 문자 집합 – 파이썬에서 OCR 엔진을 가져와 사용하기 +url: /ko/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR 문자 집합 – Python에서 OCR 엔진 가져오기 및 사용하기 + +라이브러리가 지원하는 **ocr 문자 집합**을 알고 싶으신가요? 이 튜토리얼에서는 **OCR 엔진**을 사용하여 다양한 스크립트에 대해 지원되는 문자를 조회하는 방법과 실제 프로젝트에서 왜 중요한지에 대해 보여드립니다. + +OCR 처리 후 일부 글자가 화면에 나타나지 않아 고민했던 적이 있다면, 혼자가 아닙니다. 답은 엔진이 알고 있는 문자 집합에 숨겨져 있습니다. 이 가이드를 끝까지 읽으면 다음을 할 수 있게 됩니다: + +* 특정 스크립트에 대해 OCR 엔진이 인식할 수 있는 모든 문자 목록을 출력합니다. +* 스크립트가 지원되지 않을 경우를 처리합니다. +* 문자 집합 정보를 하위 검증이나 UI 로직에 연결합니다. + +특별한 블랙박스 트릭이 아니라, 순수 Python과 몇 줄의 코드, 그리고 약간의 설명만으로 가능합니다. + +--- + +## 사전 요구 사항 + +시작하기 전에 다음이 준비되어 있는지 확인하세요: + +* Python 3.8+ 설치 (코드에서 f‑strings 사용). +* `ocr.OcrEngine`을 제공하는 OCR 라이브러리—예시에서는 `pip install ocr-lib` 로 설치된 `ocr` 패키지를 가정합니다. +* Python의 import 시스템과 출력에 대한 기본적인 이해. + +라이브러리가 없으면 다음을 실행하세요: + +```bash +pip install ocr-lib +``` + +그게 전부입니다—기본 문자 집합 조회를 위해 추가 바이너리나 네이티브 의존성을 설치할 필요가 없습니다. + +--- + +## 1단계: OCR 엔진 인스턴스 초기화 + +엔진 객체를 만드는 것은 **OCR 엔진** 기능을 사용할 때 가장 먼저 하는 일입니다. 스캐너를 켜는 것과 같으며, 엔진이 없으면 어떤 질문도 할 수 없습니다. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +`OcrEngine` 클래스는 기본 인식 엔진을 감싸는 가벼운 래퍼입니다. 실제 처리를 요청하기 전까지는 무거운 작업을 시작하지 않으므로 시작 비용이 거의 없습니다. + +> **Pro tip:** 애플리케이션에서 엔진 인스턴스를 많이 만든다면, 하나만 재사용하세요. 메모리를 절약하고 초기화 지연 시간을 줄일 수 있습니다. + +--- + +## 2단계: 특정 스크립트에 대한 OCR 문자 집합 조회 + +엔진을 확보했으니 이제 특정 문자 체계에 대해 어떤 문자를 알고 있는지 물어볼 수 있습니다. `get_supported_characters(script_name)` 메서드는 Unicode 문자들의 Python `list`를 반환합니다. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +위 스니펫을 실행하면 다음과 같은 출력이 나타납니다: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +정확한 출력은 OCR 라이브러리 버전에 따라 다르지만, 항상 평탄한 문자 리스트가 반환됩니다. 대문자와 소문자를 모두 원한다면 `"Latin"` 스크립트를 요청한 뒤 각 요소에 `.lower()`를 적용하거나, 라이브러리가 구분한다면 별도의 `"Latin‑lower"` 스크립트를 조회하면 됩니다. + +### 왜 중요한가요? + +특수한 발음 부호가 포함된 이미지(예: “ñ” 또는 “ø”)를 입력하면 OCR 엔진이 이를 자리 표시자 문자로 교체하거나 완전히 삭제할 수 있습니다. **ocr 문자 집합**을 미리 알면 입력을 사전 검증하고, 사용자에게 경고하거나 다른 엔진으로 대체할 수 있습니다. + +--- + +## 3단계: 다른 스크립트 문자 집합 조회 – Cyrillic 예시 + +같은 메서드를 사용해 엔진이 지원하는 모든 스크립트를 조회할 수 있습니다. 이번에는 Cyrillic 스크립트를 살펴보겠습니다. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +예상 출력: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +엔진이 요청된 스크립트를 **지원하지 않을** 경우, 일반적으로 `ValueError`를 발생시키거나(구현에 따라) 빈 리스트를 반환합니다. 이를 대비해 예외 처리를 추가합니다. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## 4단계: OCR 문자 집합 시각화 (선택 사항) + +때때로 빠른 시각적 자료가 이해를 돕습니다. 특히 이해관계자에게 어떤 문자가 포함됐는지 보여줄 때 유용합니다. 아래는 `matplotlib`을 사용해 문자 그리드를 그리는 최소 예시입니다. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Image alt text:** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +이 다이어그램은 핵심 솔루션에 필수는 아니지만, **OCR 엔진** 메타데이터를 단순 출력 이상으로 활용할 수 있음을 보여줍니다. + +--- + +## 5단계: 문자 집합을 실제 워크플로에 통합하기 + +이제 **ocr 문자 집합**을 가져올 수 있으니, 실전 시나리오를 살펴보겠습니다: OCR 결과를 데이터베이스에 저장하기 전에 검증하는 방법입니다. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +이 패턴은 다국어 문서를 다룰 때 흔히 발생하는 잡음 데이터를 파이프라인에 유입되는 것을 방지합니다. + +--- + +## 흔히 발생하는 함정 및 엣지 케이스 + +| 함정 | 발생 원인 | 회피 방법 | +|------|----------|-----------| +| **스크립트 이름 오타** (예: 뒤에 공백이 있는 `"Cyrillic "` ) | 엔진이 문자열을 그대로 인식해 스크립트를 찾지 못함 | 호출 전에 `script.strip()` 로 공백 제거 | +| **빈 문자 리스트** | 일부 엔진은 스크립트를 노출하지만 아직 언어 모델을 로드하지 않음 | 라이브러리가 제공한다면 `engine.load_language_model(script)` 호출하거나 모델 파일 존재 여부 확인 | +| **Unicode 정규화 문제** | “é” 같은 문자가 결합형(`e\u0301`) 또는 완성형(`\u00E9`)으로 나타날 수 있음 | 검증 전 `unicodedata.normalize('NFC', text)` 로 정규화 | +| **대규모 스크립트 성능 저하** (예: Chinese) | 수천 개 문자를 조회하면 느려질 수 있음 | 첫 호출 결과를 캐시; 대부분의 애플리케이션은 한 번만 리스트가 필요함 | + +--- + +## 전체 작동 예시 + +모든 내용을 하나로 합친 스크립트입니다. 복사‑붙여넣기 후 바로 실행할 수 있습니다: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## What Should You Learn Next? + + +다음 튜토리얼들은 이 가이드에서 시연한 기술을 기반으로 하는 밀접한 주제를 다룹니다. 각 리소스는 단계별 설명과 완전한 코드 예제를 포함하고 있어, 추가 API 기능을 마스터하고 프로젝트에 다양한 구현 방식을 적용하는 데 도움이 됩니다. + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/korean/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/korean/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..c7dce9b9a --- /dev/null +++ b/ocr/korean/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,254 @@ +--- +category: general +date: 2026-06-06 +description: Aspose OCR와 Hugging Face 모델을 사용하여 이미지에서 OCR을 수행합니다. Hugging Face 모델을 + 다운로드하고, 청구서에서 텍스트를 추출하며, GPU 자원을 해제하는 방법을 배웁니다. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: ko +og_description: Aspose OCR와 Hugging Face 모델을 사용하여 이미지에서 OCR을 수행합니다. 이 튜토리얼에서는 모델을 + 다운로드하고, 청구서에서 텍스트를 추출하며, GPU 리소스를 해제하는 방법을 보여줍니다. +og_title: Aspose OCR 및 LLM으로 이미지 OCR 수행 – 완전 가이드 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Aspose OCR 및 LLM으로 이미지 OCR 수행 – 완전 가이드 +url: /ko/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# 이미지에 대한 OCR 수행 – Aspose OCR & LLM 완전 가이드 + +이미지 파일에 **OCR을 수행**하고 싶지만 “어디서 시작해야 할까?” 라는 질문에 막혔던 적 있나요? 당신만 그런 것이 아닙니다—많은 개발자들이 문서 자동화를 처음 다룰 때 이 장벽에 부딪힙니다. 좋은 소식은 Aspose OCR과 Hugging Face의 가벼운 LLM을 사용하면 청구서 스캔을 몇 줄의 Python 코드만으로 깨끗하고 검색 가능한 텍스트로 변환할 수 있다는 것입니다. + +이 튜토리얼에서는 **OCR을 위한 이미지 로드**, **Hugging Face 모델 다운로드**, **청구서에서 텍스트 추출** 그리고 마지막으로 **GPU 리소스 해제**까지 필요한 모든 과정을 단계별로 안내합니다. 끝까지 따라오면 어떤 프로젝트에도 바로 넣어 사용할 수 있는 독립 실행형 스크립트를 얻게 됩니다. + +--- + +## 배울 내용 + +- Aspose의 `OcrEngine`을 사용해 **이미지에 OCR 수행**하는 방법 +- **Hugging Face 모델** 파일을 자동으로 **다운로드**하는 정확한 단계 +- AI‑강화 후처리를 통해 **청구서 PDF 또는 PNG에서 텍스트 추출**하는 기술 +- 추론 후 **GPU 리소스 해제** 모범 사례 +- **OCR을 위한 이미지 로드**를 효율적으로 수행하고 흔히 발생하는 함정을 피하는 팁 + +외부 문서는 필요 없습니다—전체 코드, 설명, 예상 출력이 모두 여기 있습니다. + +--- + +## 사전 요구 사항 + +| Requirement | Reason | +|-------------|--------| +| Python 3.9+ | 최신 문법 및 타입 힌트 지원 | +| `asposeocr` package (`pip install asposeocr`) | 핵심 OCR 엔진 | +| Access to a GPU (optional but recommended) | LLM 후처리 속도 향상 | +| An invoice image (`sample_invoice.png`) | 실제 테스트 케이스 | + +위 항목 중 누락된 것이 있다면 지금 설치하세요; 스크립트가 **Hugging Face 모델을 자동으로 다운로드**하므로 파일을 직접 찾아볼 필요가 없습니다. + +--- + +## Step 1: Perform OCR on Image – Create the Engine + +먼저 Aspose의 OCR 엔진을 시작해야 합니다. 이는 이미지가 텍스트로 변환될 빈 캔버스를 여는 것과 같습니다. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Why this matters:** `OcrEngine`은 모든 저수준 이미지 전처리를 추상화하므로 고수준 워크플로에 집중할 수 있습니다. 또한 `set_post_processor` 메서드를 제공해 나중에 LLM을 연결해 보다 스마트한 출력이 가능하도록 합니다. + +--- + +## Step 2: Load Image for OCR – Choose the Right File + +엔진이 준비되었으니 이제 **OCR을 위한 이미지 로드**가 필요합니다. Aspose는 PNG, JPG, TIFF 등 여러 포맷을 지원합니다. 경로는 절대 경로나 스크립트 위치를 기준으로 한 상대 경로여야 합니다. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tip:** 이미지가 크다면 미리 리사이즈하여 메모리 부담을 줄이세요. OCR 엔진은 고해상도 스캔을 처리할 수 있지만, 청구서의 경우 300 DPI 정도가 보통 최적입니다. + +--- + +## Step 3: Perform Raw OCR and View the Extracted Text + +이미지를 로드했으니 이제 **이미지에 OCR 수행**을 실행하고 엔진이 반환하는 원시 텍스트를 확인합니다. 이 단계는 AI 마법을 적용하기 전 기본선을 제공합니다. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Expected output (truncated):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +원시 출력에는 줄 바꿈, 인식 오류 문자, 누락된 필드 등이 자주 포함됩니다—바로 다음 단계에서 언어 모델을 도입하는 이유입니다. + +--- + +## Step 4: Download Hugging Face Model – Configure the LLM Post‑Processor + +여기서 **Hugging Face 모델 다운로드** 단계가 빛을 발합니다. Aspose AI는 모델이 로컬에 없을 경우 Hugging Face 허브에서 자동으로 가져올 수 있습니다. 우리는 정확도와 메모리 사용량의 균형이 좋은 Qwen2.5‑3B‑Instruct‑GGUF 모델을 사용할 것입니다. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Why this works:** `allow_auto_download` 덕분에 `.gguf` 파일을 수동으로 다운로드할 필요가 없습니다. 양자화(`int8`)를 통해 모델 크기가 약 3 GB로 줄어 대부분의 소비자용 GPU에서도 실행 가능해집니다. 하드웨어에 맞게 `gpu_layers`를 조정하면 GPU에서 더 많은 레이어를 사용해 추론 속도를 높일 수 있습니다. + +--- + +## Step 5: Extract Text from Invoice Using AI‑Enhanced Post‑Processing + +이제 LLM을 OCR 엔진에 연결하고 **후처리기**를 실행해 원시 출력의 오류를 정정하고 청구서 필드를 깔끔하게 포맷합니다. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Sample enhanced output:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **What happened?** LLM은 “Invoice #12345”를 “Invoice Number: 12345”로 바꾸고 날짜 형식을 정정했으며, 원시 엔진이 놓친 “Bill To” 필드까지 추론했습니다. 이것이 **청구서에서 텍스트 추출** 자동화의 핵심입니다. + +--- + +## Step 6: Free GPU Resources – Clean Up After Processing + +Flask API와 같이 장시간 실행되는 서비스에서 작업 후 **GPU 리소스 해제**를 하지 않으면 메모리 부족 오류가 발생할 수 있습니다. Aspose AI는 이를 위한 간단한 메서드를 제공합니다. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro tip:** OCR 호출을 `try/except` 블록으로 감쌀 경우 `finally:` 안에서 `free_resources()`를 호출하면 예외 발생 여부와 관계없이 정리 작업이 보장됩니다. + +--- + +## Step 7: Full Script – Put It All Together + +아래는 완전한 실행 가능한 스크립트입니다. 복사·붙여넣기하고 경로만 조정하면 바로 사용할 수 있습니다. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +스크립트를 실행하면 잡음이 많은 OCR 결과가 깔끔하고 구조화된 청구서 데이터로 변환되는 모습을 확인할 수 있습니다. 🎉 + +--- + +## Common Questions & Edge Cases + +| Question | Answer | +|----------|--------| +| **What if the model fails to download?** | Ensure your machine has internet access and that the `hugging_face_repo_id` is correct. You can also manually download the | + +--- + +## What Should You Learn Next? + +다음 튜토리얼들은 이 가이드에서 다룬 기술을 기반으로 하며, 단계별 설명과 완전한 코드 예제를 포함하고 있어 추가 API 기능을 마스터하고 프로젝트에 다양한 구현 방식을 적용하는 데 도움이 됩니다. + +- [Aspose.OCR을 사용한 언어 선택 C# 이미지 텍스트 추출](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Aspose.OCR for .NET을 활용한 이미지 OCR 최적화](/ocr/english/net/ocr-optimization/) +- [URL에서 이미지 OCR 수행 – 이미지에서 텍스트 변환](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/korean/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/korean/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..c1fff42b5 --- /dev/null +++ b/ocr/korean/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,282 @@ +--- +category: general +date: 2026-06-06 +description: Aspose OCR로 이미지에서 텍스트를 인식 – OCR을 위해 이미지를 로드하는 방법과 Python에서 AI 후처리를 사용해 + 이미지에 OCR을 수행하는 방법을 배워보세요. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: ko +og_description: 이미지에서 텍스트를 빠르게 인식합니다. 이 가이드는 OCR을 위해 이미지를 로드하고, 이미지에 OCR을 수행하며, AI + 후처리로 결과를 향상시키는 방법을 보여줍니다. +og_title: Aspose OCR 및 AI를 사용하여 이미지에서 텍스트 인식 +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Aspose OCR 및 AI를 사용하여 이미지에서 텍스트 인식 +url: /ko/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Aspose OCR & AI를 사용하여 이미지에서 텍스트 인식 + +이미지에서 텍스트를 인식해야 했지만 속도와 정확성을 모두 제공하는 라이브러리를 찾지 못한 적이 있나요? 당신만 그런 것이 아닙니다. 이 가이드에서는 **how to load image for OCR**, **perform OCR on image**를 보여주고 Aspose의 AI 포스트‑프로세서로 출력을 다듬는 완전한 엔드‑투‑엔드 예제를 단계별로 살펴봅니다. 마지막까지 따라 하면 PNG 파일을 깨끗하고 검색 가능한 텍스트로 변환하는 실행 가능한 스크립트를 얻을 수 있습니다. + +## 배울 내용 + +Aspose OCR 패키지 설치부터 실행 종료 시 리소스 해제까지 모든 과정을 다룹니다. 손글씨 인식 활성화가 왜 중요한지, 포스트‑프로세싱을 위한 Qwen 2.5 LLM 설정 방법, 최종 출력 형태 등을 확인할 수 있습니다. 외부 참고 자료는 필요 없으며, 복사‑붙여넣기만 하면 바로 실행됩니다. + +### 사전 요구 사항 + +- Python 3.8 이상 +- `asposeocr` 패키지 (`pip install asposeocr`) +- 인쇄되었거나 손글씨 텍스트가 포함된 이미지 파일 (예: `doc.png`) +- 옵션: 더 빠른 LLM 추론을 위한 GPU (스크립트는 CPU에서도 동작합니다) + +--- + +## 이미지에서 텍스트 인식 – 단계별 + +아래 각 코드 블록 아래에는 **왜** 해당 작업을 수행하는지에 대한 간단한 설명이 있습니다. **무엇을** 하는지뿐만 아니라 **왜** 하는지에 초점을 맞춥니다. + +### 단계 1: 필요한 모듈 설치 및 가져오기 + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Why?* `asposeocr`를 가져오면 `OcrEngine` 클래스를 사용할 수 있고, `ai` 서브모듈은 원시 OCR 결과를 크게 향상시키는 LLM 기반 포스트 프로세서를 제공합니다. + +### 단계 2: OCR 엔진 생성 및 손글씨 인식 활성화 + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +손글씨 인식을 활성화하면 엔진의 문자 집합이 확장되어, 인쇄된 텍스트와 손글씨가 혼합된 **perform OCR on image** 파일에서도 필기를 놓치지 않게 됩니다. + +### 단계 3: OCR을 위한 이미지 로드 + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +`load_image` 호출은 **load image for OCR**을 수행하는 순간이며, 경로가 잘못되면 엔진이 상세한 예외를 발생시켜 이후의 모호한 오류를 방지합니다. + +### 단계 4: 원시 OCR 실행 + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +이 단계에서 필터링되지 않은 텍스트, 신뢰도 점수, 레이아웃 메타데이터를 포함하는 `RecognitionResult` 객체를 얻습니다. 결과가 종종 잡음이 많아 AI 기반 정리가 필요합니다. + +### 단계 5: LLM 포스트 프로세싱을 위한 Aspose AI 모델 구성 + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +왜 이러한 설정을 할까요? +- **auto‑download**는 첫 실행 시 모델이 사용 가능하도록 보장합니다. +- **int8 quantization**은 큰 정확도 손실 없이 메모리 사용량을 크게 줄입니다. +- **gpu_layers**는 호환 가능한 GPU를 활용해 추론 속도를 높입니다. + +### 단계 6: AI 프로세서를 초기화하고 포스트 프로세서로 연결 + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +프로세서를 연결하면 `run_postprocessor`를 호출할 때마다 LLM이 맞춤법을 정리하고, 분리된 단어를 합치며, 누락된 구두점까지 추론합니다. + +### 단계 7: 포스트 프로세서를 실행해 OCR 출력 향상 + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text`는 일반적으로 원시 문자열보다 훨씬 읽기 쉽습니다—문맥을 이해하는 맞춤법 검사기라고 생각하면 됩니다. + +### 단계 8: 리소스 해제 + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +정리 작업은 장기 실행 서비스에서 필수이며, 그렇지 않으면 GPU 메모리가 누수되어 결국 애플리케이션이 충돌합니다. + +--- + +## 오늘 바로 실행할 수 있는 전체 스크립트 + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**예상 출력** (간단한 청구서 이미지 예시): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +AI 레이어가 “Inv0ice”를 “Invoice”로 교정하고 누락된 구두점을 추가한 것을 확인하세요. + +--- + +## 자주 묻는 질문 (및 간단한 답변) + +- **Do I need a GPU?** No. The script falls back to CPU, but inference will be slower. → **GPU가 필요합니까?** 아니요. 스크립트는 CPU로 대체되지만 추론 속도가 느려집니다. +- **Can I use a different LLM?** Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` accordingly. → **다른 LLM을 사용할 수 있나요?** 물론입니다—`hugging_face_repo_id`를 변경하고 `gpu_layers`를 적절히 조정하면 됩니다. +- **What if my image is huge?** Resize it first (e.g., using Pillow) to keep memory usage reasonable. → **이미지가 너무 크면 어떻게 하나요?** 먼저 크기를 조정하세요(예: Pillow 사용)하여 메모리 사용량을 적절히 유지합니다. +- **Is handwritten recognition always on?** You can toggle `enable_handwritten_recognition` depending on your workload. → **손글씨 인식이 항상 켜져 있나요?** 작업량에 따라 `enable_handwritten_recognition`을 토글할 수 있습니다. + +--- + +## 결론 + +이제 Aspose OCR을 사용해 **recognize text from image**하는 방법, **load image for OCR**하는 방법, 그리고 AI‑강화 포스트‑프로세싱으로 **perform OCR on image**하는 방법을 알게 되었습니다. 위의 완전한 실행 예제는 영수증 스캔, 계약서 디지털화, 손글씨 양식 데이터 추출 등 어떤 Python 프로젝트에도 OCR을 통합할 수 있는 탄탄한 기반을 제공합니다. + +다음 단계가 준비되셨나요? Qwen 모델을 더 큰 모델로 교체해 보거나, 다양한 양자화 방식을 실험하거나, 여러 이미지를 배치 처리하도록 체인해 보세요. 가능성은 무한하며, 방금 만든 코드는 이를 우아하게 처리합니다. + +Happy coding, and may your OCR results always be crystal‑clear! + +![향상된 OCR 출력을 보여주는 Python 콘솔 스크린샷](/images/ocr_output.png){alt="Aspose OCR을 사용하여 이미지에서 텍스트를 인식하는 방법을 보여주는 Python 콘솔 스크린샷"} + +## 다음에 배울 내용은? + +다음 튜토리얼들은 이 가이드에서 시연한 기술을 기반으로 하여 관련 주제를 깊이 있게 다룹니다. 각 리소스는 단계별 설명과 완전한 코드 예제를 제공하므로 추가 API 기능을 마스터하고 프로젝트에 다양한 구현 방식을 적용하는 데 도움이 됩니다. + +- [Aspose OCR을 사용해 이미지에서 텍스트 추출 – 단계별 가이드](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [이미지를 텍스트로 변환 – URL에서 이미지에 OCR 수행](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Aspose.OCR을 사용한 언어 선택이 가능한 C# 이미지 텍스트 추출](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/polish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/polish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..d17ab1579 --- /dev/null +++ b/ocr/polish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,216 @@ +--- +category: general +date: 2026-06-06 +description: Wyodrębnij tekst z obrazu odręcznego przy użyciu OCR w Pythonie. Dowiedz + się, jak szybko i niezawodnie przekształcić zdjęcie odręczne w tekst. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: pl +og_description: Wyodrębnij tekst z odręcznego obrazu przy użyciu Pythona. Ten przewodnik + pokazuje, jak przekształcić zdjęcie odręczne w tekst i odpowiada na pytanie, jak + rozpoznać odręczny tekst. +og_title: Wyodrębnij tekst z obrazu odręcznego – samouczek OCR w Pythonie +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Wyodrębnij tekst z obrazu odręcznego za pomocą Python OCR – przewodnik krok + po kroku +url: /pl/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Wyodrębnianie tekstu z obrazu odręcznego przy użyciu Python OCR – przewodnik krok po kroku + +Zastanawiałeś się kiedyś **jak rozpoznać odręczny tekst** na zdjęciu zrobionym telefonem? Nie jesteś sam. W wielu projektach — czy to digitalizacja notatek z wykładów, czy wyciąganie danych z podpisanych formularzy — potrzebujesz **wyodrębnić tekst z obrazu odręcznego** szybko i bez problemów. + +W tym tutorialu przeprowadzimy Cię przez kompletny, gotowy do uruchomienia przykład, który pokaże dokładnie, jak **przekształcić zdjęcie odręczne w tekst** przy użyciu popularnej biblioteki Python OCR. Bez niejasnych odniesień, tylko konkretny kod, wyjaśnienia i wskazówki, które możesz skopiować i wkleić już dziś. + +![wyodrębnij tekst z obrazu odręcznego](https://example.com/placeholder-handwritten.jpg "wyodrębnij tekst z obrazu odręcznego") + +## Co będzie potrzebne + +Zanim zaczniemy, upewnij się, że masz następujące elementy: + +| Wymaganie | Dlaczego jest ważne | +|-------------|----------------| +| Python 3.9 or newer | Nowoczesna składnia i wsparcie bibliotek | +| `pip` (Python package manager) | Do zainstalowania pakietu OCR | +| Jasny obraz odręcznych notatek (JPEG/PNG) | Dokładność OCR spada przy rozmytych obrazach | +| Podstawowa znajomość funkcji w Pythonie | Ułatwia późniejsze dostosowanie przykładu | + +Jeśli czegoś brakuje, pobierz najnowszego Pythona ze i zainstaluj go — bez problemu. + +## Zainstaluj bibliotekę Python OCR + +Fragment kodu, którego użyjemy, opiera się na pakiecie `ocr` (cienka nakładka na Tesseract z przydatnymi ustawieniami). Zainstaluj go jednym poleceniem: + +```bash +pip install ocr +``` + +> **Porada:** Po instalacji uruchom `tesseract --version` w terminalu, aby potwierdzić, że silnik jest dostępny. Jeśli nie masz Tesseract, postępuj zgodnie z oficjalnym przewodnikiem dla swojego systemu — większość menedżerów pakietów ma go (`apt-get install tesseract-ocr` na Ubuntu, `brew install tesseract` na macOS). + +## Krok 1: Utwórz instancję silnika OCR + +Utworzenie silnika to pierwszy klocek w murze **python ocr handwritten recognition**. Traktuj silnik jako mózg, który później odczyta odręczne zapiski. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Dlaczego to ważne: bez silnika nie możesz dostosować potoku rozpoznawania. Domyślne ustawienia są zoptymalizowane pod tekst drukowany, więc w następnym kroku będziemy je modyfikować. + +## Krok 2: Włącz rozpoznawanie odręczne + +Domyślnie silnik zakłada, że znaki są drukowane. Włączenie trybu odręcznego przełącza przełącznik, który mówi Tesseractowi, aby użył swojego modelu LSTM wytrenowanego na pismach kursywnych. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Co się stanie, jeśli to pominiesz?** OCR potraktuje kreski jako szum, co skutkuje zniekształconym wynikiem. Włączenie flagi jest kluczowe dla **jak rozpoznać odręczny tekst**. + +## Krok 3: Załaduj swoje zdjęcie odręczne + +Teraz wskazujemy silnikowi plik obrazu. Ścieżka może być bezwzględna lub względna; po prostu upewnij się, że plik istnieje. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Szybka kontrola: otwórz obraz w przeglądarce systemowej. Jeśli tekst jest rozmyty, rozważ wstępne przetworzenie (zwiększenie kontrastu, obrót) przed przekazaniem go silnikowi — takie poprawki często podnoszą skuteczność **wyodrębnić tekst z obrazu odręcznego**. + +## Krok 4: Uruchom proces rozpoznawania + +Mając wszystko gotowe, w końcu prosimy silnik o wykonanie zadania. Metoda `recognize()` zwraca obiekt wyniku, który zawiera wyodrębniony ciąg znaków oraz oceny pewności. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Za kulisami silnik konwertuje bitmapę na serię wektorów cech, przetwarza je przez sieć LSTM i skleja znaki w spójną całość. To magia **python ocr handwritten recognition**. + +## Krok 5: Wyświetl wyodrębniony tekst + +Obiekt wyniku udostępnia atrybut `.text`, który zawiera zwykły ciąg Unicode. Wydrukuj go, zapisz do pliku lub przekaż dalej w innym potoku — decyzja należy do Ciebie. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Oczekiwany wynik + +Jeśli źródłowy obraz zawiera notatkę „Buy milk, eggs, and bread”, zobaczysz coś w rodzaju: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Zauważ, że wynik zachowuje interpunkcję i podziały linii (jeśli występują). Jeśli otrzymujesz bełkot, sprawdź jakość obrazu i flagę `enable_handwritten_recognition`. + +## Radzenie sobie z typowymi problemami + +| Problem | Objaw | Rozwiązanie | +|-------|---------|-----| +| Niskie oceny pewności | Wiele „?” lub bezsensownych znaków | Zwiększ DPI obrazu do ≥300, zastosuj binaryzację (`opencv`), lub przytnij do obszaru zainteresowania. | +| Mieszane języki | Wynik miesza angielski z innym skryptem | Ustaw `engine.ocr_settings.language = "eng"` (lub inny kod ISO) przed `recognize()`. | +| Duże pliki | Długi czas przetwarzania lub błąd pamięci | Zmniejsz rozmiar obrazu do rozsądnych wymiarów (np. maksymalna szerokość 1200 px) przed wczytaniem. | +| Brak Tesseract | `ImportError` lub `FileNotFoundError` | Zainstaluj Tesseract osobno i upewnij się, że znajduje się w zmiennej systemowej PATH. | + +Te korekty utrzymują Twój **przekształcić zdjęcie odręczne w tekst** stabilnym w różnych zestawach danych. + +## Pełny skrypt, który możesz uruchomić już dziś + +Poniżej znajduje się kompletny, samodzielny program, który łączy wszystkie elementy. Skopiuj go do pliku o nazwie `handwritten_ocr.py` i uruchom `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Uruchom go, a zobaczysz tekst wypisany w konsoli — dokładnie taki rezultat, jakiego potrzebujesz, gdy chcesz **przekształcić zdjęcie odręczne w tekst**. + +## Co dalej + +Teraz, gdy opanowałeś podstawy **wyodrębnić tekst z obrazu odręcznego**, rozważ następujące kroki: + +- **Przetwarzanie wsadowe:** Przejdź przez folder ze zdjęciami i zapisz każdy wynik w pliku CSV. +- **Post‑przetwarzanie:** Użyj wyrażeń regularnych do czyszczenia typowych błędów OCR (np. „1” vs „l”). +- **Integracja:** Przekaż wyodrębnione ciągi do potoku przetwarzania języka naturalnego w celu analizy sentymentu lub ekstrakcji słów kluczowych. +- **Alternatywne biblioteki:** Jeśli potrzebujesz wyższej dokładności, zbadaj `easyocr` lub `pytesseract` z własnymi modelami LSTM — oba wspierają **python ocr handwritten recognition**. + +Pamiętaj, że jakość obrazu źródłowego często decyduje o sukcesie, więc poświęć kilka minut na wstępne przetworzenie. Trochę dodatkowego wysiłku teraz zaoszczędzi Ci wiele debugowania później. + +## Podsumowanie + +Przeszliśmy przez kompletny, end‑to‑end przykład, który pokazuje **jak rozpoznać odręczny tekst** i, co ważniejsze, **wyodrębnić tekst z obrazu odręcznego** przy użyciu Pythona. Instalując pakiet `ocr`, włączając flagę odręczną, ładując zdjęcie i wywołując `recognize()`, możesz **przekształcić zdjęcie odręczne w tekst** w zaledwie kilku linijkach kodu. + +Wypróbuj to na własnych notatkach, dostosuj kroki wstępnego przetwarzania i pozwól OCR wykonać ciężką pracę. Jeśli napotkasz problemy, wróć do tabeli „Radzenie sobie z typowymi problemami” lub wypróbuj alternatywne silniki OCR. Powodzenia w kodowaniu i niech Twoje odręczne dane staną się natychmiast przeszukiwalne! + +## Co powinieneś nauczyć się dalej? + +Poniższe tutoriale obejmują tematy ściśle powiązane, które rozwijają techniki przedstawione w tym przewodniku. Każdy zasób zawiera kompletne, działające przykłady kodu z wyjaśnieniami krok po kroku, aby pomóc Ci opanować dodatkowe funkcje API i odkrywać alternatywne podejścia implementacyjne w własnych projektach. + +- [Wyodrębnianie tekstu z obrazu przy użyciu Aspose OCR – przewodnik krok po kroku](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Jak rozpoznać prostokąty stron dla rozpoznawania tekstu OCR w Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Jak używać OCR – rozpoznawanie obrazu bez wykrywania obszaru tekstowego](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/polish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/polish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..9668e0dd8 --- /dev/null +++ b/ocr/polish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,257 @@ +--- +category: general +date: 2026-06-06 +description: 'Przewodnik po zestawie znaków OCR: dowiedz się, jak używać silnika OCR, + aby wyświetlić obsługiwane znaki dla skryptów łacińskiego i cyrylicy, z pełnymi + przykładami w Pythonie.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: pl +og_description: 'Przewodnik po zestawie znaków OCR: odkryj, jak używać silnika OCR + w Pythonie, aby wyświetlić obsługiwane znaki dla różnych skryptów, wraz z kodem + i wskazówkami.' +og_title: Zestaw znaków OCR – Pobierz i użyj silnika OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Zestaw znaków OCR – Pobieranie i użycie silnika OCR w Pythonie +url: /pl/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Zestaw znaków OCR – Pobieranie i używanie silnika OCR w Pythonie + +Chcesz poznać **ocr character set**, który obsługuje Twoja biblioteka? W tym samouczku pokażemy, jak **use OCR engine**, aby zapytać o obsługiwane znaki dla różnych skryptów i dlaczego ma to znaczenie w rzeczywistych projektach. + +Jeśli kiedykolwiek patrzyłeś na pusty ekran, zastanawiając się, dlaczego niektóre litery nie pojawiają się po przetworzeniu OCR, nie jesteś sam. Odpowiedź jest często ukryta w zestawie znaków, które silnik zna. Po przeczytaniu tego przewodnika będziesz w stanie: + +* Wymienić każdy znak, który OCR engine może rozpoznać dla danego skryptu. +* Obsłużyć przypadki, w których skrypt nie jest obsługiwany. +* Wprowadzić informacje o character‑set do downstream validation lub logiki UI. + +Bez magicznych sztuczek czarnej skrzynki — tylko czysty Python, kilka linijek kodu i odrobina wyjaśnienia. + +--- + +## Wymagania wstępne + +Zanim zaczniemy, upewnij się, że masz: + +* Python 3.8+ zainstalowany (kod używa f‑strings). +* Bibliotekę OCR, która udostępnia `ocr.OcrEngine` — w tym przykładzie zakładamy, że pakiet o nazwie `ocr` jest już zainstalowany za pomocą `pip install ocr-lib`. +* Podstawową znajomość systemu importu w Pythonie oraz funkcji drukowania. + +Jeśli brakuje biblioteki, uruchom: + +```bash +pip install ocr-lib +``` + +To wszystko — nie są potrzebne dodatkowe pliki binarne ani natywne zależności do podstawowych zapytań o character‑set, które wykonamy. + +--- + +## Krok 1: Zainicjalizuj instancję silnika OCR + +Utworzenie obiektu silnika to pierwsza rzecz, którą robisz, gdy **use OCR engine** funkcjonalność. Pomyśl o tym jak o włączeniu skanera; bez niego nie możesz zadawać pytań. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +Klasa `OcrEngine` jest lekką nakładką na bazowy silnik rozpoznawania. Nie rozpoczyna ciężkiego przetwarzania, dopóki nie o coś poprosisz, więc koszt uruchomienia jest znikomy. + +> **Pro tip:** Jeśli Twoja aplikacja tworzy wiele instancji silnika, użyj jednej wielokrotnie. Oszczędza to pamięć i zmniejsza opóźnienie inicjalizacji. + +--- + +## Krok 2: Zapytaj o zestaw znaków OCR dla konkretnego skryptu + +Teraz, gdy mamy silnik, możemy zapytać go, które znaki zna dla konkretnego systemu pisma. Metoda `get_supported_characters(script_name)` zwraca Python `list` znaków Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Uruchomienie powyższego fragmentu wypisuje coś w stylu: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Dokładny wynik zależy od wersji biblioteki OCR, ale zawsze otrzymasz płaską listę znaków. Jeśli potrzebujesz zarówno wielkich, jak i małych liter, po prostu poproś o skrypt `"Latin"` i zastosuj `.lower()` do każdego elementu, lub zapytaj o oddzielny skrypt `"Latin‑lower"`, jeśli biblioteka je rozróżnia. + +### Dlaczego to ma znaczenie? + +Gdy podasz obraz zawierający nietypowe diakrytyki (np. „ñ” lub „ø”), silnik OCR może cicho zamienić je na placeholder lub całkowicie je pominąć. Znajomość **ocr character set** z wyprzedzeniem pozwala wstępnie zwalidować dane, ostrzec użytkowników lub przełączyć się na inny silnik. + +--- + +## Krok 3: Pobierz znaki dla innego skryptu — przykład cyrylicy + +Ta sama metoda działa dla każdego skryptu, który silnik deklaruje jako obsługiwany. Zobaczmy, co się stanie z cyrylicą. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Typowy wynik: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Jeśli silnik **nie** obsługuje żądanego skryptu, zazwyczaj podnosi `ValueError` (lub zwraca pustą listę w zależności od implementacji). Zabezpieczmy się przed tym. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Krok 4: Zwizualizuj zestaw znaków OCR (opcjonalnie) + +Czasami szybka wizualizacja pomaga, szczególnie gdy musisz pokazać interesariuszom, które znaki są objęte. Poniżej minimalny przykład używający `matplotlib` do renderowania siatki znaków. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Tekst alternatywny obrazu:** ![Diagram zestawu znaków OCR](ocr_character_set.png){alt="Przegląd zestawu znaków OCR"} + +Diagram nie jest wymagany dla podstawowego rozwiązania, ale pokazuje, jak można **use OCR engine** metadane poza zwykłym wypisywaniem. + +--- + +## Krok 5: Zintegruj zestaw znaków w rzeczywistych przepływach pracy + +Teraz, gdy możemy pobrać **ocr character set**, zobaczmy praktyczny scenariusz: walidacja wyników OCR przed ich zapisaniem w bazie danych. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Ten wzorzec zapobiega przedostawaniu się nieprawidłowych danych do downstream pipelines, co jest częstym problemem przy pracy z dokumentami wielojęzycznymi. + +--- + +## Typowe pułapki i przypadki brzegowe + +| Pułapka | Dlaczego się dzieje | Jak uniknąć | +|---------|---------------------|-------------| +| **Błąd w nazwie skryptu** (np. `"Cyrillic "` z końcową spacją) | Silnik traktuje ciąg dosłownie i nie może znaleźć skryptu. | Usuń białe znaki: `script.strip()` przed wywołaniem `get_supported_characters`. | +| **Pusta lista znaków** | Niektóre silniki udostępniają skrypt, ale nie załadowały jeszcze modelu językowego. | Wywołaj `engine.load_language_model(script)`, jeśli biblioteka udostępnia taką metodę, lub upewnij się, że pliki modelu są dostępne. | +| **Problemy z normalizacją Unicode** | Znaki takie jak „é” mogą występować jako złożone (`\u00E9`) lub rozłożone (`e\u0301`). | Normalizuj ciągi za pomocą `unicodedata.normalize('NFC', text)` przed walidacją. | +| **Wydajność przy dużych skryptach** (np. chiński) | Pobieranie tysięcy znaków może być wolne. | Zbuforuj wynik po pierwszym wywołaniu; większość aplikacji potrzebuje listy tylko raz. | + +--- + +## Pełny działający przykład + +Łącząc wszystko razem, oto pojedynczy skrypt, który możesz skopiować i uruchomić: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Co warto nauczyć się dalej? + +Poniższe samouczki obejmują ściśle powiązane tematy, które rozwijają techniki przedstawione w tym przewodniku. Każde źródło zawiera kompletne działające przykłady kodu z krok po kroku wyjaśnieniami, aby pomóc Ci opanować dodatkowe funkcje API i zbadać alternatywne podejścia implementacyjne w własnych projektach. + +- [Samouczek Aspose OCR – Rozpoznawanie znaków optycznych](/ocr/english/) +- [Postprocessing OCR – Pobieranie wyborów znaków](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Jak ustawić wartość progową w rozpoznawaniu obrazu OCR](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/polish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/polish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..46f5a3db8 --- /dev/null +++ b/ocr/polish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: Wykonaj OCR na obrazie przy użyciu Aspose OCR i modelu Hugging Face. + Dowiedz się, jak pobrać model Hugging Face, wyodrębnić tekst z faktury i zwolnić + zasoby GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: pl +og_description: Wykonaj OCR na obrazie przy użyciu Aspose OCR i modelu Hugging Face. + Ten samouczek pokazuje, jak pobrać model, wyodrębnić tekst z faktury i zwolnić zasoby + GPU. +og_title: Wykonaj OCR na obrazie przy użyciu Aspose OCR i LLM – Kompletny przewodnik +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Wykonaj OCR na obrazie przy użyciu Aspose OCR i LLM – Kompletny przewodnik +url: /pl/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Wykonaj OCR na obrazie przy użyciu Aspose OCR & LLM – Kompletny przewodnik + +Czy kiedykolwiek chciałeś **wykonać OCR na plikach obrazu**, ale nie wiedziałeś, od czego zacząć? Nie jesteś sam — wielu programistów napotyka tę barierę, gdy po raz pierwszy zajmuje się automatyzacją dokumentów. Dobra wiadomość: dzięki Aspose OCR i lekkim modelom LLM z Hugging Face możesz zamienić surowe skany faktury w czysty, przeszukiwalny tekst w zaledwie kilku linijkach Pythona. + +W tym tutorialu przejdziemy krok po kroku przez wszystko, co potrzebne: od **ładowania obrazu do OCR**, po **pobranie modelu Hugging Face**, po **wyodrębnienie tekstu z faktury**, a na końcu **zwolnienie zasobów GPU**, aby Twoja aplikacja była lekka. Po zakończeniu będziesz mieć samodzielny skrypt, który możesz wkleić do dowolnego projektu. + +--- + +## Czego się nauczysz + +- Jak **wykonać OCR na obrazie** przy użyciu `OcrEngine` Aspose. +- Dokładne kroki **pobierania plików modelu Hugging Face** automatycznie. +- Techniki **wyodrębniania tekstu z faktur** w formacie PDF lub PNG z AI‑wzbogaconym przetwarzaniem post‑procesowym. +- Najlepsze praktyki **zwalniania zasobów GPU** po inferencji. +- Wskazówki, jak **ładować obraz do OCR** efektywnie i unikać typowych pułapek. + +Nie potrzebujesz żadnej zewnętrznej dokumentacji — wszystko, co potrzebne, znajduje się tutaj, wraz z pełnym kodem, wyjaśnieniami i oczekiwanym wynikiem. + +--- + +## Wymagania wstępne + +Zanim zaczniemy, upewnij się, że masz: + +| Wymaganie | Powód | +|-----------|-------| +| Python 3.9+ | Nowoczesna składnia i podpowiedzi typów | +| pakiet `asposeocr` (`pip install asposeocr`) | Główny silnik OCR | +| Dostęp do GPU (opcjonalnie, ale zalecane) | Przyspiesza post‑procesor LLM | +| Obraz faktury (`sample_invoice.png`) | Przykład z rzeczywistego świata | + +Jeśli czegoś brakuje, zainstaluj to teraz; skrypt **automatycznie pobierze model Hugging Face**, więc nie musisz samodzielnie szukać plików. + +--- + +## Krok 1: Wykonaj OCR na obrazie – Utwórz silnik + +Pierwszą rzeczą, którą musisz zrobić, jest uruchomienie silnika OCR Aspose. Pomyśl o tym jak o otwarciu pustego płótna, na którym obraz zostanie później przetworzony na tekst. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Dlaczego to ważne:** `OcrEngine` abstrahuje wszystkie niskopoziomowe przetwarzanie obrazu, więc możesz skupić się na wyższym poziomie przepływu pracy. Udostępnia także metodę `set_post_processor`, która później pozwoli podłączyć LLM do inteligentniejszego wyniku. + +--- + +## Krok 2: Ładuj obraz do OCR – Wybierz właściwy plik + +Teraz, gdy silnik istnieje, musimy **załadować obraz do OCR**. Aspose obsługuje PNG, JPG, TIFF i kilka innych formatów. Upewnij się, że ścieżka jest absolutna lub względna względem lokalizacji Twojego skryptu. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Wskazówka:** Jeśli Twój obraz jest duży, rozważ jego zmniejszenie przed przetworzeniem, aby zmniejszyć obciążenie pamięci. Silnik OCR radzi sobie ze skanami wysokiej rozdzielczości, ale obraz 300 DPI zazwyczaj jest optymalny dla faktur. + +--- + +## Krok 3: Wykonaj surowe OCR i zobacz wyodrębniony tekst + +Po załadowaniu obrazu możemy w końcu **wykonać OCR na obrazie** i zobaczyć, co surowy silnik zwróci. Ten krok daje nam punkt odniesienia przed dodaniem magii AI. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Oczekiwany wynik (skrócony):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Surowy wynik często zawiera podziały linii, błędnie rozpoznane znaki lub brakujące pola — dokładnie dlatego wprowadzimy model językowy w następnym kroku. + +--- + +## Krok 4: Pobierz model Hugging Face – Skonfiguruj post‑procesor LLM + +Tutaj wchodzi w grę krok **pobierania modelu Hugging Face**. Aspose AI może automatycznie pobrać model z hubu Hugging Face, jeśli nie znajduje się jeszcze na dysku. Użyjemy modelu Qwen2.5‑3B‑Instruct‑GGUF, który łączy dokładność z niewielkim zużyciem pamięci. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Dlaczego to działa:** `allow_auto_download` oszczędza Ci ręcznego pobierania pliku `.gguf`. Kwantyzacja (`int8`) zmniejsza rozmiar modelu do około 3 GB, co czyni go wykonalnym na większości konsumenckich GPU. Dostosuj `gpu_layers` w zależności od sprzętu — więcej warstw na GPU = szybsza inferencja. + +--- + +## Krok 5: Wyodrębnij tekst z faktury przy użyciu AI‑wzbogaconego post‑procesingu + +Teraz podłączamy LLM do silnika OCR i uruchamiamy **post‑processor**, który oczyszcza surowy wynik, koryguje błędy OCR i ładnie formatuje pola faktury. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Przykładowy ulepszony wynik:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Co się stało?** LLM rozpoznał, że „Invoice #12345” powinno być „Invoice Number: 12345”, poprawił format daty i nawet wywnioskował pole „Bill To”, którego surowy silnik nie wykrył. To sedno automatyzacji **wyodrębniania tekstu z faktury**. + +--- + +## Krok 6: Zwalnianie zasobów GPU – Sprzątanie po przetwarzaniu + +Jeśli uruchamiasz to w długotrwałej usłudze (np. API Flask), musisz **zwolnić zasoby GPU** po każdej inferencji, aby uniknąć awarii z powodu braku pamięci. Aspose AI udostępnia prostą metodę do tego celu. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro tip:** Wywołaj `free_resources()` wewnątrz bloku `finally:` jeśli otaczasz wywołanie OCR w `try/except`. To zapewni sprzątanie nawet w przypadku wystąpienia wyjątku. + +--- + +## Krok 7: Pełny skrypt – Połącz wszystko razem + +Poniżej znajduje się kompletny, gotowy do uruchomienia skrypt. Skopiuj‑wklej, dostosuj ścieżki i gotowe. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Uruchom skrypt i obserwuj transformację od szumu OCR do czystych, ustrukturyzowanych danych faktury. 🎉 + +--- + +## Często zadawane pytania i przypadki brzegowe + +| Pytanie | Odpowiedź | +|----------|-----------| +| **Co zrobić, gdy model nie uda się pobrać?** | Upewnij się, że Twój komputer ma dostęp do internetu i że `hugging_face_repo_id` jest poprawny. Możesz także ręcznie pobrać the | + +## Co powinieneś nauczyć się dalej? + +Poniższe tutoriale obejmują tematy ściśle powiązane, które rozwijają techniki przedstawione w tym przewodniku. Każdy zasób zawiera kompletny, działający kod oraz wyczerpujące wyjaśnienia krok po kroku, aby pomóc Ci opanować dodatkowe funkcje API i eksplorować alternatywne podejścia w własnych projektach. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/polish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/polish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..decd833ce --- /dev/null +++ b/ocr/polish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,284 @@ +--- +category: general +date: 2026-06-06 +description: rozpoznawanie tekstu z obrazu przy użyciu Aspose OCR – dowiedz się, jak + wczytać obraz do OCR i wykonać OCR na obrazie z wykorzystaniem post‑procesingu AI + w Pythonie. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: pl +og_description: Szybko rozpoznawaj tekst z obrazu. Ten przewodnik pokazuje, jak wczytać + obraz do OCR, wykonać OCR na obrazie i zwiększyć wyniki dzięki post‑procesowaniu + AI. +og_title: Rozpoznaj tekst z obrazu przy użyciu Aspose OCR i AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Rozpoznaj tekst z obrazu przy użyciu Aspose OCR i AI +url: /pl/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# rozpoznawanie tekstu z obrazu przy użyciu Aspose OCR i AI + +Czy kiedykolwiek potrzebowałeś rozpoznać tekst z obrazu, ale nie byłeś pewien, która biblioteka zapewni zarówno szybkość, jak i dokładność? Nie jesteś sam. W tym przewodniku przeprowadzimy Cię przez kompletny, end‑to‑end przykład, który pokazuje **jak załadować obraz do OCR**, **wykonać OCR na obrazie**, a następnie wypolerować wynik przy użyciu AI post‑processora Aspose. Po zakończeniu będziesz mieć gotowy do uruchomienia skrypt, który zamienia PNG w czysty, przeszukiwalny tekst. + +## Czego się nauczysz + +Omówimy wszystko, od instalacji pakietu Aspose OCR po zwalnianie zasobów na końcu działania. Zobaczysz, dlaczego włączenie rozpoznawania ręcznego tekstu ma znaczenie, jak skonfigurować model Qwen 2.5 LLM do post‑processingu oraz jak wygląda ostateczny wynik. Nie są potrzebne żadne zewnętrzne odwołania — po prostu skopiuj, wklej i uruchom. + +### Wymagania wstępne + +- Python 3.8 lub nowszy +- pakiet `asposeocr` (`pip install asposeocr`) +- Plik obrazu (np. `doc.png`) zawierający drukowany lub odręczny tekst +- Opcjonalnie: GPU dla szybszej inferencji LLM (skrypt działa również na CPU) + +--- + +## Rozpoznawanie tekstu z obrazu – Krok po kroku + +Pod każdym blokiem kodu znajdziesz krótkie wyjaśnienie **dlaczego** wykonujemy daną akcję, a nie tylko **co** robi dana linia. + +### Krok 1: Zainstaluj i zaimportuj wymagane moduły + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Dlaczego?* Importowanie `asposeocr` daje nam klasę `OcrEngine`, natomiast podmoduł `ai` zapewnia post‑processor oparty na LLM, który dramatycznie poprawia surowy wynik OCR. + +### Krok 2: Utwórz silnik OCR i włącz rozpoznawanie tekstu odręcznego + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Włączenie rozpoznawania odręcznego rozszerza zestaw znaków silnika, więc nie utracisz odręcznych notatek, gdy **wykonujesz OCR na obrazie** zawierającym mieszany tekst drukowany i kursywny. + +### Krok 3: Załaduj obraz do OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +Wywołanie `load_image` to moment, w którym **ładujesz obraz do OCR**; jeśli ścieżka jest nieprawidłowa, silnik zgłosi informacyjną wyjątk, chroniąc Cię przed niejasnymi błędami w dalszej części. + +### Krok 4: Uruchom surowe przetwarzanie OCR + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +Na tym etapie otrzymujesz obiekt `RecognitionResult`, który zawiera nieprzefiltrowany tekst, wyniki pewności i metadane układu. Często jest on zaszumiony — stąd potrzeba czyszczenia napędzanego AI. + +### Krok 5: Skonfiguruj model Aspose AI do post‑processingu LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Po co się martwić o te ustawienia? +- **auto‑download** zapewnia, że model jest dostępny przy pierwszym uruchomieniu. +- **int8 quantization** znacznie zmniejsza zużycie pamięci bez dużego spadku dokładności. +- **gpu_layers** pozwala wykorzystać kompatybilne GPU do szybszej inferencji. + +### Krok 6: Zainicjalizuj procesor AI i podłącz go jako post‑processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Dołączenie procesora oznacza, że za każdym razem, gdy wywołasz `run_postprocessor`, LLM oczyści pisownię, połączy przerwane słowa i nawet odgadnie brakujące znaki interpunkcyjne. + +### Krok 7: Uruchom post‑processor, aby ulepszyć wynik OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` jest zazwyczaj znacznie czytelniejszy niż surowy ciąg znaków — myśl o nim jak o korektorze, który dodatkowo rozumie kontekst. + +### Krok 8: Zwolnij zasoby + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Czyszczenie jest kluczowe w długotrwale działających usługach; w przeciwnym razie wyciekasz pamięć GPU i ostatecznie spowodujesz awarię aplikacji. + +--- + +## Pełny skrypt, który możesz uruchomić już dziś + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Oczekiwany wynik** (przykład dla prostego obrazu faktury): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Zauważ, jak warstwa AI poprawiła „Inv0ice” → „Invoice” i dodała brakującą interpunkcję. + +--- + +## Najczęściej zadawane pytania (i szybkie odpowiedzi) + +- **Czy potrzebuję GPU?** Nie. Skrypt przełącza się na CPU, ale inferencja będzie wolniejsza. +- **Czy mogę użyć innego LLM?** Oczywiście — wystarczy zmienić `hugging_face_repo_id` i odpowiednio dostosować `gpu_layers`. +- **Co jeśli mój obraz jest ogromny?** Najpierw zmień jego rozmiar (np. przy użyciu Pillow), aby utrzymać zużycie pamięci w rozsądnych granicach. +- **Czy rozpoznawanie odręczne jest zawsze włączone?** Możesz przełączać `enable_handwritten_recognition` w zależności od obciążenia. + +--- + +## Zakończenie + +Teraz wiesz, jak **rozpoznawać tekst z obrazu** przy użyciu Aspose OCR, jak **ładować obraz do OCR**, oraz jak **wykonywać OCR na obrazie** z AI‑wzbogaconym post‑processingiem. Pełny, uruchamialny przykład powyżej daje solidną podstawę do integracji OCR w dowolnym projekcie Python — niezależnie od tego, czy skanujesz paragony, digitalizujesz umowy, czy wyciągasz dane z formularzy odręcznych. + +Gotowy na kolejny krok? Spróbuj zamienić model Qwen na większy, eksperymentuj z różnymi schematami kwantyzacji lub połącz wiele obrazów w przetwarzanie wsadowe. Możliwości są nieograniczone, a kod, który właśnie stworzyłeś, poradzi sobie z nimi płynnie. + +Szczęśliwego kodowania i niech wyniki OCR zawsze będą krystalicznie czyste! + +![Zrzut ekranu konsoli Pythona pokazujący ulepszony wynik OCR](/images/ocr_output.png){alt="Zrzut ekranu pokazujący, jak rozpoznawać tekst z obrazu przy użyciu Aspose OCR"} + +## Co warto nauczyć się dalej? + +Poniższe samouczki obejmują ściśle powiązane tematy, które rozwijają techniki przedstawione w tym przewodniku. Każde źródło zawiera kompletne działające przykłady kodu z wyjaśnieniami krok po kroku, aby pomóc Ci opanować dodatkowe funkcje API i odkrywać alternatywne podejścia implementacyjne w własnych projektach. + +- [Wyodrębnij tekst z obrazu przy użyciu Aspose OCR – przewodnik krok po kroku](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Konwertuj obraz na tekst – wykonaj OCR na obrazie z URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Wyodrębnij tekst z obrazu w C# z wyborem języka przy użyciu Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/portuguese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/portuguese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..e5b1115ec --- /dev/null +++ b/ocr/portuguese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,214 @@ +--- +category: general +date: 2026-06-06 +description: Extraia texto de imagem manuscrita usando OCR em Python. Aprenda como + converter foto manuscrita em texto de forma rápida e confiável. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: pt +og_description: Extraia texto de imagem manuscrita com Python. Este guia mostra como + converter foto manuscrita em texto e responde como reconhecer texto manuscrito. +og_title: Extrair Texto de Imagem Manuscrita – Tutorial de OCR em Python +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Extrair Texto de Imagem Escrita à Mão com OCR em Python – Guia Passo a Passo +url: /pt/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Extrair Texto de Imagem Escrita à Mão com Python OCR – Guia Passo a Passo + +Já se perguntou **como reconhecer texto manuscrito** em uma foto que você tirou com o celular? Você não está sozinho. Em muitos projetos—seja digitalizando anotações de aula ou extraindo dados de formulários assinados—você precisa **extrair texto de imagem escrita à mão** rápido e sem dores de cabeça. + +Neste tutorial vamos percorrer um exemplo completo, pronto‑para‑executar, que mostra exatamente como **converter foto manuscrita em texto** usando uma biblioteca Python OCR popular. Sem referências vagas, apenas código concreto, explicações e dicas que você pode copiar‑colar hoje. + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## O Que Você Precisa + +Antes de mergulharmos, certifique‑se de que tem o seguinte: + +| Requisito | Por que importa | +|-----------|-----------------| +| Python 3.9 ou mais recente | Sintaxe moderna e suporte a bibliotecas | +| `pip` (gerenciador de pacotes Python) | Para instalar o pacote OCR | +| Uma imagem nítida de notas manuscritas (JPEG/PNG) | A precisão do OCR cai em imagens borradas | +| Familiaridade básica com funções Python | Ajuda a adaptar o exemplo depois | + +Se estiver faltando algum desses itens, baixe a versão mais recente do Python em e instale‑a—sem complicações. + +## Instale a Biblioteca Python OCR + +O trecho de código que usaremos depende do pacote `ocr` (um wrapper leve em torno do Tesseract que adiciona configurações úteis). Instale‑o com um único comando: + +```bash +pip install ocr +``` + +> **Dica profissional:** Após a instalação, execute `tesseract --version` no seu terminal para confirmar que o motor subjacente está presente. Se não tiver o Tesseract, siga o guia oficial para o seu SO—a maioria dos gerenciadores de pacotes o possui (`apt-get install tesseract-ocr` no Ubuntu, `brew install tesseract` no macOS). + +## Etapa 1: Crie uma Instância do Motor OCR + +Criar o motor é o primeiro tijolo na parede de **python ocr handwritten recognition**. Pense no motor como o cérebro que mais tarde lerá os rabiscos. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Por que isso importa: sem um motor você não pode ajustar o pipeline de reconhecimento. As configurações padrão são afinadas para texto impresso, então precisaremos ajustá‑las na próxima etapa. + +## Etapa 2: Habilite o Reconhecimento de Manuscrito + +Por padrão o motor assume caracteres impressos. Habilitar o modo manuscrito aciona uma chave que indica ao Tesseract para usar seu modelo LSTM treinado em traços cursivos. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **E se você pular isso?** O OCR tratará os traços como ruído, resultando em saída confusa. Ativar a flag é o núcleo de **como reconhecer texto manuscrito**. + +## Etapa 3: Carregue Sua Foto Manuscrita + +Agora apontamos o motor para o arquivo de imagem. O caminho pode ser absoluto ou relativo; apenas certifique‑se de que o arquivo exista. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Uma verificação rápida: abra a imagem no visualizador do seu SO. Se o texto estiver borrado, considere pré‑processamento (aumento de contraste, rotação) antes de enviá‑la ao motor—esses ajustes costumam elevar a taxa de sucesso de **extrair texto de imagem escrita à mão**. + +## Etapa 4: Execute o Processo de Reconhecimento + +Com tudo configurado, finalmente pedimos ao motor que faça seu trabalho. O método `recognize()` retorna um objeto de resultado que contém a string extraída e as pontuações de confiança. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Nos bastidores, o motor converte o bitmap em uma série de vetores de características, os passa pela rede LSTM e junta os caracteres. Essa é a mágica de **python ocr handwritten recognition**. + +## Etapa 5: Exiba o Texto Extraído + +O objeto de resultado expõe um atributo `.text` que contém a string Unicode simples. Imprima‑o, grave‑o em um arquivo ou alimente‑o em outro pipeline—você decide. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Saída Esperada + +Se a imagem fonte contiver a anotação “Buy milk, eggs, and bread”, você verá algo como: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Observe como a saída preserva pontuação e quebras de linha (se houver). Se aparecer lixo, verifique a qualidade da imagem e a flag `enable_handwritten_recognition`. + +## Lidando com Problemas Comuns + +| Problema | Sintoma | Solução | +|----------|---------|---------| +| Pontuações de confiança baixas | Muitos “?” ou caracteres sem sentido | Aumente o DPI da imagem para ≥300, aplique binarização (`opencv`) ou recorte a região de interesse. | +| Idiomas mistos | Saída mistura inglês com outro script | Defina `engine.ocr_settings.language = "eng"` (ou outro código ISO) antes de `recognize()`. | +| Arquivos grandes | Tempo de processamento longo ou erro de memória | Redimensione a imagem para uma dimensão razoável (ex.: largura máxima 1200 px) antes de carregar. | +| Tesseract ausente | `ImportError` ou `FileNotFoundError` | Instale o Tesseract separadamente e garanta que esteja no PATH do sistema. | + +Esses ajustes mantêm seu fluxo **convert handwritten photo to text** robusto em diferentes conjuntos de dados. + +## Script Completo que Você Pode Executar Hoje + +A seguir está o programa completo e autocontido que reúne todas as peças. Copie‑o para um arquivo chamado `handwritten_ocr.py` e execute `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Execute‑o, e você verá o texto impresso no console—exatamente o resultado que você precisa quando quiser **converter foto manuscrita em texto**. + +## Avançando + +Agora que você dominou o básico de **extrair texto de imagem escrita à mão**, considere os próximos passos: + +- **Processamento em lote:** Percorra uma pasta de imagens e armazene cada resultado em um arquivo CSV. +- **Pós‑processamento:** Use expressões regulares para limpar erros comuns de OCR (ex.: “1” vs “l”). +- **Integração:** Alimente as strings extraídas em um pipeline de Processamento de Linguagem Natural para análise de sentimento ou extração de palavras‑chave. +- **Bibliotecas alternativas:** Se precisar de maior precisão, explore `easyocr` ou `pytesseract` com modelos LSTM customizados—ambas suportam **python ocr handwritten recognition** também. + +Lembre‑se, a qualidade da imagem fonte costuma determinar o sucesso, então dedique alguns minutos ao pré‑processamento. Um pequeno esforço extra agora economiza muito debugging depois. + +## Conclusão + +Percorremos um exemplo completo, de ponta a ponta, que demonstra **como reconhecer texto manuscrito** e, mais importante, **como extrair texto de imagem escrita à mão** usando Python. Instalando o pacote `ocr`, ativando a flag de manuscrito, carregando sua foto e chamando `recognize()`, você pode **converter foto manuscrita em texto** em apenas algumas linhas. + +Teste com suas próprias anotações, ajuste as etapas de pré‑processamento e deixe o OCR fazer o trabalho pesado. Se encontrar algum obstáculo, revise a tabela “Lidando com Problemas Comuns” ou experimente back‑ends OCR alternativos. Boa codificação, e que seus dados manuscritos se tornem instantaneamente pesquisáveis! + +## O Que Você Deve Aprender a Seguir? + +Os tutoriais a seguir abordam tópicos intimamente relacionados que ampliam as técnicas demonstradas neste guia. Cada recurso inclui exemplos de código completos e funcionais com explicações passo a passo para ajudá‑lo a dominar recursos adicionais da API e explorar abordagens de implementação alternativas em seus próprios projetos. + +- [Extrair Texto de Imagem com Aspose OCR – Guia Passo a Passo](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Como Reconhecer Retângulos de Página para Reconhecimento de Texto OCR no Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Como Usar OCR - Reconhecer Imagem sem Detecção de Área de Texto](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/portuguese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/portuguese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..afd676e35 --- /dev/null +++ b/ocr/portuguese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,232 @@ +--- +category: general +date: 2026-06-06 +description: 'Guia de conjunto de caracteres OCR: aprenda como usar o motor OCR para + listar os caracteres suportados para scripts latinos e cirílicos com exemplos completos + em Python.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: pt +og_description: 'Guia de conjunto de caracteres OCR: descubra como usar o motor OCR + em Python para listar os caracteres suportados para vários scripts, completo com + código e dicas.' +og_title: Conjunto de Caracteres OCR – Recuperar e Utilizar o Motor OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Conjunto de Caracteres OCR – Recuperar e Usar o Motor OCR em Python +url: /pt/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Conjunto de Caracteres OCR – Recuperar e Usar o Motor OCR em Python + +Quer saber o **conjunto de caracteres OCR** que sua biblioteca suporta? Neste tutorial vamos mostrar como **usar o motor OCR** para consultar os caracteres suportados para diferentes scripts e por que isso importa em projetos reais. + +Se você já ficou encarando uma tela em branco se perguntando por que algumas letras nunca aparecem após o processamento OCR, não está sozinho. A resposta costuma estar escondida no conjunto de caracteres que o motor conhece. Ao final deste guia você será capaz de: + +* Listar cada caractere que um motor OCR pode reconhecer para um script específico. +* Tratar casos em que um script não é suportado. +* Inserir as informações do conjunto de caracteres em validações posteriores ou lógica de UI. + +Sem truques mágicos de caixa‑preta — apenas Python puro, algumas linhas de código e um pouco de explicação. + +--- + +## Pré‑requisitos + +Antes de começarmos, verifique se você tem: + +* Python 3.8+ instalado (o código usa f‑strings). +* A biblioteca OCR que fornece `ocr.OcrEngine` — para este exemplo vamos assumir que um pacote chamado `ocr` já está instalado via `pip install ocr-lib`. +* Familiaridade básica com o sistema de importação do Python e impressão. + +Se estiver faltando a biblioteca, execute: + +```bash +pip install ocr-lib +``` + +É só isso — nenhum binário extra ou dependência nativa é necessário para as consultas básicas ao conjunto de caracteres que faremos. + +--- + +## Etapa 1: Inicializar a Instância do Motor OCR + +Criar um objeto de motor é a primeira coisa que você faz ao **usar o motor OCR**. Pense nisso como ligar um scanner; sem ele, você não pode fazer perguntas. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +A classe `OcrEngine` é um wrapper leve ao redor do motor de reconhecimento subjacente. Ela não inicia processamento pesado até que você solicite algo, portanto o custo de inicialização é insignificante. + +> **Dica profissional:** Se sua aplicação cria muitas instâncias do motor, reutilize uma única. Isso economiza memória e reduz a latência de inicialização. + +--- + +## Etapa 2: Consultar o Conjunto de Caracteres OCR para um Script Específico + +Agora que temos um motor, podemos perguntar a ele quais caracteres conhece para um determinado sistema de escrita. O método `get_supported_characters(script_name)` devolve uma `list` Python de caracteres Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Executar o trecho acima imprime algo como: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +A saída exata depende da versão da biblioteca OCR, mas você sempre receberá uma lista plana de caracteres. Se precisar tanto de maiúsculas quanto de minúsculas, basta solicitar o script `"Latin"` e então aplicar `.lower()` a cada elemento, ou consultar um script separado `"Latin‑lower"` se a biblioteca os distinguir. + +### Por que isso importa? + +Quando você fornece uma imagem contendo diacríticos incomuns (por exemplo, “ñ” ou “ø”), o motor OCR pode substituir silenciosamente esses caracteres por um placeholder ou descartá‑los totalmente. Conhecer o **conjunto de caracteres OCR** com antecedência permite pré‑validar a entrada, avisar os usuários ou recorrer a outro motor. + +--- + +## Etapa 3: Recuperar Caracteres para Outro Script – Exemplo Cirílico + +O mesmo método funciona para qualquer script que o motor afirma suportar. Vamos ver o que acontece com o cirílico. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Saída típica: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Se o motor **não** suportar o script solicitado, normalmente ele lança um `ValueError` (ou devolve uma lista vazia, dependendo da implementação). Vamos proteger contra isso. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Etapa 4: Visualizar o Conjunto de Caracteres OCR (Opcional) + +Às vezes uma visualização rápida ajuda, especialmente quando você precisa mostrar aos interessados quais caracteres estão cobertos. Abaixo está um exemplo mínimo usando `matplotlib` para renderizar uma grade de caracteres. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Texto alternativo da imagem:** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +O diagrama não é obrigatório para a solução principal, mas demonstra como você pode **usar o motor OCR** para obter metadados além da simples impressão. + +--- + +## Etapa 5: Integrar o Conjunto de Caracteres em Fluxos de Trabalho Reais + +Agora que podemos recuperar o **conjunto de caracteres OCR**, vejamos um cenário prático: validar a saída OCR antes de armazená‑la em um banco de dados. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Esse padrão impede que dados lixo escapem para pipelines posteriores, um ponto de dor comum ao lidar com documentos multilíngues. + +--- + +## Armadilhas Comuns e Casos de Borda + +| Armadilha | Por que acontece | Como evitar | +|-----------|------------------|--------------| +| **Erro de digitação no nome do script** (ex.: `"Cyrillic "` com espaço ao final) | O motor trata a string literalmente e não encontra o script. | Remova espaços em branco: `script.strip()` antes de chamar `get_supported_characters`. | +| **Lista de caracteres vazia** | Alguns motores expõem um script, mas ainda não carregaram o modelo de idioma. | Chame `engine.load_language_model(script)` se a biblioteca oferecer esse método, ou garanta que os arquivos de modelo estejam presentes. | +| **Problemas de normalização Unicode** | Caracteres como “é” podem aparecer como compostos (`\u00E9`) ou decompostos (`e\u0301`). | Normalize strings com `unicodedata.normalize('NFC', text)` antes da validação. | +| **Desempenho em scripts extensos** (ex.: Chinês) | Recuperar milhares de caracteres pode ser lento. | Cache o resultado após a primeira chamada; a maioria das aplicações só precisa da lista uma vez. | + +--- + +## Exemplo Completo Funcional + +Juntando tudo, aqui está um script único que você pode copiar‑colar e executar: + + + +## O que Você Deve Aprender a Seguir? + +Os tutoriais a seguir abordam tópicos intimamente relacionados que ampliam as técnicas demonstradas neste guia. Cada recurso inclui exemplos de código completos com explicações passo a passo para ajudá‑lo a dominar recursos adicionais da API e explorar abordagens alternativas de implementação em seus próprios projetos. + +- [Tutorial Aspose OCR – Reconhecimento Óptico de Caracteres](/ocr/english/) +- [Pós‑processamento OCR – Obter Opções de Caracteres](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Como Definir Valor de Limiar no Reconhecimento de Imagem OCR](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/portuguese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/portuguese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..149cf72b0 --- /dev/null +++ b/ocr/portuguese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: Realize OCR em imagem usando Aspose OCR e um modelo Hugging Face. Aprenda + como baixar o modelo Hugging Face, extrair texto de fatura e liberar recursos da + GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: pt +og_description: Realize OCR em imagem usando Aspose OCR e um modelo do Hugging Face. + Este tutorial mostra como baixar o modelo, extrair texto de uma fatura e liberar + recursos da GPU. +og_title: Realize OCR em Imagem com Aspose OCR e LLM – Guia Completo +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Realize OCR em Imagem com Aspose OCR e LLM – Guia Completo +url: /pt/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Realizar OCR em Imagem com Aspose OCR & LLM – Guia Completo + +Já quis **realizar OCR em imagem** arquivos mas se sentiu preso na pergunta “por onde começar?”? Você não está sozinho—muitos desenvolvedores encontram essa barreira ao primeiro lidar com automação de documentos. A boa notícia é que com Aspose OCR e um LLM leve da Hugging Face, você pode transformar uma digitalização bruta de uma fatura em texto limpo e pesquisável em apenas algumas linhas de Python. + +Neste tutorial, percorreremos tudo o que você precisa: desde **carregar imagem para OCR**, até **baixar o modelo Hugging Face**, **extrair texto de fatura** e, finalmente, **liberar recursos da GPU** para que seu aplicativo permaneça leve. Ao final, você terá um script autônomo que pode ser inserido em qualquer projeto. + +--- + +## O que você aprenderá + +- Como **realizar OCR em imagem** usando o `OcrEngine` da Aspose. +- Os passos exatos para **baixar o modelo Hugging Face** automaticamente. +- Técnicas para **extrair texto de fatura** de PDFs ou PNGs com pós‑processamento aprimorado por IA. +- Melhores práticas para **liberar recursos da GPU** após a inferência. +- Dicas para **carregar imagem para OCR** de forma eficiente e evitar armadilhas comuns. + +Nenhuma documentação externa é necessária—tudo o que você precisa está aqui, com código completo, explicações e saída esperada. + +--- + +## Pré-requisitos + +Antes de mergulharmos, certifique-se de que você tem: + +| Requirement | Reason | +|-------------|--------| +| Python 3.9+ | Sintaxe moderna e dicas de tipo | +| `asposeocr` package (`pip install asposeocr`) | Motor OCR principal | +| Access to a GPU (optional but recommended) | Acelera o pós‑processador LLM | +| An invoice image (`sample_invoice.png`) | Caso de teste do mundo real | + +Se você estiver sem algum desses, instale-os agora; o script também **baixará o modelo Hugging Face** automaticamente, então você não precisará procurar os arquivos manualmente. + +--- + +## Etapa 1: Realizar OCR em Imagem – Criar o Engine + +A primeira coisa que você precisa fazer é iniciar o motor OCR da Aspose. Pense nele como abrir uma tela em branco onde a imagem será posteriormente transformada em texto. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Por que isso importa:** `OcrEngine` abstrai todo o pré‑processamento de imagem de baixo nível, permitindo que você se concentre no fluxo de trabalho de alto nível. Ele também expõe o método `set_post_processor` que mais tarde nos permitirá conectar um LLM para uma saída mais inteligente. + +--- + +## Etapa 2: Carregar Imagem para OCR – Escolher o Arquivo Correto + +Agora que o motor existe, precisamos **carregar imagem para OCR**. Aspose suporta PNG, JPG, TIFF e alguns outros formatos. Certifique‑se de que o caminho seja absoluto ou relativo à localização do seu script. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Dica:** Se sua imagem for grande, considere redimensioná‑la antes para reduzir a pressão de memória. O motor OCR pode lidar com digitalizações de alta resolução, mas uma imagem de 300 DPI costuma ser o ponto ideal para faturas. + +--- + +## Etapa 3: Realizar OCR Bruto e Visualizar o Texto Extraído + +Com a imagem carregada, finalmente podemos **realizar OCR em imagem** e ver o que o motor bruto produz. Esta etapa nos fornece uma linha de base antes de adicionarmos a magia da IA. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Saída esperada (truncada):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +A saída bruta frequentemente contém quebras de linha, caracteres reconhecidos incorretamente ou campos ausentes—exatamente por isso que traremos um modelo de linguagem a seguir. + +--- + +## Etapa 4: Baixar Modelo Hugging Face – Configurar o Pós‑Processador LLM + +É aqui que a etapa de **baixar modelo Hugging Face** se destaca. O Aspose AI pode puxar automaticamente um modelo do hub Hugging Face se ele ainda não estiver no disco. Usaremos o modelo Qwen2.5‑3B‑Instruct‑GGUF, que equilibra precisão e uso de memória. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Por que isso funciona:** `allow_auto_download` evita que você baixe manualmente o arquivo `.gguf`. A quantização (`int8`) reduz o tamanho do modelo para cerca de 3 GB, tornando‑o viável na maioria das GPUs de consumo. Ajuste `gpu_layers` de acordo com seu hardware—mais camadas na GPU = inferência mais rápida. + +--- + +## Etapa 5: Extrair Texto da Fatura Usando Pós‑Processamento Aprimorado por IA + +Agora conectamos o LLM ao motor OCR e executamos um **pós‑processador** que limpa a saída bruta, corrige erros de OCR e formata os campos da fatura de forma elegante. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Exemplo de saída aprimorada:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **O que aconteceu?** O LLM reconheceu que “Invoice #12345” deveria ser “Invoice Number: 12345”, corrigiu o formato da data e até inferiu o campo “Bill To” que o motor bruto perdeu. Este é o núcleo da automação de **extrair texto de fatura**. + +--- + +## Etapa 6: Liberar Recursos da GPU – Limpar Após o Processamento + +Se você estiver executando isso em um serviço de longa duração (por exemplo, uma API Flask), deve **liberar recursos da GPU** após cada inferência para evitar travamentos por falta de memória. O Aspose AI fornece um método simples para isso. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Dica profissional:** Chame `free_resources()` dentro de um bloco `finally:` se você envolver a chamada OCR em um try/except. Isso garante a limpeza mesmo quando ocorre uma exceção. + +--- + +## Etapa 7: Script Completo – Junte Tudo + +Abaixo está o script completo, pronto para executar. Copie‑e‑cole, ajuste os caminhos e você está pronto para usar. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Execute o script e observe a transformação de OCR ruidoso para dados de fatura limpos e estruturados. 🎉 + +--- + +## Perguntas Frequentes e Casos de Borda + +| Question | Answer | +|----------|--------| +| **E se o modelo falhar ao baixar?** | Certifique‑se de que sua máquina tem acesso à internet e que o `hugging_face_repo_id` está correto. Você também pode baixar manualmente o | + +## O que Você Deve Aprender a Seguir? + +Os tutoriais a seguir cobrem tópicos intimamente relacionados que se baseiam nas técnicas demonstradas neste guia. Cada recurso inclui exemplos de código completos e funcionais com explicações passo a passo para ajudá‑lo a dominar recursos adicionais da API e explorar abordagens de implementação alternativas em seus próprios projetos. + +- [Extrair texto de imagem C# com seleção de idioma usando Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extrair Texto de Imagem – Otimização de OCR com Aspose.OCR para .NET](/ocr/english/net/ocr-optimization/) +- [Converter Imagem em Texto – Realizar OCR em Imagem a partir de URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/portuguese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/portuguese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..e88f22155 --- /dev/null +++ b/ocr/portuguese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,283 @@ +--- +category: general +date: 2026-06-06 +description: reconhecer texto a partir de imagem com Aspose OCR – aprenda como carregar + a imagem para OCR e realizar OCR na imagem usando pós‑processamento de IA em Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: pt +og_description: reconheça texto de imagem rapidamente. Este guia mostra como carregar + a imagem para OCR, realizar OCR na imagem e melhorar os resultados com pós‑processamento + de IA. +og_title: reconhecer texto de imagem usando Aspose OCR e IA +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: reconhecer texto de imagem usando Aspose OCR e IA +url: /pt/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# reconhecer texto a partir de imagem usando Aspose OCR & IA + +Já precisou reconhecer texto a partir de uma imagem, mas não sabia qual biblioteca ofereceria velocidade e precisão? Você não está sozinho. Neste guia, percorreremos um exemplo completo, de ponta a ponta, que mostra **como carregar a imagem para OCR**, **executar OCR na imagem** e, em seguida, aprimorar o resultado com o pós‑processador de IA da Aspose. Ao final, você terá um script pronto para rodar que transforma um PNG em texto limpo e pesquisável. + +## O que você vai aprender + +Cobriremos tudo, desde a instalação do pacote Aspose OCR até a liberação de recursos ao final da execução. Você verá por que habilitar o reconhecimento de texto manuscrito é importante, como configurar um LLM Qwen 2.5 para pós‑processamento e como fica a saída final. Nenhuma referência externa é necessária — basta copiar, colar e executar. + +### Pré‑requisitos + +- Python 3.8 ou superior +- pacote `asposeocr` (`pip install asposeocr`) +- Um arquivo de imagem (por exemplo, `doc.png`) que contenha texto impresso ou manuscrito +- Opcional: uma GPU para inferência de LLM mais rápida (o script funciona também em CPU) + +--- + +## Reconhecer texto a partir de imagem – Passo a Passo + +Abaixo de cada bloco de código você encontrará uma breve explicação do **porquê** da ação, não apenas do **o que** a linha faz. + +### Passo 1: Instalar e importar os módulos necessários + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Por que?* Importar `asposeocr` nos fornece a classe `OcrEngine`, enquanto o submódulo `ai` oferece o pós‑processador baseado em LLM que melhora drasticamente a saída bruta do OCR. + +### Passo 2: Criar o motor OCR e habilitar o reconhecimento de texto manuscrito + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Habilitar o reconhecimento de manuscrito expande o conjunto de caracteres do motor, de modo que você não perderá rabiscos ao **executar OCR na imagem** que contenha texto impresso e cursivo misturados. + +### Passo 3: Carregar a imagem para OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +A chamada `load_image` é o momento em que você **carrega a imagem para OCR**; se o caminho estiver errado, o motor lançará uma exceção informativa, evitando erros crípticos posteriores. + +### Passo 4: Executar a passagem OCR bruta + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +Nesta fase você obtém um objeto `RecognitionResult` que contém o texto não filtrado, pontuações de confiança e metadados de layout. Geralmente ele é ruidoso — daí a necessidade de limpeza guiada por IA. + +### Passo 5: Configurar o modelo Aspose AI para pós‑processamento LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Por que se preocupar com essas configurações? +- **auto‑download** garante que o modelo esteja disponível na primeira execução. +- **int8 quantization** reduz drasticamente a demanda de memória sem grande perda de precisão. +- **gpu_layers** permite aproveitar uma GPU compatível para inferência mais rápida. + +### Passo 6: Inicializar o processador AI e anexá‑lo como pós‑processador + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Anexar o processador significa que toda vez que você chamar `run_postprocessor`, o LLM limpará ortografia, mesclará palavras quebradas e até inferirá pontuação ausente. + +### Passo 7: Executar o pós‑processador para melhorar a saída OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +O `enhanced_result.text` costuma ser muito mais legível que a string bruta — pense nele como um corretor ortográfico que também entende o contexto. + +### Passo 8: Liberar recursos + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Limpar recursos é crucial em serviços de longa duração; caso contrário, você vazará memória da GPU e eventualmente travará sua aplicação. + +--- + +## Script completo que você pode executar hoje + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Saída esperada** (exemplo para uma imagem de nota fiscal simples): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Observe como a camada de IA corrigiu “Inv0ice” → “Invoice” e adicionou pontuação que faltava. + +--- + +## Perguntas frequentes (e respostas rápidas) + +- **Preciso de uma GPU?** Não. O script recai para CPU, mas a inferência será mais lenta. +- **Posso usar outro LLM?** Absolutamente — basta mudar `hugging_face_repo_id` e ajustar `gpu_layers` conforme necessário. +- **E se minha imagem for enorme?** Redimensione‑a primeiro (por exemplo, usando Pillow) para manter o uso de memória razoável. +- **O reconhecimento de manuscrito está sempre ativo?** Você pode alternar `enable_handwritten_recognition` conforme sua carga de trabalho. + +--- + +## Conclusão + +Agora você sabe como **reconhecer texto a partir de imagem** usando Aspose OCR, como **carregar a imagem para OCR** e como **executar OCR na imagem** com pós‑processamento aprimorado por IA. O exemplo completo e executável acima fornece uma base sólida para integrar OCR em qualquer projeto Python — seja escaneando recibos, digitalizando contratos ou extraindo dados de formulários manuscritos. + +Pronto para o próximo passo? Experimente trocar o modelo Qwen por um maior, teste diferentes esquemas de quantização ou encadeie várias imagens para processamento em lote. As possibilidades são infinitas, e o código que você acabou de criar lidará com elas de forma elegante. + +Boa codificação, e que seus resultados de OCR sejam sempre cristalinos! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Screenshot showing how to recognize text from image using Aspose OCR"} + +## O que você deve aprender a seguir? + +Os tutoriais a seguir abordam tópicos intimamente relacionados que ampliam as técnicas demonstradas neste guia. Cada recurso inclui exemplos de código completos e funcionais com explicações passo a passo para ajudá‑lo a dominar recursos adicionais da API e explorar abordagens alternativas de implementação em seus próprios projetos. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/russian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/russian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..a7c882044 --- /dev/null +++ b/ocr/russian/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,217 @@ +--- +category: general +date: 2026-06-06 +description: Извлеките текст из изображения с рукописным текстом с помощью Python + OCR. Узнайте, как быстро и надёжно преобразовать фотографию рукописного текста в + обычный текст. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: ru +og_description: Извлеките текст из изображения с рукописным текстом с помощью Python. + Это руководство показывает, как преобразовать фотографию рукописного текста в обычный + текст, и отвечает на вопрос, как распознавать рукописный текст. +og_title: Извлечение текста из рукописного изображения – учебник по OCR на Python +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Извлечение текста из рукописного изображения с помощью Python OCR – пошаговое + руководство +url: /ru/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Извлечение текста из рукописного изображения с помощью Python OCR – Пошаговое руководство + +Когда‑нибудь задавались вопросом **как распознавать рукописный текст** на фотографии, сделанной на вашем телефоне? Вы не одиноки. Во многих проектах — будь то оцифровка конспектов лекций или извлечение данных из подписанных форм — вам нужно **извлекать текст из рукописного изображения** быстро и без проблем. + +В этом руководстве мы пройдём через полностью готовый к запуску пример, который покажет, как **преобразовать рукописную фотографию в текст** с помощью популярной библиотеки Python OCR. Никаких расплывчатых ссылок, только конкретный код, объяснения и советы, которые вы можете скопировать‑вставить уже сегодня. + +![извлечение текста из рукописного изображения](https://example.com/placeholder-handwritten.jpg "извлечение текста из рукописного изображения") + +## Что вам понадобится + +Прежде чем погрузиться в детали, убедитесь, что у вас есть следующее: + +| Требование | Почему это важно | +|-------------|-------------------| +| Python 3.9 или новее | Современный синтаксис и поддержка библиотек | +| `pip` (менеджер пакетов Python) | Для установки OCR‑пакета | +| Чёткое изображение рукописных заметок (JPEG/PNG) | Точность OCR падает на размытых изображениях | +| Базовое знакомство с функциями Python | Поможет адаптировать пример позже | + +Если чего‑то не хватает, скачайте последнюю версию Python с и установите её — без проблем. + +## Установите библиотеку Python OCR + +Фрагмент кода, который мы будем использовать, опирается на пакет `ocr` (тонкая обёртка над Tesseract, добавляющая удобные настройки). Установите его одной командой: + +```bash +pip install ocr +``` + +> **Pro tip:** После установки выполните `tesseract --version` в терминале, чтобы убедиться, что движок установлен. Если у вас нет Tesseract, следуйте официальному руководству для вашей ОС — большинство менеджеров пакетов имеют его (`apt-get install tesseract-ocr` в Ubuntu, `brew install tesseract` в macOS). + +## Шаг 1: Создайте экземпляр OCR‑движка + +Создание движка — первый кирпич в стене **python ocr handwritten recognition**. Думайте о движке как о мозге, который позже будет читать ваши каракули. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Почему это важно: без движка вы не сможете настроить конвейер распознавания. Стандартные параметры оптимизированы под печатный текст, поэтому их придётся изменить на следующем шаге. + +## Шаг 2: Включите режим распознавания рукописного текста + +По умолчанию движок ожидает печатные символы. Включение режима рукописного текста переключает флаг, заставляющий Tesseract использовать свою LSTM‑модель, обученную на курсивных штрихах. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Что произойдёт, если пропустить этот шаг?** OCR будет воспринимать штрихи как шум, выдавая бессмысленный вывод. Включение флага — ключ к **how to recognize handwritten text**. + +## Шаг 3: Загрузите вашу рукописную фотографию + +Теперь указываем движку путь к файлу изображения. Путь может быть абсолютным или относительным; просто убедитесь, что файл существует. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Быстрая проверка: откройте изображение в просмотрщике ОС. Если текст размытый, подумайте о предобработке (повышение контрастности, вращение) перед передачей в движок — такие правки часто повышают успешность **extract text from handwritten image**. + +## Шаг 4: Запустите процесс распознавания + +Когда всё настроено, мы наконец просим движок выполнить свою работу. Метод `recognize()` возвращает объект результата, содержащий извлечённую строку и оценки уверенности. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +За кулисами движок преобразует растровое изображение в набор векторных признаков, пропускает их через LSTM‑сеть и склеивает символы вместе. Это и есть магия **python ocr handwritten recognition**. + +## Шаг 5: Выведите извлечённый текст + +Объект результата имеет атрибут `.text`, содержащий обычную Unicode‑строку. Выведите её, запишите в файл или передайте в другой конвейер — как вам удобно. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Ожидаемый вывод + +Если на исходном изображении записано «Buy milk, eggs, and bread», вы увидите примерно следующее: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Обратите внимание, что вывод сохраняет пунктуацию и переносы строк (если они есть). Если получаете набор бессмыслицы, проверьте качество изображения и флаг `enable_handwritten_recognition`. + +## Работа с распространёнными проблемами + +| Проблема | Симптом | Решение | +|----------|----------|----------| +| Низкие оценки уверенности | Много «?», бессмысленные символы | Увеличьте DPI изображения до ≥300, примените бинаризацию (`opencv`) или обрежьте область интереса. | +| Смешанные языки | Вывод сочетает английский с другим скриптом | Установите `engine.ocr_settings.language = "eng"` (или другой ISO‑код) перед `recognize()`. | +| Большие файлы | Длительное время обработки или ошибка памяти | Измените размер изображения до разумных параметров (например, максимальная ширина 1200 px) перед загрузкой. | +| Отсутствует Tesseract | `ImportError` или `FileNotFoundError` | Установите Tesseract отдельно и убедитесь, что он находится в системном PATH. | + +Эти корректировки делают ваш рабочий процесс **convert handwritten photo to text** надёжным для самых разных наборов данных. + +## Полный скрипт, который можно запустить сегодня + +Ниже представлен полностью автономный скрипт, объединяющий все части. Скопируйте его в файл `handwritten_ocr.py` и выполните `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Запустите, и вы увидите текст, выведенный в консоль — именно тот результат, который нужен, когда вы хотите **convert handwritten photo to text**. + +## Дальнейшие шаги + +Теперь, когда вы освоили основы **extract text from handwritten image**, можно перейти к следующим задачам: + +- **Пакетная обработка:** пройдитесь по папке с изображениями и сохраняйте каждый результат в CSV‑файл. +- **Пост‑обработка:** используйте регулярные выражения для исправления типичных ошибок OCR (например, «1» vs «l»). +- **Интеграция:** передавайте извлечённые строки в конвейер обработки естественного языка для анализа тональности или извлечения ключевых слов. +- **Альтернативные библиотеки:** если требуется более высокая точность, изучите `easyocr` или `pytesseract` с кастомными LSTM‑моделями — обе поддерживают **python ocr handwritten recognition**. + +Помните, качество исходного изображения часто определяет успех, поэтому уделите несколько минут предобработке. Небольшие усилия сейчас сэкономят вам кучу отладки позже. + +## Заключение + +Мы прошли полный сквозной пример, показывающий **how to recognize handwritten text** и, что важнее, как **extract text from handwritten image** с помощью Python. Установив пакет `ocr`, включив флаг рукописного распознавания, загрузив изображение и вызвав `recognize()`, вы сможете **convert handwritten photo to text** всего в несколько строк кода. + +Попробуйте на своих заметках, поиграйте с предобработкой, и позвольте OCR выполнить тяжёлую работу. Если возникнут проблемы, обратитесь к таблице «Работа с распространёнными проблемами» или поэкспериментируйте с другими OCR‑бэкендами. Приятного кодинга, и пусть ваши рукописные данные станут мгновенно доступными для поиска! + +## Что изучать дальше? + +Следующие руководства охватывают тесно связанные темы, расширяющие техники, продемонстрированные в этом пособии. Каждый ресурс содержит полностью рабочие примеры кода с пошаговыми объяснениями, помогающими освоить дополнительные возможности API и исследовать альтернативные подходы в ваших проектах. + +- [Извлечение текста из изображения с помощью Aspose OCR – Пошаговое руководство](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Как распознавать прямоугольники страниц для OCR‑распознавания текста в Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Как использовать OCR — распознавание изображения без определения текстовой области](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/russian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/russian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..5a5488996 --- /dev/null +++ b/ocr/russian/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,258 @@ +--- +category: general +date: 2026-06-06 +description: 'Руководство по набору символов OCR: узнайте, как использовать OCR‑движок + для вывода поддерживаемых символов латинского и кириллического алфавитов с полными + примерами на Python.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: ru +og_description: 'Руководство по набору символов OCR: узнайте, как использовать OCR‑движок + в Python для вывода поддерживаемых символов различных письменностей, с примерами + кода и советами.' +og_title: Набор символов OCR – Получение и использование OCR‑движка +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Набор символов OCR – Получение и использование OCR‑движка в Python +url: /ru/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Набор символов OCR – Получение и использование OCR Engine в Python + +Хотите узнать **ocr character set**, поддерживаемый вашей библиотекой? В этом руководстве мы покажем, как **use OCR engine** для запроса поддерживаемых символов различных письменностей и почему это важно для реальных проектов. + +Если вы когда‑нибудь сидели перед пустым экраном, задаваясь вопросом, почему некоторые буквы никогда не появляются после обработки OCR, вы не одиноки. Ответ часто скрыт в наборе символов, известном движку. К концу этого руководства вы сможете: + +* Вывести список всех символов, которые OCR‑движок может распознать для заданной письменности. +* Обрабатывать случаи, когда письменность не поддерживается. +* Подключать информацию о наборе символов к проверке валидности или логике пользовательского интерфейса. + +Никакой магии — только чистый Python, несколько строк кода и небольшое объяснение. + +--- + +## Требования + +Прежде чем начать, убедитесь, что у вас есть: + +* Python 3.8+ (код использует f‑строки). +* Библиотека OCR, предоставляющая `ocr.OcrEngine` — в этом примере будем считать, что пакет `ocr` уже установлен через `pip install ocr-lib`. +* Базовое знакомство с системой импорта Python и выводом на экран. + +Если библиотека отсутствует, выполните: + +```bash +pip install ocr-lib +``` + +Это всё — никаких дополнительных бинарных файлов или нативных зависимостей для базовых запросов набора символов, которые мы будем выполнять. + +--- + +## Шаг 1: Инициализация экземпляра OCR Engine + +Создание объекта движка — первое, что нужно сделать, чтобы **use OCR engine**. Представьте, что это включение сканера; без него задать вопросы нельзя. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +Класс `OcrEngine` — лёгкая обёртка над реальным движком распознавания. Он не запускает тяжёлую обработку, пока вы не запросите что‑то, поэтому затраты на старт минимальны. + +> **Pro tip:** Если ваше приложение создаёт много экземпляров движка, переиспользуйте один. Это экономит память и уменьшает задержку инициализации. + +--- + +## Шаг 2: Запрос набора символов OCR для конкретной письменности + +Теперь, когда у нас есть движок, мы можем спросить его, какие символы он знает для определённой системы письма. Метод `get_supported_characters(script_name)` возвращает Python `list` Unicode‑символов. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Выполнение фрагмента выше выводит что‑то вроде: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Точный вывод зависит от версии OCR‑библиотеки, но всегда будет плоским списком символов. Если нужны как заглавные, так и строчные буквы, просто запросите письменность `"Latin"` и примените `.lower()` к каждому элементу, либо запросите отдельную `"Latin‑lower"` письменность, если библиотека её различает. + +### Почему это важно? + +Когда вы подаёте изображение с необычными диакритическими знаками (например, “ñ” или “ø”), OCR‑движок может тихо заменить их заполнителем или полностью удалить. Зная **ocr character set** заранее, вы можете предварительно проверять ввод, предупреждать пользователей или переключаться на другой движок. + +--- + +## Шаг 3: Получение символов для другой письменности – пример с кириллицей + +Тот же метод работает для любой письменности, которую движок заявляет как поддерживаемую. Посмотрим, что произойдёт с кириллицей. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Типичный вывод: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Если движок **не** поддерживает запрошенную письменность, обычно генерируется `ValueError` (или возвращается пустой список, в зависимости от реализации). Защитим код от этого. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Шаг 4: Визуализация набора символов OCR (необязательно) + +Иногда быстрая визуализация помогает, особенно когда нужно показать заинтересованным сторонам, какие символы покрыты. Ниже минимальный пример с использованием `matplotlib` для отображения сетки символов. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Image alt text:** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +Диаграмма не обязательна для основной задачи, но демонстрирует, как можно **use OCR engine** метаданные помимо простого вывода в консоль. + +--- + +## Шаг 5: Интеграция набора символов в реальные рабочие процессы + +Теперь, когда мы можем получить **ocr character set**, посмотрим практический сценарий: проверка вывода OCR перед сохранением в базу данных. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Этот шаблон предотвращает попадание «мусорных» данных в последующие конвейеры — частая проблема при работе с многоязычными документами. + +--- + +## Распространённые подводные камни и граничные случаи + +| Pitfall | Why it Happens | How to Avoid | +|---------|----------------|--------------| +| **Опечатка в названии письменности** (например, `"Cyrillic "` с пробелом в конце) | Движок воспринимает строку буквально и не может найти письменность. | Удаляйте пробелы: `script.strip()` перед вызовом `get_supported_characters`. | +| **Пустой список символов** | Некоторые движки объявляют письменность, но ещё не загрузили языковую модель. | Вызовите `engine.load_language_model(script)`, если библиотека предоставляет такой метод, или убедитесь, что файлы модели присутствуют. | +| **Проблемы нормализации Unicode** | Символы вроде “é” могут быть в составном (`\u00E9`) или разложенном (`e\u0301`) виде. | Нормализуйте строки с помощью `unicodedata.normalize('NFC', text)` перед проверкой. | +| **Производительность при больших письменностях** (например, китайский) | Получение тысяч символов может быть медленным. | Кешируйте результат после первого вызова; большинству приложений достаточно один раз получить список. | + +--- + +## Полный рабочий пример + +Собрав всё вместе, получаем скрипт, который можно скопировать и запустить: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Что изучать дальше? + + +Следующие руководства охватывают тесно связанные темы, построенные на техниках, продемонстрированных в этом руководстве. Каждый ресурс включает полностью рабочие примеры кода с пошаговыми объяснениями, чтобы помочь вам освоить дополнительные возможности API и исследовать альтернативные подходы в собственных проектах. + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/russian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/russian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..3d9b43314 --- /dev/null +++ b/ocr/russian/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,254 @@ +--- +category: general +date: 2026-06-06 +description: Выполните OCR изображения с помощью Aspose OCR и модели Hugging Face. + Узнайте, как скачать модель Hugging Face, извлечь текст из счета и освободить ресурсы + GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: ru +og_description: Выполните OCR изображения с использованием Aspose OCR и модели Hugging + Face. Этот учебник показывает, как загрузить модель, извлечь текст из счета и освободить + ресурсы GPU. +og_title: Выполните OCR изображения с помощью Aspose OCR и LLM — Полное руководство +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Выполните OCR на изображении с помощью Aspose OCR и LLM — полное руководство +url: /ru/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Выполнить OCR на изображении с Aspose OCR & LLM – Полное руководство + +Когда‑нибудь хотели **perform OCR on image** файлы, но застряли на вопросе «с чего начать?»? Вы не одиноки — многие разработчики сталкиваются с этим, когда впервые берутся за автоматизацию документов. Хорошая новость в том, что с Aspose OCR и лёгкой LLM от Hugging Face вы можете превратить сырое сканирование счета в чистый, поисковый текст всего за несколько строк Python. + +В этом руководстве мы пройдем всё, что вам нужно: от **loading image for OCR**, до **download Hugging Face model**, до **extract text from invoice** данных, и наконец **free GPU resources**, чтобы ваше приложение оставалось лёгким. К концу у вас будет автономный скрипт, который можно добавить в любой проект. + +--- + +## Что вы узнаете + +- Как **perform OCR on image** с использованием `OcrEngine` от Aspose. +- Точные шаги для автоматической **download Hugging Face model** файлов. +- Методы **extract text from invoice** PDF или PNG с AI‑усиленной пост‑обработкой. +- Лучшие практики **free GPU resources** после вывода. +- Советы по эффективному **load image for OCR** и избежанию распространённых ошибок. + +Внешняя документация не требуется — всё, что вам нужно, находится здесь, с полным кодом, объяснениями и ожидаемым выводом. + +--- + +## Требования + +| Требование | Причина | +|-------------|--------| +| Python 3.9+ | Современный синтаксис и подсказки типов | +| `asposeocr` package (`pip install asposeocr`) | Основной OCR‑движок | +| Access to a GPU (optional but recommended) | Ускоряет пост‑процессор LLM | +| An invoice image (`sample_invoice.png`) | Реальный пример для теста | + +Если у вас чего‑то не хватает, установите сейчас; скрипт также **download Hugging Face model** автоматически, так что вам не придётся искать файлы вручную. + +--- + +## Шаг 1: Perform OCR on Image – Создать движок + +Первое, что нужно сделать, — запустить OCR‑движок Aspose. Представьте его как открытие чистого холста, на котором изображение позже будет преобразовано в текст. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Почему это важно:** `OcrEngine` абстрагирует всю низкоуровневую предобработку изображений, позволяя сосредоточиться на более высокоуровневом рабочем процессе. Он также предоставляет метод `set_post_processor`, который позже позволит подключить LLM для более умного вывода. + +--- + +## Шаг 2: Load Image for OCR – Choose the Right File + +Теперь, когда движок существует, нам нужно **load image for OCR**. Aspose поддерживает PNG, JPG, TIFF и несколько других форматов. Убедитесь, что путь абсолютный или относительный к расположению вашего скрипта. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Совет:** Если изображение большое, рассмотрите возможность изменения его размера заранее, чтобы снизить нагрузку на память. OCR‑движок может обрабатывать сканы высокого разрешения, но изображение 300 DPI обычно является оптимальным для счетов. + +--- + +## Шаг 3: Perform Raw OCR and View the Extracted Text + +С загруженным изображением мы наконец можем **perform OCR on image** и увидеть, что выдаёт сырой движок. Этот шаг даёт нам базовую линию перед добавлением AI‑магии. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Ожидаемый вывод (усечённый):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Сырой вывод часто содержит разрывы строк, неправильно распознанные символы или отсутствующие поля — именно поэтому мы подключим языковую модель дальше. + +--- + +## Шаг 4: Download Hugging Face Model – Configure the LLM Post‑Processor + +Здесь шаг **download Hugging Face model** проявляет себя. Aspose AI может автоматически загрузить модель из Hugging Face hub, если её ещё нет на диске. Мы будем использовать модель Qwen2.5‑3B‑Instruct‑GGUF, которая балансирует точность и объём памяти. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Почему это работает:** `allow_auto_download` избавляет от необходимости вручную скачивать файл `.gguf`. Квантование (`int8`) уменьшает размер модели до примерно 3 GB, делая её пригодной для большинства потребительских GPU. Настройте `gpu_layers` в зависимости от вашего оборудования — больше слоёв на GPU = быстрее вывод. + +--- + +## Шаг 5: Extract Text from Invoice Using AI‑Enhanced Post‑Processing + +Теперь мы привязываем LLM к OCR‑движку и запускаем **post‑processor**, который очищает сырой вывод, исправляет ошибки OCR и аккуратно форматирует поля счета. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Пример улучшенного вывода:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Что произошло?** LLM распознала, что “Invoice #12345” должно быть “Invoice Number: 12345”, исправила формат даты и даже вывела поле “Bill To”, которое пропустил сырой движок. Это суть автоматизации **extract text from invoice**. + +--- + +## Шаг 6: Free GPU Resources – Clean Up After Processing + +Если вы запускаете это в длительно живом сервисе (например, Flask API), необходимо **free GPU resources** после каждого вывода, чтобы избежать сбоев из‑за нехватки памяти. Aspose AI предоставляет простой метод для этого. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Профессиональный совет:** Вызывайте `free_resources()` внутри блока `finally:`, если вы оборачиваете вызов OCR в try/except. Это гарантирует очистку даже при возникновении исключения. + +--- + +## Шаг 7: Full Script – Put It All Together + +Ниже представлен полный, готовый к запуску скрипт. Скопируйте‑вставьте его, скорректируйте пути, и вы готовы к работе. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Запустите скрипт и наблюдайте трансформацию от шумного OCR к чистым, структурированным данным счета. 🎉 + +--- + +## Часто задаваемые вопросы и крайние случаи + +| Вопрос | Ответ | +|----------|--------| +| **Что если модель не удалось загрузить?** | Убедитесь, что ваш компьютер имеет доступ к интернету и что `hugging_face_repo_id` указан правильно. Вы также можете вручную скачать the | + +## Что следует изучить дальше? + +Следующие руководства охватывают тесно связанные темы, которые развивают техники, продемонстрированные в этом руководстве. Каждый ресурс включает полностью работающие примеры кода с пошаговыми объяснениями, чтобы помочь вам освоить дополнительные возможности API и исследовать альтернативные подходы к реализации в ваших проектах. + +- [Извлечь текст из изображения C# с выбором языка с использованием Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Извлечь текст из изображения – оптимизация OCR с Aspose.OCR для .NET](/ocr/english/net/ocr-optimization/) +- [Преобразовать изображение в текст – выполнить OCR на изображении из URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/russian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/russian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..b116652d2 --- /dev/null +++ b/ocr/russian/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,284 @@ +--- +category: general +date: 2026-06-06 +description: распознавать текст с изображения с помощью Aspose OCR – узнайте, как + загрузить изображение для OCR и выполнить OCR изображения, используя AI‑постобработку + в Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: ru +og_description: быстро распознавать текст с изображения. Это руководство показывает, + как загрузить изображение для OCR, выполнить OCR на изображении и улучшить результаты + с помощью AI‑постобработки. +og_title: Распознавание текста с изображения с помощью Aspose OCR и ИИ +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: распознавание текста с изображения с использованием Aspose OCR и ИИ +url: /ru/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Распознавание текста с изображения с помощью Aspose OCR & AI + +Когда‑либо вам нужно было распознать текст с изображения, но вы не были уверены, какая библиотека обеспечит и скорость, и точность? Вы не одиноки. В этом руководстве мы пройдем полный пример от начала до конца, показывающий **how to load image for OCR**, **perform OCR on image**, а затем улучшим результат с помощью AI‑постпроцессора Aspose. К концу у вас будет готовый к запуску скрипт, преобразующий PNG в чистый, индексируемый текст. + +## Что вы узнаете + +Мы рассмотрим всё: от установки пакета Aspose OCR до освобождения ресурсов в конце выполнения. Вы увидите, почему важно включать распознавание рукописного текста, как настроить LLM Qwen 2.5 для пост‑обработки и как выглядит окончательный результат. Никаких внешних ссылок не требуется — просто скопируйте, вставьте и запустите. + +### Требования + +- Python 3.8 или новее +- пакет `asposeocr` (`pip install asposeocr`) +- файл изображения (например, `doc.png`), содержащий печатный или рукописный текст +- Опционально: GPU для более быстрой инференции LLM (скрипт работает и на CPU) + +--- + +## Распознавание текста с изображения — пошагово + +Под каждым блоком кода вы найдёте короткое объяснение **почему** мы выполняем конкретное действие, а не просто **что** делает строка. + +### Шаг 1: Установить и импортировать необходимые модули + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Почему?* Импорт `asposeocr` предоставляет нам класс `OcrEngine`, а подмодуль `ai` — пост‑процессор на основе LLM, который значительно улучшает исходный результат OCR. + +### Шаг 2: Создать движок OCR и включить распознавание рукописного текста + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Включение распознавания рукописного текста расширяет набор символов движка, поэтому вы не потеряете пометки, когда **perform OCR on image** файлов, содержащих смешанный печатный и курсивный текст. + +### Шаг 3: Загрузить изображение для OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +Вызов `load_image` — это момент, когда вы **load image for OCR**; если путь неверный, движок выдаст информативное исключение, спасая вас от непонятных последующих ошибок. + +### Шаг 4: Выполнить первичный проход OCR + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +На этом этапе вы получаете объект `RecognitionResult`, содержащий нефильтрованный текст, оценки уверенности и метаданные разметки. Он часто шумный — поэтому требуется очистка с помощью ИИ. + +### Шаг 5: Настроить AI‑модель Aspose для пост‑обработки LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Зачем эти настройки? +- **auto‑download** гарантирует, что модель будет доступна при первом запуске. +- **int8 quantization** уменьшает потребление памяти без значительной потери точности. +- **gpu_layers** позволяет использовать совместимый GPU для более быстрой инференции. + +### Шаг 6: Инициализировать AI‑процессор и прикрепить его как пост‑процессор + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Прикрепление процессора означает, что каждый раз при вызове `run_postprocessor` LLM будет исправлять орфографию, объединять разорванные слова и даже восстанавливать отсутствующую пунктуацию. + +### Шаг 7: Запустить пост‑процессор для улучшения вывода OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` обычно гораздо более читаем, чем исходная строка — представьте это как проверку орфографии, понимающую контекст. + +### Шаг 8: Освободить ресурсы + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Очистка критически важна в длительно работающих сервисах; иначе вы будете утекать память GPU и в конечном итоге ваш приложение упадёт. + +--- + +## Полный скрипт, который вы можете запустить сегодня + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Ожидаемый вывод** (пример для простого изображения счета): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Обратите внимание, как слой ИИ исправил «Inv0ice» → «Invoice» и добавил недостающую пунктуацию. + +--- + +## Часто задаваемые вопросы (и быстрые ответы) + +- **Нужен ли мне GPU?** Нет. Скрипт переключается на CPU, но инференс будет медленнее. +- **Можно ли использовать другую LLM?** Конечно — просто измените `hugging_face_repo_id` и соответственно настройте `gpu_layers`. +- **Что если мое изображение огромное?** Сначала измените его размер (например, с помощью Pillow), чтобы потребление памяти было разумным. +- **Всегда ли включено распознавание рукописного текста?** Вы можете переключать `enable_handwritten_recognition` в зависимости от нагрузки. + +--- + +## Заключение + +Теперь вы знаете, как **recognize text from image** с помощью Aspose OCR, как **load image for OCR**, и как **perform OCR on image** с AI‑улучшенной пост‑обработкой. Полный, исполняемый пример выше предоставляет надёжную основу для интеграции OCR в любой проект на Python — будь то сканирование чеков, оцифровка контрактов или извлечение данных из рукописных форм. + +Готовы к следующему шагу? Попробуйте заменить модель Qwen на более крупную, поэкспериментировать с различными схемами квантования или объединить несколько изображений для пакетной обработки. Возможностей бесконечно, и код, который вы только что создали, справится с ними без проблем. + +Удачной разработки, и пусть результаты OCR всегда будут кристально чистыми! + +![Снимок экрана консоли Python, показывающий улучшенный вывод OCR](/images/ocr_output.png){alt="Снимок экрана, показывающий, как распознавать текст с изображения с помощью Aspose OCR"} + +## Что вам стоит изучить дальше? + +Следующие руководства охватывают тесно связанные темы, построенные на техниках, продемонстрированных в этом руководстве. Каждый ресурс включает полные работающие примеры кода с пошаговыми объяснениями, чтобы помочь вам освоить дополнительные возможности API и исследовать альтернативные подходы к реализации в ваших проектах. + +- [Извлечь текст из изображения с Aspose OCR — пошаговое руководство](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Преобразовать изображение в текст — выполнить OCR на изображении из URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Извлечь текст изображения C# с выбором языка с помощью Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/spanish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/spanish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..ad4706377 --- /dev/null +++ b/ocr/spanish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,215 @@ +--- +category: general +date: 2026-06-06 +description: Extrae texto de una imagen manuscrita usando OCR de Python. Aprende cómo + convertir una foto manuscrita a texto de forma rápida y fiable. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: es +og_description: Extrae texto de una imagen escrita a mano con Python. Esta guía muestra + cómo convertir una foto escrita a mano en texto y responde cómo reconocer texto + manuscrito. +og_title: Extraer texto de una imagen manuscrita – Tutorial de OCR en Python +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Extraer texto de una imagen manuscrita con OCR de Python – Guía paso a paso +url: /es/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Extraer texto de una imagen manuscrita con Python OCR – Guía paso a paso + +¿Alguna vez te has preguntado **cómo reconocer texto manuscrito** en una foto que tomaste con tu teléfono? No estás solo. En muchos proyectos—ya sea digitalizando notas de clase o extrayendo datos de formularios firmados—necesitas **extraer texto de una imagen manuscrita** rápido y sin complicaciones. + +En este tutorial recorreremos un ejemplo completo, listo‑para‑ejecutar, que te muestra exactamente cómo **convertir foto manuscrita a texto** usando una popular biblioteca Python OCR. Sin referencias vagas, solo código concreto, explicaciones y consejos que puedes copiar‑pegar hoy. + +![extraer texto de una imagen manuscrita](https://example.com/placeholder-handwritten.jpg "extraer texto de una imagen manuscrita") + +## Lo que necesitarás + +Antes de sumergirnos, asegúrate de contar con lo siguiente: + +| Requisito | Por qué es importante | +|-------------|----------------| +| Python 3.9 or newer | Sintaxis moderna y soporte de bibliotecas | +| `pip` (Python package manager) | Para instalar el paquete OCR | +| A clear image of handwritten notes (JPEG/PNG) | La precisión del OCR disminuye con imágenes borrosas | +| Basic familiarity with Python functions | Te ayuda a adaptar el ejemplo más adelante | + +Si te falta alguno de estos, descarga la última versión de Python desde e instálala—sin complicaciones. + +## Instalar la biblioteca Python OCR + +El fragmento de código que usaremos depende del paquete `ocr` (un contenedor ligero alrededor de Tesseract que agrega configuraciones útiles). Instálalo con un solo comando: + +```bash +pip install ocr +``` + +> **Consejo profesional:** Después de la instalación, ejecuta `tesseract --version` en tu terminal para confirmar que el motor subyacente está presente. Si no tienes Tesseract, sigue la guía oficial para tu SO—la mayoría de los gestores de paquetes lo tienen (`apt-get install tesseract-ocr` en Ubuntu, `brew install tesseract` en macOS). + +## Paso 1: Crear una instancia del motor OCR + +Crear el motor es el primer ladrillo en la pared del **python ocr handwritten recognition**. Piensa en el motor como el cerebro que luego leerá los garabatos. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Por qué es importante: sin un motor no puedes ajustar la canalización de reconocimiento. La configuración predeterminada está afinada para texto impreso, así que necesitaremos ajustarla en el siguiente paso. + +## Paso 2: Habilitar el reconocimiento manuscrito + +Por defecto, el motor asume caracteres impresos. Habilitar el modo manuscrito activa un interruptor que indica a Tesseract que use su modelo LSTM entrenado con trazos cursivos. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **¿Qué pasa si lo omites?** El OCR tratará los trazos como ruido, resultando en una salida confusa. Habilitar la bandera es el núcleo de **how to recognize handwritten text**. + +## Paso 3: Cargar tu foto manuscrita + +Ahora apuntamos el motor al archivo de imagen. La ruta puede ser absoluta o relativa; solo asegúrate de que el archivo exista. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Una rápida verificación: abre la imagen en el visor de tu SO. Si el texto está borroso, considera pre‑procesar (aumentar contraste, rotar) antes de enviarla al motor—esos ajustes a menudo aumentan la tasa de éxito de **extract text from handwritten image**. + +## Paso 4: Ejecutar el proceso de reconocimiento + +Con todo configurado, finalmente pedimos al motor que haga su trabajo. El método `recognize()` devuelve un objeto de resultado que contiene la cadena extraída y los puntajes de confianza. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Detrás de escena, el motor convierte el bitmap en una serie de vectores de características, los procesa a través de la red LSTM y une los caracteres. Esa es la magia del **python ocr handwritten recognition**. + +## Paso 5: Mostrar el texto extraído + +El objeto de resultado expone un atributo `.text` que contiene la cadena Unicode simple. Imprímelo, escríbelo en un archivo o introdúcelo en otra canalización—tú decides. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Salida esperada + +Si la imagen fuente contiene la nota “Buy milk, eggs, and bread”, verías algo como: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Observa cómo la salida conserva la puntuación y los saltos de línea (si los hay). Si obtienes basura, verifica nuevamente la calidad de la imagen y la bandera `enable_handwritten_recognition`. + +## Manejo de problemas comunes + +| Problema | Síntoma | Solución | +|-------|---------|-----| +| Bajos puntajes de confianza | Muchos “?” o caracteres sin sentido | Aumenta la DPI de la imagen a ≥300, aplica binarización (`opencv`), o recorta a la región de interés. | +| Idiomas mezclados | La salida mezcla inglés y otro alfabeto | Establece `engine.ocr_settings.language = "eng"` (u otro código ISO) antes de `recognize()`. | +| Archivos grandes | Tiempo de procesamiento largo o error de memoria | Redimensiona la imagen a una dimensión razonable (p.ej., ancho máximo 1200 px) antes de cargarla. | +| Tesseract faltante | `ImportError` o `FileNotFoundError` | Instala Tesseract por separado y asegúrate de que esté en el PATH del sistema. | + +Estos ajustes mantienen tu flujo de trabajo **convert handwritten photo to text** robusto a través de diversos conjuntos de datos. + +## Script completo que puedes ejecutar hoy + +A continuación está el programa completo, autónomo, que reúne todas las piezas. Cópialo en un archivo llamado `handwritten_ocr.py` y ejecuta `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Ejecuta el script y verás el texto impreso en la consola—exactamente el resultado que necesitas cuando deseas **convert handwritten photo to text**. + +## Ir más allá + +Ahora que dominas los conceptos básicos de **extract text from handwritten image**, considera los siguientes pasos: + +- **Procesamiento por lotes:** Recorrer una carpeta de imágenes y almacenar cada resultado en un archivo CSV. +- **Post‑procesamiento:** Usa expresiones regulares para limpiar errores comunes de OCR (p.ej., “1” vs “l”). +- **Integración:** Alimenta las cadenas extraídas a una canalización de Procesamiento de Lenguaje Natural para análisis de sentimiento o extracción de palabras clave. +- **Bibliotecas alternativas:** Si necesitas mayor precisión, explora `easyocr` o `pytesseract` con modelos LSTM personalizados—ambas también soportan **python ocr handwritten recognition**. + +Recuerda, la calidad de la imagen fuente suele determinar el éxito, así que dedica unos minutos al preprocesamiento. Un pequeño esfuerzo extra ahora te ahorra mucho depuración después. + +## Conclusión + +Hemos recorrido un ejemplo completo, de extremo a extremo, que muestra **how to recognize handwritten text** y, más importante, cómo **extract text from handwritten image** usando Python. Instalando el paquete `ocr`, activando la bandera manuscrita, cargando tu imagen y llamando a `recognize()`, puedes **convert handwritten photo to text** en solo unas pocas líneas. + +Pruébalo con tus propias notas, ajusta los pasos de preprocesamiento y deja que el OCR haga el trabajo pesado. Si encuentras algún problema, revisita la tabla “Manejo de problemas comunes” o experimenta con back‑ends OCR alternativos. ¡Feliz codificación, y que tus datos manuscritos se vuelvan instantáneamente buscables! + +## ¿Qué deberías aprender a continuación? + +Los siguientes tutoriales cubren temas estrechamente relacionados que amplían las técnicas demostradas en esta guía. Cada recurso incluye ejemplos de código completos y funcionales con explicaciones paso a paso para ayudarte a dominar características adicionales de la API y explorar enfoques de implementación alternativos en tus propios proyectos. + +- [Extraer texto de imagen con Aspose OCR – Guía paso a paso](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Cómo reconocer rectángulos de página para reconocimiento de texto OCR en Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Cómo usar OCR - Reconocer imagen sin detección de área de texto](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/spanish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/spanish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..abb3a3800 --- /dev/null +++ b/ocr/spanish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,258 @@ +--- +category: general +date: 2026-06-06 +description: 'Guía de conjunto de caracteres OCR: aprende cómo usar el motor OCR para + enumerar los caracteres compatibles con los scripts latino y cirílico con ejemplos + completos en Python.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: es +og_description: 'Guía del conjunto de caracteres OCR: descubre cómo usar el motor + OCR en Python para listar los caracteres compatibles con varios scripts, con código + y consejos.' +og_title: Conjunto de caracteres OCR – Recuperar y usar el motor OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Conjunto de caracteres OCR – Recuperar y usar el motor OCR en Python +url: /es/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Conjunto de Caracteres OCR – Recuperar y Usar el Motor OCR en Python + +¿Quieres conocer el **conjunto de caracteres OCR** que admite tu biblioteca? En este tutorial te mostraremos cómo **usar el motor OCR** para consultar los caracteres compatibles con diferentes escrituras, y por qué eso es importante en proyectos del mundo real. + +Si alguna vez te has quedado mirando una pantalla en blanco preguntándote por qué algunas letras nunca aparecen después del procesamiento OCR, no estás solo. La respuesta a menudo está oculta en el conjunto de caracteres que el motor conoce. Al final de esta guía podrás: + +* Listar cada carácter que un motor OCR puede reconocer para una escritura dada. +* Manejar casos en los que una escritura no está soportada. +* Incorporar la información del conjunto de caracteres en validaciones posteriores o lógica de UI. + +Sin trucos mágicos de caja negra—solo Python puro, un par de líneas de código y una breve explicación. + +--- + +## Prerrequisitos + +Antes de comenzar, asegúrate de tener: + +* Python 3.8+ instalado (el código usa f‑strings). +* La biblioteca OCR que proporciona `ocr.OcrEngine`—para este ejemplo asumiremos que el paquete llamado `ocr` ya está instalado mediante `pip install ocr-lib`. +* Familiaridad básica con el sistema de importación de Python y la impresión. + +Si te falta la biblioteca, ejecuta: + +```bash +pip install ocr-lib +``` + +Eso es todo—no se requieren binarios extra ni dependencias nativas para las consultas básicas de conjunto de caracteres que realizaremos. + +--- + +## Paso 1: Inicializar la Instancia del Motor OCR + +Crear un objeto motor es lo primero que haces cuando **usas el motor OCR**. Piensa en ello como encender un escáner; sin él, no puedes hacer preguntas. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +La clase `OcrEngine` es un contenedor ligero alrededor del motor de reconocimiento subyacente. No inicia procesamiento pesado hasta que solicitas algo, por lo que el costo de inicio es insignificante. + +> **Consejo profesional:** Si tu aplicación crea muchas instancias del motor, reutiliza una sola. Ahorras memoria y reduces la latencia de inicialización. + +--- + +## Paso 2: Consultar el Conjunto de Caracteres OCR para una Escritura Específica + +Ahora que tenemos un motor, podemos preguntarle qué caracteres conoce para un sistema de escritura particular. El método `get_supported_characters(script_name)` devuelve una `list` de Python con caracteres Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Ejecutar el fragmento anterior imprime algo como: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +La salida exacta depende de la versión de la biblioteca OCR, pero siempre obtendrás una lista plana de caracteres. Si necesitas tanto mayúsculas como minúsculas, simplemente solicita la escritura `"Latin"` y luego aplica `.lower()` a cada elemento, o consulta una escritura `"Latin‑lower"` separada si la biblioteca las diferencia. + +### ¿Por qué es importante? + +Cuando alimentas una imagen que contiene diacríticos inusuales (p. ej., “ñ” o “ø”), el motor OCR puede reemplazarlos silenciosamente con un marcador de posición o eliminarlos por completo. Conocer de antemano el **conjunto de caracteres OCR** te permite pre‑validar la entrada, advertir a los usuarios o recurrir a otro motor. + +--- + +## Paso 3: Recuperar Caracteres para Otra Escritura – Ejemplo Cirílico + +El mismo método funciona para cualquier escritura que el motor afirme soportar. Veamos qué ocurre con el cirílico. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Salida típica: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Si el motor **no** soporta la escritura solicitada, normalmente lanza un `ValueError` (o devuelve una lista vacía según la implementación). Protejámonos contra eso. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Paso 4: Visualizar el Conjunto de Caracteres OCR (Opcional) + +A veces una visualización rápida ayuda, sobre todo cuando necesitas mostrar a los interesados qué caracteres están cubiertos. A continuación tienes un ejemplo mínimo usando `matplotlib` para renderizar una cuadrícula de caracteres. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Texto alternativo de la imagen:** ![OCR character set diagram](ocr_character_set.png){alt="Vista general del conjunto de caracteres OCR"} + +El diagrama no es obligatorio para la solución principal, pero demuestra cómo puedes **usar el motor OCR** más allá de la simple impresión. + +--- + +## Paso 5: Integrar el Conjunto de Caracteres en Flujos de Trabajo del Mundo Real + +Ahora que podemos obtener el **conjunto de caracteres OCR**, veamos un escenario práctico: validar la salida OCR antes de almacenarla en una base de datos. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Este patrón evita que datos basura se filtren a canalizaciones posteriores, un punto doloroso frecuente al trabajar con documentos multilingües. + +--- + +## Trampas Comunes y Casos Limítrofes + +| Trampa | Por qué ocurre | Cómo evitar | +|--------|----------------|--------------| +| **Error tipográfico en el nombre de la escritura** (p. ej., `"Cyrillic "` con espacio al final) | El motor trata la cadena literalmente y no puede encontrar la escritura. | Elimina espacios: `script.strip()` antes de llamar a `get_supported_characters`. | +| **Lista de caracteres vacía** | Algunos motores exponen una escritura pero aún no han cargado el modelo de idioma. | Llama a `engine.load_language_model(script)` si la biblioteca ofrece ese método, o asegura que los archivos del modelo estén presentes. | +| **Problemas de normalización Unicode** | Caracteres como “é” pueden aparecer como compuestos (`\u00E9`) o descompuestos (`e\u0301`). | Normaliza cadenas con `unicodedata.normalize('NFC', text)` antes de la validación. | +| **Rendimiento en escrituras enormes** (p. ej., chino) | Recuperar miles de caracteres puede ser lento. | Cachea el resultado después de la primera llamada; la mayoría de aplicaciones solo necesita la lista una vez. | + +--- + +## Ejemplo Completo Funcional + +Juntando todo, aquí tienes un script único que puedes copiar‑pegar y ejecutar: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## ¿Qué Deberías Aprender a Continuación? + + +Los siguientes tutoriales cubren temas estrechamente relacionados que amplían las técnicas demostradas en esta guía. Cada recurso incluye ejemplos de código completos con explicaciones paso a paso para ayudarte a dominar funciones adicionales de la API y explorar enfoques de implementación alternativos en tus propios proyectos. + +- [Tutorial Aspose OCR – Reconocimiento Óptico de Caracteres](/ocr/english/) +- [Post‑procesamiento OCR – Obtener Opciones de Caracteres](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Cómo Establecer el Valor de Umbral en el Reconocimiento de Imágenes OCR](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/spanish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/spanish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..3b0dea098 --- /dev/null +++ b/ocr/spanish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,261 @@ +--- +category: general +date: 2026-06-06 +description: Realiza OCR en una imagen usando Aspose OCR y un modelo de Hugging Face. + Aprende cómo descargar el modelo de Hugging Face, extraer texto de una factura y + liberar los recursos de GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: es +og_description: Realiza OCR en una imagen usando Aspose OCR y un modelo de Hugging + Face. Este tutorial muestra cómo descargar el modelo, extraer texto de una factura + y liberar los recursos de GPU. +og_title: Realiza OCR en una imagen con Aspose OCR y LLM – Guía completa +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Realiza OCR en una imagen con Aspose OCR y LLM – Guía completa +url: /es/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Realiza OCR en Imágenes con Aspose OCR & LLM – Guía Completa + +¿Alguna vez quisiste **realizar OCR en archivos de imagen** pero te quedaste atascado con la pregunta “¿por dónde empiezo?”? No estás solo—muchos desarrolladores se topan con esa barrera al iniciar la automatización de documentos. La buena noticia es que con Aspose OCR y un LLM ligero de Hugging Face, puedes convertir un escaneo bruto de una factura en texto limpio y buscable con solo unas pocas líneas de Python. + +En este tutorial recorreremos todo lo que necesitas: desde **cargar la imagen para OCR**, hasta **descargar el modelo de Hugging Face**, **extraer texto de la factura** y, finalmente, **liberar los recursos de GPU** para que tu aplicación siga ligera. Al final tendrás un script autónomo que podrás integrar en cualquier proyecto. + +--- + +## Lo Que Aprenderás + +- Cómo **realizar OCR en imagen** usando `OcrEngine` de Aspose. +- Los pasos exactos para **descargar el modelo de Hugging Face** automáticamente. +- Técnicas para **extraer texto de factura** en PDFs o PNGs con post‑procesamiento potenciado por IA. +- Buenas prácticas para **liberar los recursos de GPU** después de la inferencia. +- Consejos para **cargar la imagen para OCR** de forma eficiente y evitar errores comunes. + +No necesitas documentación externa—todo lo que necesitas está aquí, con código completo, explicaciones y salida esperada. + +--- + +## Requisitos Previos + +Antes de comenzar, asegúrate de contar con: + +| Requisito | Razón | +|-----------|-------| +| Python 3.9+ | Sintaxis moderna y anotaciones de tipo | +| paquete `asposeocr` (`pip install asposeocr`) | Motor OCR principal | +| Acceso a una GPU (opcional pero recomendado) | Acelera el post‑procesador LLM | +| Imagen de factura (`sample_invoice.png`) | Caso de prueba del mundo real | + +Si te falta alguno, instálalo ahora; el script también **descargará el modelo de Hugging Face** automáticamente, así no tendrás que buscar archivos manualmente. + +--- + +## Paso 1: Realiza OCR en Imagen – Crea el Motor + +Lo primero que debes hacer es iniciar el motor OCR de Aspose. Piensa en él como abrir un lienzo en blanco donde la imagen se pintará después en texto. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Por qué es importante:** `OcrEngine` abstrae todo el preprocesamiento de imagen de bajo nivel, permitiéndote centrarte en el flujo de trabajo de alto nivel. También expone un método `set_post_processor` que más adelante nos permitirá conectar un LLM para obtener una salida más inteligente. + +--- + +## Paso 2: Cargar Imagen para OCR – Elige el Archivo Correcto + +Ahora que el motor existe, necesitamos **cargar la imagen para OCR**. Aspose admite PNG, JPG, TIFF y varios formatos más. Asegúrate de que la ruta sea absoluta o relativa a la ubicación de tu script. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Consejo:** Si tu imagen es grande, considera redimensionarla antes para reducir la presión de memoria. El motor OCR puede manejar escaneos de alta resolución, pero una imagen de 300 DPI suele ser el punto óptimo para facturas. + +--- + +## Paso 3: Realiza OCR Bruto y Visualiza el Texto Extraído + +Con la imagen cargada, finalmente podemos **realizar OCR en imagen** y ver qué produce el motor sin procesar. Este paso nos da una línea base antes de añadir la magia de IA. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Salida esperada (truncada):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +La salida cruda suele contener saltos de línea, caracteres mal reconocidos o campos ausentes—exactamente por eso incorporaremos un modelo de lenguaje en el siguiente paso. + +--- + +## Paso 4: Descargar Modelo de Hugging Face – Configura el Post‑Procesador LLM + +Aquí es donde brilla el paso **descargar modelo de Hugging Face**. Aspose AI puede obtener automáticamente un modelo del hub de Hugging Face si aún no está en disco. Usaremos el modelo Qwen2.5‑3B‑Instruct‑GGUF, que equilibra precisión y huella de memoria. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Por qué funciona:** `allow_auto_download` te evita la descarga manual del archivo `.gguf`. La cuantización (`int8`) reduce el tamaño del modelo a aproximadamente 3 GB, haciéndolo viable en la mayoría de GPUs de consumo. Ajusta `gpu_layers` según tu hardware—más capas en GPU = inferencia más rápida. + +--- + +## Paso 5: Extraer Texto de la Factura Usando Post‑Procesamiento Mejorado por IA + +Ahora conectamos el LLM al motor OCR y ejecutamos un **post‑procesador** que limpia la salida cruda, corrige errores de OCR y formatea los campos de la factura de manera ordenada. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Ejemplo de salida mejorada:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **¿Qué ocurrió?** El LLM reconoció que “Invoice #12345” debería ser “Invoice Number: 12345”, corrigió el formato de fecha e incluso inferió el campo “Bill To” que el motor crudo omitió. Este es el núcleo de la automatización **extraer texto de factura**. + +--- + +## Paso 6: Liberar Recursos de GPU – Limpieza Después del Procesamiento + +Si ejecutas esto en un servicio de larga duración (p. ej., una API Flask), debes **liberar los recursos de GPU** después de cada inferencia para evitar fallos por falta de memoria. Aspose AI ofrece un método sencillo para ello. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Consejo profesional:** Llama a `free_resources()` dentro de un bloque `finally:` si envuelves la llamada OCR en un `try/except`. Así garantizas la limpieza incluso cuando ocurre una excepción. + +--- + +## Paso 7: Script Completo – Junta Todo + +A continuación tienes el script completo, listo para ejecutarse. Copia‑pega, ajusta las rutas y listo. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Ejecuta el script y observa la transformación de OCR ruidoso a datos de factura limpios y estructurados. 🎉 + +--- + +## Preguntas Frecuentes & Casos Límite + +| Pregunta | Respuesta | +|----------|-----------| +| **¿Qué pasa si el modelo no se descarga?** | Asegúrate de que tu máquina tenga acceso a internet y que `hugging_face_repo_id` sea correcto. También puedes descargar manualmente el archivo. | +| **¿Puedo usar CPU en lugar de GPU?** | Sí, pero el post‑procesador será mucho más lento. Ajusta `gpu_layers` a 0 para forzar ejecución en CPU. | +| **¿Cómo manejo facturas con varios idiomas?** | Puedes combinar Aspose OCR con modelos multilingües de Hugging Face; simplemente cambia el `repo_id` al modelo deseado. | +| **¿Hay un límite de tamaño de imagen?** | No hay un límite estricto, pero imágenes muy grandes pueden agotar la memoria. Redúcelas a 300 DPI cuando sea posible. | + +--- + +## ¿Qué Deberías Aprender a Continuación? + +Los tutoriales siguientes cubren temas estrechamente relacionados que amplían las técnicas demostradas en esta guía. Cada recurso incluye ejemplos de código completos con explicaciones paso a paso para ayudarte a dominar funciones adicionales de la API y explorar enfoques alternativos en tus propios proyectos. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/spanish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/spanish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..fe0fc3f18 --- /dev/null +++ b/ocr/spanish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,283 @@ +--- +category: general +date: 2026-06-06 +description: reconocer texto de una imagen con Aspose OCR – aprende cómo cargar la + imagen para OCR y realizar OCR en la imagen usando post‑procesamiento de IA en Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: es +og_description: Reconoce texto de una imagen rápidamente. Esta guía muestra cómo cargar + una imagen para OCR, realizar OCR en la imagen y mejorar los resultados con post‑procesamiento + de IA. +og_title: reconocer texto de la imagen usando Aspose OCR y IA +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: reconocer texto de una imagen usando Aspose OCR e IA +url: /es/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# reconocer texto de imagen usando Aspose OCR & AI + +¿Alguna vez necesitaste reconocer texto de una imagen pero no estabas seguro de qué biblioteca te ofrecería tanto velocidad como precisión? No estás solo. En esta guía recorreremos un ejemplo completo, de extremo a extremo, que muestra **cómo cargar una imagen para OCR**, **realizar OCR en una imagen**, y luego pulir la salida con el post‑processor AI de Aspose. Al final tendrás un script listo para ejecutar que convierte un PNG en texto limpio y buscable. + +## Lo que aprenderás + +Cubrirémos todo, desde la instalación del paquete Aspose OCR hasta la liberación de recursos al final de la ejecución. Verás por qué habilitar el reconocimiento de texto manuscrito es importante, cómo configurar un LLM Qwen 2.5 para el post‑procesamiento, y cómo se ve la salida final. + +### Requisitos previos + +- Python 3.8 o superior +- paquete `asposeocr` (`pip install asposeocr`) +- Un archivo de imagen (p. ej., `doc.png`) que contenga texto impreso o manuscrito +- Opcional: una GPU para inferencia de LLM más rápida (el script también funciona en CPU) + +--- + +## Reconocer texto de imagen – Paso a paso + +Debajo de cada bloque de código encontrarás una breve explicación de **por qué** hacemos esa acción en particular, no solo **qué** hace la línea. + +### Paso 1: Instalar e importar los módulos requeridos + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*¿Por qué?* Importar `asposeocr` nos brinda la clase `OcrEngine`, mientras que el submódulo `ai` proporciona el post‑procesador basado en LLM que mejora drásticamente la salida cruda de OCR. + +### Paso 2: Crear el motor OCR y habilitar el reconocimiento de texto manuscrito + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Habilitar el reconocimiento manuscrito amplía el conjunto de caracteres del motor, por lo que no perderás garabatos al **realizar OCR en una imagen** que contenga texto impreso y cursivo mezclado. + +### Paso 3: Cargar la imagen para OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +La llamada `load_image` es el momento en que **cargas la imagen para OCR**; si la ruta es incorrecta, el motor lanzará una excepción informativa, ahorrándote errores crípticos posteriores. + +### Paso 4: Ejecutar la pasada OCR cruda + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +En esta etapa obtienes un objeto `RecognitionResult` que contiene el texto sin filtrar, puntuaciones de confianza y metadatos de diseño. A menudo es ruidoso, de ahí la necesidad de una limpieza impulsada por IA. + +### Paso 5: Configurar el modelo AI de Aspose para el post‑procesamiento LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +¿Por qué molestarse con estas configuraciones? +- **auto‑download** garantiza que el modelo esté disponible en la primera ejecución. +- **int8 quantization** reduce drásticamente la demanda de memoria sin una gran pérdida de precisión. +- **gpu_layers** te permite aprovechar una GPU compatible para una inferencia más rápida. + +### Paso 6: Inicializar el procesador AI y adjuntarlo como post‑procesador + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Adjuntar el procesador significa que cada vez que llames a `run_postprocessor`, el LLM limpiará la ortografía, combinará palabras rotas e incluso inferirá la puntuación faltante. + +### Paso 7: Ejecutar el post‑procesador para mejorar la salida OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` suele ser mucho más legible que la cadena cruda; piénsalo como un corrector ortográfico que también entiende el contexto. + +### Paso 8: Liberar recursos + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Limpiar es crucial en servicios de larga duración; de lo contrario, filtrarás memoria de GPU y eventualmente bloquearás tu aplicación. + +--- + +## Script completo que puedes ejecutar hoy + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Salida esperada** (ejemplo para una imagen de factura simple): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Observa cómo la capa AI corrigió “Inv0ice” → “Invoice” y añadió la puntuación faltante. + +--- + +## Preguntas frecuentes (y respuestas rápidas) + +- **¿Necesito una GPU?** No. El script recurre a la CPU, pero la inferencia será más lenta. +- **¿Puedo usar un LLM diferente?** Absolutamente—solo cambia `hugging_face_repo_id` y ajusta `gpu_layers` en consecuencia. +- **¿Qué pasa si mi imagen es enorme?** Redimensiónala primero (p. ej., usando Pillow) para mantener un uso de memoria razonable. +- **¿El reconocimiento manuscrito está siempre activado?** Puedes alternar `enable_handwritten_recognition` según tu carga de trabajo. + +--- + +## Conclusión + +Ahora sabes cómo **reconocer texto de una imagen** usando Aspose OCR, cómo **cargar la imagen para OCR**, y cómo **realizar OCR en una imagen** con post‑procesamiento mejorado por IA. El ejemplo completo y ejecutable anterior te brinda una base sólida para integrar OCR en cualquier proyecto Python—ya sea escaneando recibos, digitalizando contratos o extrayendo datos de formularios manuscritos. + +¿Listo para el siguiente paso? Prueba cambiar el modelo Qwen por uno más grande, experimenta con diferentes esquemas de cuantización, o encadena varias imágenes para procesamiento por lotes. Las posibilidades son infinitas, y el código que acabas de crear las manejará sin problemas. + +¡Feliz codificación, y que tus resultados de OCR siempre sean cristalinos! + +![Captura de pantalla de la consola Python mostrando salida OCR mejorada](/images/ocr_output.png){alt="Captura de pantalla que muestra cómo reconocer texto de una imagen usando Aspose OCR"} + +## ¿Qué deberías aprender a continuación? + +Los siguientes tutoriales cubren temas estrechamente relacionados que se basan en las técnicas demostradas en esta guía. Cada recurso incluye ejemplos de código completos y funcionales con explicaciones paso a paso para ayudarte a dominar características adicionales de la API y explorar enfoques de implementación alternativos en tus propios proyectos. + +- [Extraer texto de una imagen con Aspose OCR – Guía paso a paso](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Convertir imagen a texto – Realizar OCR en una imagen desde URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extraer texto de imagen en C# con selección de idioma usando Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/swedish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/swedish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..6a2c07463 --- /dev/null +++ b/ocr/swedish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,215 @@ +--- +category: general +date: 2026-06-06 +description: Extrahera text från en handskriven bild med Python OCR. Lär dig hur du + snabbt och pålitligt konverterar en handskriven bild till text. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: sv +og_description: Extrahera text från en handskriven bild med Python. Den här guiden + visar hur man konverterar en handskriven bild till text och svarar på hur man känner + igen handskriven text. +og_title: Extrahera text från handskriven bild – Python OCR-handledning +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Extrahera text från handskriven bild med Python OCR – Steg‑för‑steg guide +url: /sv/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Extrahera text från handskriven bild med Python OCR – steg‑för‑steg‑guide + +Har du någonsin undrat **hur man känner igen handskriven text** i ett foto du tog med din telefon? Du är inte ensam. I många projekt—oavsett om det handlar om att digitalisera föreläsningsanteckningar eller hämta data från undertecknade formulär—behöver du **extrahera text från handskriven bild** snabbt och utan huvudvärk. + +I den här handledningen går vi igenom ett komplett, färdigt‑att‑köra exempel som visar exakt hur du **konverterar handskriven foto till text** med ett populärt Python‑OCR‑bibliotek. Inga vaga referenser, bara konkret kod, förklaringar och tips du kan kopiera‑klistra idag. + +![extract text from handwritten image](https://example.com/placeholder-handwritten.jpg "extract text from handwritten image") + +## Vad du behöver + +Innan vi dyker ner, se till att du har följande: + +| Krav | Varför det är viktigt | +|------|-----------------------| +| Python 3.9 eller nyare | Modern syntax & library support | +| `pip` (Python package manager) | För att installera OCR‑paketet | +| En klar bild av handskrivna anteckningar (JPEG/PNG) | OCR‑noggrannheten sjunker på suddiga bilder | +| Grundläggande kunskap om Python‑funktioner | Hjälper dig att anpassa exemplet senare | + +Om du saknar någon av dessa, hämta den senaste Python‑versionen från och installera den—inga problem. + +## Installera Python OCR‑biblioteket + +Kodsnutten vi använder bygger på paketet `ocr` (ett lätt omslag runt Tesseract som lägger till praktiska inställningar). Installera det med ett enda kommando: + +```bash +pip install ocr +``` + +> **Pro tip:** Efter installationen, kör `tesseract --version` i din terminal för att bekräfta att den underliggande motorn finns. Om du inte har Tesseract, följ den officiella guiden för ditt OS—de flesta paketchefer har den (`apt-get install tesseract-ocr` på Ubuntu, `brew install tesseract` på macOS). + +## Steg 1: Skapa en OCR‑motorinstans + +Att skapa motorn är den första byggstenen i **python ocr handwritten recognition**. Tänk på motorn som hjärnan som senare kommer att läsa klotterna. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Varför detta är viktigt: utan en motor kan du inte finjustera igenkännings‑pipeline. Standardinställningarna är optimerade för tryckt text, så vi måste justera dem i nästa steg. + +## Steg 2: Aktivera handskriftsigenkänning + +Som standard antar motorn tryckt text. Att aktivera handskriftsläge slår på en switch som talar om för Tesseract att använda sin LSTM‑modell tränad på kursiva streck. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Vad händer om du hoppar över detta?** OCR‑motorn kommer att behandla strecken som brus, vilket ger en förvrängd utskrift. Att sätta flaggan är kärnan i **how to recognize handwritten text**. + +## Steg 3: Ladda ditt handskrivna foto + +Nu pekar vi motorn på bildfilen. Sökvägen kan vara absolut eller relativ; se bara till att filen finns. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +En snabb kontroll: öppna bilden i ditt operativsystems bildvisare. Om texten är suddig, överväg förbehandling (kontrastökning, rotation) innan du matar in den i motorn—sådana justeringar höjer ofta framgångsfrekvensen för **extract text from handwritten image**. + +## Steg 4: Kör igenkänningsprocessen + +När allt är på plats ber vi motorn utföra sitt jobb. Metoden `recognize()` returnerar ett resultatobjekt som innehåller den extraherade strängen och förtroendesiffror. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Bakom kulisserna konverterar motorn bitmapen till en serie funktionsvektorer, kör dem genom LSTM‑nätverket och sätter ihop tecknen. Det är magin i **python ocr handwritten recognition**. + +## Steg 5: Visa den extraherade texten + +Resultatobjektet har en `.text`‑attribut som innehåller den rena Unicode‑strängen. Skriv ut den, spara den i en fil eller skicka den vidare i en annan pipeline—du bestämmer. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Förväntad utskrift + +Om källbilden innehåller noteringen “Buy milk, eggs, and bread”, skulle du se något i stil med: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Lägg märke till hur utskriften bevarar skiljetecken och radbrytningar (om några). Om du får nonsens, dubbelkolla bildkvaliteten och flaggan `enable_handwritten_recognition`. + +## Hantera vanliga fallgropar + +| Problem | Symtom | Lösning | +|---------|--------|---------| +| Låga förtroendesiffror | Många “?” eller meningslösa tecken | Öka bildens DPI till ≥300, applicera binarisering (`opencv`), eller beskära till intresseområdet. | +| Blandade språk | Utskrift blandar engelska och ett annat skriftsystem | Sätt `engine.ocr_settings.language = "eng"` (eller annan ISO‑kod) före `recognize()`. | +| Stora filer | Lång bearbetningstid eller minnesfel | Ändra storlek på bilden till rimliga dimensioner (t.ex. maxbredd 1200 px) innan du laddar den. | +| Saknad Tesseract | `ImportError` eller `FileNotFoundError` | Installera Tesseract separat och se till att den finns i system‑PATH. | + +Dessa justeringar gör ditt **convert handwritten photo to text**‑flöde robust över olika dataset. + +## Fullt skript du kan köra idag + +Nedan är det kompletta, självständiga programmet som sätter ihop alla bitar. Kopiera det till en fil som heter `handwritten_ocr.py` och kör `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Kör det, så ser du texten skriven i konsolen—precis det resultat du behöver när du vill **convert handwritten photo to text**. + +## Gå längre + +Nu när du behärskar grunderna i **extract text from handwritten image**, fundera på följande nästa steg: + +- **Batch‑behandling:** Loopa igenom en mapp med bilder och lagra varje resultat i en CSV‑fil. +- **Post‑behandling:** Använd reguljära uttryck för att rensa vanliga OCR‑fel (t.ex. “1” vs “l”). +- **Integration:** Skicka de extraherade strängarna till en Natural Language Processing‑pipeline för sentiment‑analys eller nyckelordsutvinning. +- **Alternativa bibliotek:** Om du behöver högre precision, utforska `easyocr` eller `pytesseract` med anpassade LSTM‑modeller—båda stödjer **python ocr handwritten recognition** också. + +Kom ihåg att kvaliteten på källbilden ofta avgör framgång, så spendera några minuter på förbehandling. Lite extra arbete nu sparar dig mycket felsökning senare. + +## Slutsats + +Vi har gått igenom ett komplett, end‑to‑end‑exempel som visar **how to recognize handwritten text** och, ännu viktigare, hur du **extract text from handwritten image** med Python. Genom att installera `ocr`‑paketet, slå på handskriftsflaggan, ladda din bild och anropa `recognize()`, kan du **convert handwritten photo to text** med bara ett fåtal rader kod. + +Prova med dina egna anteckningar, justera förbehandlingsstegen, och låt OCR‑motorn göra det tunga arbetet. Om du stöter på problem, återvänd till tabellen “Hantera vanliga fallgropar” eller experimentera med alternativa OCR‑back‑ends. Lycka till med kodandet, och må din handskrivna data bli omedelbart sökbar! + +## Vad bör du lära dig härnäst? + +De följande handledningarna täcker närbesläktade ämnen som bygger på teknikerna i den här guiden. Varje resurs innehåller kompletta fungerande kodexempel med steg‑för‑steg‑förklaringar för att hjälpa dig bemästra ytterligare API‑funktioner och utforska alternativa implementeringsmetoder i dina egna projekt. + +- [Extrahera text från bild med Aspose OCR – steg‑för‑steg‑guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Hur man förbereder rektanglar för OCR‑textigenkänning i Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Hur man använder OCR – känna igen bild utan textområdesdetektering](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/swedish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/swedish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..ebe5537ca --- /dev/null +++ b/ocr/swedish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,241 @@ +--- +category: general +date: 2026-06-06 +description: 'OCR-teckenuppsättningsguide: lär dig hur du använder OCR-motorn för + att lista stödda tecken för latin- och kyrilliska skript med fullständiga Python‑exempel.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: sv +og_description: 'OCR-teckenuppsättningsguide: upptäck hur du använder OCR-motorn i + Python för att lista stödda tecken för olika skript, komplett med kod och tips.' +og_title: OCR-teckenuppsättning – Hämta och använd OCR-motorn +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR-teckenuppsättning – Hämta och använd OCR-motor i Python +url: /sv/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR-teckenuppsättning – Hämta och Använd OCR-motor i Python + +Vill du veta **ocr character set** som ditt bibliotek stöder? I den här handledningen visar vi hur du **use OCR engine** för att fråga efter stödda tecken för olika skript, och varför det är viktigt för verkliga projekt. + +Om du någonsin har stirrat på en tom skärm och undrat varför vissa bokstäver aldrig visas efter OCR‑behandling, är du inte ensam. Svaret är ofta gömt i teckenuppsättningen som motorn känner till. I slutet av den här guiden kommer du att kunna: + +* Lista varje tecken som en OCR-motor kan känna igen för ett givet skript. +* Hantera fall där ett skript inte stöds. +* Anslut teckenuppsättningsinformationen till efterföljande validering eller UI‑logik. + +Inga magiska black‑box‑trick—bara ren Python, ett par kodrader och en liten förklaring. + +--- + +## Förutsättningar + +Innan vi hoppar in, se till att du har: + +* Python 3.8+ installerat (koden använder f‑strings). +* OCR‑biblioteket som tillhandahåller `ocr.OcrEngine`—för detta exempel antar vi att ett paket som heter `ocr` redan är installerat via `pip install ocr-lib`. +* Grundläggande kunskap om Pythons import‑system och utskrift. + +Om du saknar biblioteket, kör: + +```bash +pip install ocr-lib +``` + +Det är allt—inga extra binärer eller inhemska beroenden behövs för de grundläggande teckenuppsättningsfrågorna vi kommer att utföra. + +## Steg 1: Initiera OCR‑motorinstansen + +Att skapa ett motorobjekt är det första du gör när du **use OCR engine**‑funktionalitet. Tänk på det som att slå på en skanner; utan den kan du inte ställa några frågor. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +`OcrEngine`‑klassen är ett lättviktigt omslag runt den underliggande igenkänningsmotorn. Den påbörjar inte tung bearbetning förrän du begär något, så startkostnaden är försumbar. + +> **Pro tip:** Om din applikation skapar många motorinstanser, återanvänd en enda. Det sparar minne och minskar initieringslatens. + +## Steg 2: Fråga OCR‑teckenuppsättningen för ett specifikt skript + +Nu när vi har en motor kan vi fråga den vilka tecken den känner till för ett särskilt skriftsystem. Metoden `get_supported_characters(script_name)` returnerar en Python `list` med Unicode‑tecken. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Att köra kodsnutten ovan skriver ut något liknande: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Den exakta utskriften beror på OCR‑bibliotekets version, men du får alltid en platt lista med tecken. Om du behöver både versaler och gemener, begär helt enkelt skriptet `"Latin"` och tillämpa sedan `.lower()` på varje element, eller fråga ett separat `"Latin‑lower"`‑skript om biblioteket skiljer dem åt. + +### Varför är detta viktigt? + +När du matar in en bild som innehåller ovanliga diakritiska tecken (t.ex. “ñ” eller “ø”), kan OCR‑motorn tyst ersätta dem med en platshållare eller helt enkelt släppa dem. Att känna till **ocr character set** i förväg låter dig förvalidera indata, varna användare eller falla tillbaka på en annan motor. + +## Steg 3: Hämta tecken för ett annat skript – Cyrillic‑exempel + +Samma metod fungerar för alla skript som motorn påstår sig stödja. Låt oss se vad som händer med Cyrillic. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Typisk utskrift: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Om motorn **inte** stödjer det begärda skriptet, kastar den vanligtvis ett `ValueError` (eller returnerar en tom lista beroende på implementationen). Låt oss skydda mot det. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +## Steg 4: Visualisera OCR‑teckenuppsättningen (valfritt) + +Ibland hjälper en snabb visualisering, särskilt när du behöver visa intressenter vilka tecken som täcks. Nedan är ett minimalt exempel som använder `matplotlib` för att rendera ett rutnät av tecken. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Bildens alt‑text:** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +Diagrammet krävs inte för kärnlösningen, men det visar hur du kan **use OCR engine** metadata utöver enkel utskrift. + +## Steg 5: Integrera teckenuppsättningen i verkliga arbetsflöden + +Nu när vi kan hämta **ocr character set**, låt oss se ett praktiskt scenario: validera OCR‑utdata innan den lagras i en databas. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Detta mönster förhindrar skräpd data från att smita in i efterföljande pipelines, ett vanligt problem när man hanterar flerspråkiga dokument. + +## Vanliga fallgropar och kantfall + +| Pitfall | Why it Happens | How to Avoid | +|---------|----------------|--------------| +| **Script name typo** (t.ex. `"Cyrillic "` med efterföljande mellanslag) | Motorn behandlar strängen bokstavligt och kan inte hitta skriptet. | Ta bort blanksteg: `script.strip()` innan du anropar `get_supported_characters`. | +| **Tom teckenlista** | Vissa motorer exponerar ett skript men har ännu inte laddat språkmodellen. | Anropa `engine.load_language_model(script)` om biblioteket tillhandahåller en sådan metod, eller säkerställ att modellfilerna finns. | +| **Unicode‑normaliseringsproblem** | Tecken som “é” kan visas som sammansatta (`\u00E9`) eller dekomponerade (`e\u0301`). | Normalisera strängar med `unicodedata.normalize('NFC', text)` innan validering. | +| **Prestanda på stora skript** (t.ex. kinesiska) | Att hämta tusentals tecken kan vara långsamt. | Cacha resultatet efter första anropet; de flesta applikationer behöver bara listan en gång. | + +## Fullt fungerande exempel + +När vi sätter ihop allt, här är ett enda skript du kan kopiera‑klistra in och köra: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Vad bör du lära dig härnäst? + +Följande handledningar täcker närliggande ämnen som bygger på teknikerna som demonstrerats i den här guiden. Varje resurs innehåller kompletta fungerande kodexempel med steg‑för‑steg‑förklaringar för att hjälpa dig bemästra ytterligare API‑funktioner och utforska alternativa implementeringsmetoder i dina egna projekt. + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/swedish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/swedish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..81fdeb5ba --- /dev/null +++ b/ocr/swedish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,261 @@ +--- +category: general +date: 2026-06-06 +description: Utför OCR på bild med Aspose OCR och en Hugging Face‑modell. Lär dig + hur du laddar ner Hugging Face‑modellen, extraherar text från en faktura och frigör + GPU‑resurser. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: sv +og_description: Utför OCR på bild med Aspose OCR och en Hugging Face-modell. Denna + handledning visar hur du laddar ner modellen, extraherar text från en faktura och + frigör GPU-resurser. +og_title: Utför OCR på bild med Aspose OCR & LLM – Komplett guide +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Utför OCR på bild med Aspose OCR & LLM – Komplett guide +url: /sv/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Utför OCR på bild med Aspose OCR & LLM – Komplett guide + +Har du någonsin velat **utföra OCR på bild**‑filer men känt dig fast vid frågan “var börjar jag?”? Du är inte ensam—många utvecklare stöter på samma hinder när de först ger sig på dokumentautomatisering. Den goda nyheten är att med Aspose OCR och en lättviktig LLM från Hugging Face kan du förvandla en rå skanning av en faktura till ren, sökbar text på bara några rader Python‑kod. + +I den här tutorialen går vi igenom allt du behöver: från **laddning av bild för OCR**, till **nedladdning av Hugging Face‑modellen**, till **extrahering av text från fakturadatan**, och slutligen **frigör GPU‑resurser** så att din app förblir slank. När du är klar har du ett självständigt skript som du kan släppa in i vilket projekt som helst. + +--- + +## Vad du kommer att lära dig + +- Hur du **utför OCR på bild** med Asposes `OcrEngine`. +- De exakta stegen för att **ladda ner Hugging Face‑modell** automatiskt. +- Tekniker för att **extrahera text från faktura**‑PDF:er eller PNG‑filer med AI‑förbättrad efterbehandling. +- Bästa praxis för att **frigöra GPU‑resurser** efter inferens. +- Tips för att **ladda bild för OCR** effektivt och undvika vanliga fallgropar. + +Ingen extern dokumentation behövs—allt du behöver finns här, med fullständig kod, förklaringar och förväntad output. + +--- + +## Förutsättningar + +Innan vi dyker in, se till att du har: + +| Krav | Orsak | +|------|-------| +| Python 3.9+ | Modern syntax och typ‑hints | +| `asposeocr`‑paketet (`pip install asposeocr`) | Kärn‑OCR‑motor | +| Tillgång till ett GPU (valfritt men rekommenderat) | Snabbar upp LLM‑efterbehandlingen | +| En fakturabild (`sample_invoice.png`) | Verkligt testfall | + +Om du saknar något av detta, installera det nu; skriptet kommer också **ladda ner Hugging Face‑modell** automatiskt, så du behöver inte leta efter filer själv. + +--- + +## Steg 1: Utför OCR på bild – Skapa motorn + +Det första du måste göra är att starta Asposes OCR‑motor. Tänk på den som en tom duk där bilden senare målas om till text. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Varför detta är viktigt:** `OcrEngine` abstraherar all låg‑nivå bild‑förbehandling, så att du kan fokusera på arbetsflödet på högre nivå. Den exponerar också en `set_post_processor`‑metod som senare låter oss koppla in en LLM för smartare output. + +--- + +## Steg 2: Ladda bild för OCR – Välj rätt fil + +Nu när motorn finns, måste vi **ladda bild för OCR**. Aspose stödjer PNG, JPG, TIFF och ett fåtal andra format. Se till att sökvägen är absolut eller relativ till ditt skripts plats. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Tips:** Om din bild är stor, överväg att skala ner den i förväg för att minska minnesbelastningen. OCR‑motorn kan hantera högupplösta skanningar, men en 300 DPI‑bild är vanligtvis en bra kompromiss för fakturor. + +--- + +## Steg 3: Utför rå OCR och visa den extraherade texten + +Med bilden laddad kan vi äntligen **utföra OCR på bild** och se vad den råa motorn spottar ut. Detta steg ger oss en baslinje innan vi lägger till AI‑magi. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Förväntad output (avkortad):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Den råa outputen innehåller ofta radbrytningar, felaktigt igenkända tecken eller saknade fält—precis därför vi kommer att introducera en språkmodell i nästa steg. + +--- + +## Steg 4: Ladda ner Hugging Face‑modell – Konfigurera LLM‑efterbehandlingen + +Här kommer **ladda ner Hugging Face‑modell**‑steget till sin rätt. Aspose AI kan automatiskt hämta en modell från Hugging Face‑hubben om den inte redan finns på disk. Vi använder modellen Qwen2.5‑3B‑Instruct‑GGUF, som balanserar noggrannhet och minnesfotavtryck. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Varför detta fungerar:** `allow_auto_download` sparar dig från att manuellt ladda ner `.gguf`‑filen. Kvantisering (`int8`) minskar modellens storlek till ungefär 3 GB, vilket gör den genomförbar på de flesta konsument‑GPU:er. Justera `gpu_layers` efter din hårdvara—fler lager på GPU = snabbare inferens. + +--- + +## Steg 5: Extrahera text från faktura med AI‑förbättrad efterbehandling + +Nu fäster vi LLM:n på OCR‑motorn och kör en **efterbehandlare** som rensar den råa outputen, korrigerar OCR‑fel och formaterar fakturafälten snyggt. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Exempel på förbättrad output:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Vad hände?** LLM:n identifierade att “Invoice #12345” borde vara “Invoice Number: 12345”, fixade datumformatet och infererade även “Bill To”‑fältet som den råa motorn missade. Detta är kärnan i **extrahera text från faktura**‑automatisering. + +--- + +## Steg 6: Frigör GPU‑resurser – Rensa upp efter bearbetning + +Om du kör detta i en långlivad tjänst (t.ex. ett Flask‑API) måste du **frigöra GPU‑resurser** efter varje inferens för att undvika minnesuttag. Aspose AI erbjuder en enkel metod för detta. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro‑tips:** Anropa `free_resources()` i ett `finally:`‑block om du omsluter OCR‑anropet med try/except. Det garanterar städning även när ett undantag inträffar. + +--- + +## Steg 7: Fullt skript – Sätt ihop allt + +Nedan är det kompletta, körklara skriptet. Kopiera‑klistra in det, justera sökvägarna, så är du igång. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Kör skriptet och se förvandlingen från brusig OCR till ren, strukturerad fakturadata. 🎉 + +--- + +## Vanliga frågor & kantfall + +| Fråga | Svar | +|-------|------| +| **Vad händer om modellen misslyckas med att laddas ner?** | Säkerställ att maskinen har internetåtkomst och att `hugging_face_repo_id` är korrekt. Du kan också manuellt ladda ner filen. | +| **Kan jag köra utan GPU?** | Ja, men efterbehandlingen blir betydligt långsammare. Du kan minska `gpu_layers` till 0 för att köra helt på CPU. | +| **Fungerar detta med PDF‑filer?** | Ja, konvertera PDF‑sidor till PNG först, eller använd Asposes PDF‑till‑bild‑funktion innan OCR. | +| **Hur hanterar jag flera fakturor i en mapp?** | Lägg in en loop som itererar över alla bildfiler och anropar samma pipeline för varje fil. | + +--- + +## Vad du bör lära dig härnäst + +De följande tutorialerna behandlar närliggande ämnen som bygger vidare på teknikerna i den här guiden. Varje resurs innehåller kompletta kodexempel med steg‑för‑steg‑förklaringar för att hjälpa dig bemästra ytterligare API‑funktioner och utforska alternativa implementationssätt i egna projekt. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/swedish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/swedish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..10181315a --- /dev/null +++ b/ocr/swedish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,282 @@ +--- +category: general +date: 2026-06-06 +description: igenkänna text från bild med Aspose OCR – lär dig hur du laddar en bild + för OCR och utför OCR på bilden med AI‑efterbehandling i Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: sv +og_description: Känn igen text från bild snabbt. Den här guiden visar hur du laddar + bild för OCR, utför OCR på bilden och förbättrar resultaten med AI‑efterbehandling. +og_title: Känn igen text från bild med Aspose OCR & AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Känn igen text från bild med Aspose OCR & AI +url: /sv/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# igenkänna text från bild med Aspose OCR & AI + +Har du någonsin behövt känna igen text från en bild men varit osäker på vilket bibliotek som ger både hastighet och noggrannhet? Du är inte ensam. I den här guiden går vi igenom ett komplett, end‑to‑end‑exempel som visar **hur man laddar bild för OCR**, **utför OCR på bild**, och sedan polerar resultatet med Asposes AI‑postprocessor. I slutet har du ett färdigt skript som omvandlar en PNG till ren, sökbar text. + +## Vad du kommer att lära dig + +Vi kommer att gå igenom allt från att installera Aspose OCR‑paketet till att frigöra resurser i slutet av körningen. Du får se varför aktivering av handskriven‑textigenkänning är viktigt, hur man konfigurerar en Qwen 2.5 LLM för efterbehandling, och hur det slutgiltiga resultatet ser ut. Ingen extern referens krävs—bara kopiera, klistra in och kör. + +### Förutsättningar + +- Python 3.8 eller nyare +- `asposeocr`‑paketet (`pip install asposeocr`) +- En bildfil (t.ex. `doc.png`) som innehåller tryckt eller handskriven text +- Valfritt: ett GPU för snabbare LLM‑inferens (skriptet fungerar även på CPU) + +--- + +## Känna igen text från bild – Steg‑för‑steg + +Under varje kodblock hittar du en kort förklaring av **varför** vi gör just den handlingen, inte bara **vad** raden gör. + +### Steg 1: Installera och importera de nödvändiga modulerna + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Varför?* Att importera `asposeocr` ger oss `OcrEngine`‑klassen, medan `ai`‑undermodulen tillhandahåller den LLM‑baserade post‑processorn som dramatiskt förbättrar rå OCR‑utdata. + +### Steg 2: Skapa OCR‑motorn och aktivera handskriven textigenkänning + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Att aktivera handskriven igenkänning utökar motorns teckenuppsättning, så du förlorar inte klotter när du **utför OCR på bild**‑filer som innehåller blandad tryckt och kursiv text. + +### Steg 3: Ladda bilden för OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +`load_image`‑anropet är ögonblicket då du **laddar bild för OCR**; om sökvägen är fel kommer motorn att kasta ett informativt undantag, vilket sparar dig från kryptiska fel senare i kedjan. + +### Steg 4: Kör den råa OCR‑passagen + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +I detta skede får du ett `RecognitionResult`‑objekt som innehåller den ofiltrerade texten, förtroendescore och layout‑metadata. Det är ofta brusigt—därför behövs AI‑driven rengöring. + +### Steg 5: Konfigurera Aspose AI‑modellen för LLM‑post‑processing + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Varför bry sig om dessa inställningar? +- **auto‑download** garanterar att modellen är tillgänglig vid första körning. +- **int8 quantization** minskar minneskravet utan stor förlust i noggrannhet. +- **gpu_layers** låter dig utnyttja ett kompatibelt GPU för snabbare inferens. + +### Steg 6: Initiera AI‑processorn och fäst den som en post‑processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Att fästa processorn betyder att varje gång du anropar `run_postprocessor` kommer LLM att rätta stavning, slå ihop brutna ord och till och med gissa saknad interpunktion. + +### Steg 7: Kör post‑processorn för att förbättra OCR‑utdata + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` är vanligtvis mycket mer läsbar än den råa strängen—tänk på det som en stavningskontroll som också förstår sammanhang. + +### Steg 8: Frigör resurser + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Rengöring är avgörande i långvariga tjänster; annars läcker du GPU‑minne och får så småningom din applikation att krascha. + +--- + +## Fullt skript du kan köra idag + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Förväntad output** (exempel för en enkel fakturabild): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Lägg märke till hur AI‑lagret korrigerade “Inv0ice” → “Invoice” och lade till saknad interpunktion. + +--- + +## Vanliga frågor (och snabba svar) + +- **Behöver jag ett GPU?** Nej. Skriptet faller tillbaka till CPU, men inferensen blir långsammare. +- **Kan jag använda en annan LLM?** Absolut—byt bara `hugging_face_repo_id` och justera `gpu_layers` därefter. +- **Vad händer om min bild är enorm?** Ändra storlek först (t.ex. med Pillow) för att hålla minnesanvändningen rimlig. +- **Är handskriven igenkänning alltid på?** Du kan växla `enable_handwritten_recognition` beroende på din arbetsbelastning. + +--- + +## Slutsats + +Du vet nu hur man **igenkänner text från bild** med Aspose OCR, hur man **laddar bild för OCR**, och hur man **utför OCR på bild** med AI‑förbättrad post‑processing. Det kompletta, körbara exemplet ovan ger dig en solid grund för att integrera OCR i vilket Python‑projekt som helst—oavsett om det handlar om att skanna kvitton, digitalisera kontrakt eller extrahera data från handskrivna formulär. + +Redo för nästa steg? Prova att byta ut Qwen‑modellen mot en större, experimentera med olika kvantiseringsscheman, eller kedja ihop flera bilder för batch‑behandling. Möjligheterna är oändliga, och koden du just byggt kommer att hantera dem smidigt. + +Lycka till med kodningen, och må dina OCR‑resultat alltid vara kristallklara! + +![Skärmbild av Python-konsol som visar förbättrad OCR‑output](/images/ocr_output.png){alt="Skärmbild som visar hur man känner igen text från bild med Aspose OCR"} + +## Vad bör du lära dig härnäst? + +Följande handledningar täcker närbesläktade ämnen som bygger på teknikerna som demonstreras i den här guiden. Varje resurs innehåller kompletta fungerande kodexempel med steg‑för‑steg‑förklaringar för att hjälpa dig bemästra ytterligare API‑funktioner och utforska alternativa implementationsmetoder i dina egna projekt. + +- [Extrahera text från bild med Aspose OCR – Steg‑för‑steg‑guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Konvertera bild till text – Utför OCR på bild från URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extrahera bildtext C# med språkval med Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/thai/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/thai/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..ffb1075f2 --- /dev/null +++ b/ocr/thai/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,212 @@ +--- +category: general +date: 2026-06-06 +description: ดึงข้อความจากภาพลายมือด้วย Python OCR. เรียนรู้วิธีแปลงภาพลายมือเป็นข้อความอย่างรวดเร็วและเชื่อถือได้. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: th +og_description: ดึงข้อความจากภาพลายมือด้วย Python คู่มือนี้แสดงวิธีแปลงภาพลายมือเป็นข้อความและตอบคำถามเกี่ยวกับการจดจำข้อความลายมือ +og_title: ดึงข้อความจากภาพลายมือ – บทเรียน Python OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: สกัดข้อความจากภาพลายมือด้วย Python OCR – คู่มือขั้นตอนโดยละเอียด +url: /th/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# ดึงข้อความจากภาพลายมือด้วย Python OCR – คู่มือขั้นตอนโดยละเอียด + +เคยสงสัยไหมว่า **วิธีการจดจำข้อความลายมือ** ในรูปที่คุณถ่ายด้วยโทรศัพท์ของคุณ? คุณไม่ได้เป็นคนเดียว ในหลายโครงการ—ไม่ว่าจะเป็นการแปลงบันทึกการบรรยายเป็นดิจิทัลหรือดึงข้อมูลจากแบบฟอร์มที่ลงลายเซ็น—คุณต้อง **ดึงข้อความจากภาพลายมือ** อย่างรวดเร็วและไม่มีปัญหา + +ในบทเรียนนี้เราจะเดินผ่านตัวอย่างที่สมบูรณ์และพร้อมรันที่แสดงให้คุณเห็นอย่างชัดเจนว่า **วิธีแปลงภาพลายมือเป็นข้อความ** ด้วยไลบรารี Python OCR ที่เป็นที่นิยม ไม่ได้อ้างอิงแบบคลุมเครือ เพียงโค้ดที่เป็นรูปธรรม คำอธิบาย และเคล็ดลับที่คุณสามารถคัดลอก‑วางได้ทันที + +![ดึงข้อความจากภาพลายมือ](https://example.com/placeholder-handwritten.jpg "ดึงข้อความจากภาพลายมือ") + +## สิ่งที่คุณต้องการ + +ก่อนที่เราจะดำเนินการต่อ โปรดตรวจสอบว่าคุณมีสิ่งต่อไปนี้: + +| ความต้องการ | เหตุผลที่สำคัญ | +|-------------|----------------| +| Python 3.9 หรือใหม่กว่า | ไวยากรณ์สมัยใหม่และการสนับสนุนไลบรารี | +| `pip` (Python package manager) | เพื่อติดตั้งแพคเกจ OCR | +| ภาพลายมือที่ชัดเจน (JPEG/PNG) | ความแม่นยำของ OCR ลดลงเมื่อภาพเบลอ | +| ความคุ้นเคยพื้นฐานกับฟังก์ชัน Python | ช่วยให้คุณปรับตัวอย่างได้ในภายหลัง | + +หากคุณขาดสิ่งใดสิ่งหนึ่ง ให้ดาวน์โหลด Python เวอร์ชันล่าสุดจาก และติดตั้ง—ไม่มีปัญหา + +## ติดตั้งไลบรารี Python OCR + +โค้ดสแนปเปตที่เราจะใช้พึ่งพาแพคเกจ `ocr` (wrapper เบา ๆ รอบ Tesseract ที่เพิ่มการตั้งค่าที่สะดวก) ติดตั้งด้วยคำสั่งเดียว: + +```bash +pip install ocr +``` + +> **เคล็ดลับ:** หลังการติดตั้ง ให้รัน `tesseract --version` ในเทอร์มินัลของคุณเพื่อยืนยันว่าเอนจิ้นพื้นฐานมีอยู่ หากคุณไม่มี Tesseract ให้ทำตามคู่มืออย่างเป็นทางการสำหรับระบบปฏิบัติการของคุณ—ส่วนใหญ่ของผู้จัดการแพคเกจมีอยู่ (`apt-get install tesseract-ocr` บน Ubuntu, `brew install tesseract` บน macOS). + +## ขั้นตอนที่ 1: สร้างอินสแตนซ์ของ OCR Engine + +การสร้าง engine คืออิฐก้อนแรกในกำแพงของ **python ocr handwritten recognition** คิดว่า engine คือสมองที่จะอ่านลายมือในภายหลัง + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +ทำไมเรื่องนี้สำคัญ: หากไม่มี engine คุณไม่สามารถปรับแต่ง pipeline การจดจำได้ การตั้งค่าเริ่มต้นถูกปรับสำหรับข้อความพิมพ์ ดังนั้นเราจำเป็นต้องปรับในขั้นตอนต่อไป + +## ขั้นตอนที่ 2: เปิดใช้งานการจดจำลายมือ + +โดยค่าเริ่มต้น engine จะสมมติว่าเป็นอักขระพิมพ์ การเปิดใช้งานโหมดลายมือจะสลับสวิตช์ที่บอก Tesseract ให้ใช้โมเดล LSTM ที่ฝึกด้วยลายมือ + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **ถ้าคุณข้ามขั้นตอนนี้?** OCR จะถือเส้นลายมือเป็นสัญญาณรบกวน ทำให้ผลลัพธ์เป็นข้อความสับสน การเปิดใช้งานแฟล็กนี้คือหัวใจของ **วิธีการจดจำข้อความลายมือ**. + +## ขั้นตอนที่ 3: โหลดภาพลายมือของคุณ + +ตอนนี้เราชี้ engine ไปที่ไฟล์ภาพ พาธสามารถเป็นแบบเต็มหรือแบบสัมพันธ์ได้ เพียงตรวจสอบให้ไฟล์มีอยู่จริง + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +การตรวจสอบอย่างรวดเร็ว: เปิดภาพในโปรแกรมดูของระบบปฏิบัติการ หากข้อความเบลอ ให้พิจารณาการทำพรี‑โปรเซส (เพิ่มคอนทราสต์, หมุน) ก่อนส่งให้ engine—การปรับเหล่านี้มักเพิ่มอัตราความสำเร็จของ **ดึงข้อความจากภาพลายมือ**. + +## ขั้นตอนที่ 4: เรียกกระบวนการจดจำ + +เมื่อทุกอย่างพร้อม เราก็ขอให้ engine ทำงานของมัน เมธอด `recognize()` จะคืนอ็อบเจกต์ผลลัพธ์ที่บรรจุสตริงที่ดึงออกและคะแนนความมั่นใจ + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +เบื้องหลัง engine จะเปลี่ยนบิตแมพเป็นเวกเตอร์ลักษณะหลายชุด รันผ่านเครือข่าย LSTM แล้วต่ออักขระเข้าด้วยกัน นั่นคือความมหัศจรรย์ของ **python ocr handwritten recognition**. + +## ขั้นตอนที่ 5: แสดงข้อความที่ดึงออก + +อ็อบเจกต์ผลลัพธ์เปิดเผยแอตทริบิวต์ `.text` ที่มีสตริง Unicode ธรรมดา พิมพ์ออก, เขียนลงไฟล์, หรือส่งต่อไปยัง pipeline อื่น—คุณเลือก + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### ผลลัพธ์ที่คาดหวัง + +หากภาพต้นฉบับมีโน้ตว่า “Buy milk, eggs, and bread” คุณจะเห็นประมาณนี้: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +สังเกตว่าผลลัพธ์คงรักษาเครื่องหมายวรรคตอนและการขึ้นบรรทัด (ถ้ามี) หากได้ข้อความไร้สาระ ให้ตรวจสอบคุณภาพภาพและแฟล็ก `enable_handwritten_recognition` อีกครั้ง + +## การจัดการกับปัญหาที่พบบ่อย + +| ปัญหา | อาการ | วิธีแก้ | +|-------|---------|-----| +| คะแนนความมั่นใจต่ำ | มี “?” หรืออักขระที่ไม่มีความหมาย | เพิ่ม DPI ของภาพเป็น ≥300, ใช้การไบนารี (`opencv`), หรือครอบตัดเป็นส่วนที่ต้องการ | +| หลายภาษา | ผลลัพธ์ผสมภาษาอังกฤษกับสคริปต์อื่น | ตั้งค่า `engine.ocr_settings.language = "eng"` (หรือรหัส ISO อื่น) ก่อน `recognize()` | +| ไฟล์ขนาดใหญ่ | เวลาประมวลผลนานหรือเกิดข้อผิดพลาดหน่วยความจำ | ปรับขนาดภาพให้เหมาะสม (เช่น ความกว้างสูงสุด 1200 px) ก่อนโหลด | +| ไม่มี Tesseract | `ImportError` หรือ `FileNotFoundError` | ติดตั้ง Tesseract แยกต่างหากและตรวจสอบว่าอยู่ใน PATH ของระบบ | + +การปรับเหล่านี้ทำให้ workflow **แปลงภาพลายมือเป็นข้อความ** ของคุณแข็งแรงแม้กับชุดข้อมูลที่หลากหลาย + +## สคริปต์เต็มที่คุณสามารถรันได้วันนี้ + +ด้านล่างเป็นโปรแกรมสมบูรณ์แบบที่รวมทุกส่วนเข้าด้วยกัน คัดลอกไปยังไฟล์ชื่อ `handwritten_ocr.py` แล้วรัน `python handwritten_ocr.py` + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +รันมันแล้วคุณจะเห็นข้อความพิมพ์ออกที่คอนโซล—ผลลัพธ์ที่คุณต้องการเมื่ออยาก **แปลงภาพลายมือเป็นข้อความ**. + +## ไปต่อ + +ตอนนี้คุณได้เชี่ยวชาญพื้นฐานของ **ดึงข้อความจากภาพลายมือ** แล้ว ลองทำตามขั้นตอนต่อไปนี้: + +- **การประมวลผลเป็นชุด:** วนลูปผ่านโฟลเดอร์ของรูปภาพและบันทึกผลลัพธ์แต่ละรายการในไฟล์ CSV. +- **การประมวลผลหลัง:** ใช้ regular expressions เพื่อลบข้อผิดพลาด OCR ที่พบบ่อย (เช่น “1” กับ “l”). +- **การบูรณาการ:** ส่งสตริงที่ดึงออกไปยัง pipeline การประมวลผลภาษาธรรมชาติเพื่อวิเคราะห์ความรู้สึกหรือสกัดคีย์เวิร์ด. +- **ไลบรารีทางเลือก:** หากต้องการความแม่นยำสูงขึ้น ให้สำรวจ `easyocr` หรือ `pytesseract` พร้อมโมเดล LSTM ที่กำหนดเอง—ทั้งสองสนับสนุน **python ocr handwritten recognition** ด้วย. + +จำไว้ว่า คุณภาพของภาพต้นฉบับมักกำหนดความสำเร็จ จึงควรใช้เวลาสักสองสามนาทีในการพรี‑โปรเซส การใส่ความพยายามเล็กน้อยตอนนี้จะช่วยลดการดีบักในภายหลังอย่างมาก + +## สรุป + +เราได้เดินผ่านตัวอย่างครบวงจรที่แสดง **วิธีการจดจำข้อความลายมือ** และสำคัญยิ่งกว่า **ดึงข้อความจากภาพลายมือ** ด้วย Python โดยการติดตั้งแพคเกจ `ocr` เปิดแฟล็กลายมือ โหลดรูปภาพของคุณ แล้วเรียก `recognize()` คุณสามารถ **แปลงภาพลายมือเป็นข้อความ** ได้ในไม่กี่บรรทัด + +ลองใช้กับโน้ตของคุณเอง ปรับขั้นตอนพรี‑โปรเซส แล้วให้ OCR ทำงานหนัก หากเจออุปสรรคใด ๆ ให้กลับไปดูตาราง “การจัดการกับปัญหาที่พบบ่อย” หรือทดลองใช้ OCR back‑ends ทางเลือกอื่น ๆ โค้ดดิ้งให้สนุกและขอให้ข้อมูลลายมือของคุณกลายเป็นข้อมูลที่ค้นหาได้ทันที! + +## คุณควรเรียนรู้อะไรต่อไป? + +บทเรียนต่อไปนี้ครอบคลุมหัวข้อที่เกี่ยวข้องอย่างใกล้ชิดและต่อยอดจากเทคนิคที่แสดงในคู่มือนี้ แต่ละแหล่งรวมตัวอย่างโค้ดทำงานเต็มรูปแบบพร้อมคำอธิบายขั้นตอน‑โดย‑ขั้นตอน เพื่อช่วยคุณเชี่ยวชาญฟีเจอร์ API เพิ่มเติมและสำรวจแนวทางการนำไปใช้แบบต่าง ๆ ในโครงการของคุณ + +- [ดึงข้อความจากภาพด้วย Aspose OCR – คู่มือขั้นตอนโดยละเอียด](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [วิธีการจดจำสี่เหลี่ยมหน้าสำหรับการจดจำข้อความ OCR ใน Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [วิธีใช้ OCR - จดจำภาพโดยไม่มีการตรวจจับพื้นที่ข้อความ](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/thai/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/thai/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..8a2b38b93 --- /dev/null +++ b/ocr/thai/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: 'คู่มือชุดอักขระ OCR: เรียนรู้วิธีใช้เครื่องมือ OCR เพื่อแสดงรายการอักขระที่รองรับสำหรับสคริปต์ละตินและซีริลลิกพร้อมตัวอย่าง + Python เต็มรูปแบบ.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: th +og_description: 'คู่มือชุดอักขระ OCR: ค้นหาวิธีใช้เครื่องมือ OCR ใน Python เพื่อแสดงรายการอักขระที่รองรับสำหรับสคริปต์ต่าง + ๆ พร้อมโค้ดและเคล็ดลับ' +og_title: ชุดอักขระ OCR – ดึงและใช้เอนจิน OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: ชุดอักขระ OCR – ดึงและใช้งานเครื่องมือ OCR ใน Python +url: /th/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# ชุดอักขระ OCR – ดึงและใช้ OCR Engine ใน Python + +ต้องการทราบ **ocr character set** ที่ไลบรารีของคุณรองรับหรือไม่? ในบทแนะนำนี้เราจะแสดงวิธี **use OCR engine** เพื่อสอบถามอักขระที่รองรับสำหรับสคริปต์ต่าง ๆ และทำไมสิ่งนี้ถึงสำคัญสำหรับโครงการในโลกจริง + +หากคุณเคยมองหน้าจอสีขาวเปล่าและสงสัยว่าทำไมบางตัวอักษรไม่ปรากฏหลังการประมวลผล OCR คุณไม่ได้อยู่คนเดียว คำตอบมักซ่อนอยู่ในชุดอักขระที่เอนจินรู้จัก ด้วยคำแนะนำนี้คุณจะสามารถ: + +* รายการอักขระทั้งหมดที่ OCR engine สามารถจดจำได้สำหรับสคริปต์ที่กำหนด +* จัดการกรณีที่สคริปต์ไม่ได้รับการสนับสนุน +* นำข้อมูลชุดอักขระไปใช้ในกระบวนการตรวจสอบหรือโลจิก UI ด้านล่าง + +ไม่มีเทคนิคกล่องดำมหัศจรรย์—เพียง Python ธรรมดา, บรรทัดโค้ดไม่กี่บรรทัด, และคำอธิบายสั้น ๆ + +--- + +## ข้อกำหนดเบื้องต้น + +ก่อนที่เราจะเริ่ม, ตรวจสอบให้แน่ใจว่าคุณมี: + +* Python 3.8+ ติดตั้งแล้ว (โค้ดใช้ f‑strings) +* ไลบรารี OCR ที่ให้ `ocr.OcrEngine`—สำหรับตัวอย่างนี้เราจะสมมติว่ามีแพ็กเกจชื่อ `ocr` ติดตั้งแล้วผ่าน `pip install ocr-lib` +* ความคุ้นเคยพื้นฐานกับระบบ import ของ Python และการพิมพ์ผล + +หากคุณยังไม่มีไลบรารี, ให้รัน: + +```bash +pip install ocr-lib +``` + +แค่นั้น—ไม่ต้องมีไบนารีเพิ่มเติมหรือการพึ่งพาเนทีฟสำหรับการสืบค้นชุดอักขระพื้นฐานที่เราจะทำ + +--- + +## ขั้นตอนที่ 1: เริ่มต้นอินสแตนซ์ OCR Engine + +การสร้างอ็อบเจกต์เอนจินเป็นสิ่งแรกที่ทำเมื่อคุณ **use OCR engine** คิดว่าเป็นการเปิดสแกนเนอร์; หากไม่มีคุณก็ไม่สามารถถามคำถามใด ๆ ได้ + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +คลาส `OcrEngine` เป็น wrapper ที่เบา ๆ รอบเอนจินการจดจำพื้นฐาน มันจะไม่เริ่มการประมวลผลหนักจนกว่าคุณจะร้องขออะไรบางอย่าง, ดังนั้นค่าใช้จ่ายในการเริ่มต้นจึงแทบไม่มี + +> **Pro tip:** หากแอปพลิเคชันของคุณสร้างอินสแตนซ์เอนจินหลายตัว, ให้ใช้ตัวเดียวซ้ำ – จะช่วยประหยัดหน่วยความจำและลดความหน่วงของการเริ่มต้น + +--- + +## ขั้นตอนที่ 2: สืบค้นชุดอักขระ OCR สำหรับสคริปต์เฉพาะ + +ตอนนี้เรามีเอนจินแล้ว, เราสามารถถามว่าเอนจินรู้จักอักขระใดบ้างสำหรับระบบเขียนที่กำหนด วิธี `get_supported_characters(script_name)` จะคืนค่า `list` ของอักขระ Unicode + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +การรันสคริปต์ด้านบนจะพิมพ์ผลลัพธ์ประมาณ: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +ผลลัพธ์ที่แน่นอนขึ้นอยู่กับเวอร์ชันของไลบรารี OCR, แต่คุณจะได้รับรายการอักขระแบบแบนเสมอ หากต้องการทั้งตัวพิมพ์ใหญ่และพิมพ์เล็ก, เพียงขอสคริปต์ `"Latin"` แล้วใช้ `.lower()` กับแต่ละองค์ประกอบ, หรือสอบถามสคริปต์ `"Latin‑lower"` แยกต่างหากหากไลบรารีแยกแยะไว้ + +### ทำไมเรื่องนี้ถึงสำคัญ? + +เมื่อคุณป้อนภาพที่มีไดอะคริติกแปลก ๆ (เช่น “ñ” หรือ “ø”), OCR engine อาจแทนที่โดยเงียบ ๆ ด้วยตัวแทนหรือทิ้งออกไปทั้งหมด การรู้ **ocr character set** ล่วงหน้าช่วยให้คุณตรวจสอบข้อมูลล่วงหน้า, เตือนผู้ใช้, หรือสลับไปใช้เอนจินอื่น + +--- + +## ขั้นตอนที่ 3: ดึงอักขระสำหรับสคริปต์อื่น – ตัวอย่าง Cyrillic + +วิธีเดียวกันทำงานกับสคริปต์ใดก็ได้ที่เอนจินอ้างว่ารองรับ ลองดูว่าเกิดอะไรขึ้นกับ Cyrillic + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +ผลลัพธ์ทั่วไป: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +หากเอนจิน **not** รองรับสคริปต์ที่ร้องขอ, มักจะโยน `ValueError` (หรือคืนรายการว่างขึ้นอยู่กับการทำงาน) ให้ป้องกันกรณีนั้นด้วย + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## ขั้นตอนที่ 4: แสดงภาพชุดอักขระ OCR (ทางเลือก) + +บางครั้งภาพสั้น ๆ ช่วยได้, โดยเฉพาะเมื่อคุณต้องแสดงให้ผู้มีส่วนได้ส่วนเสียเห็นว่าครอบคลุมอักขระใดบ้าง ด้านล่างเป็นตัวอย่างขั้นต่ำโดยใช้ `matplotlib` เพื่อวาดตารางอักขระ + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Image alt text:** ![แผนภาพชุดอักขระ OCR](ocr_character_set.png){alt="ภาพรวมชุดอักขระ OCR"} + +แผนภาพนี้ไม่จำเป็นสำหรับวิธีแก้หลัก, แต่แสดงให้เห็นว่าคุณสามารถ **use OCR engine** เมตาดาต้าได้เกินกว่าการพิมพ์ผลธรรมดา + +--- + +## ขั้นตอนที่ 5: ผสานชุดอักขระเข้ากระบวนการทำงานจริง + +ตอนนี้เราสามารถดึง **ocr character set** มาได้, มาดูสถานการณ์จริง: ตรวจสอบผลลัพธ์ OCR ก่อนบันทึกลงฐานข้อมูล + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +รูปแบบนี้ป้องกันข้อมูลขยะไม่ให้ไหลเข้าสู่สายงานต่อเนื่อง, ซึ่งเป็นจุดเจ็บปวดทั่วไปเมื่อทำงานกับเอกสารหลายภาษา + +--- + +## ข้อผิดพลาดทั่วไปและกรณีขอบ + +| ข้อผิดพลาด | สาเหตุ | วิธีหลีกเลี่ยง | +|------------|--------|----------------| +| **Script name typo** (เช่น `"Cyrillic "` มีช่องว่างท้าย) | เอนจินตีความสตริงตามตัวอักษรและไม่พบสคริปต์ | ตัดช่องว่าง: `script.strip()` ก่อนเรียก `get_supported_characters` | +| **Empty character list** | บางเอนจินเปิดเผยสคริปต์แต่ยังไม่ได้โหลดโมเดลภาษา | เรียก `engine.load_language_model(script)` หากไลบรารีมีเมธอดนี้, หรือให้แน่ใจว่าไฟล์โมเดลมีอยู่ | +| **Unicode normalization issues** | อักขระเช่น “é” อาจอยู่ในรูปแบบประกอบ (`\u00E9`) หรือแยก (`e\u0301`) | ทำ Normalise สตริงด้วย `unicodedata.normalize('NFC', text)` ก่อนตรวจสอบ | +| **Performance on huge scripts** (เช่น Chinese) | การดึงอักขระหลายพันตัวอาจช้า | แคชผลลัพธ์หลังการเรียกครั้งแรก; ส่วนใหญ่แอปต้องการรายการเพียงครั้งเดียว | + +--- + +## ตัวอย่างทำงานเต็มรูปแบบ + +รวมทุกอย่างเข้าด้วยกัน, นี่คือสคริปต์เดียวที่คุณสามารถคัดลอก‑วางและรันได้: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## What Should You Learn Next? + + +The following tutorials cover closely related topics that build on the techniques demonstrated in this guide. Each resource includes complete working code examples with step-by-step explanations to help you master additional API features and explore alternative implementation approaches in your own projects. + +- [Aspose OCR Tutorial – Optical Character Recognition](/ocr/english/) +- [OCR Post Processing – Get Character Choices](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [How to Set Threshold Value in OCR Image Recognition](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/thai/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/thai/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..55abc2bfd --- /dev/null +++ b/ocr/thai/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,252 @@ +--- +category: general +date: 2026-06-06 +description: ทำการ OCR บนภาพโดยใช้ Aspose OCR และโมเดลจาก Hugging Face. เรียนรู้วิธีดาวน์โหลดโมเดล + Hugging Face, ดึงข้อความจากใบแจ้งหนี้, และปล่อยทรัพยากร GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: th +og_description: ทำ OCR บนภาพโดยใช้ Aspose OCR และโมเดลของ Hugging Face บทเรียนนี้แสดงวิธีดาวน์โหลดโมเดล, + แยกข้อความจากใบแจ้งหนี้, และปลดปล่อยทรัพยากร GPU +og_title: ทำ OCR บนรูปภาพด้วย Aspose OCR & LLM – คู่มือฉบับสมบูรณ์ +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: ทำ OCR บนภาพด้วย Aspose OCR & LLM – คู่มือฉบับสมบูรณ์ +url: /th/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# ทำ OCR บนรูปภาพด้วย Aspose OCR & LLM – คู่มือเต็ม + +เคยอยาก **perform OCR on image** ไฟล์แต่รู้สึกติดที่คำถาม “จะเริ่มจากตรงไหน?” หรือไม่? คุณไม่ได้อยู่คนเดียว—นักพัฒนาหลายคนเจออุปสรรคนี้เมื่อพยายามทำอัตโนมัติเอกสารครั้งแรก ข่าวดีคือด้วย Aspose OCR และ LLM ขนาดเล็กจาก Hugging Face คุณสามารถแปลงสแกนใบแจ้งหนี้ดิบให้เป็นข้อความที่สะอาดและค้นหาได้ในไม่กี่บรรทัดของ Python. + +ในบทแนะนำนี้เราจะพาคุณผ่านทุกอย่างที่ต้องการ: ตั้งแต่ **loading image for OCR**, ไปจนถึง **download Hugging Face model**, ไปจนถึง **extract text from invoice** data, และสุดท้าย **free GPU resources** เพื่อให้แอปของคุณเบาแรง. เมื่อเสร็จคุณจะมีสคริปต์ที่ทำงานได้เองซึ่งสามารถใส่ลงในโปรเจกต์ใดก็ได้. + +--- + +## สิ่งที่คุณจะได้เรียนรู้ + +- วิธี **perform OCR on image** ด้วย `OcrEngine` ของ Aspose +- ขั้นตอนที่แม่นยำในการ **download Hugging Face model** โดยอัตโนมัติ +- เทคนิคการ **extract text from invoice** จาก PDF หรือ PNG ด้วยการประมวลผลหลัง AI +- แนวปฏิบัติที่ดีที่สุดในการ **free GPU resources** หลังการ inference +- เคล็ดลับการ **load image for OCR** อย่างมีประสิทธิภาพและหลีกเลี่ยงข้อผิดพลาดทั่วไป + +ไม่มีเอกสารภายนอกที่จำเป็น—ทุกอย่างที่คุณต้องการอยู่ที่นี่ พร้อมโค้ดเต็ม คำอธิบาย และผลลัพธ์ที่คาดหวัง. + +--- + +## ข้อกำหนดเบื้องต้น + +| ข้อกำหนด | เหตุผล | +|-------------|--------| +| Python 3.9+ | ไวยากรณ์สมัยใหม่และ type hints | +| `asposeocr` package (`pip install asposeocr`) | เครื่องยนต์ OCR หลัก | +| Access to a GPU (optional but recommended) | เร่งความเร็วให้กับ LLM post‑processor | +| An invoice image (`sample_invoice.png`) | กรณีทดสอบในโลกจริง | + +หากคุณขาดสิ่งใดสิ่งหนึ่ง ให้ติดตั้งตอนนี้; สคริปต์จะ **download Hugging Face model** โดยอัตโนมัติ ดังนั้นคุณไม่ต้องค้นหาไฟล์ด้วยตนเอง. + +--- + +## ขั้นตอนที่ 1: Perform OCR on Image – สร้าง Engine + +สิ่งแรกที่คุณต้องทำคือเปิดใช้งาน OCR engine ของ Aspose. คิดว่ามันเหมือนกับการเปิดผืนผ้าใบเปล่าที่ภาพจะถูกวาดเป็นข้อความในภายหลัง. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **ทำไมเรื่องนี้สำคัญ:** `OcrEngine` แยกการประมวลผลภาพระดับต่ำทั้งหมดออกไป ทำให้คุณโฟกัสที่ขั้นตอนระดับสูงได้. มันยังเปิดเผยเมธอด `set_post_processor` ที่จะให้เราต่อเชื่อม LLM เพื่อผลลัพธ์ที่ฉลาดขึ้นในภายหลัง. + +--- + +## ขั้นตอนที่ 2: Load Image for OCR – เลือกไฟล์ที่เหมาะสม + +ตอนนี้ engine มีแล้ว เราต้อง **load image for OCR**. Aspose รองรับ PNG, JPG, TIFF, และรูปแบบอื่น ๆ อีกหลายประเภท. ตรวจสอบให้แน่ใจว่าเส้นทางเป็นแบบ absolute หรือ relative ต่อที่ตั้งของสคริปต์ของคุณ. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **เคล็ดลับ:** หากภาพของคุณมีขนาดใหญ่ ควรปรับขนาดก่อนเพื่อลดความกดดันของหน่วยความจำ. OCR engine สามารถจัดการสแกนความละเอียดสูงได้ แต่ภาพ 300 DPI มักเป็นจุดที่เหมาะสมสำหรับใบแจ้งหนี้. + +--- + +## ขั้นตอนที่ 3: Perform Raw OCR and View the Extracted Text + +เมื่อโหลดภาพแล้ว เราสามารถ **perform OCR on image** และดูว่าตัว engine ดิบให้ผลลัพธ์อะไรออกมา. ขั้นตอนนี้ให้ฐานข้อมูลก่อนที่เราจะเพิ่มเวทมนตร์ AI. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**ผลลัพธ์ที่คาดหวัง (ตัดทอน):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +ผลลัพธ์ดิบมักมีการขึ้นบรรทัดใหม่ ตัวอักษรที่อ่านผิด หรือฟิลด์ที่หายไป—นี่คือเหตุผลที่เราจะนำ language model เข้ามาในขั้นตอนต่อไป. + +--- + +## ขั้นตอนที่ 4: Download Hugging Face Model – กำหนดค่า LLM Post‑Processor + +นี่คือจุดที่ขั้นตอน **download Hugging Face model** ส่องแสง. Aspose AI สามารถดึงโมเดลจาก Hugging Face hub โดยอัตโนมัติหากยังไม่มีบนดิสก์. เราจะใช้โมเดล Qwen2.5‑3B‑Instruct‑GGUF ซึ่งสมดุลระหว่างความแม่นยำและขนาดหน่วยความจำ. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **ทำไมวิธีนี้ได้ผล:** `allow_auto_download` ช่วยคุณไม่ต้องดาวน์โหลดไฟล์ `.gguf` ด้วยตนเอง. การควอนไทซ์ (`int8`) ลดขนาดโมเดลเหลือประมาณ 3 GB ทำให้ใช้ได้บน GPU ของผู้บริโภคส่วนใหญ่. ปรับ `gpu_layers` ตามฮาร์ดแวร์ของคุณ—เพิ่มเลเยอร์บน GPU จะทำให้ inference เร็วขึ้น. + +--- + +## ขั้นตอนที่ 5: Extract Text from Invoice Using AI‑Enhanced Post‑Processing + +ตอนนี้เราต่อ LLM เข้ากับ OCR engine และรัน **post‑processor** ที่ทำความสะอาดผลลัพธ์ดิบ, แก้ไขข้อผิดพลาด OCR, และจัดรูปแบบฟิลด์ใบแจ้งหนี้ให้สวยงาม. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**ตัวอย่างผลลัพธ์ที่ปรับปรุงแล้ว:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **เกิดอะไรขึ้น?** LLM ตระหนักว่า “Invoice #12345” ควรเป็น “Invoice Number: 12345”, แก้ไขรูปแบบวันที่, และแม้กระทั่งสรุปฟิลด์ “Bill To” ที่ engine ดิบพลาด. นี่คือหัวใจของการ **extract text from invoice** อัตโนมัติ. + +--- + +## ขั้นตอนที่ 6: Free GPU Resources – ทำความสะอาดหลังการประมวลผล + +หากคุณรันโค้ดนี้ในบริการที่ทำงานต่อเนื่อง (เช่น Flask API) คุณต้อง **free GPU resources** หลังแต่ละครั้งของ inference เพื่อหลีกเลี่ยงการพังจาก out‑of‑memory. Aspose AI มีเมธอดง่าย ๆ สำหรับทำเช่นนั้น. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **เคล็ดลับระดับมืออาชีพ:** เรียก `free_resources()` ภายในบล็อก `finally:` หากคุณห่อการเรียก OCR ด้วย try/except. จะทำให้การทำความสะอาดเกิดขึ้นแม้เกิดข้อยกเว้น. + +--- + +## ขั้นตอนที่ 7: Full Script – รวมทุกอย่างเข้าด้วยกัน + +ด้านล่างเป็นสคริปต์เต็มที่พร้อมรัน. คัดลอก‑วาง, ปรับเส้นทางตามต้องการ, แล้วคุณก็พร้อมใช้งาน. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +รันสคริปต์และชมการเปลี่ยนแปลงจาก OCR ที่มีเสียงรบกวนไปสู่ข้อมูลใบแจ้งหนี้ที่สะอาดและเป็นโครงสร้าง 🎉 + +--- + +## คำถามทั่วไปและกรณีขอบ + +| คำถาม | คำตอบ | +|----------|--------| +| **ถ้าโมเดลไม่สามารถดาวน์โหลดได้จะทำอย่างไร?** | Ensure your machine has internet access and that the `hugging_face_repo_id` is correct. You can also manually download the | + +## คุณควรเรียนรู้อะไรต่อไป? + +บทแนะนำต่อไปนี้ครอบคลุมหัวข้อที่เกี่ยวข้องอย่างใกล้ชิดและต่อยอดจากเทคนิคที่แสดงในคู่มือนี้. แต่ละแหล่งรวมโค้ดทำงานเต็มรูปแบบพร้อมคำอธิบายทีละขั้นตอน เพื่อช่วยคุณเชี่ยวชาญฟีเจอร์ API เพิ่มเติมและสำรวจวิธีการทำงานแบบอื่นในโปรเจกต์ของคุณเอง. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/thai/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/thai/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..0c0df952e --- /dev/null +++ b/ocr/thai/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,282 @@ +--- +category: general +date: 2026-06-06 +description: แยกข้อความจากภาพด้วย Aspose OCR – เรียนรู้วิธีโหลดภาพสำหรับ OCR และทำ + OCR บนภาพโดยใช้การประมวลผลหลังจาก AI ใน Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: th +og_description: จดจำข้อความจากภาพได้อย่างรวดเร็ว คู่มือนี้แสดงวิธีโหลดภาพเพื่อทำ OCR, + ทำ OCR บนภาพ, และเพิ่มผลลัพธ์ด้วยการประมวลผลหลังจาก AI +og_title: จดจำข้อความจากภาพโดยใช้ Aspose OCR & AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: แยกข้อความจากภาพโดยใช้ Aspose OCR & AI +url: /th/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# จดจำข้อความจากรูปภาพด้วย Aspose OCR & AI + +เคยต้องการจดจำข้อความจากรูปภาพแต่ไม่แน่ใจว่าห้องสมุดใดจะให้ความเร็วและความแม่นยำทั้งสอง? คุณไม่ได้อยู่คนเดียว ในคู่มือนี้เราจะเดินผ่านตัวอย่างครบวงจรที่แสดง **วิธีโหลดรูปภาพสำหรับ OCR**, **ทำ OCR บนรูปภาพ**, และจากนั้นปรับผลลัพธ์ด้วย AI post‑processor ของ Aspose. เมื่อเสร็จคุณจะมีสคริปต์พร้อมรันที่แปลง PNG เป็นข้อความที่สะอาดและค้นหาได้. + +## สิ่งที่คุณจะได้เรียนรู้ + +เราจะครอบคลุมทุกอย่างตั้งแต่การติดตั้งแพ็กเกจ Aspose OCR จนถึงการปล่อยทรัพยากรเมื่อจบการทำงาน คุณจะเห็นว่าการเปิดใช้งานการจดจำข้อความเขียนมือมีความสำคัญอย่างไร, วิธีกำหนดค่า Qwen 2.5 LLM สำหรับการ post‑processing, และผลลัพธ์สุดท้ายเป็นอย่างไร. ไม่ต้องอ้างอิงภายนอก—เพียงคัดลอก, วาง, และรัน. + +### ข้อกำหนดเบื้องต้น + +- Python 3.8 หรือใหม่กว่า +- แพ็กเกจ `asposeocr` (`pip install asposeocr`) +- ไฟล์รูปภาพ (เช่น `doc.png`) ที่มีข้อความพิมพ์หรือเขียนมือ +- ตัวเลือก: GPU เพื่อเร่งการสรุปผล LLM (สคริปต์ทำงานบน CPU ได้เช่นกัน) + +--- + +## จดจำข้อความจากรูปภาพ – ขั้นตอนทีละขั้นตอน + +ด้านล่างแต่ละบล็อกโค้ดคุณจะพบคำอธิบายสั้น ๆ เกี่ยวกับ **ทำไม** เราถึงทำการกระทำนั้น, ไม่ใช่แค่ **อะไร** ที่บรรทัดทำ. + +### ขั้นตอน 1: ติดตั้งและนำเข้าโมดูลที่จำเป็น + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*ทำไม?* การนำเข้า `asposeocr` ให้เราได้คลาส `OcrEngine`, ส่วนโมดูลย่อย `ai` ให้ post‑processor ที่ใช้ LLM ซึ่งปรับปรุงผลลัพธ์ OCR ดิบอย่างมาก. + +### ขั้นตอน 2: สร้างเครื่อง OCR และเปิดใช้งานการจดจำข้อความเขียนมือ + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +การเปิดใช้งานการจดจำข้อความเขียนมือจะขยายชุดอักขระของเครื่อง, ดังนั้นคุณจะไม่สูญเสียการเขียนลายมือเมื่อคุณ **ทำ OCR บนรูปภาพ** ที่มีข้อความพิมพ์และลายมือผสมกัน. + +### ขั้นตอน 3: โหลดรูปภาพสำหรับ OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +การเรียก `load_image` คือช่วงที่คุณ **โหลดรูปภาพสำหรับ OCR**; หากพาธผิด, เครื่องจะโยนข้อยกเว้นที่ให้ข้อมูล, ช่วยคุณหลีกเลี่ยงข้อผิดพลาดต่อมาที่ซับซ้อน. + +### ขั้นตอน 4: รันการประมวลผล OCR ดิบ + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +ในขั้นตอนนี้คุณจะได้อ็อบเจ็กต์ `RecognitionResult` ที่มีข้อความที่ไม่ได้กรอง, คะแนนความมั่นใจ, และเมตาดาต้าเลย์เอาต์. มักจะมีเสียงรบกวน—จึงต้องการการทำความสะอาดด้วย AI. + +### ขั้นตอน 5: กำหนดค่าโมเดล AI ของ Aspose สำหรับการ post‑processing LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +ทำไมต้องยุ่งกับการตั้งค่าเหล่านี้? +- **auto‑download** รับประกันว่าโมเดลพร้อมใช้งานในการรันครั้งแรก. +- **int8 quantization** ลดความต้องการหน่วยความจำโดยไม่สูญเสียความแม่นยำมาก. +- **gpu_layers** ให้คุณใช้ GPU ที่เข้ากันได้เพื่อเร่งการสรุปผล. + +### ขั้นตอน 6: เริ่มต้น AI processor และแนบเป็น post‑processor + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +การแนบ processor หมายความว่า ทุกครั้งที่คุณเรียก `run_postprocessor`, LLM จะทำความสะอาดการสะกด, รวมคำที่แยก, และแม้กระทั่งเติมเครื่องหมายวรรคตอนที่หายไป. + +### ขั้นตอน 7: รัน post‑processor เพื่อปรับปรุงผลลัพธ์ OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` มักจะอ่านง่ายกว่าข้อความดิบ—คิดว่าเป็นตัวตรวจสอบการสะกดที่ยังเข้าใจบริบทด้วย. + +### ขั้นตอน 8: ปล่อยทรัพยากร + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +การทำความสะอาดเป็นสิ่งสำคัญในบริการที่ทำงานต่อเนื่อง; หากไม่ทำคุณจะรั่วหน่วยความจำ GPU และในที่สุดแอปพลิเคชันจะล่ม. + +--- + +## สคริปต์เต็มที่คุณสามารถรันได้วันนี้ + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**ผลลัพธ์ที่คาดหวัง** (ตัวอย่างสำหรับรูปใบแจ้งหนี้ง่าย): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +สังเกตว่าเลเยอร์ AI แก้ไข “Inv0ice” → “Invoice” และเพิ่มเครื่องหมายวรรคตอนที่หายไป. + +--- + +## คำถามที่พบบ่อย (และคำตอบสั้น) + +- **ฉันต้องการ GPU หรือไม่?** ไม่. สคริปต์จะใช้ CPU แทน, แต่การสรุปผลจะช้ากว่า. +- **ฉันสามารถใช้ LLM อื่นได้หรือไม่?** แน่นอน—เพียงเปลี่ยน `hugging_face_repo_id` และปรับ `gpu_layers` ตามความเหมาะสม. +- **ถ้าภาพของฉันใหญ่เกินไปจะทำอย่างไร?** ปรับขนาดภาพก่อน (เช่น ใช้ Pillow) เพื่อให้การใช้หน่วยความจำอยู่ในระดับที่สมเหตุสมผล. +- **การจดจำข้อความเขียนมือเปิดอยู่เสมอหรือไม่?** คุณสามารถสลับ `enable_handwritten_recognition` ตามภาระงานของคุณ. + +--- + +## สรุป + +ตอนนี้คุณรู้วิธี **จดจำข้อความจากรูปภาพ** ด้วย Aspose OCR, วิธี **โหลดรูปภาพสำหรับ OCR**, และวิธี **ทำ OCR บนรูปภาพ** ด้วยการ post‑processing ที่เสริมด้วย AI. ตัวอย่างที่สมบูรณ์และรันได้ข้างต้นให้พื้นฐานที่แข็งแกร่งในการผสาน OCR เข้ากับโปรเจกต์ Python ใด ๆ—ไม่ว่าจะเป็นการสแกนใบเสร็จ, ดิจิไทซ์สัญญา, หรือสกัดข้อมูลจากแบบฟอร์มเขียนมือ. + +พร้อมสำหรับขั้นตอนต่อไปหรือยัง? ลองสลับโมเดล Qwen เป็นโมเดลที่ใหญ่กว่า, ทดลองใช้สคีมการ quantization ต่าง ๆ, หรือเชื่อมต่อหลายภาพเข้าด้วยกันสำหรับการประมวลผลเป็นชุด. ความเป็นไปได้ไม่มีที่สิ้นสุด, และโค้ดที่คุณสร้างจะจัดการได้อย่างราบรื่น. + +ขอให้เขียนโค้ดอย่างสนุกสนาน, และขอให้ผลลัพธ์ OCR ของคุณใสเหมือนคริสตัล! + +![ภาพหน้าจอของคอนโซล Python แสดงผลลัพธ์ OCR ที่ปรับปรุง](/images/ocr_output.png){alt="ภาพหน้าจอของคอนโซล Python แสดงผลลัพธ์ OCR ที่ปรับปรุง"} + +## คุณควรเรียนรู้อะไรต่อไป? + +บทแนะนำต่อไปนี้ครอบคลุมหัวข้อที่เกี่ยวข้องอย่างใกล้ชิดซึ่งต่อยอดจากเทคนิคที่แสดงในคู่มือนี้. แต่ละแหล่งรวมตัวอย่างโค้ดทำงานเต็มรูปแบบพร้อมคำอธิบายขั้นตอนเพื่อช่วยคุณเชี่ยวชาญฟีเจอร์ API เพิ่มเติมและสำรวจวิธีการดำเนินการทางเลือกในโครงการของคุณ. + +- [ดึงข้อความจากรูปภาพด้วย Aspose OCR – คู่มือขั้นตอนทีละขั้นตอน](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [แปลงรูปภาพเป็นข้อความ – ทำ OCR บนรูปภาพจาก URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [ดึงข้อความจากรูปภาพด้วย C# พร้อมการเลือกภาษาโดยใช้ Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/turkish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/turkish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..caeca98f0 --- /dev/null +++ b/ocr/turkish/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,215 @@ +--- +category: general +date: 2026-06-06 +description: Python OCR kullanarak el yazısı görüntüsünden metin çıkarın. El yazısı + fotoğrafını hızlı ve güvenilir bir şekilde metne dönüştürmeyi öğrenin. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: tr +og_description: Python ile el yazısı görüntüsünden metin çıkarın. Bu rehber, el yazısı + fotoğrafını metne nasıl dönüştüreceğinizi gösterir ve el yazısı metni nasıl tanıyacağınızı + açıklar. +og_title: El Yazısı Görüntüsünden Metin Çıkar – Python OCR Eğitimi +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Python OCR ile El Yazısı Görüntüsünden Metin Çıkarma – Adım Adım Rehber +url: /tr/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# El Yazısı Görüntüsünden Metin Çıkarma Python OCR ile – Adım Adım Kılavuz + +Telefonunuzdan çektiğiniz bir fotoğrafta **el yazısı metni nasıl tanıyacağınızı** hiç merak ettiniz mi? Yalnız değilsiniz. Birçok projede—ders notlarını dijitalleştiriyor ya da imzalı formlardan veri çekiyor olun—**el yazısı görüntüsünden metin çıkarmak** için hızlı ve sorunsuz bir çözüme ihtiyacınız var. + +Bu öğreticide, popüler bir Python OCR kütüphanesini kullanarak **el yazısı fotoğrafını metne dönüştürmenin** tam olarak nasıl yapılacağını gösteren eksiksiz, çalıştırmaya hazır bir örnek üzerinden adım adım ilerleyeceğiz. Belirsiz referanslar yok, sadece somut kod, açıklamalar ve bugün kopyalayıp yapıştırabileceğiniz ipuçları. + +![el yazısı görüntüsünden metin çıkarma](https://example.com/placeholder-handwritten.jpg "el yazısı görüntüsünden metin çıkarma") + +## Gereksinimler + +Before we dive in, make sure you have the following: + +| Gereksinim | Neden Önemli | +|-------------|----------------| +| Python 3.9 ve üzeri | Modern sözdizimi ve kütüphane desteği | +| `pip` (Python package manager) | OCR paketini kurmak için | +| El yazısı notların net bir görüntüsü (JPEG/PNG) | Bulanık görüntülerde OCR doğruluğu düşer | +| Python fonksiyonlarına temel aşinalık | Örneği daha sonra uyarlamanıza yardımcı olur | + +Eğer bunlardan herhangi birine sahip değilseniz, en son Python sürümünü adresinden alın ve kurun—hiç sorun yok. + +## Python OCR Kütüphanesini Kurun + +Kullanacağımız kod parçacığı, `ocr` paketine (Tesseract etrafında kullanışlı ayarlar ekleyen ince bir sarmalayıcı) dayanıyor. Tek bir komutla kurun: + +```bash +pip install ocr +``` + +> **Pro ipucu:** Kurulumdan sonra, terminalinizde `tesseract --version` komutunu çalıştırarak temel motorun mevcut olduğunu doğrulayın. Tesseract yüklü değilse, işletim sisteminiz için resmi kılavuzu izleyin—çoğu paket yöneticisinde bulunur (`apt-get install tesseract-ocr` Ubuntu’da, `brew install tesseract` macOS’da). + +## Adım 1: Bir OCR Motoru Örneği Oluşturun + +Motoru oluşturmak, **python ocr handwritten recognition** duvarının ilk tuğlasıdır. Motoru, daha sonra karalamaları okuyacak beyin olarak düşünün. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Neden önemli: bir motor olmadan tanıma işlem hattını ayarlayamazsınız. Varsayılan ayarlar basılı metin için optimize edilmiştir, bu yüzden bir sonraki adımda bunları ayarlamamız gerekecek. + +## Adım 2: El Yazısı Tanımayı Etkinleştirin + +Varsayılan olarak motor, basılı karakterleri varsayar. El yazısı modunu etkinleştirmek, Tesseract'ın el yazısı darbeleri üzerine eğitilmiş LSTM modelini kullanmasını sağlayan bir anahtarı çevirir. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Bunu atlamanız ne olur?** OCR, darbeleri gürültü olarak değerlendirir ve bozuk bir çıktı üretir. Bayrağı etkinleştirmek, **how to recognize handwritten text** konusunun özüdür. + +## Adım 3: El Yazısı Fotoğrafınızı Yükleyin + +Şimdi motoru görüntü dosyasına yönlendiriyoruz. Yol mutlak ya da göreli olabilir; dosyanın mevcut olduğundan emin olun. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Hızlı bir kontrol: görüntüyü işletim sisteminizin görüntüleyicisinde açın. Metin bulanıksa, motorun önüne göndermeden önce ön işleme (kontrast artırma, döndürme) yapmayı düşünün—bu ayarlamalar genellikle **el yazısı görüntüsünden metin çıkarma** başarısını artırır. + +## Adım 4: Tanıma İşlemini Çalıştırın + +Her şey ayarlandığında, motoru işini yapması için sonunda çağırıyoruz. `recognize()` metodu, çıkarılan dizeyi ve güven skorlarını içeren bir sonuç nesnesi döndürür. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Arka planda, motor bitmap'i bir dizi özellik vektörüne dönüştürür, bunları LSTM ağına geçirir ve karakterleri birleştirir. İşte **python ocr handwritten recognition**'ın büyüsü. + +## Adım 5: Çıkarılan Metni Görüntüleyin + +Sonuç nesnesi, düz Unicode dizesini içeren bir `.text` özelliği sunar. Yazdırın, bir dosyaya kaydedin veya başka bir işlem hattına besleyin—size kalmış. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Beklenen Çıktı + +Kaynak görüntü “Süt, yumurta ve ekmek al” notunu içeriyorsa, aşağıdakine benzer bir çıktı görürsünüz: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Çıktının noktalama işaretlerini ve satır sonlarını (varsa) koruduğuna dikkat edin. Eğer anlamsız karakterler alırsanız, görüntü kalitesini ve `enable_handwritten_recognition` bayrağını tekrar kontrol edin. + +## Yaygın Sorunlarla Baş Etme + +| Sorun | Belirti | Çözüm | +|-------|---------|-----| +| Düşük güven skorları | Birçok “?” veya anlamsız karakter | Görüntü DPI'sını ≥300'e yükseltin, ikilileştirme (`opencv`) uygulayın veya ilgi bölgesine kırpın. | +| Karışık diller | Çıktı İngilizce ve başka bir betiği karıştırıyor | `engine.ocr_settings.language = "eng"` (veya başka bir ISO kodu) `recognize()` öncesinde ayarlayın. | +| Büyük dosyalar | Uzun işleme süresi veya bellek hatası | Görüntüyü yüklemeden önce makul bir boyuta (ör. maksimum genişlik 1200 px) yeniden boyutlandırın. | +| Tesseract eksik | `ImportError` or `FileNotFoundError` | Tesseract'ı ayrı olarak kurun ve sistem PATH'inde olduğundan emin olun. | + +Bu ayarlamalar, **convert handwritten photo to text** iş akışınızı çeşitli veri setlerinde sağlam tutar. + +## Bugün Çalıştırabileceğiniz Tam Betik + +Aşağıda, tüm parçaları bir araya getiren eksiksiz, bağımsız program yer alıyor. `handwritten_ocr.py` adlı bir dosyaya kopyalayın ve `python handwritten_ocr.py` komutunu çalıştırın. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Çalıştırın, ve metnin konsola yazdırıldığını göreceksiniz—tam da **convert handwritten photo to text** istediğinizde ihtiyaç duyduğunuz sonuç. + +## Daha İleri + +Artık **el yazısı görüntüsünden metin çıkarma** temellerine hâkim olduğunuz için, aşağıdaki adımları düşünün: + +- **Toplu işleme:** Görüntü klasörünü döngüye alıp her sonucu bir CSV dosyasına kaydedin. +- **Son işlem:** Yaygın OCR hatalarını (ör. “1” vs “l”) temizlemek için düzenli ifadeler kullanın. +- **Entegrasyon:** Çıkarılan dizeleri duygu analizi veya anahtar kelime çıkarımı için bir Doğal Dil İşleme işlem hattına besleyin. +- **Alternatif kütüphaneler:** Daha yüksek doğruluk gerekiyorsa, `easyocr` veya `pytesseract` ile özelleştirilmiş LSTM modellerini keşfedin—her ikisi de **python ocr handwritten recognition**'ı destekler. + +Unutmayın, kaynak görüntünün kalitesi genellikle başarıyı belirler, bu yüzden ön işleme birkaç dakikanızı ayırın. Şimdi biraz ekstra çaba, ileride çokça hata ayıklamayı önler. + +## Sonuç + +Tam bir uçtan uca örnek üzerinden **el yazısı metni nasıl tanıyacağınızı** ve daha da önemlisi **el yazısı görüntüsünden metin nasıl çıkarılır** gösterdik. `ocr` paketini kurarak, el yazısı bayrağını açarak, resminizi yükleyerek ve `recognize()` çağırarak, sadece birkaç satırda **convert handwritten photo to text** yapabilirsiniz. + +Kendi notlarınızla deneyin, ön işleme adımlarını ayarlayın ve OCR'un ağır işi yapmasına izin verin. Herhangi bir sorunla karşılaşırsanız, “Yaygın Sorunlarla Baş Etme” tablosuna geri dönün veya alternatif OCR arka uçlarıyla deney yapın. Kodlamanız keyifli olsun ve el yazısı verileriniz anında aranabilir hale gelsin! + +## Sonra Ne Öğrenmelisiniz? + +Aşağıdaki öğreticiler, bu rehberde gösterilen tekniklere dayanan yakından ilgili konuları kapsar. Her kaynak, ek API özelliklerini öğrenmenize ve kendi projelerinizde alternatif uygulama yaklaşımlarını keşfetmenize yardımcı olmak için adım adım açıklamalarla tam çalışan kod örnekleri içerir. + +- [Aspose OCR ile Görüntüden Metin Çıkarma – Adım Adım Kılavuz](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Aspose.OCR'da OCR Metin Tanıma için Sayfa Dikdörtgenlerini Tanıma](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [OCR Kullanımı - Metin Alanı Algılamadan Görüntü Tanıma](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/turkish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/turkish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..6c957fd09 --- /dev/null +++ b/ocr/turkish/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,242 @@ +--- +category: general +date: 2026-06-06 +description: 'OCR karakter seti rehberi: OCR motorunu kullanarak Latin ve Kiril alfabeleri + için desteklenen karakterleri tam Python örnekleriyle nasıl listeleyeceğinizi öğrenin.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: tr +og_description: 'OCR karakter seti rehberi: Python’da OCR motorunu nasıl kullanarak + çeşitli yazı tipleri için desteklenen karakterleri listeleyeceğinizi keşfedin, kod + ve ipuçlarıyla birlikte.' +og_title: OCR Karakter Seti – OCR Motorunu Al ve Kullan +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: OCR Karakter Seti – Python’da OCR Motorunu Al ve Kullan +url: /tr/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# OCR Karakter Seti – Python'da OCR Motorunu Al ve Kullan + +Kütüphanenizin desteklediği **ocr character set**'i öğrenmek ister misiniz? Bu öğreticide, farklı betikler için desteklenen karakterleri sorgulamak ve bunun gerçek dünya projelerinde neden önemli olduğunu göstermek için **use OCR engine**'i nasıl kullanacağınızı göstereceğiz. + +Eğer OCR işleme sonrasında bazı harflerin neden hiç görünmediğini merak ederek boş bir ekrana bakıyorsanız, yalnız değilsiniz. Cevap genellikle motorun bildiği karakter setinde gizlidir. Bu rehberin sonunda şunları yapabilecek: + +* Belirli bir betik için bir OCR motorunun tanıyabileceği tüm karakterleri listeleyin. +* Bir betik desteklenmediğinde durumları yönetin. +* Karakter‑seti bilgisini sonraki doğrulama veya UI mantığına entegre edin. + +Sihirli kara kutu hileleri yok—sadece saf Python, birkaç satır kod ve biraz açıklama. + +--- + +## Önkoşullar + +İlerlemeye başlamadan önce, şunların yüklü olduğundan emin olun: + +* Python 3.8+ yüklü (kod f‑string'leri kullanıyor). +* `ocr.OcrEngine` sağlayan OCR kütüphanesi—bu örnek için `ocr` adlı paketin `pip install ocr-lib` ile zaten yüklü olduğunu varsayacağız. +* Python'un import sistemi ve çıktı verme konusunda temel bilgi. + +Eğer kütüphane eksikse, şu komutu çalıştırın: + +```bash +pip install ocr-lib +``` + +Bu kadar—temel karakter‑seti sorguları için ekstra ikili dosyalar veya yerel bağımlılıklar gerekmez. + +## Adım 1: OCR Motoru Örneğini Başlatma + +Bir motor nesnesi oluşturmak, **use OCR engine** işlevselliğini kullandığınızda yaptığınız ilk şeydir. Bunu bir tarayıcıyı açmak gibi düşünün; olmadan hiçbir soru soramazsınız. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +`OcrEngine` sınıfı, temel tanıma motorunun hafif bir sarmalayıcısıdır. Bir şey talep edene kadar ağır işlemeye başlamaz, bu yüzden başlangıç maliyeti ihmal edilebilir. + +> **Pro ipucu:** Uygulamanız birçok motor örneği oluşturuyorsa, tek bir örneği yeniden kullanın. Belleği tasarruf eder ve başlatma gecikmesini azaltır. + +## Adım 2: Belirli Bir Betik İçin OCR Karakter Setini Sorgulama + +Artık bir motorumuz olduğuna göre, ona belirli bir yazı sistemi için hangi karakterleri bildiğini sorabiliriz. `get_supported_characters(script_name)` yöntemi, Unicode karakterlerinin bir Python `list`'ini döndürür. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Yukarıdaki kod parçasını çalıştırmak şöyle bir çıktı verir: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Tam çıktı OCR kütüphanesinin sürümüne bağlıdır, ancak her zaman düz bir karakter listesi alırsınız. Hem büyük hem küçük harflere ihtiyacınız varsa, sadece `"Latin"` betiğini isteyin ve ardından her öğeye `.lower()` uygulayın, ya da kütüphane ayırıyorsa ayrı bir `"Latin‑lower"` betiğini sorgulayın. + +### Neden Önemli? + +Alışılmadık diakritik işaretler içeren bir görüntü (ör. “ñ” veya “ø”) beslediğinizde, OCR motoru bunları sessizce bir yer tutucu ile değiştirebilir veya tamamen atabilir. **ocr character set**'i önceden bilmek, girdiyi ön‑doğrulamanıza, kullanıcıları uyarmanıza veya farklı bir motora geçmenize olanak tanır. + +## Adım 3: Başka Bir Betik İçin Karakterleri Al – Kiril Örneği + +Aynı yöntem, motorun desteklediğini iddia ettiği herhangi bir betik için çalışır. Kiril alfabesiyle ne olacağını görelim. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Tipik çıktı: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Motor istenen betiği **desteklemiyorsa**, genellikle bir `ValueError` yükseltir (veya uygulamaya bağlı olarak boş bir liste döner). Buna karşı önlem alalım. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +## Adım 4: OCR Karakter Setini Görselleştirme (İsteğe Bağlı) + +Bazen hızlı bir görsel yardımcı olur, özellikle paydaşlara hangi karakterlerin kapsandığını göstermeniz gerektiğinde. Aşağıda `matplotlib` kullanarak bir karakter ızgarasını çizen minimal bir örnek var. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Görsel alt metni:** ![OCR character set diagram](ocr_character_set.png){alt="OCR character set overview"} + +Diagram temel çözüm için gerekli değildir, ancak **use OCR engine** meta verilerini düz yazdırmanın ötesinde nasıl kullanabileceğinizi gösterir. + +## Adım 5: Karakter Setini Gerçek‑Dünya İş Akışlarına Entegre Etme + +Artık **ocr character set**'i alabileceğimize göre, pratik bir senaryoya bakalım: OCR çıktısını bir veritabanına kaydetmeden önce doğrulama. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Bu desen, çok dilli belgelerle çalışırken sık karşılaşılan bir sorun olan hatalı verilerin sonraki işlem hatlarına sızmasını önler. + +## Yaygın Tuzaklar ve Kenar Durumları + +| Sorun | Neden Olur | Nasıl Önlenir | +|-------|------------|---------------| +| **Betik adı yazım hatası** (ör. `"Cyrillic "` sonundaki boşluk) | Motor dizeyi harfi harfine alır ve betiği bulamaz. | `get_supported_characters` çağırmadan önce `script.strip()` ile boşlukları temizleyin. | +| **Boş karakter listesi** | Bazı motorlar bir betiği gösterir ancak dil modelini henüz yüklememiştir. | Kütüphane böyle bir yöntem sağlıyorsa `engine.load_language_model(script)` çağırın veya model dosyalarının mevcut olduğundan emin olun. | +| **Unicode normalizasyon sorunları** | “é” gibi karakterler birleşik (`\u00E9`) veya ayrık (`e\u0301`) biçimde görünebilir. | Doğrulamadan önce `unicodedata.normalize('NFC', text)` ile dizeleri normalleştirin. | +| **Büyük betiklerde performans** (ör. Çince) | Binlerce karakteri almak yavaş olabilir. | İlk çağrıdan sonra sonucu önbelleğe alın; çoğu uygulama listeye sadece bir kez ihtiyaç duyar. | + +## Tam Çalışan Örnek + +Tüm parçaları birleştirerek, kopyalayıp yapıştırıp çalıştırabileceğiniz tek bir betik: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Sonra Ne Öğrenmelisiniz? + +Bu öğreticiler, bu rehberde gösterilen tekniklere dayanan yakından ilgili konuları kapsar. Her kaynak, ek API özelliklerini öğrenmenize ve kendi projelerinizde alternatif uygulama yaklaşımlarını keşfetmenize yardımcı olmak için adım adım açıklamalar içeren tam çalışan kod örnekleri sunar. + +- [Aspose OCR Öğreticisi – Optik Karakter Tanıma](/ocr/english/) +- [OCR Son İşleme – Karakter Seçeneklerini Al](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [OCR Görüntü Tanıma'da Eşik Değerini Nasıl Ayarlarsınız](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/turkish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/turkish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..2cd202908 --- /dev/null +++ b/ocr/turkish/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: Aspose OCR ve bir Hugging Face modeli kullanarak görüntüde OCR gerçekleştirin. + Hugging Face modelini nasıl indireceğinizi, faturadan metin çıkartmayı ve GPU kaynaklarını + nasıl serbest bırakacağınızı öğrenin. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: tr +og_description: Aspose OCR ve bir Hugging Face modeli kullanarak görüntüde OCR gerçekleştirin. + Bu öğreticide modelin nasıl indirileceği, faturadan metin çıkarma ve GPU kaynaklarını + serbest bırakma gösterilmektedir. +og_title: Aspose OCR ve LLM ile Görüntüde OCR Yapma – Tam Rehber +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Aspose OCR ve LLM ile Görüntüde OCR Yapma – Tam Kılavuz +url: /tr/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Aspose OCR & LLM ile Görüntü Üzerinde OCR Gerçekleştirme – Tam Kılavuz + +Hiç **perform OCR on image** dosyalarıyla çalışmak isteyip “nereden başlamalıyım?” sorusunda takıldıysanız? Yalnız değilsiniz—birçok geliştirici belge otomasyonuna ilk kez başladığında bu duvara çarpar. İyi haber, Aspose OCR ve Hugging Face'ten hafif bir LLM ile bir faturanın ham taramasını sadece birkaç Python satırıyla temiz, aranabilir metne dönüştürebilirsiniz. + +Bu öğreticide ihtiyacınız olan her şeyi adım adım göstereceğiz: **loading image for OCR**'den, **download the Hugging Face model**'a, **extract text from invoice** verilerine ve sonunda **free GPU resources**'a kadar, böylece uygulamanız hafif kalır. Sonunda, herhangi bir projeye ekleyebileceğiniz bağımsız bir betiğiniz olacak. + +--- + +## Neler Öğreneceksiniz + +- Aspose'un `OcrEngine`'ini kullanarak **perform OCR on image** nasıl yapılır. +- **download Hugging Face model** dosyalarını otomatik olarak indirme adımları. +- AI destekli post‑processing ile **extract text from invoice** PDF'leri veya PNG'leri işleme teknikleri. +- Çıkarımdan sonra **free GPU resources** için en iyi uygulamalar. +- **load image for OCR**'ı verimli bir şekilde yapma ve yaygın tuzaklardan kaçınma ipuçları. + +Harici bir dokümantasyona gerek yok—ihtiyacınız olan her şey burada, tam kod, açıklamalar ve beklenen çıktı ile. + +--- + +## Önkoşullar + +İçeriğe başlamadan önce şunların olduğundan emin olun: + +| Requirement | Reason | +|-------------|--------| +| Python 3.9+ | Modern sözdizimi ve tip ipuçları | +| `asposeocr` package (`pip install asposeocr`) | Temel OCR motoru | +| Access to a GPU (optional but recommended) | LLM post‑processor'ı hızlandırır | +| An invoice image (`sample_invoice.png`) | Gerçek dünya test durumu | + +Eğer bunlardan herhangi birine sahip değilseniz, şimdi kurun; betik ayrıca **download Hugging Face model**'i otomatik olarak indirecek, böylece dosyaları kendiniz aramak zorunda kalmayacaksınız. + +--- + +## Adım 1: Görüntü Üzerinde OCR Gerçekleştirme – Motoru Oluşturma + +İlk yapmanız gereken Aspose'un OCR motorunu başlatmaktır. Bunu, görüntünün daha sonra metne dönüştürüleceği boş bir tuval açmak gibi düşünün. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Neden önemli?** `OcrEngine` tüm düşük seviyeli görüntü ön işleme adımlarını soyutlar, böylece daha yüksek seviyeli iş akışına odaklanabilirsiniz. Ayrıca daha sonra daha akıllı çıktı için bir LLM bağlamamıza izin verecek bir `set_post_processor` metodunu da sunar. + +--- + +## Adım 2: OCR İçin Görüntü Yükleme – Doğru Dosyayı Seçme + +Motor artık mevcut olduğuna göre **load image for OCR** yapmamız gerekiyor. Aspose PNG, JPG, TIFF ve birkaç diğer formatı destekler. Yolun mutlak ya da betiğinizin konumuna göre göreli olduğundan emin olun. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **İpucu:** Görüntünüz büyükse, belleği azaltmak için önceden yeniden boyutlandırmayı düşünün. OCR motoru yüksek çözünürlüklü taramaları işleyebilir, ancak 300 DPI bir görüntü genellikle faturalar için ideal bir noktadır. + +--- + +## Adım 3: Ham OCR Gerçekleştir ve Çıkarılan Metni Görüntüle + +Görüntü yüklendikten sonra nihayet **perform OCR on image** yapabilir ve ham motorun ne ürettiğini görebiliriz. Bu adım, AI sihrini eklemeden önce bir temel sağlar. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Beklenen çıktı (kısaltılmış):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Ham çıktı genellikle satır sonları, hatalı tanınan karakterler veya eksik alanlar içerir—tam da bir dil modeli ekleyeceğimiz neden budur. + +--- + +## Adım 4: Hugging Face Modeli İndir – LLM Post‑Processor'ı Yapılandırma + +İşte **download Hugging Face model** adımının parladığı yer. Aspose AI, model diskte yoksa Hugging Face hub'ından otomatik olarak çekebilir. Doğruluk ve bellek ayak izini dengeleyen Qwen2.5‑3B‑Instruct‑GGUF modelini kullanacağız. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Neden işe yarıyor:** `allow_auto_download` `.gguf` dosyasını manuel olarak indirmenizi engeller. Kuantizasyon (`int8`) model boyutunu yaklaşık 3 GB'ye düşürür, bu da çoğu tüketici GPU'sunda uygulanabilir kılar. Donanımınıza göre `gpu_layers` değerini ayarlayın—GPU'da daha fazla katman = daha hızlı çıkarım. + +--- + +## Adım 5: AI‑Destekli Post‑Processing Kullanarak Faturadan Metin Çıkarma + +Şimdi LLM'yi OCR motoruna ekliyoruz ve ham çıktıyı temizleyen, OCR hatalarını düzelten ve fatura alanlarını güzel bir şekilde biçimlendiren bir **post‑processor** çalıştırıyoruz. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Örnek geliştirilmiş çıktı:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Ne oldu?** LLM, “Invoice #12345” ifadesinin “Invoice Number: 12345” olması gerektiğini fark etti, tarih formatını düzeltti ve ham motorun kaçırdığı “Bill To” alanını bile tahmin etti. Bu, **extract text from invoice** otomasyonunun özüdür. + +--- + +## Adım 6: GPU Kaynaklarını Serbest Bırak – İşlem Sonrası Temizleme + +Bunu uzun ömürlü bir hizmette (ör. Flask API) çalıştırıyorsanız, her çıkarımdan sonra **free GPU resources** yapmanız gerekir; aksi takdirde bellek dışı çöküşler yaşanır. Aspose AI bunun için basit bir yöntem sunar. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Pro ipucu:** OCR çağrısını try/except içinde sarıyorsanız, `free_resources()`'ı bir `finally:` bloğu içinde çağırın. Bu, bir istisna oluşsa bile temizlik garantiler. + +--- + +## Adım 7: Tam Betik – Hepsini Bir Araya Getirme + +Aşağıda tam, çalıştırmaya hazır betik yer alıyor. Kopyalayıp yapıştırın, yolları ayarlayın ve hazırsınız. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Betik çalıştırın ve gürültülü OCR'dan temiz, yapılandırılmış fatura verisine dönüşümü izleyin. 🎉 + +--- + +## Sık Sorulan Sorular & Kenar Durumları + +| Soru | Cevap | +|------|-------| +| **Model indirme başarısız olursa ne olur?** | Makinenizin internet erişimi olduğundan ve `hugging_face_repo_id`'nin doğru olduğundan emin olun. Ayrıca dosyayı manuel olarak indirebilirsiniz | + +## Sonraki Öğrenmeniz Gerekenler? + +Aşağıdaki öğreticiler, bu kılavuzda gösterilen tekniklere dayanan yakından ilgili konuları kapsar. Her kaynak, ek API özelliklerini öğrenmenize ve kendi projelerinizde alternatif uygulama yaklaşımlarını keşfetmenize yardımcı olacak adım adım açıklamalar içeren tam çalışan kod örnekleri sunar. + +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Extract Text from Image – OCR Optimization with Aspose.OCR for .NET](/ocr/english/net/ocr-optimization/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/turkish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/turkish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..7946b4345 --- /dev/null +++ b/ocr/turkish/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,283 @@ +--- +category: general +date: 2026-06-06 +description: Aspose OCR ile görüntüden metin tanıma – OCR için görüntüyü nasıl yükleyeceğinizi + ve Python’da AI sonrası işleme kullanarak görüntüde OCR nasıl yapacağınızı öğrenin. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: tr +og_description: Görüntüden metni hızlıca tanıyın. Bu rehber, OCR için görüntünün nasıl + yükleneceğini, görüntü üzerinde OCR nasıl yapılacağını ve AI sonrası işleme ile + sonuçların nasıl artırılacağını gösterir. +og_title: Aspose OCR ve AI kullanarak görüntüden metin tanıma +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Aspose OCR ve AI kullanarak görüntüden metin tanıma +url: /tr/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Aspose OCR & AI kullanarak görüntüden metin tanıma + +Görüntüden metin tanımanız gerektiğinde, hem hız hem de doğruluk sağlayacak kütüphaneyi bulamadınız mı? Yalnız değilsiniz. Bu rehberde **OCR için görüntüyü nasıl yükleyeceğinizi**, **görüntü üzerinde OCR nasıl yapılacağını** ve ardından Aspose’un AI sonrası işleyicisiyle çıktıyı nasıl parlatacağınızı gösteren eksiksiz, uçtan uca bir örnek üzerinden ilerleyeceğiz. Sonunda, PNG dosyasını temiz, aranabilir metne dönüştüren çalıştırmaya hazır bir betiğiniz olacak. + +## Öğrenecekleriniz + +Aspose OCR paketinin kurulumundan çalışmanın sonunda kaynakların serbest bırakılmasına kadar her şeyi ele alacağız. El yazısı tanımanın neden önemli olduğunu, post‑processing için Qwen 2.5 LLM’nin nasıl yapılandırılacağını ve son çıktının nasıl göründüğünü göreceksiniz. Harici referanslara gerek yok—kopyalayıp yapıştırın, çalıştırın. + +### Önkoşullar + +- Python 3.8 ve üzeri +- `asposeocr` paketi (`pip install asposeocr`) +- Basılı ya da el yazısı metin içeren bir görüntü dosyası (ör. `doc.png`) +- İsteğe bağlı: daha hızlı LLM çıkarımı için bir GPU (betik CPU’da da çalışır) + +--- + +## Görüntüden metin tanıma – Adım‑adım + +Her kod bloğunun altında, **ne yaptığımızı** değil, **neden** o işlemi yaptığımızı açıklayan kısa bir metin bulacaksınız. + +### Adım 1: Gerekli modülleri kurun ve içe aktarın + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Niçin?* `asposeocr`’yi içe aktarmak `OcrEngine` sınıfını sağlar, `ai` alt modülü ise ham OCR çıktısını büyük ölçüde iyileştiren LLM‑tabanlı post‑processörü sunar. + +### Adım 2: OCR motorunu oluşturun ve el‑yazısı tanımayı etkinleştirin + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +El‑yazısı tanımayı etkinleştirmek motorun karakter setini genişletir, böylece **görüntü üzerinde OCR yaparken** karışık basılı ve el yazısı metin içeren dosyalarda karalamaları kaybetmezsiniz. + +### Adım 3: OCR için görüntüyü yükleyin + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +`load_image` çağrısı **OCR için görüntüyü yüklediğiniz** anıdır; yol yanlışsa motor bilgilendirici bir istisna fırlatır ve sizi sonraki gizemli hatalardan korur. + +### Adım 4: Ham OCR geçişini çalıştırın + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +Bu aşamada, filtrelenmemiş metin, güven skorları ve düzen meta verilerini içeren bir `RecognitionResult` nesnesi elde edersiniz. Çıktı genellikle gürültülüdür—bu yüzden AI‑destekli temizlik gerekir. + +### Adım 5: LLM post‑processi için Aspose AI modelini yapılandırın + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Bu ayarlarla neden uğraşıyoruz? +- **auto‑download** modeli ilk çalıştırmada kullanılabilir kılar. +- **int8 quantization** hafıza ihtiyacını büyük bir doğruluk kaybı olmadan azaltır. +- **gpu_layers** uyumlu bir GPU’yu daha hızlı çıkarım için kullanmanıza olanak tanır. + +### Adım 6: AI işlemcisini başlatın ve post‑processör olarak ekleyin + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +İşlemciyi eklemek, `run_postprocessor` her çağrıldığında LLM’nin yazım hatalarını düzeltmesini, kırık kelimeleri birleştirmesini ve eksik noktalama işaretlerini tahmin etmesini sağlar. + +### Adım 7: OCR çıktısını iyileştirmek için post‑processörü çalıştırın + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` genellikle ham dizeden çok daha okunaklıdır—bağlamı anlayan bir yazım denetleyicisi gibi düşünün. + +### Adım 8: Kaynakları serbest bırakın + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Uzun süren hizmetlerde temizlik hayati önemdedir; aksi takdirde GPU belleği sızar ve uygulamanız sonunda çökebilir. + +--- + +## Bugün çalıştırabileceğiniz tam betik + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Beklenen çıktı** (basit bir fatura görüntüsü örneği): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +AI katmanının “Inv0ice” → “Invoice” hatasını düzelttiğine ve eksik noktalama işaretlerini eklediğine dikkat edin. + +--- + +## Sıkça sorulan sorular (ve hızlı cevaplar) + +- **GPU’ya ihtiyacım var mı?** Hayır. Betik CPU’ya geri döner, ancak çıkarım daha yavaş olur. +- **Farklı bir LLM kullanabilir miyim?** Kesinlikle—sadece `hugging_face_repo_id` değerini değiştirin ve `gpu_layers` ayarını buna göre düzenleyin. +- **Görüntüm çok büyükse ne yapmalıyım?** Bellek kullanımını makul tutmak için önce yeniden boyutlandırın (ör. Pillow kullanarak). +- **El‑yazısı tanıma her zaman açık mı?** İş yükünüze bağlı olarak `enable_handwritten_recognition` değerini açıp kapatabilirsiniz. + +--- + +## Sonuç + +Artık Aspose OCR kullanarak **görüntüden metin tanıma**, **OCR için görüntüyü yükleme** ve AI‑destekli post‑processing ile **görüntü üzerinde OCR yapma** konularını biliyorsunuz. Yukarıdaki eksiksiz, çalıştırılabilir örnek, OCR’u herhangi bir Python projesine entegre etmeniz için sağlam bir temel sunar—makbuz taramadan sözleşme dijitalleştirmeye, el yazısı formlardan veri çıkarımına kadar. + +Bir sonraki adıma hazır mısınız? Qwen modelini daha büyük bir modele değiştirin, farklı kuantizasyon şemalarıyla deney yapın ya da toplu işleme için birden fazla görüntüyü zincirleyin. Olasılıklar sınırsızdır ve az önce oluşturduğunuz kod bunları sorunsuz bir şekilde yönetecektir. + +İyi kodlamalar, OCR sonuçlarınız her zaman kristal‑net olsun! + +![Screenshot of Python console showing enhanced OCR output](/images/ocr_output.png){alt="Aspose OCR kullanarak görüntüden metin tanımanın nasıl yapılacağını gösteren ekran görüntüsü"} + +## Sonraki Öğrenmeniz Gerekenler + +Aşağıdaki öğreticiler, bu rehberde gösterilen tekniklere dayanarak yakından ilgili konuları kapsar. Her kaynak, ek API özelliklerini öğrenmenize ve projelerinizde alternatif uygulama yaklaşımlarını keşfetmenize yardımcı olacak tam çalışan kod örnekleri ve adım‑adım açıklamalar içerir. + +- [Extract Text from Image with Aspose OCR – Step‑by‑Step Guide](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Convert Image to Text – Perform OCR on Image from URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Extract image text C# with language selection using Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/vietnamese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md b/ocr/vietnamese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md new file mode 100644 index 000000000..adcc0ce2f --- /dev/null +++ b/ocr/vietnamese/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/_index.md @@ -0,0 +1,215 @@ +--- +category: general +date: 2026-06-06 +description: Trích xuất văn bản từ hình ảnh viết tay bằng Python OCR. Tìm hiểu cách + chuyển đổi ảnh viết tay thành văn bản một cách nhanh chóng và đáng tin cậy. +draft: false +keywords: +- extract text from handwritten image +- convert handwritten photo to text +- how to recognize handwritten text +- python ocr handwritten recognition +language: vi +og_description: Trích xuất văn bản từ hình ảnh viết tay bằng Python. Hướng dẫn này + chỉ cách chuyển đổi ảnh viết tay thành văn bản và trả lời cách nhận dạng văn bản + viết tay. +og_title: Trích xuất văn bản từ hình ảnh viết tay – Hướng dẫn OCR Python +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Extract text from handwritten image using Python OCR. Learn how to + convert handwritten photo to text quickly and reliably. + headline: Extract Text from Handwritten Image with Python OCR – Step‑by‑Step Guide + type: TechArticle +tags: +- OCR +- Python +- Handwriting +title: Trích xuất văn bản từ hình ảnh viết tay bằng Python OCR – Hướng dẫn từng bước +url: /vi/python/general/extract-text-from-handwritten-image-with-python-ocr-step-by/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Trích xuất Văn bản từ Hình ảnh Viết tay bằng Python OCR – Hướng dẫn Từng bước + +Bạn đã bao giờ tự hỏi **cách nhận dạng văn bản viết tay** trong một bức ảnh bạn chụp bằng điện thoại chưa? Bạn không phải là người duy nhất. Trong nhiều dự án—cho dù là số hoá ghi chú bài giảng hay trích xuất dữ liệu từ các mẫu ký—bạn cần **trích xuất văn bản từ hình ảnh viết tay** nhanh chóng và không gặp rắc rối. + +Trong hướng dẫn này, chúng tôi sẽ đi qua một ví dụ hoàn chỉnh, sẵn sàng chạy, cho bạn thấy chính xác cách **chuyển đổi ảnh viết tay thành văn bản** bằng một thư viện Python OCR phổ biến. Không có những tham chiếu mơ hồ, chỉ có mã cụ thể, giải thích và các mẹo bạn có thể sao chép‑dán ngay hôm nay. + +![trích xuất văn bản từ hình ảnh viết tay](https://example.com/placeholder-handwritten.jpg "trích xuất văn bản từ hình ảnh viết tay") + +## Những gì bạn cần + +Trước khi chúng ta bắt đầu, hãy chắc chắn rằng bạn có những thứ sau: + +| Requirement | Why it matters | +|-------------|----------------| +| Python 3.9 or newer | Modern syntax & library support | +| `pip` (Python package manager) | To install the OCR package | +| A clear image of handwritten notes (JPEG/PNG) | OCR accuracy drops on blurry images | +| Basic familiarity with Python functions | Helps you adapt the example later | + +Nếu bạn thiếu bất kỳ mục nào trong số này, hãy tải Python mới nhất từ và cài đặt—không có phiền toái. + +## Cài đặt Thư viện Python OCR + +Đoạn mã mẫu chúng ta sẽ sử dụng dựa trên gói `ocr` (một lớp bọc nhẹ quanh Tesseract cung cấp các cài đặt tiện lợi). Cài đặt nó bằng một lệnh duy nhất: + +```bash +pip install ocr +``` + +> **Mẹo chuyên nghiệp:** Sau khi cài đặt, chạy `tesseract --version` trong terminal để xác nhận engine nền đã có. Nếu bạn chưa có Tesseract, hãy làm theo hướng dẫn chính thức cho hệ điều hành của bạn—hầu hết các trình quản lý gói đều có sẵn (`apt-get install tesseract-ocr` trên Ubuntu, `brew install tesseract` trên macOS). + +## Bước 1: Tạo một Instance của Engine OCR + +Việc tạo engine là viên gạch đầu tiên trong bức tường của **python ocr handwritten recognition**. Hãy nghĩ engine như bộ não sẽ sau này đọc các nét viết. + +```python +# Step 1: Import the library and instantiate the OCR engine +import ocr + +# The OcrEngine class gives us access to all the low‑level settings. +engine = ocr.OcrEngine() +``` + +Tại sao điều này quan trọng: không có engine, bạn không thể tinh chỉnh quy trình nhận dạng. Các cài đặt mặc định được tối ưu cho văn bản in, vì vậy chúng ta sẽ cần điều chỉnh chúng ở bước tiếp theo. + +## Bước 2: Bật chế độ Nhận dạng Viết tay + +Mặc định engine giả định các ký tự in. Bật chế độ viết tay sẽ chuyển công tắc cho Tesseract sử dụng mô hình LSTM đã được huấn luyện trên các nét chữ viết liền. + +```python +# Step 2: Turn on handwritten mode +engine.ocr_settings.enable_handwritten_recognition = True +``` + +> **Nếu bạn bỏ qua bước này thì sao?** OCR sẽ coi các nét viết là nhiễu, dẫn đến kết quả rối rắm. Bật cờ này là cốt lõi của **cách nhận dạng văn bản viết tay**. + +## Bước 3: Tải ảnh Viết tay của bạn + +Bây giờ chúng ta chỉ định engine tới tệp ảnh. Đường dẫn có thể là tuyệt đối hoặc tương đối; chỉ cần chắc chắn tệp tồn tại. + +```python +# Step 3: Load the image containing the handwritten notes +image_path = "YOUR_DIRECTORY/handwritten_note.jpg" +engine.load_image(image_path) +``` + +Kiểm tra nhanh: mở ảnh trong trình xem của hệ điều hành. Nếu văn bản mờ, hãy cân nhắc tiền xử lý (tăng độ tương phản, xoay) trước khi đưa vào engine—những điều chỉnh này thường nâng cao tỷ lệ thành công của **trích xuất văn bản từ hình ảnh viết tay**. + +## Bước 4: Chạy quy trình Nhận dạng + +Khi mọi thứ đã sẵn sàng, chúng ta cuối cùng yêu cầu engine thực hiện công việc. Phương thức `recognize()` trả về một đối tượng kết quả chứa chuỗi đã trích xuất và điểm tin cậy. + +```python +# Step 4: Execute the OCR process +handwritten_result = engine.recognize() +``` + +Trong hậu trường, engine chuyển bitmap thành một loạt vector đặc trưng, chạy chúng qua mạng LSTM và ghép các ký tự lại với nhau. Đó là phép màu của **python ocr handwritten recognition**. + +## Bước 5: Hiển thị Văn bản Đã Trích xuất + +Đối tượng kết quả cung cấp thuộc tính `.text` chứa chuỗi Unicode thuần. In ra, ghi vào tệp, hoặc đưa vào pipeline khác—tùy bạn. + +```python +# Step 5: Print the extracted text to the console +print("=== Extracted Text ===") +print(handwritten_result.text) +``` + +### Kết quả Dự kiến + +Nếu ảnh nguồn chứa ghi chú “Buy milk, eggs, and bread”, bạn sẽ thấy kết quả tương tự: + +``` +=== Extracted Text === +Buy milk, eggs, and bread +``` + +Lưu ý cách kết quả giữ lại dấu câu và ngắt dòng (nếu có). Nếu bạn nhận được kết quả vô nghĩa, hãy kiểm tra lại chất lượng ảnh và cờ `enable_handwritten_recognition`. + +## Xử lý Các Trường hợp Gặp lỗi Thông thường + +| Vấn đề | Triệu chứng | Cách khắc phục | +|-------|-------------|----------------| +| Điểm tin cậy thấp | Nhiều “?” hoặc ký tự vô nghĩa | Tăng DPI ảnh lên ≥300, áp dụng nhị phân hoá (`opencv`), hoặc cắt vùng quan tâm. | +| Ngôn ngữ hỗn hợp | Kết quả trộn tiếng Anh và ký tự của ngôn ngữ khác | Đặt `engine.ocr_settings.language = "eng"` (hoặc mã ISO khác) trước khi gọi `recognize()`. | +| Tệp lớn | Thời gian xử lý lâu hoặc lỗi bộ nhớ | Thay đổi kích thước ảnh về kích thước hợp lý (ví dụ, chiều rộng tối đa 1200 px) trước khi tải. | +| Thiếu Tesseract | `ImportError` hoặc `FileNotFoundError` | Cài đặt Tesseract riêng và đảm bảo nó có trong PATH hệ thống. | + +Những điều chỉnh này giữ cho quy trình **chuyển đổi ảnh viết tay thành văn bản** của bạn ổn định trên các bộ dữ liệu đa dạng. + +## Toàn bộ Script Bạn Có Thể Chạy Ngay Hôm Nay + +Dưới đây là chương trình hoàn chỉnh, tự chứa, kết hợp tất cả các phần lại. Sao chép vào tệp có tên `handwritten_ocr.py` và chạy `python handwritten_ocr.py`. + +```python +#!/usr/bin/env python3 +""" +handwritten_ocr.py + +A minimal example that demonstrates how to extract text from handwritten image +using Python OCR (handwritten recognition mode). Adjust `image_path` to point +to your own file before running. +""" + +import ocr # pip install ocr + +def main(): + # 1️⃣ Create the OCR engine + engine = ocr.OcrEngine() + + # 2️⃣ Enable the handwritten recognition flag + engine.ocr_settings.enable_handwritten_recognition = True + + # 3️⃣ Load the target image + image_path = "YOUR_DIRECTORY/handwritten_note.jpg" + engine.load_image(image_path) + + # 4️⃣ Run the OCR process + result = engine.recognize() + + # 5️⃣ Output the extracted text + print("=== Extracted Text ===") + print(result.text) + +if __name__ == "__main__": + main() +``` + +Chạy nó, và bạn sẽ thấy văn bản được in ra console—đúng là kết quả bạn cần khi muốn **chuyển đổi ảnh viết tay thành văn bản**. + +## Tiếp tục Khám phá + +Bây giờ bạn đã nắm vững các kiến thức cơ bản của **trích xuất văn bản từ hình ảnh viết tay**, hãy xem xét các bước tiếp theo: + +- **Batch processing:** Xử lý hàng loạt: Duyệt qua một thư mục các ảnh và lưu mỗi kết quả vào tệp CSV. +- **Post‑processing:** Hậu xử lý: Sử dụng biểu thức chính quy để làm sạch các lỗi OCR thường gặp (ví dụ, “1” vs “l”). +- **Integration:** Tích hợp: Đưa các chuỗi đã trích xuất vào pipeline Xử lý Ngôn ngữ Tự nhiên để phân tích cảm xúc hoặc trích xuất từ khóa. +- **Alternative libraries:** Thư viện thay thế: Nếu bạn cần độ chính xác cao hơn, khám phá `easyocr` hoặc `pytesseract` với các mô hình LSTM tùy chỉnh—cả hai đều hỗ trợ **python ocr handwritten recognition**. + +Hãy nhớ, chất lượng của ảnh nguồn thường quyết định thành công, vì vậy hãy dành vài phút để tiền xử lý. Một chút nỗ lực thêm ngay bây giờ sẽ tiết kiệm cho bạn rất nhiều thời gian gỡ lỗi sau này. + +## Kết luận + +Chúng tôi đã đi qua một ví dụ hoàn chỉnh, từ đầu đến cuối, cho thấy **cách nhận dạng văn bản viết tay** và, quan trọng hơn, cách **trích xuất văn bản từ hình ảnh viết tay** bằng Python. Bằng cách cài đặt gói `ocr`, bật cờ viết tay, tải ảnh của bạn và gọi `recognize()`, bạn có thể **chuyển đổi ảnh viết tay thành văn bản** chỉ trong vài dòng mã. + +Hãy thử với các ghi chú của bạn, điều chỉnh các bước tiền xử lý, và để OCR thực hiện phần công việc nặng. Nếu gặp khó khăn, hãy xem lại bảng “Xử lý Các Trường hợp Gặp lỗi Thông thường” hoặc thử nghiệm các back‑end OCR thay thế. Chúc lập trình vui vẻ, và hy vọng dữ liệu viết tay của bạn sẽ ngay lập tức có thể tìm kiếm được! + +## Bạn Nên Học Gì Tiếp Theo? + +Các hướng dẫn sau đây bao gồm các chủ đề liên quan chặt chẽ, xây dựng trên các kỹ thuật được trình bày trong hướng dẫn này. Mỗi tài nguyên bao gồm các ví dụ mã hoạt động đầy đủ với giải thích từng bước để giúp bạn nắm vững các tính năng API bổ sung và khám phá các cách triển khai thay thế trong dự án của mình. + +- [Trích xuất Văn bản từ Ảnh với Aspose OCR – Hướng dẫn Từng bước](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Cách Nhận dạng Hình chữ nhật Trang cho Nhận dạng Văn bản OCR trong Aspose.OCR](/ocr/english/java/advanced-ocr-techniques/prepare-rectangles-for-ocr/) +- [Cách Sử dụng OCR - Nhận dạng Ảnh mà không cần Phát hiện Vùng Văn bản](/ocr/english/net/image-and-drawing-recognition/recognize-image-without-text-area-detection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/vietnamese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md b/ocr/vietnamese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md new file mode 100644 index 000000000..5fe821f63 --- /dev/null +++ b/ocr/vietnamese/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/_index.md @@ -0,0 +1,256 @@ +--- +category: general +date: 2026-06-06 +description: 'Hướng dẫn bộ ký tự OCR: học cách sử dụng công cụ OCR để liệt kê các + ký tự được hỗ trợ cho các bảng chữ Latin và Cyrillic với các ví dụ Python đầy đủ.' +draft: false +keywords: +- ocr character set +- use OCR engine +- supported OCR characters +- script detection OCR +- OCR library Python +language: vi +og_description: 'Hướng dẫn bộ ký tự OCR: khám phá cách sử dụng engine OCR trong Python + để liệt kê các ký tự được hỗ trợ cho các bộ chữ viết khác nhau, kèm theo mã và mẹo.' +og_title: Bộ ký tự OCR – Truy xuất và Sử dụng công cụ OCR +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: 'OCR character set guide: learn how to use OCR engine to list supported + characters for Latin and Cyrillic scripts with full Python examples.' + headline: OCR Character Set – Retrieve and Use OCR Engine in Python + type: TechArticle +tags: +- OCR +- Python +- Character Sets +title: Bộ ký tự OCR – Truy xuất và sử dụng Engine OCR trong Python +url: /vi/python/general/ocr-character-set-retrieve-and-use-ocr-engine-in-python/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Bộ ký tự OCR – Truy xuất và Sử dụng Engine OCR trong Python + +Bạn muốn biết **bộ ký tự OCR** mà thư viện của mình hỗ trợ? Trong hướng dẫn này chúng tôi sẽ chỉ cho bạn cách **sử dụng engine OCR** để truy vấn các ký tự được hỗ trợ cho các script khác nhau, và tại sao điều này lại quan trọng đối với các dự án thực tế. + +Nếu bạn từng nhìn chằm chằm vào màn hình trống và tự hỏi tại sao một số ký tự không bao giờ xuất hiện sau khi xử lý OCR, bạn không phải là người duy nhất. Câu trả lời thường ẩn trong bộ ký tự mà engine biết. Khi đọc xong hướng dẫn này, bạn sẽ có thể: + +* Liệt kê mọi ký tự mà một engine OCR có thể nhận dạng cho một script nhất định. +* Xử lý các trường hợp script không được hỗ trợ. +* Nhúng thông tin bộ ký tự vào việc kiểm tra hợp lệ hoặc logic UI downstream. + +Không có bất kỳ thủ thuật hộp đen nào—chỉ là Python thuần, một vài dòng code, và một chút giải thích. + +--- + +## Yêu cầu trước + +Trước khi bắt đầu, hãy chắc chắn rằng bạn đã có: + +* Python 3.8+ được cài đặt (code sử dụng f‑strings). +* Thư viện OCR cung cấp `ocr.OcrEngine`—trong ví dụ này chúng tôi giả định có một package tên `ocr` đã được cài đặt qua `pip install ocr-lib`. +* Kiến thức cơ bản về hệ thống import của Python và việc in ra màn hình. + +Nếu bạn chưa có thư viện, chạy: + +```bash +pip install ocr-lib +``` + +Xong—không cần binary hay phụ thuộc native nào cho các truy vấn bộ ký tự cơ bản mà chúng ta sẽ thực hiện. + +--- + +## Bước 1: Khởi tạo Instance của Engine OCR + +Tạo một đối tượng engine là việc đầu tiên bạn làm khi **sử dụng engine OCR**. Hãy nghĩ nó như bật một máy quét; nếu không có nó, bạn không thể đặt câu hỏi nào. + +```python +import ocr + +# Step 1: Create an OCR engine instance +engine = ocr.OcrEngine() +``` + +Lớp `OcrEngine` là một wrapper nhẹ quanh engine nhận dạng bên dưới. Nó không bắt đầu xử lý nặng cho tới khi bạn yêu cầu, vì vậy chi phí khởi động gần như không đáng kể. + +> **Mẹo chuyên nghiệp:** Nếu ứng dụng của bạn tạo nhiều instance engine, hãy tái sử dụng một instance duy nhất. Điều này tiết kiệm bộ nhớ và giảm độ trễ khởi tạo. + +--- + +## Bước 2: Truy vấn Bộ ký tự OCR cho một Script Cụ thể + +Bây giờ chúng ta đã có engine, có thể hỏi nó biết những ký tự nào cho một hệ thống viết nhất định. Phương thức `get_supported_characters(script_name)` trả về một `list` Python chứa các ký tự Unicode. + +```python +# Step 2: Get the list of characters the engine can recognize for the Latin script +latin_chars = engine.get_supported_characters("Latin") +print(f"Supported Latin chars ({len(latin_chars)}): {latin_chars}") +``` + +Chạy đoạn code trên sẽ in ra thứ gì đó như: + +``` +Supported Latin chars (26): ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] +``` + +Kết quả chính xác phụ thuộc vào phiên bản thư viện OCR, nhưng bạn sẽ luôn nhận được một danh sách phẳng các ký tự. Nếu bạn cần cả chữ hoa và chữ thường, chỉ cần yêu cầu script `"Latin"` rồi áp dụng `.lower()` cho mỗi phần tử, hoặc truy vấn script `"Latin‑lower"` riêng nếu thư viện phân biệt chúng. + +### Tại sao điều này lại quan trọng? + +Khi bạn đưa vào một hình ảnh chứa các dấu phụ lạ (ví dụ: “ñ” hoặc “ø”), engine OCR có thể âm thầm thay chúng bằng ký tự placeholder hoặc bỏ hoàn toàn. Biết trước **bộ ký tự OCR** giúp bạn kiểm tra đầu vào, cảnh báo người dùng, hoặc chuyển sang engine khác. + +--- + +## Bước 3: Lấy Ký tự cho Script Khác – Ví dụ Cyrillic + +Phương thức này hoạt động với bất kỳ script nào mà engine tuyên bố hỗ trợ. Hãy xem nó hoạt động như thế nào với Cyrillic. + +```python +# Step 3: Get the list of characters the engine can recognize for the Cyrillic script +cyrillic_chars = engine.get_supported_characters("Cyrillic") +print(f"Supported Cyrillic chars ({len(cyrillic_chars)}): {cyrillic_chars}") +``` + +Kết quả thường thấy: + +``` +Supported Cyrillic chars (33): ['А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ё', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я'] +``` + +Nếu engine **không** hỗ trợ script được yêu cầu, nó thường ném `ValueError` (hoặc trả về danh sách rỗng tùy vào cách triển khai). Hãy bảo vệ mã của bạn trước trường hợp này. + +```python +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +# Example usage: +safe_get_characters(engine, "Arabic") # Might trigger the error branch +``` + +--- + +## Bước 4: Trực quan hoá Bộ ký tự OCR (Tùy chọn) + +Đôi khi một hình ảnh nhanh giúp ích, đặc biệt khi bạn cần trình bày cho các bên liên quan biết những ký tự nào đã được bao phủ. Dưới đây là một ví dụ tối thiểu dùng `matplotlib` để vẽ lưới ký tự. + +```python +import matplotlib.pyplot as plt + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14, ha='center', va='center') + plt.title(title) + plt.show() + +# Visualize Latin characters +plot_character_grid(latin_chars, title="Latin OCR Character Set") +``` + +> **Văn bản thay thế hình ảnh:** ![Bộ ký tự OCR](ocr_character_set.png){alt="Tổng quan bộ ký tự OCR"} + +Biểu đồ không bắt buộc cho giải pháp cốt lõi, nhưng nó minh họa cách bạn có thể **sử dụng engine OCR** để khai thác siêu dữ liệu ngoài việc in ra màn hình. + +--- + +## Bước 5: Tích hợp Bộ ký tự vào Quy trình Thực tế + +Bây giờ chúng ta có thể truy xuất **bộ ký tự OCR**, hãy xem một kịch bản thực tế: kiểm tra đầu ra OCR trước khi lưu vào cơ sở dữ liệu. + +```python +def validate_ocr_output(text, allowed_chars): + """Return True if every character in `text` exists in `allowed_chars`.""" + invalid = [c for c in text if c not in allowed_chars] + if invalid: + print(f"Invalid characters found: {invalid}") + return False + return True + +# Suppose we ran OCR on an image and got this result: +ocr_result = "HELLO WORLD!" + +# Use the previously fetched Latin set for validation: +if validate_ocr_output(ocr_result, set(latin_chars)): + print("All characters are supported – safe to store.") +else: + print("Result contains unsupported symbols – consider manual review.") +``` + +Mẫu này ngăn dữ liệu rác lọt vào các pipeline downstream, một vấn đề phổ biến khi làm việc với tài liệu đa ngôn ngữ. + +--- + +## Những Cạm Bẫy Thường Gặp và Các Trường Hợp Cạnh + +| Cạm bẫy | Nguyên nhân | Cách tránh | +|---------|-------------|------------| +| **Tên script sai** (ví dụ: `"Cyrillic "` có dấu cách thừa) | Engine xử lý chuỗi nguyên văn và không tìm thấy script. | Loại bỏ khoảng trắng: `script.strip()` trước khi gọi `get_supported_characters`. | +| **Danh sách ký tự rỗng** | Một số engine khai báo script nhưng chưa tải mô hình ngôn ngữ. | Gọi `engine.load_language_model(script)` nếu thư viện có phương thức này, hoặc đảm bảo các file mô hình đã có. | +| **Vấn đề chuẩn hoá Unicode** | Các ký tự như “é” có thể xuất hiện dưới dạng hợp thành (`\u00E9`) hoặc tách rời (`e\u0301`). | Chuẩn hoá chuỗi bằng `unicodedata.normalize('NFC', text)` trước khi kiểm tra. | +| **Hiệu năng với script lớn** (ví dụ: Chinese) | Truy xuất hàng nghìn ký tự có thể chậm. | Lưu cache kết quả sau lần gọi đầu tiên; hầu hết ứng dụng chỉ cần danh sách một lần. | + +--- + +## Ví dụ Hoàn chỉnh + +Kết hợp mọi thứ lại, đây là một script duy nhất bạn có thể sao chép‑dán và chạy: + +```python +import ocr +import matplotlib.pyplot as plt + +def safe_get_characters(engine, script): + try: + chars = engine.get_supported_characters(script) + if not chars: + print(f"No characters returned for script '{script}'.") + else: + print(f"Supported {script} chars ({len(chars)}): {chars}") + return chars + except Exception as e: + print(f"Error retrieving characters for script '{script}': {e}") + return [] + +def plot_character_grid(chars, title="OCR Character Set"): + cols = 10 + rows = (len(chars) + cols - 1) // cols + fig, ax = plt.subplots(figsize=(cols, rows)) + ax.set_axis_off() + for idx, ch in enumerate(chars): + row = idx // cols + col = idx % cols + ax.text(col / cols, 1 - row / rows, ch, fontsize=14 + + +## Bạn Nên Học Gì Tiếp Theo? + + +Các hướng dẫn sau đây đề cập đến các chủ đề liên quan chặt chẽ, xây dựng trên các kỹ thuật được trình bày trong hướng dẫn này. Mỗi tài nguyên bao gồm các ví dụ code đầy đủ cùng giải thích từng bước để giúp bạn làm chủ các tính năng API bổ sung và khám phá các cách triển khai thay thế trong dự án của mình. + +- [Hướng dẫn Aspose OCR – Nhận dạng ký tự quang học](/ocr/english/) +- [Xử lý hậu OCR – Lấy các lựa chọn ký tự](/ocr/english/net/text-recognition/get-choices-for-recognized-characters/) +- [Cách Đặt Giá Trị Ngưỡng trong Nhận dạng Hình ảnh OCR](/ocr/english/net/ocr-settings/set-threshold-value/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/vietnamese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md b/ocr/vietnamese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md new file mode 100644 index 000000000..e472eef7e --- /dev/null +++ b/ocr/vietnamese/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/_index.md @@ -0,0 +1,257 @@ +--- +category: general +date: 2026-06-06 +description: Thực hiện OCR trên hình ảnh bằng Aspose OCR và mô hình Hugging Face. + Tìm hiểu cách tải mô hình Hugging Face, trích xuất văn bản từ hoá đơn và giải phóng + tài nguyên GPU. +draft: false +keywords: +- perform OCR on image +- download Hugging Face model +- extract text from invoice +- free GPU resources +- load image for OCR +language: vi +og_description: Thực hiện OCR trên hình ảnh bằng Aspose OCR và mô hình Hugging Face. + Hướng dẫn này chỉ cách tải mô hình, trích xuất văn bản từ hoá đơn và giải phóng + tài nguyên GPU. +og_title: Thực hiện OCR trên hình ảnh với Aspose OCR & LLM – Hướng dẫn đầy đủ +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: Perform OCR on image using Aspose OCR and a Hugging Face model. Learn + how to download Hugging Face model, extract text from invoice, and free GPU resources. + headline: Perform OCR on Image with Aspose OCR & LLM – Complete Guide + type: TechArticle +tags: +- OCR +- Aspose +- Python +- AI +- HuggingFace +title: Thực hiện OCR trên hình ảnh với Aspose OCR & LLM – Hướng dẫn toàn diện +url: /vi/python/general/perform-ocr-on-image-with-aspose-ocr-llm-complete-guide/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# Thực hiện OCR trên Hình ảnh với Aspose OCR & LLM – Hướng dẫn toàn diện + +Bạn đã bao giờ muốn **thực hiện OCR trên các tệp hình ảnh** nhưng lại bối rối không biết bắt đầu từ đâu? Bạn không đơn độc—nhiều nhà phát triển gặp phải rào cản này khi lần đầu tiếp cận tự động hoá tài liệu. Tin tốt là với Aspose OCR và một LLM nhẹ từ Hugging Face, bạn có thể biến một bản scan hóa đơn thô thành văn bản sạch, có thể tìm kiếm chỉ trong vài dòng Python. + +Trong tutorial này, chúng ta sẽ đi qua mọi thứ bạn cần: từ **tải hình ảnh cho OCR**, đến **tải mô hình Hugging Face**, tới **trích xuất văn bản từ dữ liệu hóa đơn**, và cuối cùng **giải phóng tài nguyên GPU** để ứng dụng của bạn luôn nhẹ nhàng. Khi kết thúc, bạn sẽ có một script tự chứa có thể chèn vào bất kỳ dự án nào. + +--- + +## Những gì bạn sẽ học + +- Cách **thực hiện OCR trên hình ảnh** bằng `OcrEngine` của Aspose. +- Các bước chính xác để **tải mô hình Hugging Face** một cách tự động. +- Kỹ thuật **trích xuất văn bản từ hóa đơn** PDF hoặc PNG với xử lý hậu kỳ AI. +- Các thực tiễn tốt nhất để **giải phóng tài nguyên GPU** sau khi suy luận. +- Mẹo **tải hình ảnh cho OCR** hiệu quả và tránh các lỗi thường gặp. + +Không cần tài liệu bên ngoài—tất cả những gì bạn cần đều có ở đây, kèm đầy đủ code, giải thích và kết quả mong đợi. + +--- + +## Yêu cầu trước + +Trước khi bắt đầu, hãy chắc chắn bạn đã có: + +| Yêu cầu | Lý do | +|---------|-------| +| Python 3.9+ | Cú pháp hiện đại và hỗ trợ type hints | +| Gói `asposeocr` (`pip install asposeocr`) | Engine OCR cốt lõi | +| Truy cập GPU (tùy chọn nhưng khuyến nghị) | Tăng tốc bộ xử lý hậu kỳ LLM | +| Hình ảnh hóa đơn (`sample_invoice.png`) | Trường hợp thử nghiệm thực tế | + +Nếu thiếu bất kỳ mục nào, hãy cài đặt ngay; script cũng sẽ **tự động tải mô hình Hugging Face** nên bạn không cần tự tìm kiếm file. + +--- + +## Bước 1: Thực hiện OCR trên Hình ảnh – Tạo Engine + +Điều đầu tiên bạn cần làm là khởi tạo engine OCR của Aspose. Hãy tưởng tượng nó như một canvas trống, nơi hình ảnh sẽ được “vẽ” thành văn bản. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# Step 1: Initialize the OCR engine +engine = ocr.OcrEngine() +``` + +> **Tại sao lại quan trọng:** `OcrEngine` ẩn đi tất cả các bước tiền xử lý hình ảnh cấp thấp, cho phép bạn tập trung vào quy trình cấp cao. Nó cũng cung cấp phương thức `set_post_processor` để bạn có thể gắn một LLM cho đầu ra thông minh hơn. + +--- + +## Bước 2: Tải Hình ảnh cho OCR – Chọn File Phù hợp + +Khi engine đã sẵn sàng, chúng ta cần **tải hình ảnh cho OCR**. Aspose hỗ trợ PNG, JPG, TIFF và một vài định dạng khác. Đảm bảo đường dẫn là tuyệt đối hoặc tương đối so với vị trí script của bạn. + +```python +# Step 2: Load the image you want to process +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") +``` + +> **Mẹo:** Nếu hình ảnh của bạn quá lớn, hãy cân nhắc giảm kích thước trước để giảm áp lực bộ nhớ. Engine OCR có thể xử lý các bản scan độ phân giải cao, nhưng hình ảnh 300 DPI thường là mức tối ưu cho hóa đơn. + +--- + +## Bước 3: Thực hiện OCR Thô và Xem Văn bản Được Trích xuất + +Với hình ảnh đã được tải, chúng ta cuối cùng có thể **thực hiện OCR trên hình ảnh** và xem engine trả về gì. Bước này cung cấp baseline trước khi chúng ta thêm “ma thuật” AI. + +```python +# Step 3: Run the built‑in OCR and capture raw results +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) +``` + +**Kết quả mong đợi (được cắt ngắn):** + +``` +Raw text: +Invoice #12345 +Date: 2024‑04‑01 +Total: $1,250.00 +... +``` + +Kết quả thô thường chứa các dấu ngắt dòng, ký tự nhận dạng sai, hoặc thiếu trường dữ liệu—đó chính là lý do chúng ta sẽ đưa mô hình ngôn ngữ vào tiếp theo. + +--- + +## Bước 4: Tải Mô hình Hugging Face – Cấu hình Post‑Processor AI + +Đây là phần **tải mô hình Hugging Face** tỏa sáng. Aspose AI có thể tự động kéo mô hình từ Hugging Face hub nếu chưa có trên ổ đĩa. Chúng ta sẽ dùng mô hình Qwen2.5‑3B‑Instruct‑GGUF, cân bằng giữa độ chính xác và dung lượng bộ nhớ. + +```python +# Step 4: Set up the AI model configuration +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", # automatically fetch model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # int8 quantization cuts RAM usage dramatically + gpu_layers=20, # first 20 layers run on GPU, rest on CPU + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) +``` + +> **Tại sao lại hoạt động:** `allow_auto_download` giúp bạn không phải tải thủ công file `.gguf`. Việc lượng tử hoá (`int8`) giảm kích thước mô hình xuống khoảng 3 GB, cho phép chạy trên hầu hết GPU tiêu dùng. Điều chỉnh `gpu_layers` tùy theo phần cứng—càng nhiều lớp trên GPU, suy luận càng nhanh. + +--- + +## Bước 5: Trích xuất Văn bản từ Hóa đơn bằng Post‑Processing AI + +Bây giờ chúng ta gắn LLM vào engine OCR và chạy **post‑processor** để làm sạch đầu ra thô, sửa lỗi OCR và định dạng các trường hóa đơn một cách gọn gàng. + +```python +# Step 5: Initialize the AI engine and bind it to the OCR engine +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # implicit when using set_post_processor in newer versions +engine.set_post_processor(ai_engine) + +# Run the AI‑enhanced post‑processor +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) +``` + +**Ví dụ đầu ra đã được cải thiện:** + +``` +Enhanced text: +Invoice Number: 12345 +Date: 2024-04-01 +Bill To: Acme Corp. +Amount Due: $1,250.00 +Status: Paid +``` + +> **Điều gì đã xảy ra?** LLM nhận ra “Invoice #12345” nên được chuyển thành “Invoice Number: 12345”, sửa định dạng ngày, và thậm chí suy luận ra trường “Bill To” mà engine thô đã bỏ lỡ. Đây là cốt lõi của tự động **trích xuất văn bản từ hóa đơn**. + +--- + +## Bước 6: Giải phóng Tài nguyên GPU – Dọn Dẹp Sau Khi Xử lý + +Nếu bạn chạy script này trong một dịch vụ lâu dài (ví dụ: API Flask), bạn phải **giải phóng tài nguyên GPU** sau mỗi lần suy luận để tránh lỗi hết bộ nhớ. Aspose AI cung cấp một phương thức đơn giản cho việc này. + +```python +# Step 6: Release GPU memory and dispose of the engine +ai_engine.free_resources() # frees GPU layers and model tensors +engine.dispose() # cleans up internal buffers +``` + +> **Mẹo chuyên nghiệp:** Gọi `free_resources()` trong khối `finally:` nếu bạn bọc lời gọi OCR trong `try/except`. Điều này đảm bảo dọn dẹp ngay cả khi có ngoại lệ xảy ra. + +--- + +## Bước 7: Script Đầy Đủ – Kết Hợp Tất Cả + +Dưới đây là script hoàn chỉnh, sẵn sàng chạy. Sao chép‑dán, điều chỉnh đường dẫn, và bạn đã sẵn sàng. + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Create OCR engine +engine = ocr.OcrEngine() + +# 2️⃣ Load image for OCR +engine.load_image("YOUR_DIRECTORY/sample_invoice.png") + +# 3️⃣ Raw OCR +raw_result = engine.recognize() +print("Raw text:") +print(raw_result.text) + +# 4️⃣ Download Hugging Face model (auto‑download enabled) +ai_cfg = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/asp_ocr_models" +) + +# 5️⃣ Attach AI post‑processor and enhance text +ai_engine = AsposeAI() +ai_engine.initialize(ai_cfg) # optional in newer versions +engine.set_post_processor(ai_engine) + +enhanced_result = engine.run_postprocessor(raw_result) +print("\nEnhanced text:") +print(enhanced_result.text) + +# 6️⃣ Free GPU resources +ai_engine.free_resources() +engine.dispose() +``` + +Chạy script và quan sát quá trình chuyển đổi từ OCR nhiễu sang dữ liệu hóa đơn sạch, có cấu trúc. 🎉 + +--- + +## Câu hỏi Thường gặp & Trường hợp Cạnh + +| Câu hỏi | Trả lời | +|---------|---------| +| **Nếu mô hình không tải được?** | Đảm bảo máy của bạn có kết nối internet và `hugging_face_repo_id` đúng. Bạn cũng có thể tải thủ công file | + +## Bạn Nên Học Gì Tiếp Theo? + + +Các tutorial sau đây đề cập đến các chủ đề liên quan chặt chẽ, xây dựng trên các kỹ thuật đã trình bày trong hướng dẫn này. Mỗi tài nguyên đều bao gồm code mẫu đầy đủ và giải thích chi tiết từng bước để giúp bạn nắm vững các tính năng API bổ sung và khám phá các cách triển khai thay thế trong dự án của mình. + +- [Trích xuất văn bản hình ảnh C# với lựa chọn ngôn ngữ bằng Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) +- [Trích xuất Văn bản từ Hình ảnh – Tối ưu hoá OCR với Aspose.OCR cho .NET](/ocr/english/net/ocr-optimization/) +- [Chuyển đổi Hình ảnh thành Văn bản – Thực hiện OCR trên Hình ảnh từ URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file diff --git a/ocr/vietnamese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md b/ocr/vietnamese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md new file mode 100644 index 000000000..eb046d495 --- /dev/null +++ b/ocr/vietnamese/python/general/recognize-text-from-image-using-aspose-ocr-ai/_index.md @@ -0,0 +1,283 @@ +--- +category: general +date: 2026-06-06 +description: Nhận dạng văn bản từ hình ảnh bằng Aspose OCR – học cách tải hình ảnh + để OCR và thực hiện OCR trên hình ảnh bằng xử lý hậu kỳ AI trong Python. +draft: false +keywords: +- recognize text from image +- perform ocr on image +- load image for ocr +language: vi +og_description: Nhận dạng văn bản từ hình ảnh nhanh chóng. Hướng dẫn này chỉ cách + tải hình ảnh cho OCR, thực hiện OCR trên hình ảnh và nâng cao kết quả bằng xử lý + hậu kỳ AI. +og_title: Nhận dạng văn bản từ hình ảnh bằng Aspose OCR & AI +schemas: +- author: Aspose + dateModified: '2026-06-06' + description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + headline: recognize text from image using Aspose OCR & AI + type: TechArticle +- description: recognize text from image with Aspose OCR – learn how to load image + for OCR and perform OCR on image using AI post‑processing in Python. + name: recognize text from image using Aspose OCR & AI + steps: + - name: Prerequisites + text: '- Python 3.8 or newer - `asposeocr` package (`pip install asposeocr`) - + An image file (e.g., `doc.png`) that contains printed or handwritten text - + Optional: a GPU for faster LLM inference (the script works on CPU too)' + - name: Install and import the required modules + text: '```python # Install the library (run once in your environment) # pip install + asposeocr' + - name: Create the OCR engine and enable handwritten text recognition + text: '```python # Initialise the OCR engine ocr_engine = ocr.OcrEngine()' + - name: Load the image for OCR + text: '```python # Point the engine at the file you want to process ocr_engine.load_image("YOUR_DIRECTORY/doc.png") + ```' + - name: Run the raw OCR pass + text: '```python # Execute the recognition pipeline and capture the raw result + raw_result = ocr_engine.recognize() ```' + - name: Configure the Aspose AI model for LLM post‑processing + text: '```python ai_config = AsposeAIModelConfig( allow_auto_download="true", + # Pull the model if missing hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage gpu_layers=20, # Use 20 + GPU layers if a GPU is present directory_model_path="YOUR_DIRECTORY/mo' + - name: Initialise the AI processor and attach it as a post‑processor + text: '```python # Instantiate the AI processor with the configuration above ai_processor + = AsposeAI() ai_processor.initialize(ai_config)' + - name: Run the post‑processor to enhance the OCR output + text: '```python # Apply AI‑driven post‑processing enhanced_result = ocr_engine.run_postprocessor(raw_result)' + - name: Release resources + text: '```python # Free GPU/CPU memory held by the AI model ai_processor.free_resources()' + type: HowTo +- questions: + - answer: No. The script falls back to CPU, but inference will be slower. + question: Do I need a GPU? + - answer: Absolutely—just change `hugging_face_repo_id` and adjust `gpu_layers` + accordingly. + question: Can I use a different LLM? + - answer: Resize it first (e.g., using Pillow) to keep memory usage reasonable. + question: What if my image is huge? + - answer: You can toggle `enable_handwritten_recognition` depending on your workload. + question: Is handwritten recognition always on? + type: FAQPage +tags: +- OCR +- Aspose +- Python +title: Nhận dạng văn bản từ hình ảnh bằng Aspose OCR & AI +url: /vi/python/general/recognize-text-from-image-using-aspose-ocr-ai/ +--- + +{{< blocks/products/pf/main-wrap-class >}} +{{< blocks/products/pf/main-container >}} +{{< blocks/products/pf/tutorial-page-section >}} + +# nhận dạng văn bản từ hình ảnh bằng Aspose OCR & AI + +Bạn đã bao giờ cần nhận dạng văn bản từ hình ảnh nhưng không chắc thư viện nào sẽ cho bạn cả tốc độ và độ chính xác? Bạn không phải là người duy nhất. Trong hướng dẫn này, chúng tôi sẽ đi qua một ví dụ hoàn chỉnh, đầu‑tới‑đầu cho thấy **cách tải hình ảnh cho OCR**, **thực hiện OCR trên hình ảnh**, và sau đó tinh chỉnh kết quả bằng bộ xử lý hậu‑xử lý AI của Aspose. Khi kết thúc, bạn sẽ có một script sẵn sàng chạy để chuyển đổi PNG thành văn bản sạch, có thể tìm kiếm. + +## Những gì bạn sẽ học + +Chúng tôi sẽ bao phủ mọi thứ từ việc cài đặt gói Aspose OCR đến việc giải phóng tài nguyên khi kết thúc. Bạn sẽ thấy tại sao việc bật nhận dạng văn bản viết tay quan trọng, cách cấu hình mô hình Qwen 2.5 LLM cho hậu‑xử lý, và kết quả cuối cùng trông như thế nào. Không cần tham chiếu bên ngoài—chỉ cần sao chép, dán và chạy. + +### Yêu cầu trước + +- Python 3.8 hoặc mới hơn +- Gói `asposeocr` (`pip install asposeocr`) +- Một tệp hình ảnh (ví dụ, `doc.png`) chứa văn bản in hoặc viết tay +- Tùy chọn: GPU để tăng tốc suy luận LLM (script cũng chạy trên CPU) + +--- + +## Nhận dạng văn bản từ hình ảnh – Các bước + +Below each code block you’ll find a short explanation of **why** we’re doing that particular action, not merely **what** the line does. + +### Bước 1: Cài đặt và nhập các mô-đun cần thiết + +```python +# Install the library (run once in your environment) +# pip install asposeocr + +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig +``` + +*Why?* Việc nhập `asposeocr` cung cấp cho chúng ta lớp `OcrEngine`, trong khi submodule `ai` cung cấp bộ xử lý hậu‑xử lý dựa trên LLM giúp cải thiện đáng kể kết quả OCR thô. + +### Bước 2: Tạo engine OCR và bật nhận dạng văn bản viết tay + +```python +# Initialise the OCR engine +ocr_engine = ocr.OcrEngine() + +# Turn on handwritten‑text support – useful for notes, signatures, etc. +ocr_engine.ocr_settings.enable_handwritten_recognition = True +``` + +Bật nhận dạng viết tay mở rộng bộ ký tự của engine, vì vậy bạn sẽ không mất các nét viết khi **thực hiện OCR trên hình ảnh** có chứa văn bản in và viết tay hỗn hợp. + +### Bước 3: Tải hình ảnh cho OCR + +```python +# Point the engine at the file you want to process +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") +``` + +Lệnh `load_image` là thời điểm bạn **tải hình ảnh cho OCR**; nếu đường dẫn sai, engine sẽ ném ra một ngoại lệ có thông tin, giúp bạn tránh các lỗi khó hiểu ở các bước sau. + +### Bước 4: Chạy quá trình OCR thô + +```python +# Execute the recognition pipeline and capture the raw result +raw_result = ocr_engine.recognize() +``` + +Ở giai đoạn này, bạn nhận được một đối tượng `RecognitionResult` chứa văn bản chưa lọc, điểm tin cậy và siêu dữ liệu bố cục. Nó thường có nhiều nhiễu—do đó cần quá trình làm sạch bằng AI. + +### Bước 5: Cấu hình mô hình AI của Aspose cho hậu‑xử lý LLM + +```python +ai_config = AsposeAIModelConfig( + allow_auto_download="true", # Pull the model if missing + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", # Reduce RAM usage + gpu_layers=20, # Use 20 GPU layers if a GPU is present + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) +``` + +Tại sao phải quan tâm đến các cài đặt này? +- **auto‑download** đảm bảo mô hình có sẵn khi chạy lần đầu. +- **int8 quantization** giảm đáng kể nhu cầu bộ nhớ mà không làm giảm quá nhiều độ chính xác. +- **gpu_layers** cho phép bạn tận dụng GPU tương thích để tăng tốc suy luận. + +### Bước 6: Khởi tạo bộ xử lý AI và gắn nó làm bộ hậu‑xử lý + +```python +# Instantiate the AI processor with the configuration above +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) + +# Hook the AI processor into the OCR engine +ocr_engine.set_post_processor(ai_processor, None) +``` + +Gắn bộ xử lý có nghĩa là mỗi khi bạn gọi `run_postprocessor`, LLM sẽ sửa lỗi chính tả, ghép các từ bị tách, và thậm chí suy đoán dấu câu còn thiếu. + +### Bước 7: Chạy bộ hậu‑xử lý để cải thiện kết quả OCR + +```python +# Apply AI‑driven post‑processing +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# Print the polished text +print(enhanced_result.text) +``` + +`enhanced_result.text` thường dễ đọc hơn nhiều so với chuỗi thô—hãy nghĩ nó như một công cụ kiểm tra chính tả còn hiểu ngữ cảnh. + +### Bước 8: Giải phóng tài nguyên + +```python +# Free GPU/CPU memory held by the AI model +ai_processor.free_resources() + +# Dispose of the OCR engine to close file handles, etc. +ocr_engine.dispose() +``` + +Dọn dẹp là rất quan trọng trong các dịch vụ chạy lâu; nếu không, bạn sẽ rò rỉ bộ nhớ GPU và cuối cùng làm ứng dụng bị sập. + +--- + +## Kịch bản đầy đủ bạn có thể chạy ngay hôm nay + +```python +import asposeocr as ocr +from asposeocr.ai import AsposeAI, AsposeAIModelConfig + +# 1️⃣ Initialise OCR engine with handwritten support +ocr_engine = ocr.OcrEngine() +ocr_engine.ocr_settings.enable_handwritten_recognition = True + +# 2️⃣ Load the image you want to analyse +ocr_engine.load_image("YOUR_DIRECTORY/doc.png") # <-- replace with your path + +# 3️⃣ Perform the basic OCR pass +raw_result = ocr_engine.recognize() + +# 4️⃣ Set up the AI model for smarter output +ai_config = AsposeAIModelConfig( + allow_auto_download="true", + hugging_face_repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", + hugging_face_quantization="int8", + gpu_layers=20, + directory_model_path="YOUR_DIRECTORY/models/ocr_ai" +) + +# 5️⃣ Initialise and attach the AI post‑processor +ai_processor = AsposeAI() +ai_processor.initialize(ai_config) +ocr_engine.set_post_processor(ai_processor, None) + +# 6️⃣ Enhance the OCR result with the LLM +enhanced_result = ocr_engine.run_postprocessor(raw_result) + +# 7️⃣ Show the final, cleaned‑up text +print("=== Enhanced OCR Output ===") +print(enhanced_result.text) + +# 8️⃣ Clean up +ai_processor.free_resources() +ocr_engine.dispose() +``` + +**Kết quả mong đợi** (ví dụ cho một hình ảnh hoá đơn đơn giản): + +``` +=== Enhanced OCR Output === +Invoice #12345 +Date: 2024‑04‑01 +Total Amount: $1,250.00 +Thank you for your business! +``` + +Chú ý cách lớp AI đã sửa “Inv0ice” → “Invoice” và thêm dấu câu còn thiếu. + +--- + +## Câu hỏi thường gặp (và câu trả lời nhanh) + +- **Bạn có cần GPU không?** Không. Script sẽ chuyển sang CPU, nhưng suy luận sẽ chậm hơn. +- **Tôi có thể dùng LLM khác không?** Chắc chắn—chỉ cần thay đổi `hugging_face_repo_id` và điều chỉnh `gpu_layers` cho phù hợp. +- **Nếu hình ảnh của tôi quá lớn thì sao?** Hãy thay đổi kích thước trước (ví dụ, dùng Pillow) để giữ mức sử dụng bộ nhớ hợp lý. +- **Nhận dạng viết tay luôn bật không?** Bạn có thể bật/tắt `enable_handwritten_recognition` tùy theo khối lượng công việc. + +--- + +## Kết luận + +Bạn hiện đã biết cách **nhận dạng văn bản từ hình ảnh** bằng Aspose OCR, cách **tải hình ảnh cho OCR**, và cách **thực hiện OCR trên hình ảnh** với hậu‑xử lý được cải thiện bởi AI. Ví dụ hoàn chỉnh, có thể chạy được ở trên cung cấp nền tảng vững chắc để tích hợp OCR vào bất kỳ dự án Python nào—dù là quét biên lai, số hoá hợp đồng, hay trích xuất dữ liệu từ mẫu viết tay. + +Sẵn sàng cho bước tiếp theo? Hãy thử thay thế mô hình Qwen bằng một mô hình lớn hơn, thử nghiệm các scheme lượng tử khác nhau, hoặc nối chuỗi nhiều hình ảnh lại để xử lý hàng loạt. Các khả năng là vô hạn, và đoạn code bạn vừa xây dựng sẽ xử lý chúng một cách suôn sẻ. + +Chúc lập trình vui vẻ, và hy vọng kết quả OCR của bạn luôn trong sáng như pha lê! + +![Ảnh chụp màn hình console Python hiển thị kết quả OCR được cải thiện](/images/ocr_output.png){alt="Ảnh chụp màn hình cho thấy cách nhận dạng văn bản từ hình ảnh bằng Aspose OCR"} + +## Bạn nên học gì tiếp theo? + +Các hướng dẫn sau đây bao gồm các chủ đề liên quan chặt chẽ, xây dựng dựa trên các kỹ thuật được trình bày trong hướng dẫn này. Mỗi tài nguyên đều có các ví dụ code hoàn chỉnh kèm giải thích từng bước để giúp bạn nắm vững các tính năng API bổ sung và khám phá các cách triển khai thay thế trong dự án của mình. + +- [Trích xuất văn bản từ hình ảnh bằng Aspose OCR – Hướng dẫn từng bước](/ocr/english/python/general/extract-text-from-image-with-aspose-ocr-step-by-step-guide/) +- [Chuyển đổi hình ảnh thành văn bản – Thực hiện OCR trên hình ảnh từ URL](/ocr/english/net/ocr-optimization/perform-ocr-on-image-from-url/) +- [Trích xuất văn bản hình ảnh C# với lựa chọn ngôn ngữ bằng Aspose.OCR](/ocr/english/net/ocr-configuration/ocr-operation-with-language-selection/) + +{{< /blocks/products/pf/tutorial-page-section >}} +{{< /blocks/products/pf/main-container >}} +{{< /blocks/products/pf/main-wrap-class >}} +{{< blocks/products/products-backtop-button >}} \ No newline at end of file