GLM-OCR by AkshitMaheshwari · Pull Request #19 · kaws26/TaxAI

AkshitMaheshwari · 2026-04-03T18:42:58Z

No description provided.

vercel · 2026-04-03T18:43:05Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
tax-ai	Ready	Preview, Comment	Apr 3, 2026 7:19pm

Copilot

Pull request overview

This PR switches the backend image OCR implementation from Tesseract/pytesseract to the HuggingFace zai-org/GLM-OCR model, and updates frontend environment configuration to point at the backend running on port 8000.

Changes:

Replace Tesseract-based OCR with GLM-OCR (Transformers) inference and add model-loading caching.
Update backend dependencies to include transformers, torch, and accelerate.
Adjust frontend .env.example / .env API URL values and add myenv/ to backend .gitignore.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
Frontend/.env.example	Updates example API base URL to `http://127.0.0.1:8000`.
Frontend/.env	Changes committed Vite API base URL to localhost.
Backend/services/image_ocr.py	Replaces pytesseract OCR pipeline with GLM-OCR model inference.
Backend/requirements.txt	Adds Transformers + Torch + Accelerate dependencies for GLM-OCR.
Backend/.gitignore	Ignores `myenv/` directory.

Comments suppressed due to low confidence (1)

Backend/services/image_ocr.py:10

ImageOps is imported but no longer used after switching away from the pre-processing path. Please remove the unused import to keep the module clean (and avoid lint/test failures if the repo enforces them).


from PIL import Image, ImageOps

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-03T18:46:15Z

 import os
+import tempfile
+import uuid


tempfile is imported but never used. Either remove the import or use tempfile.NamedTemporaryFile/mkstemp for the temporary PNG (which would also avoid writing into the app’s working directory).

Copilot · 2026-04-03T18:46:16Z


 from runtime import get_runtime_config
 from services.document_ingestion import DocumentValidationError, parse_document
 from services.groq_ai import extract_csv_from_ocr_text, groq_status


get_runtime_config is now unused in this module (the prior Tesseract config path was removed). Please drop the unused import to avoid confusion and potential linting failures.

Copilot · 2026-04-03T18:46:16Z

+_processor = None
+_model = None
+
+def _load_glm_ocr():
+    global _processor, _model
+    if _processor is None or _model is None:
+        from transformers import AutoProcessor, AutoModelForImageTextToText


Model initialization is guarded only by _processor is None or _model is None with module-level globals. Under concurrent requests, two threads can enter _load_glm_ocr() simultaneously and race while downloading/loading the model, causing excessive memory use or intermittent failures. Consider protecting initialization with a threading.Lock (or initializing once during app startup/lifespan).

Copilot · 2026-04-03T18:46:16Z

 def ocr_status() -> dict[str, Any]:
-    config = get_runtime_config()
-    tesseract_cmd = str(getattr(config, "TESSERACT_CMD", "")).strip()
    return {
-        "pytesseract_installed": pytesseract is not None,
-        "tesseract_cmd_configured": bool(tesseract_cmd),
+        "glm_ocr_configured": True,
        "groq": groq_status(),
    }


ocr_status() reports "glm_ocr_configured": True unconditionally, even if transformers/torch aren’t installed or the model can’t be loaded/downloaded. This can mislead the /options API and the frontend. Consider making this flag reflect reality (e.g., attempt a lightweight import and/or _load_glm_ocr() in a try/except and return configured: False plus an error detail when it fails).

Copilot · 2026-04-03T18:46:16Z

+    temp_filename = f"{uuid.uuid4().hex}.png"
    try:
        image = Image.open(BytesIO(image_bytes))
+        image.save(temp_filename, format="PNG")
    except Exception as exc:
+        if os.path.exists(temp_filename):
+            os.remove(temp_filename)
        raise OcrConversionError(f"Uploaded file is not a readable image: {exc}") from exc

-    # Grayscale + auto-contrast usually improves OCR quality on scans.
-    processed = ImageOps.autocontrast(ImageOps.grayscale(image))
-
    try:
-        text = pytesseract.image_to_string(processed)
+        messages = [
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "image",
+                        "url": temp_filename
+                    },


The OCR path writes a UUID-named PNG into the current working directory and then passes a relative path (url: temp_filename) into the processor. In deployed environments the CWD may be read-only or different than expected, which can cause OCR to fail. Prefer creating the file in the OS temp directory (e.g., via tempfile.NamedTemporaryFile(dir=tempfile.gettempdir(), ...)) and pass an absolute path to the model/processor.

Copilot · 2026-04-03T18:46:16Z

+        inputs.pop("token_type_ids", None)
+        generated_ids = _model.generate(**inputs, max_new_tokens=8192)
+        output_text = _processor.decode(generated_ids[0][inputs["input_ids"].shape[1]:], skip_special_tokens=False)
+
+        normalized = "\n".join(line.rstrip() for line in output_text.splitlines())
+        if len(normalized.strip()) < 20:
+            raise OcrConversionError(


_processor.decode(..., skip_special_tokens=False) will include the model’s special/control tokens in the OCR output. Since the downstream Groq extraction expects plain OCR text, these tokens can pollute the prompt and reduce extraction quality. Consider decoding with skip_special_tokens=True (or explicitly stripping known template tokens) before normalization.

Copilot · 2026-04-03T18:46:17Z

@@ -1,2 +1,2 @@
 # Backend API URL
-VITE_API_URL=https://taxai-77xc.onrender.com
+VITE_API_URL=http://127.0.0.1:8000


Committing Frontend/.env with VITE_API_URL=http://127.0.0.1:8000 will bake a localhost API URL into any production build that uses repository .env defaults, breaking deployments (and overriding the fallback onrender URL in src/utils/api.js). Typically .env should be untracked and .env.example used for documentation; consider removing Frontend/.env from version control (and adding it to Frontend/.gitignore) or restoring the deployed API URL here.

Suggested change

VITE_API_URL=http://127.0.0.1:8000

# Leave VITE_API_URL unset here so the app can use its built-in fallback URL.

# For local development, define VITE_API_URL in an untracked local env file instead.

Copilot · 2026-04-03T18:46:17Z

+transformers
+torch
+accelerate


Adding transformers, torch, and accelerate significantly increases build size and may fail to install in some deployment targets without extra system dependencies / correct wheels. Consider documenting the required runtime (CPU vs GPU) and deployment implications, or making these optional extras so environments that don’t use OCR don’t have to install them.

Suggested change

transformers

torch

accelerate

# Optional OCR/ML dependencies:

# Install these only in deployments that need OCR features, as they

# significantly increase build size and may require CPU/GPU-specific wheels.

# Example:

# pip install transformers torch accelerate

…nce OCR configuration

GLM-OCR

a9838ae

Copilot AI review requested due to automatic review settings April 3, 2026 18:42

kaws26 deployed to akshit - TaxAI PR #19 April 3, 2026 18:43 — with Render View deployment

Copilot started reviewing on behalf of AkshitMaheshwari April 3, 2026 18:43 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

feat: Update API integration to use centralized API_BASE_URL and enha…

75c0f36

…nce OCR configuration

kaws26 deployed to akshit - TaxAI PR #19 April 3, 2026 19:19 — with Render View deployment

vercel Bot deployed to Preview April 3, 2026 19:19 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GLM-OCR#19

GLM-OCR#19
AkshitMaheshwari wants to merge 2 commits into
mainfrom
akshit

AkshitMaheshwari commented Apr 3, 2026

Uh oh!

vercel Bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	VITE_API_URL=http://127.0.0.1:8000
	# Leave VITE_API_URL unset here so the app can use its built-in fallback URL.
	# For local development, define VITE_API_URL in an untracked local env file instead.

-transformers
-torch
-accelerate
+# Optional OCR/ML dependencies:
+# Install these only in deployments that need OCR features, as they
+# significantly increase build size and may require CPU/GPU-specific wheels.
+# Example:
+#   pip install transformers torch accelerate

Conversation

AkshitMaheshwari commented Apr 3, 2026

Uh oh!

vercel Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel Bot commented Apr 3, 2026 •

edited

Loading