Skip to content

coderganesh/OCRmyTamilPDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📄🔍 OCRmyTamilPDF

GUI Screenshot

OCRmyTamilPDF is a GUI tool that makes scanned Tamil PDFs searchable and copy-pasteable by adding a text layer.

✨ Features

  • Simple UI and Strong performance
  • Cross-platform

📋 Requirements

OCRmyTamilPDF requires Python (built on 3.13.7) along with three external programs: Tk, Ghostscript and Tesseract OCR. It runs on literally all desktop platforms where Python is supported, such as Linux, Windows, macOS, and FreeBSD.

Note: For only running the executable binary file, installing the tk package is not required.

🛠️ Installing Requirements

🐧 Linux

  1. Install the required external packages with the following commands based on your Linux distro:

Debian-based:

    sudo apt update && sudo apt install tesseract-ocr tesseract-data-tam tesseract-data-eng tk ghostscript

Arch-based:

    sudo pacman -Syu tesseract tesseract-data-tam tesseract-data-eng tk ghostscript
  1. From the project’s root directory, install the required Python packages with the following command: pip install -r requirements.txt

🪟 Windows

  1. Download and install Ghostscript and Tesseract OCR software and add them to the system path. Make sure to select Tamil language pack while installing tesseract via installer.
  2. From the project’s root directory, install the required Python packages with the following command: pip install -r requirements.txt

Note: Installing the tk package is not required as it is already bundled with Python in Windows.

📘 Run Instructions

If using the script: Run python main.py
If using the executable (download the .exe for Windows or .bin for Linux from the Releases page): Run it directly by double-clicking or from the terminal

🙋‍♂️ Having Issues?

If you face any problems or have suggestions, feel free to open an issue.

⚖️ License

GPLv3 License

This project is licensed under the GPLv3 License - see the LICENSE file for details.

🏷️ Attribution

This project uses OCRmyPDF, which is licensed under the Mozilla Public License 2.0.
No modifications were made to OCRmyPDF itself.
See the OCRmyPDF repository for license details.

About

GUI tool for making scanned Tamil PDFs searchable and copy-pasteable

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages