Skip to content

slim8916/extract-text

Repository files navigation

Extract Text

Extract Text is a GNOME Shell extension that runs OCR on a selected screen region. It captures the region, sends the image to a local OCR helper, then copies the extracted text to the clipboard, saves it to a text file, or does both.

The extension is designed to keep GNOME Shell light. OCR models are not loaded when the extension is enabled. They are loaded by a separate helper process only while OCR is running, then the helper exits.

Status

This project is intended for local/manual installation from GitHub.

It is not currently packaged for extensions.gnome.org because the OCR backend uses Python packages and native libraries such as ONNX Runtime and OpenCV. Those dependencies must be installed explicitly by the user.

Features

  • Select any screen region and extract text from it.
  • Use PaddleOCR models through RapidOCR and ONNX Runtime.
  • Choose OCR backend: RapidOCR, PaddleOCR, or Auto.
  • Choose a language/script profile.
  • Show or hide the top bar icon.
  • Configure the shortcut used to start region selection.
  • Copy text to the clipboard, save it to a text file, or do both.
  • Choose the save folder. Empty uses Pictures/Screenshots.
  • Upscale small screenshots before OCR.
  • Filter low-confidence OCR lines.
  • Stop long OCR jobs with a timeout.
  • Show or hide notifications.

Requirements

  • GNOME Shell 45 or newer.
  • python3 with venv support.
  • glib-compile-schemas.
  • Internet access for the first OCR run, because RapidOCR may download model files into the local virtual environment.

Installation

Clone the repository into your GNOME extensions directory:

mkdir -p ~/.local/share/gnome-shell/extensions
git clone https://github.com/<your-user>/<your-repo>.git \
  ~/.local/share/gnome-shell/extensions/extract-text@slim8916.github.io
cd ~/.local/share/gnome-shell/extensions/extract-text@slim8916.github.io

Install the recommended lightweight OCR backend:

python3 -m venv .venv
.venv/bin/python3 -m pip install -r requirements.txt

Compile the settings schema:

glib-compile-schemas schemas

Reload GNOME Shell, then enable the extension:

gnome-extensions enable extract-text@slim8916.github.io

On Wayland, log out and log back in. On X11, Alt + F2, then r, then Enter is usually enough.

Usage

Click the top bar icon or use the configured shortcut. Select a screen region, then release the mouse button. The extension runs OCR and sends the result to the configured output.

The default output is the clipboard. You can change this in the extension preferences.

OCR Backend

The default backend is RapidOCR, which uses PaddleOCR models through ONNX Runtime. This keeps the extension lighter than the official PaddleOCR Python stack.

The extension automatically uses .venv/bin/python3 when that file exists. If there is no local virtual environment, it falls back to python3 from PATH.

Optional official PaddleOCR backend:

.venv/bin/python3 -m pip install paddleocr pillow

The first OCR run can be slower because the backend may download or initialize model files. Later runs should be faster because model files are cached by the backend and by the operating system.

Privacy

OCR runs locally. The selected screenshot is written to a temporary file, passed to the local OCR helper, then deleted.

The extension can write extracted text to the clipboard and/or a local text file depending on your settings.

Development

Useful checks before committing:

glib-compile-schemas schemas
python3 -m py_compile ocr_helper.py
node --check --experimental-default-type=module extension.js
node --check --experimental-default-type=module prefs.js
gnome-extensions pack --force --out-dir /tmp .

Do not commit .venv/, downloaded OCR model files, generated schema binaries, or packaged .zip files.

License

MIT. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors