Extract Text is a GNOME Shell extension that runs OCR on a selected screen region. It captures the region, sends the image to a local OCR helper, then copies the extracted text to the clipboard, saves it to a text file, or does both.
The extension is designed to keep GNOME Shell light. OCR models are not loaded when the extension is enabled. They are loaded by a separate helper process only while OCR is running, then the helper exits.
This project is intended for local/manual installation from GitHub.
It is not currently packaged for extensions.gnome.org because the OCR backend
uses Python packages and native libraries such as ONNX Runtime and OpenCV.
Those dependencies must be installed explicitly by the user.
- Select any screen region and extract text from it.
- Use PaddleOCR models through RapidOCR and ONNX Runtime.
- Choose OCR backend: RapidOCR, PaddleOCR, or Auto.
- Choose a language/script profile.
- Show or hide the top bar icon.
- Configure the shortcut used to start region selection.
- Copy text to the clipboard, save it to a text file, or do both.
- Choose the save folder. Empty uses
Pictures/Screenshots. - Upscale small screenshots before OCR.
- Filter low-confidence OCR lines.
- Stop long OCR jobs with a timeout.
- Show or hide notifications.
- GNOME Shell 45 or newer.
python3withvenvsupport.glib-compile-schemas.- Internet access for the first OCR run, because RapidOCR may download model files into the local virtual environment.
Clone the repository into your GNOME extensions directory:
mkdir -p ~/.local/share/gnome-shell/extensions
git clone https://github.com/<your-user>/<your-repo>.git \
~/.local/share/gnome-shell/extensions/extract-text@slim8916.github.io
cd ~/.local/share/gnome-shell/extensions/extract-text@slim8916.github.ioInstall the recommended lightweight OCR backend:
python3 -m venv .venv
.venv/bin/python3 -m pip install -r requirements.txtCompile the settings schema:
glib-compile-schemas schemasReload GNOME Shell, then enable the extension:
gnome-extensions enable extract-text@slim8916.github.ioOn Wayland, log out and log back in. On X11, Alt + F2, then r, then
Enter is usually enough.
Click the top bar icon or use the configured shortcut. Select a screen region, then release the mouse button. The extension runs OCR and sends the result to the configured output.
The default output is the clipboard. You can change this in the extension preferences.
The default backend is RapidOCR, which uses PaddleOCR models through ONNX Runtime. This keeps the extension lighter than the official PaddleOCR Python stack.
The extension automatically uses .venv/bin/python3 when that file exists. If
there is no local virtual environment, it falls back to python3 from PATH.
Optional official PaddleOCR backend:
.venv/bin/python3 -m pip install paddleocr pillowThe first OCR run can be slower because the backend may download or initialize model files. Later runs should be faster because model files are cached by the backend and by the operating system.
OCR runs locally. The selected screenshot is written to a temporary file, passed to the local OCR helper, then deleted.
The extension can write extracted text to the clipboard and/or a local text file depending on your settings.
Useful checks before committing:
glib-compile-schemas schemas
python3 -m py_compile ocr_helper.py
node --check --experimental-default-type=module extension.js
node --check --experimental-default-type=module prefs.js
gnome-extensions pack --force --out-dir /tmp .Do not commit .venv/, downloaded OCR model files, generated schema binaries,
or packaged .zip files.
MIT. See LICENSE.