Table of Contents
Tokenizers is an experimental plugin which enables developers to use tokenizers inside Unreal Engine's environment.
With this plugin you can:
- Initialize tokenizers from JSON blob or file configuration
- Encode and Decode text
- Use every feature in both C++ and Blueprints
To use this plugin, you'll need the C static library from Tokenizers-cpp. You can either download it directly from the Releases page of this repository or compile it yourself from the Tokenizers-cpp source.
- OS: Windows - 64 bit
- UE: version 5.0 - 5.3
- In your Unreal Engine project, create a
Pluginsfolder if it doesn't already exist. - Navigate to the Releases page.
- Download the source code for the release you want to use.
- Extract the downloaded source code into the
Pluginsdirectory. - Navigate to
Plugins/Tokenizers-UE5/Source/ThirdParty/TokenizersLibrary/Win64. - From the same release page, download
tokenizers_c.liband place it inside theWin64folder. - Delete the placeholder file named
PLACE STATIC LIB HEREfrom theWin64folder.
Want to contribute to? Awesome! Check out the contributing guidelines to get involved. Contributors are encouraged join to the community Discord server.
This project is licensed under the MIT License, except for specific files noted below. See the LICENSE file for more information.
- Tokenizers-cpp:
- Source: Tokenizers-cpp GitHub
- License for
tokenizers_c.h: Apache License 2.0
This project is based on MLC-AI's C/C++ implementation of HuggingFace's Tokenizers library.