Skip to content

Add support for wav2vec2bert Models#2

Open
Korla-tech wants to merge 5 commits intoengineerchuan:mainfrom
Korla-tech:main
Open

Add support for wav2vec2bert Models#2
Korla-tech wants to merge 5 commits intoengineerchuan:mainfrom
Korla-tech:main

Conversation

@Korla-tech
Copy link
Copy Markdown

This pull request introduces support for wav2vec2-bert models, adding new conversion scripts, quantization tools, and C++ library targets. The changes include new build targets and scripts for handling wav2vec2-bert models, updates to documentation, and improvements to quantization logic for better compatibility and precision. Additionally, there are enhancements to token handling and rendering in the C++ inference code.

Support for wav2vec2-bert models:

  • Added a new Python script models/convert-wav2vec2bert-to-ggml.py to convert HuggingFace wav2vec2-bert models.
  • Introduced new C++ library target wav2vec2-bert and associated build logic in src/CMakeLists.txt for handling wav2vec2-bert architectures.

Quantization improvements:

  • Added quantize-wav2vec2bert.cpp for quantizing wav2vec2-bert models.
  • Improved quantization logic in common-ggml.cpp to only quantize 2D tensors with row sizes compatible with the quantizer block size.

Documentation updates:

  • Updated README.md with instructions for converting, running, and quantizing wav2vec2-bert models, including new script and binary names.

Token handling and rendering enhancements:

  • Added logic in wav2vec2.cpp for identifying and rendering special tokens (such as space, pad, blank, unk), and improved token rendering for output.
  • Added a compatibility wrapper for 1D convolution in wav2vec2.cpp, without this the code crashed on some systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants