Thank you for your interest in contributing! This document provides guidelines for contributing to wav2vec2.cpp.
Be respectful and constructive in all interactions.
- Fork the repository
- Clone your fork:
git clone https://github.com/YOUR_USERNAME/wav2vec2.cpp - Create a branch:
git checkout -b feature/your-feature - Make your changes
- Test thoroughly
- Submit a pull request
# Build
mkdir build && cd build
cmake -DGGML_METAL=ON ..
make -j
# Test
./bin/wav2vec2-cli -m models/wav2vec2-phoneme/ggml-model-f16.bin -f samples/jfk.wav- Code compiles without warnings
- Existing tests pass
- New functionality includes tests
- Accuracy metrics unchanged (run
scripts/eval_pytorch.py) - Code follows project style (see CONVENTIONS.md)
Title: <scope>: <description>
Examples:
wav2vec2: fix attention mask handlingexamples: add streaming inferencedocs: update build instructions
Description:
- What does this PR do?
- Why is this change needed?
- How was it tested?
Submit separate PRs for unrelated changes. Don't mix bug fixes with new features.
- 4-space indentation
- 120 character line limit
snake_casefor functions and variableswav2vec2_prefix for public API functions- See docs/CONVENTIONS.md for details
If you use AI tools (GitHub Copilot, ChatGPT, Claude, etc.) to generate code, please disclose this in your PR description. This project was itself "vibe coded" with AI assistance, so AI-assisted contributions are welcome with transparency.
cd build && ctestCompare against PyTorch reference:
python scripts/eval_pytorch.py \
--audio samples/jfk.wav \
--model models/wav2vec2-phoneme/ggml-model-f16.binExpected: PER < 5% vs PyTorch, or improvement over current baseline.
Include:
- Platform (macOS/Linux/Windows, CPU/GPU)
- Model used
- Steps to reproduce
- Expected vs actual behavior
- Relevant logs
Describe:
- The problem you're trying to solve
- Your proposed solution
- Alternatives considered
By contributing, you agree that your contributions will be licensed under the MIT License.