GitHub - TechyNilesh/Autowave: AutoWave is an complete audio automatic classification library with other features plottings,audio agumentaion, data loading etc.

The simplest way to classify audio in Python.

Powered by pretrained transformer models (AST, Wav2Vec2, HuBERT) via HuggingFace — fine-tune a state-of-the-art audio classifier on your own dataset in a few lines of code.

from autowave import AudioClassifier

# 1. Load and train
model = AudioClassifier()
model.fit("data/train/")

# 2. Predict
result = model.predict("test.wav")
print(result)  # {"label": "dog_bark", "confidence": 0.94}

# 3. Evaluate
metrics = model.evaluate("data/test/")
print(f"Accuracy: {metrics['accuracy']:.2%}")

# 4. Save & reload
model.save("my_model/")
loaded = AudioClassifier.load("my_model/")

Installation

pip install AutoWave

Requirements: Python ≥ 3.10, PyTorch ≥ 2.0

Quick Start

1. Prepare your dataset

Organize audio files into class subfolders:

data/
  train/
    dog/     bark1.wav  bark2.wav  ...
    cat/     meow1.wav  meow2.wav  ...
    bird/    chirp1.wav chirp2.wav ...
  test/
    dog/     ...
    cat/     ...

2. Train and predict

from autowave import AudioClassifier

model = AudioClassifier()
model.fit("data/train/")
model.predict("data/test/dog/bark_test.wav")
# → {"label": "dog", "confidence": 0.97}

3. Evaluate

results = model.evaluate("data/test/")
print(f"Accuracy: {results['accuracy']:.2%}")
print(results["report"])

4. Save and reload

model.save("my_model/")
loaded = AudioClassifier.load("my_model/")
loaded.predict("new_audio.wav")

Zero-Shot Classification (no training)

Classify audio against any text labels — no dataset or fine-tuning required:

from autowave import ZeroShotClassifier

clf = ZeroShotClassifier()
clf.predict("audio.wav", labels=["dog barking", "cat meowing", "rain", "music"])
# → [{"label": "dog barking", "confidence": 0.91}, ...]

Advanced Options

model = AudioClassifier(
    model_name="ast",          # "ast" | "wav2vec2" | "hubert" | "wavlm" | any HF model ID
    epochs=10,
    batch_size=8,
    learning_rate=1e-4,
    augment=True,              # noise, pitch shift, time stretch, shift
    device="auto",             # "auto" | "cuda" | "mps" | "cpu"
    output_dir="checkpoints/",
    max_duration_s=10.0,
)
model.fit("data/train/", val_folder="data/val/")

Available models

Short name	HuggingFace model	Best for
`ast` (default)	MIT/ast-finetuned-audioset-10-10-0.4593	All audio types
`wav2vec2`	facebook/wav2vec2-base	Speech tasks
`hubert`	facebook/hubert-base-ls960	Speech tasks
`wavlm`	microsoft/wavlm-base	Speech benchmarks

Any HuggingFace AutoModelForAudioClassification-compatible model ID also works.

Export to ONNX

model.export_onnx("model.onnx")

Visualization

from autowave.visualization import plots

plots.waveform("audio.wav")
plots.spectrogram("audio.wav")
plots.mfcc("audio.wav")
plots.spectral_centroid("audio.wav")
plots.time_freq_overview("audio.wav")

Audio Utilities

from autowave.utils.audio import read_properties, resample, convert_format

# Metadata
props = read_properties("audio.wav")
print(props.sample_rate, props.duration_s, props.channels)

# Resample to 16 kHz
resample("audio.mp3", target_sr=16000, output_path="audio_16k.wav")

# Convert format
convert_format("audio.wav", output_format="mp3")

Supported Audio Formats

.wav · .mp3 · .flac · .ogg · .m4a · .aiff

Core Contributors

Nilesh Verma

Satyajit Pattnaik

Kalash Jindal

Citation

If you use AutoWave in your research or project, please cite:

@software{autowave2024,
  author       = {Verma, Nilesh and Pattnaik, Satyajit and Jindal, Kalash},
  title        = {{AutoWave}: Automatic Audio Classification with Pretrained Transformers},
  year         = {2024},
  version      = {2.0.0},
  url          = {https://github.com/TechyNilesh/Autowave},
  note         = {Python library for audio classification using AST, Wav2Vec2, HuBERT, and WavLM}
}

Developed with Love ❤️

Developed for ML researchers, data scientists, Python developers, speech engineers, and the open-source audio community.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
AutoWave		AutoWave
examples		examples
logo		logo
tests		tests
LICENSE.txt		LICENSE.txt
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Quick Start

1. Prepare your dataset

2. Train and predict

3. Evaluate

4. Save and reload

Zero-Shot Classification (no training)

Advanced Options

Available models

Export to ONNX

Visualization

Audio Utilities

Supported Audio Formats

Core Contributors

Citation

Developed with Love ❤️

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Installation

Quick Start

1. Prepare your dataset

2. Train and predict

3. Evaluate

4. Save and reload

Zero-Shot Classification (no training)

Advanced Options

Available models

Export to ONNX

Visualization

Audio Utilities

Supported Audio Formats

Core Contributors

Citation

Developed with Love ❤️

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages