init whisper by Stanwang1210 · Pull Request #6 · EverydayRespect/feature_pipeline

Stanwang1210 · 2026-04-03T01:17:00Z

Add whisper feature

Copilot

Pull request overview

This PR introduces Whisper-based audio feature extraction (log-mel spectrograms) and wires the new extractors into the model loader, along with example YAML configs to run the pipeline with these features.

Changes:

Add Whisper80Extractor and Whisper128Extractor to compute 80/128-bin Whisper log-mel spectrogram features.
Register Whisper extractors in models/utils.py for dynamic model loading.
Add example pipeline configs for running Whisper80/Whisper128 audio extraction.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
models/whisper_80.py	New Whisper 80-mel extractor implementation.
models/whisper_128.py	New Whisper 128-mel extractor implementation.
models/utils.py	Registers new model(s) in the global `model_map` and imports extractors.
config/whisper80_audio.yaml	Adds an example config to run Whisper80 mel extraction.
config/whisper128_audio.yaml	Adds an example config to run Whisper128 mel extraction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-03T01:21:44Z

+from models.open_smile import OpenSmileExtractor
+from models.whisper_80 import Whisper80Extractor


models.open_smile is imported and registered, but there is no corresponding models/open_smile.py in the repo. This will raise ModuleNotFoundError at import time and prevent the pipeline from starting. Either add/commit the missing extractor module or remove the import and model_map entry.

Copilot · 2026-04-03T01:21:44Z

+from models.whisper_80 import Whisper80Extractor
+from models.whisper_128 import Whisper128Extractor


Importing Whisper extractors at module import time makes openai-whisper a hard dependency for any run (since worker.py imports models.utils), even if Whisper isn’t selected. Consider lazy-loading these optional backends (e.g., wrap the imports/model_map registration in try/except ImportError or import inside load_model only when model_name matches) so missing optional deps don’t break unrelated models.

Copilot · 2026-04-03T01:21:44Z

+from models.base import BaseModel
+try:
+    import whisper
+except :


Bare except: will also catch unrelated errors (e.g., runtime errors inside the dependency) and can mask the real failure reason. Catch ImportError explicitly here so only missing-dependency cases trigger this message.

Suggested change

except :

except ImportError:

Copilot · 2026-04-03T01:21:45Z

+        if output == "mel_features":
+
+            return {
+                "mels": mel.cpu().numpy() ,  # shape [num_patches, hidden_dim]


The inline shape comment is incorrect for Whisper mel spectrograms: whisper.log_mel_spectrogram returns something shaped like [n_mels, n_frames] (not [num_patches, hidden_dim]). This can confuse downstream users when inspecting saved Arrow outputs; update the comment to match the actual tensor shape.

Suggested change

"mels": mel.cpu().numpy() , # shape [num_patches, hidden_dim]

"mels": mel.cpu().numpy() , # shape [n_mels, n_frames]

Copilot · 2026-04-03T01:21:45Z

+            audiopath (str, optional): Path to the input audio file.
+
+        Returns:
+            torch.Tensor: A tuple containing the audio tensor and the sample rate.


This docstring claims the method returns a (tensor, sample_rate) tuple, but the implementation returns only the audio array from whisper.load_audio. Please correct the return type/description (and the audiopath arg name) to avoid misleading callers.

Suggested change

audiopath (str, optional): Path to the input audio file.

Returns:

torch.Tensor: A tuple containing the audio tensor and the sample rate.

audio_path (str, optional): Path to the input audio file.

Returns:

numpy.ndarray: Audio samples returned by ``whisper.load_audio``,

resampled to ``self.sampling_rate``.

Copilot · 2026-04-03T01:21:46Z

+        if output == "mel_features":
+
+            return {
+                "mels": mel.cpu().numpy() ,  # shape [num_patches, hidden_dim]


The inline shape comment is incorrect for Whisper mel spectrograms: whisper.log_mel_spectrogram returns something shaped like [n_mels, n_frames] (not [num_patches, hidden_dim]). This can confuse downstream users when inspecting saved Arrow outputs; update the comment to match the actual tensor shape.

Suggested change

"mels": mel.cpu().numpy() , # shape [num_patches, hidden_dim]

"mels": mel.cpu().numpy() , # shape [n_mels, n_frames]

Copilot · 2026-04-03T01:21:46Z

+        Load an audio file and resample if necessary.
+
+        Args:
+            audiopath (str, optional): Path to the input audio file.
+
+        Returns:
+            torch.Tensor: A tuple containing the audio tensor and the sample rate.


This docstring claims the method returns a (tensor, sample_rate) tuple, but the implementation returns only the audio array from whisper.load_audio. Please correct the return type/description (and the audiopath arg name) to avoid misleading callers.

Suggested change

Load an audio file and resample if necessary.

Args:

audiopath (str, optional): Path to the input audio file.

Returns:

torch.Tensor: A tuple containing the audio tensor and the sample rate.

Load an audio file and resample it to the configured sampling rate.

Args:

audio_path (str, optional): Path to the input audio file.

Returns:

np.ndarray: Audio samples returned by ``whisper.load_audio``,

resampled to ``self.sampling_rate``.

Copilot · 2026-04-03T01:21:46Z

+        if "speech_mels_features" in self.feature_list:
+            mel_features = self.extract_mels(audio_data)
+            return [("speech_mels_features", mel_features)]
+


If speech_mels_features is not in feature_list, this method returns None, which will raise a TypeError in worker.py where it blindly iterates for feature_name, feature_value in extractor.extract_features(...). Return an empty list (or yield nothing) instead so the extractor is safe under misconfiguration.

Suggested change

return []

Copilot · 2026-04-03T01:21:46Z

+
+phases:
+  - model:
+      name: "Whisper80"


main.launch_workers_for_phase() expects phase["model"]["base_dir"] to exist when building buffer_root. This config omits base_dir, so running with it will raise KeyError. Add a base_dir (consistent with other configs like first_batch_pyarrow.yaml).

Suggested change

name: "Whisper80"

name: "Whisper80"

base_dir: "Whisper80"

Copilot · 2026-04-03T01:21:46Z

+phases:
+  - model:
+      name: "Whisper128"
+      path: "models/Whisper128"


main.launch_workers_for_phase() expects phase["model"]["base_dir"] to exist when building buffer_root. This config omits base_dir, so running with it will raise KeyError. Add a base_dir (consistent with other configs like first_batch_pyarrow.yaml).

Suggested change

path: "models/Whisper128"

path: "models/Whisper128"

base_dir: "models/Whisper128"

init whisper

1e3155c

Copilot AI review requested due to automatic review settings April 3, 2026 01:17

Copilot started reviewing on behalf of Stanwang1210 April 3, 2026 01:17 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

init whisper#6

init whisper#6
Stanwang1210 wants to merge 1 commit into
EverydayRespect:mp_storagefrom
Stanwang1210:whisper_mp_storage

Stanwang1210 commented Apr 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		from models.open_smile import OpenSmileExtractor
		from models.whisper_80 import Whisper80Extractor

		from models.whisper_80 import Whisper80Extractor
		from models.whisper_128 import Whisper128Extractor

	"mels": mel.cpu().numpy() , # shape [num_patches, hidden_dim]
	"mels": mel.cpu().numpy() , # shape [n_mels, n_frames]

	path: "models/Whisper128"
	path: "models/Whisper128"
	base_dir: "models/Whisper128"

Conversation

Stanwang1210 commented Apr 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants