Hi everyone,
While looking at the code in inference.py, I noticed that the last audio chunk appears to be sent to the inference engine twice: once inside the loop and then again in the following if block.
|
def process_audio( |
|
message_processor: MessageProcessor, |
|
sample_rate: int, |
|
data: np.ndarray): |
|
""" |
|
Stream audio data in fixed-size chunks over a WebSocket connection. |
|
|
|
Args: |
|
message_processor (MessageProcessor): class that processes the audio chunks. |
|
sample_rate (int): Audio sample rate (Hz). |
|
data (np.ndarray): Audio samples as int16 array. |
|
""" |
|
# speech_chunk_size is expressed in seconds, so the number of samples corresponding to |
|
# one speech chunk is the following |
|
samples_per_chunk = int( |
|
sample_rate * message_processor.speech_processor.speech_chunk_size) |
|
i = 0 |
|
for i in range(0, len(data), samples_per_chunk): |
|
output = message_processor.process_speech(data[i:i + samples_per_chunk].tobytes()) |
|
LOGGER.debug(f"response: {output}") |
|
# send last part of the audio |
|
if i < len(data): |
|
output = message_processor.process_speech(data[i:].tobytes()) |
|
LOGGER.debug(f"response: {output}") |
It seems that simplifying the loop slightly might avoid this behavior. For example:
def process_audio(
message_processor: MessageProcessor,
sample_rate: int,
data: np.ndarray,
):
"""
Stream audio data in fixed-size chunks over a WebSocket connection.
"""
samples_per_chunk = int(
sample_rate * message_processor.speech_processor.speech_chunk_size
)
for start in range(0, len(data), samples_per_chunk):
chunk = data[start:start + samples_per_chunk]
if len(chunk) == 0:
continue
output = message_processor.process_speech(chunk.tobytes())
LOGGER.debug(f"response: {output}")
Could you please take a look, and if possible fix this.
Hi everyone,
While looking at the code in
inference.py, I noticed that the last audio chunk appears to be sent to the inference engine twice: once inside the loop and then again in the following if block.simulstream/simulstream/inference.py
Lines 40 to 63 in 8a14aa1
It seems that simplifying the loop slightly might avoid this behavior. For example:
Could you please take a look, and if possible fix this.