feat: Add speech-to-text implementation with Whisper by sakshammaurya · Pull Request #1 · ChannelBlend/briskk-speech-assignment

sakshammaurya · 2025-03-09T12:26:18Z

I added speech-to-test implementation with whisper. The sample data was not working, so I added new recorded audio that works for me.

sakshammaurya · 2025-03-12T05:47:18Z

I’m excited to share that I’ve finished all the tasks for the assignment! Please let me know if there’s anything that could use some extra polishing. I'm eager to make it even better!

bhushan-nitish · 2025-03-12T06:30:48Z

Hey @sakshammaurya , Thanks for the Submission, will get back soon.

bhushan-nitish · 2025-03-12T07:12:18Z

Hey @sakshammaurya

Great work on your submission! Here are a few refinements that should be achievable within a 2–3 hour effort and can further polish your solution:

Global Denoiser Initialization:
Instead of initializing DeepFilterNet for every request, consider initializing it once at startup (assuming thread safety) and reusing the instance across requests. This can reduce overhead and improve performance during noise reduction.

Redis Updates for Autocomplete:
Currently, your autocomplete endpoint reads from Redis. Enhancing this by updating search counts after each successful transcription would provide more dynamic and accurate ranking based on actual user interactions.

Enhanced Error Handling:
While your error handling is solid, you could refine it further by adding more granular error messages and structured logging. This will help in debugging and make the service more robust in production.

WebSocket Message Structure:
Instead of sending transcription text and autocomplete suggestions as separate messages, consider combining them into a single, structured payload (e.g., a JSON object). This approach simplifies client-side processing and improves the clarity of your real-time communication.

Concurrency and Performance Considerations:
Evaluate how your endpoints handle multiple simultaneous requests. You might explore options like async task queues or rate limiting to ensure the service scales gracefully under load.

These enhancements are relatively minor but will significantly boost the overall quality, maintainability, and performance of your service. Keep up the excellent work, and we look forward to seeing your updated implementation!

Feel free to choose the refinements which are quicker to develop and consider timeboxing this effort for 2–3 hours.

CC: @Vishal-CB @yatender-oktalk

sakshammaurya · 2025-04-12T08:56:16Z

Hi,
Apologies for the late response; I was expecting communication via email, so I didn't check your message here. Thank you for your in-depth analysis and suggestions on my code. I have updated my code per your suggestion and committed it to the same branch.

I also wanted to inquire if the position is still open, as I remain very interested in the opportunity.

sakshammaurya added 2 commits March 9, 2025 17:48

feat: Add speech-to-text implementation with Whisper

48a51f4

feat:Added Noise reduction and search suggestion

c5dd10b

Optimize FastAPI speech-to-text service

35f05b5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add speech-to-text implementation with Whisper#1

feat: Add speech-to-text implementation with Whisper#1
sakshammaurya wants to merge 3 commits into
ChannelBlend:mainfrom
sakshammaurya:candidate-saksham_maurya

sakshammaurya commented Mar 9, 2025

Uh oh!

sakshammaurya commented Mar 12, 2025

Uh oh!

bhushan-nitish commented Mar 12, 2025

Uh oh!

bhushan-nitish commented Mar 12, 2025

Uh oh!

sakshammaurya commented Apr 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sakshammaurya commented Mar 9, 2025

Uh oh!

sakshammaurya commented Mar 12, 2025

Uh oh!

bhushan-nitish commented Mar 12, 2025

Uh oh!

bhushan-nitish commented Mar 12, 2025

Uh oh!

sakshammaurya commented Apr 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants