Skip to content

feat: Add speech-to-text implementation with Whisper#1

Open
sakshammaurya wants to merge 3 commits into
ChannelBlend:mainfrom
sakshammaurya:candidate-saksham_maurya
Open

feat: Add speech-to-text implementation with Whisper#1
sakshammaurya wants to merge 3 commits into
ChannelBlend:mainfrom
sakshammaurya:candidate-saksham_maurya

Conversation

@sakshammaurya
Copy link
Copy Markdown

I added speech-to-test implementation with whisper. The sample data was not working, so I added new recorded audio that works for me.

@sakshammaurya
Copy link
Copy Markdown
Author

I’m excited to share that I’ve finished all the tasks for the assignment! Please let me know if there’s anything that could use some extra polishing. I'm eager to make it even better!

@bhushan-nitish
Copy link
Copy Markdown
Collaborator

Hey @sakshammaurya , Thanks for the Submission, will get back soon.

@bhushan-nitish
Copy link
Copy Markdown
Collaborator

Hey @sakshammaurya

Great work on your submission! Here are a few refinements that should be achievable within a 2–3 hour effort and can further polish your solution:

Global Denoiser Initialization:
Instead of initializing DeepFilterNet for every request, consider initializing it once at startup (assuming thread safety) and reusing the instance across requests. This can reduce overhead and improve performance during noise reduction.

Redis Updates for Autocomplete:
Currently, your autocomplete endpoint reads from Redis. Enhancing this by updating search counts after each successful transcription would provide more dynamic and accurate ranking based on actual user interactions.

Enhanced Error Handling:
While your error handling is solid, you could refine it further by adding more granular error messages and structured logging. This will help in debugging and make the service more robust in production.

WebSocket Message Structure:
Instead of sending transcription text and autocomplete suggestions as separate messages, consider combining them into a single, structured payload (e.g., a JSON object). This approach simplifies client-side processing and improves the clarity of your real-time communication.

Concurrency and Performance Considerations:
Evaluate how your endpoints handle multiple simultaneous requests. You might explore options like async task queues or rate limiting to ensure the service scales gracefully under load.

These enhancements are relatively minor but will significantly boost the overall quality, maintainability, and performance of your service. Keep up the excellent work, and we look forward to seeing your updated implementation!

Feel free to choose the refinements which are quicker to develop and consider timeboxing this effort for 2–3 hours.

CC: @Vishal-CB @yatender-oktalk

@sakshammaurya
Copy link
Copy Markdown
Author

Hi,
Apologies for the late response; I was expecting communication via email, so I didn't check your message here. Thank you for your in-depth analysis and suggestions on my code. I have updated my code per your suggestion and committed it to the same branch. ​

I also wanted to inquire if the position is still open, as I remain very interested in the opportunity.​

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants