Voice Input App in Python – Code Release and Overview [May 2017]

I built the following application, primarily, for convenience…

There are times when I don’t wanna type on my laptop. And I didn’t know of any general purpose, simple, minimal application that could do voice recognition and text input in the most basic form, as in: listen to my voice, paste what I just said, so I don’t have to type it.

I use this app in social media replies as well as when I post updates on different channels.

And, to be completely honest, there are times when I eat chocolate while watching scientific lectures (positive reinforcement). Some of these lectures spark spontaneous thoughts that I want to share, and the only ‘clean’ way to do it is by voice: one hand is used for chocolate manipulation, while the other for handling the mouse.

So, this gave me a solid reason to build this application.

I posted a showcase video of the app a few weeks ago…

In that video I was promising that I will do a code release and review sometime in the future. And that ‘sometime’ is now.

Voice Input App – The Code

I’ll jump right into it, not before doing some important remarks.

This app uses the powerful Speech Recognition API of Google, which is pretty damn accurate (even when you speak with a mouthful of dark chocolate).

When I click the ‘Speak’ button, it beeps for voice input, then it does its recognition ‘magic’ and delivers my message on the clipboard. Then I can right-click and paste the message from the clipboard – in any field that takes text.

The following python libraries are needed to build the app:

speech_recognition
pygame
pyperclip

Additionally, you will need to register for API keys with Google Speech Cloud. The current free tier provides sufficient personal use on a monthly basis.

  1. Making the necessary imports.

  1. Instantiating a tkinter object, creating a custom title, using a custom icon (instead of the default), and applying a custom tkinter style to the app.

  1. We’ll be using a microphone image for the speak button, so we have to initiate it via the PhotoImage method. We’re not gonna use it full size, but as a diminished (scaled) sample.

  1. Creating a guiding ‘label’ widget.

  1. This is the main part of the application in which we do the following:

– create a function to handle the click of the speak button
– play sound effect to prompt the user to speak
– initiate the voice recognizer, with some custom parameters
– listen to input
– play sound effect to let the user know that listening has stopped
– input processing (recognition)
– place the recognized message into clipboard
– handling of exceptions (via try – except)

  1. We’ll be using threading to prevent the application from freezing or becoming unresponsive.

  1. Then, we’re creating the ‘Speak’ button, which, when clicked, calls the buttonClick() function which is invoked inside the thr() threading function.

  1. Since the application is minimalist in design, we can ‘force’ it to stay on top of all windows. Finally, we run the mainloop(), which is specific to tkinter.

Conclusion

You can download the single-piece code and the additional files (icons, sounds effects) in the github repository I created for it. I hope it serves you at least as well as it serves me!


Get on my list of friends
More about my book Stress and Adaptation
More about my book Persistent Fat Loss
More about my book Ketone Power
More about my book Periodic Fasting

Related posts:

Comments

comments

One Response to Voice Input App in Python – Code Release and Overview [May 2017]

  1. Gustavo says:

    Very interesting. Thanks for share.

Leave a Reply

Your email address will not be published. Required fields are marked *