This is the first workable version of my voice triggered search bot. I coded it in Python 3.4. If I keep working on it, it can become much more than a search bot. I could add unlimited functions and features and make it respond to numerous commands. Some may include:
– doing backups
– running searches on Google Scholar and Pubmed and speaking the results
– doing video search
– sending reminders to my phone
– checking my email and speaking notifications
– sending notes to cloud accounts
– searching for files on my computer
– reading from Wikipedia
– and so on.
It currently responds to only 2 commands:
“search for” – which returns a Google search for the desired search-term.
“stop listening” – which is a shutdown signal.
The only limitation to what it can be programmed to do lies within the imagination of the coder. I can make it respond to numerous simple and sophisticated commands. It only takes time and heavy testing to expand and improve it.
But what is the utility of this banal code in light of existing programs such as Windows Speech Recognition, Dragon, and the like?
For me, Windows’ speech recognition system (which I tried) is not an option because:
It is not optimized for search.
It has trouble recognizing some words. Mine uses Google’s Speech API, which I think is better than Windows’.
It has limited functionality. I can add unlimited personalized functionalities and commands to mine, the limitation here being my coding skills.
It makes sense to understand that because this bot uses Google’s Speech API and it currently focuses on web search, it needs an internet connection to be functional. However, if I want, I could optimize it to work offline by using other speech recognition Python libraries.
Before developing it further, I need to make it faster.
Another potential negative aspect is that it has no GUI. For me this is not a problem since I don’t need to interact with it. Its purpose is to run in the background and listen to my commands. I don’t need to have a graphical instantiation of the code. A GUI would be unnecessary and distracting at the moment. However, I remain open to the though of adding a graphical interface when other more important aspects of coding are dealt with.
Long story short, if you want to try this out, assuming you have some knowledge of programming, you need to have Python 3.4 installed on your system.
My developing environment is a 64-bit Windows 8. I cannot assert or assume that this works on other systems.
After installing Python 3.4, make sure you get these libraries (modules, packages): speech_recognition, pyttsx, pygame, and webbrowser (I think it’s a core package). Here’s the code:
print("Listening to the magic words: 'search for' or 'stop listening'")
eng.say("I'm sorry, I couldn't reach google")
print("I'm sorry, I couldn't reach google")
Please send me suggestions on how this can be improved in terms of code economy/ecology and efficiency, in terms of functionality and usability (+practicality). Whatever comes to your mind, leave me a message below. I am in lack of inspiration…