Controlling a computer with speech
|
Vedics
The Vedics [12] program, also written in Python, integrates with the Gnome version 2 and version 3 desktop environment as well as Unity. Like Palaver, it comes without any user interface, but it understands only a selection of predefined English speech commands. To start Firewall, for example, you say "Run application," wait a few seconds, and then say "Firefox."
The program uses PocketSphinx as its engine, and the degree of recognition is not very good. Instead of "Move down," the program understood it as "Minimize window." It also took up a lot of processing on our test system, reacted to faint background noise, and crashed repeatedly.
To try Vedics, you can find it as a ready-made DEB package [13] on SourceForge. There, you can also download a PDF describing all the commands. You can download the tar archive, unzip it, and install the program using ./configure && make && sudo install . In any case, you should call up the program using vedics in the terminal. Only then can you determine what texts the program recognizes and whether it has crashed (Figure 7).
Conclusion
None of the candidates could compete with Siri or the commercial Windows programs. The language recognition turned out to be matter of luck, and the PocketSphinx engine [14] used by most programs lagged miles behind the commercial Dragon NaturallySpeaking.
Installation proved rocky because of all the required dependencies, and the operating concept was often cumbersome. Disabled users in particular would have a hard time getting the programs to work without assistance.
Palaver proved to have the best speech recognition capabilities, but it is inextricably linked with Google. The huge range of functions Simon provides can only be achieved with a massive amount of configuring – if you can get the program to work at all. Blather and FreeSpeech seem incomplete, and Vedics proved altogether useless in its current state. FreeSpeech at least allows input of English texts, as long as the extensive reworking isn't a bother.
Because the work on programs and engines is proceeding at its current slow rate, controlling the PC via speech may remain wishful thinking for Linux users for some time.
Infos
- Blather: http://www.jezra.net/projects/blather
- Blather repository: https://gitorious.org/blather
- Sphinx Knowledge Base Tool: http://www.speech.cs.cmu.edu/tools/lmtool-new.html
- CMU-Cam Toolkit: http://www.speech.cs.cmu.edu/SLM/CMU-Cam_Toolkit_v2.tar.gz
- FreeSpeech: http://code.google.com/p/freespeech-vr/downloads/list
- Palaver: http://sourceforge.net/projects/palaver/
- Palaver repository: https://github.com/JamezQ/Palaver
- Simon: http://simon-listens.org/index.php?id=396&L=1
- Julius: http://julius.sourceforge.jp/en_index.php
- Simon Listens: http://www.simon-listens.org/index.php?id=122&L=1
- Simon handbook: http://userbase.kde.org/Simon/Handbook
- Vedics: http://vedics.sourceforge.net/
- Vedics repository: http://sourceforge.net/projects/vedics/files/vedics/
- PocketSphinx: http://cmusphinx.sourceforge.net/wiki/download
« Previous 1 2 3 Next »
Buy this article as PDF
Pages: 5
(incl. VAT)