The speech synthesis activities at the (Experimental Phonetics) concentrates on various linguistic and application oriented aspects of speech synthesis. Our goal is to achieve naturally sounding and linguistically motivated speech synthesis.
The speech synthesis system developed at the IMS is the system (). It is based on the original speech synthesis framework developed at CSTR, University of Edinburgh. The current voice of our system uses diphones taken from the project.
Cepstral is a commercial Text to Speech engine that is installed on the Pi and does not require an Internet connection. The voices are higher quality than open source solutions and pricing is dependent on the use case. More information is available is their website:
Espeak is a more modern speech synthesis package than Festival. It sounds clearer but does wail a little. If you are making an alien or a RPi witch then it’s the one for you! Seriously it is a good allrounder with great customisation options.
Google’s Text to Speech engine is a little different to Festival and Espeak. Your text is sent to Google’s servers to generate the speech file which is then returned to your Pi and played using mplayer. This means you will need an internet connection for it to work, but the speech quality is superb.
We provide a demo where you can online. The demo provides diphone synthesis with synthesis modules developed at IMS. Not all of these are contained in our open source version; some modules in the open source version are simpler than these, so the synthesis results from the open source version may sound different from what you get from the open source version.
EXTRA: Dan Fountain improved on the above script to speak any length of text (Google limits you to 100 bytes normally). His excellent easy-to-read webpage describes this at