What is the best option for publishing the presentation that I am working on?
Building presentations is a core activity in many business, including eLearning. Many presentations are built to be used in a live situation, may it be a lecture, a meeting, a conference or a webinar, but we may occasionally want to publish and share presentations on the web.
When publishing such presentation, we all need to face a decision. Shall we publish only the pictures by themselves (the “silence” alternative), shall we pick up a microphone and record our voice comments, shall we write a script and hire a professional voiceover, or shall we use text-to-speech?
When this choice is discussed in various forums on the web, the discussion turns quickly into a debate about the very idea of substituting the human voice with a computer generated voice, and whether TTS is good enough to substitute human speakers. What I find striking is not the fact that there is a reaction when it comes to something to which we have such emotional attachment as our voice, but the fact that we often fail to see text-to-speech as yet another option at our disposal (as much as silence, our own voice and a professional voice talent) when it comes to publish a presentation, but we rather tend to see text-to-speech as a substitute.
The question on whether we should substitute human voice with text-to-speech is the wrong question. The real question is, what is the best option for publishing this presentation that I am working on?
Let’s go back to the choice that we as author of the presentation are facing. A silent presentation may be a valid and straightforward option if the presentation is completely self-explanatory and does not need any additional comment. Recording the voiceover by ourselves may be a valid option if we have time, a voice with enough quality for the task at hand, and the technical know-how and tools needed for the task. A voice talent may be a splendid alternative if we have the budget, the time and the logistic to select and get a professional voice talent to record a voiceover for our presentation.
But what are the conditions in which text-to-speech may be a proper choice, if not even the best one?
Well, the obvious answer is that we may need text-to-speech when none of the other three options are available for one reason or the other. But there are other, more interesting situations, where text-to-speech not only is acceptable, but it could even be our best option.
- When working with presentations that need to be updated often, human voiceovers may be difficult or even impossible to use. By using text-to-speech, updating the voiceover for a presentation is as easy as editing text.
- When working on multilingual material, we might not have the budget and the logistic possibility to get good speakers in all languages. With Text-to-speech we only need to get our text translated in the target languages, which is a much easier task. We might even adopt mixed solution (human voiceover for some languages, TTS for other languages).
- When we need to be able to publish quickly and 24/7, text-to-speech is always available.
- When we need to publish a large library of presentations, text-to-speech will be able to work faster than real-time, meaning that we can produce several hours of audio in just few minutes.
- When we want to use several voice characters in our presentation, the complexity and the budget needed for a voiceover project might get Hollywood proportions. With Text-to-Speech using several voices is as easy as using only one voice.
- Text-to-speech can also be used to build a prototype of a presentation, testing the script and the way pictures and words go together, before calling in a professional voiceover for the final take.
State of the art text-to-speech has made improvements in the expressivity of the voices, is available in many languages and with several voices available for each language, as you can hear in this sample presentation of English voices.
Slidetalk: introducing 13 of our english voices, for your talking presentations
The availability of many voices opens up the possibility of alternating different voices in the same presentation, which may help increasing engagement from the audience. There are many other ways to improve the way TTS is used. As with any tool, we need to master its strengths and weaknesses to get the most out of it.
The SlideTalk web service has been built to make it easy to add text-to-speech voiceover to presentations, by hiding all technical aspects and allowing us to focus on choosing pictures and type descriptions, while everything else is automatically taken care of. The result is a YouTube video, easy to share. This is called the show, describe, share method. SlideTalk integrates a high quality Text-to-Speech in more than 20 languages and with more than 70 voices.
In conclusion, as eLearning professionals we need to build an arsenal of tools and the competence for choosing which ones are best suited for each project. When it comes to publish presentations online, Voice talents, home-made recording, Text-to-Speech voiceover, silent and self-explaining pictures are all tools at our disposals. We must learn how to use each one of them.
Paolo Leva. 15 years experience with speech technology, first as developer and then, since 2004, as Product Manager. Currently co-founder of a startup called slidetalk.net, a cloud service converting presentations into talking presentations, by using TTS, and publishing them on the web. Always interested in finding new ways to use existing things, and in finding new perspectives on how technology is perceived, marketed, purchased and consumed. Amateur musician in the spare time, and proud father 24/7.Website: www.slidetalk.net