Using text-to-speech (TTS) software to generate subliminal or other types of audio messages requires a voice engine and an application program that controls it. There are many TTS programs available and it is important to choose one that is suitable. This article describes what characteristics they can have and provides guidelines to aid in the selection process.

A text-to-speech application program allows the input of text and directs a voice engine in how the text is to be converted into audio. Although much simpler and smaller than a voice engine, it is just as important. The programs available can vary greatly with respect to the features they provide.

These features can be grouped into five categories: Cost, System Platform, User Interface, Voice Engine Controls and Audio Output Format.

Cost Considerations

One of the most important considerations is the cost of purchasing a TTS application. There are a few programs that are free of charge, but most are available on average for $30.00 USD. Regardless of the cost, the features provided can be substantially different.

A TTS program that is free of charge can actually have better features than another which has a cost. The reverse is also true in that one that you may have to pay for may not be as appealing or lack some desired characteristics.

Because text-to-speech technology requires both a voice engine and a front-end application, some voice suppliers will bundle their voices with a TTS application. Generally, this means that if you purchase their voice (or voices), the program may be included at no additional cost.

This can be of benefit but only if you are also looking for a suitable premium voice engine to use. If you have already decided on a particular voice engine and it includes a TTS program, then you may be able to save some money. But you should be certain that the voice offered is suitable for your needs.

One of the most popular TTS applications is TextAloud (Windows only) and it can be purchased separately or may be offered as a bonus when purchasing certain voices. If you find a voice that is preferable and TextAloud is included with it, then it suggested that you consider purchasing it if your operating system is a recent version of Microsoft Windows.

If all you require is a useful TTS program and already have a suitable voice engine, then you should consider a free program that is very useful. This is DSpeech and is completely free of charge, does not include annoying nag messages, has no spyware, no licensing fees and has many useful features. Regrettably, DSpeech is only presently available for the Windows platform. If you find it helpful you should consider providing a donation to the author, Dimitrios Coutsoumbas.

System Platform Concerns

Depending on which operating system your computer uses, you may have more or fewer options available to you. There are more TTS programs provided for the Windows platform than any other. This does not mean that there are no options available for MAC OS or Linux, but only that the choices can be limited.

Ideally, the voice engine chosen and the TTS program selected should both be available for all popular operating systems. Unfortunately, there are few voice suppliers that provide different voices and TTS software that work across all computer platforms.

One of the vendors that provide both is Cepstral and the voices they supply are compatible with most Windows versions, MAC OS, Linux, and other operating systems. As a bonus, Cepstral provides a TTS command-line interface program that works almost identically for each platform and also a Windows version with a graphical user interface, known as SwiftTalker. Both are more than capable of converting text into audio in an efficient manner.

Also important is that the TTS program supports the latest version of SAPI, which is an acronym for Speech Application Interface. The most commonly used versions of this are SAPI4 and SAPI5, but more recent versions such as SAPI 5.1 or 5.3 are available. It may be necessary to update your operating system to benefit from more recent versions.

User Interface Issues

The TTS program chosen to generate subliminal audio messages or other types will most likely have a graphical user interface (GUI), but it may also have a command line interface (CLI). Either can achieve the desired results, but for most people a windows type interface is preferred.

If you choose a TTS program that has a Windows type interface, then these are the basic features that are required.

  • Permits data entry of the text to be converted
  • Provides controls for starting and stopping speech
  • Generates the converted speech as a WAV file
  • Provides options for choosing output sample rate and size
  • Provides controls for adjusting voice speed and pitch
  • Allows Speech Synthesis Markup Language (SSML) meta-tags

Other features as listed below are desirable and can make it easier to use the TTS program, but they are not absolutely necessary.

  • Permits opening different input file types (.doc, .rtf, .txt, .html, .xml)
  • Allows direct recording from microphone
  • Permits pasting text from clipboard
  • Provides controls for pausing, speaking from cursor, next sentence, etc.
  • Allows control over voice engine selection
  • Provides additional audio output formats (WAV, MP3 or OGG)
  • Provides options for audio format (channels, sample rate and size)
  • Provides controls for voice engine volume, speed and pitch
  • Allows selection of different SAPI versions

Voice Engine Control Concerns

The TTS application program directs how the voice engine should convert text into speech. If the voice engine allows control over volume, pitch, and speed, then these should ideally be controllable by the TTS program.

However, not all voice engines permit control over the volume, speed or pitch and how they can be adjusted. This is important because even if you choose a voice engine that you consider appropriate, you will likely have to modify it somewhat so that it sounds suitable for your purpose.

It may be necessary to adjust the volume, modify the pitch, or the speed of the generated voice in order for it to sound calm, smooth and relaxing. For the purpose of building subliminal or other types of audio message segments, this is an important consideration.

Audio Output Format Options

All TTS application programs must be able to convert the text into speech and into an audio clip or segment. Choosing an audio output format that is uncompressed, such as WAV is a better choice because it allows you to use the generated audio clip in any way you consider appropriate.

The audio message segments will still have to be integrated with other tracks, but if they are in an uncompressed format then this can be helpful when mixing the final recording.

The TTS program should provide at least one and preferably more output audio format options. More importantly, it should permit options with respect to sound characteristics.

It should be possible to choose between mono or stereo, different sample frequencies and either an 8-bit or 16-bit sample size. The preferences for generating an audio message clip would be in mono, a sample size of 16 bits and a sample rate of 44,100 kHz.

To conclude, if you decide to use text-to-speech software you will require both a human sounding voice engine and a front-end application that can control it. A TTS program that directs how the text is converted into audible speech can be free of charge, purchased separately or bundled with a voice engine.

You may even decide to acquire and use one or more TTS applications and this is advisable. As long as they meet the basic requirements and perhaps have more desirable features, then you are on the right track in generating your subliminal or other types of audio messages.

About the Author

James Kudlak specializes in writing articles and building websites across a variety of topics such as online dating, paternity issues or insurance. Visit one of his newer website creations relating to pet insurance at http://petcarehealthinsurance.com/ which helps people select among the many Pet Insurance Companies available.

(c) Copyright – B. James Kudlak. All Rights Reserved Worldwide.

Tagged with:

Filed under: Building Subliminal Recordings

Like this post? Subscribe to my RSS feed and get loads more!