Sentry.init({ dsn: 'https://[email protected]/7' });
This website uses cookies. If you continue to browse the page, we will assume that you agree.
OK
18.04.20
Voiceover for the New Generation: Faster, Better, Cheaper
Human voice is an incredible, versatile thing — it can convey a wide range of emotions; its timbre can make you weep from joy or sadness, or put you straight to sleep from boredom. And it all originates in the unique vocal cords and personality of the speaker, who can precisely deliver both the information and feeling to us. This is why talented voice actors are so hard to find and valuable, and why producing a good voiceover product can be so expensive and time-consuming.
Think about it.
To get a good voiceover, at a minimum, you have to gain access to a professional recording studio (which tends to be pricey!) and secure a timeslot on the schedule of your favorite voice actor. Depending on the volume of the future voice product, this time slot may be anywhere from a couple of hours to a couple of weeks or even months — and may require you to pay rent, staff salaries, taxes, insurance and whatnot. To record a book of about 350 pages can cost you, at a very modest estimate, anywhere between $ 1,000 and $ 5, 000. Add to that the costs of cutting, post-editing, re-recording to correct errors, etc., and you’re looking at a pretty significant investment.

But what if you were told that there’s an alternative, and that you can now have your audiobook recorded for just $ 100? Is that even realistic? Yes it is! If we now have Siri, Alexa, Google Assistant, etc., helping us in many aspects of our daily lives, why not take more or less the same ideology and extend it, putting robust, budget-friendly tools right at your fingertips?

Several pioneering companies on the market are doing exactly that: they are building voice robots that make things faster, cheaper, and easier. Regardless of the specific technological solution they are employing — whether they are using neural networks, artificial intelligence, or deep learning; whether they are creating a computer-tinged voice or sampling human voices to create a more natural, intuitively-pleasing sound — their text-to-speech robots can be used in a wide variety of applications, from reading the news, "manning" call centers, to creating audiobooks and running automatic translators, and many, many more.

Among these creators are such giants as Amazon and IBM with their respective projects Polly and Watson, who are creating moderately priced, high-productivity voice robots. Others, such as Acapela, ResponsiveVoice, and ReadSpeaker, are competing in a slightly different market niche, in which subscriptions are not based on output but on price per year. Each of them comes with their own pros and cons, exploring different approaches, and delivering different levels of speed, quality and price tags that fit the needs of their specific target clientele.

In the meantime, here at Amai we are working to cover the whole gamut. While most other companies offer robotic voices at the frequency of only 22 kHz, we built a product that operates at 44 kHz. This allows us to deliver crystal-clear sound, without noise or distortions — all with natural human intonations.

To do this, we start with datasets from professional voice actors and broadcasters recorded at very high quality. We then train our models using artificial intelligence and natural-language understanding technologies. The resulting voices are capable of responding to punctuation — commas, question marks, exclamation points, — replicating the nuances and inflections of organic human speech. We are also constantly increasing the synthesis speed and improving the quality.

So let’s revert now to the above example of an audio book. In the old paradigm, the process of recording a book of roughly 350 pages (or 1 million characters) would have taken you and your team about 2 weeks, at a cost of at least $ 1,000−5,000. With Amai, you can record it in just a day (that's a giant time saving!), sitting at your own computer, at a cost of just $ 99 for the entire product. What’s not to like?

If all of this sounds too good to be true it’s because you haven’t yet had a chance to listen to our Voice. You can also play with and customize our demo — or just see if she does better than the voices of our competitors.