What’s so great about artificial intelligence or AI? A lot, apparently, and it’s making its way into voice over with no apologies. Are you ready? Technology will never rest, and it won’t be idle around the advancement and improvement of voice over and AI—or anything else tech-related.
Some artists worry because a computer-generated voice can sound similar to a human voice. Some software can sound almost exactly like specific human voices. But experts in both AI and voice over say AI isn’t poised to overtake voice over in the not-too-distant future. As Voice 123 notes, “AI voices can help businesses fill the gaps in their operations and further solidify their market presence.”
You Get What You Pay For with Voice Over and AI
In 2020, the CEO of Voice123, Rolf Veldman, said AI impacted 2 percent of our voice over industry. Odds are that hasn’t changed much now. Producers agree that digitization has positively impacted and improved the quality of AI voices.
Advancements in technology have spurred AI to sound like even the most subtle nuances of our human speech. These voices can even take a breath and pause exactly where they’re supposed to. And yes, they are inexpensive, some say even “cheap.” As Voice Archive shares, the advantages of AI VO are speed, lower starting cost, and more control when creating and editing.
On the negative side, as we all know, an AI voice is devoid of emotion, compassion, and empathy. It’s tough to ensure accuracy with flow, pronunciation, and accent, and in fact, an AI voice can sound monotonous and lack spontaneity. It can miss those subtle contextual clues and struggle with acronyms and abbreviations. Some people call AI voices “soulless.” Ouch.
Research has shown that people respond more acutely to a real person voice over than an AI voice.
Synthetic Voices Take the Stage
AI voices such as Siri and Alexa are termed “synthetic voices,” evolving via machine learning technology when text is converted into audible speech. And yes, some of the AI voices you hear may sound like real voice talent, and in fact, those AI voices are created to sound like real humans.
Deep learning has taken the production of AI voice from clunky and robotic, says MIT Technology Review, to something much more palatable and effective. Voice developers can feed audio into an algorithm that learns the pacing, pronunciation, or intonation patterns of human speech.
It cites WellSaid Labs and its two key deep-learning models. The first one can predict from text how a speaker will sound, with “accent, pitch and timber,” the publication explains. The second one gets really real and fills in with “breaths and the way the voice resonates in its environment.”
Then there’s making a human voice sound human with qualities such as inconsistency, expressiveness, and the ability to deliver the same lines in entirely different styles, depending on the context. (VO actors relate to this direction: “Now give me three different reads of the last line of that spot, please”…)
They Still Need Real Voices
Looks like AI voices are even being used to, er, “fix” actors’ speech in film and television, as with the company Resemble.ai, which can tidy up garbled speech or mispronounced words.
And yes, there’s work for voice over artists with AI, says the review. “And for every synthetic voice made by these companies, a voice actor must also supply the original training data.” Whew.
The article notes that some companies want to be fair about working with us VO talent and have asked SAG-AFTRA how to do that. “SAG-AFTRA is also pushing for legislation to protect actors from illegitimate replicas of their voice.”
Richness and Complexity of Human Voices
Liz Barber of London’s Royal Academy of Dramatic Art specializes in communication skills training. She talks about how AI voice development can learn from the world of acting—many people feel that we as voice over performers, are acting every time we read a script.
“Voice is a result of a dynamic relationship between mind and body. It is a physical process,” Barber says. “Actors understand this. Their profession is grounded in the relationship between physical presence and voice.”
She says voice reveals emotions but doesn’t describe them. “AI voice can trigger some extreme emotions in us. What is in question is how the human voice will evolve with our increasing engagement with AI voice.”
Barber adds that “The one thing that will never change is the richness, complexity, and capability of the human voice.”
You Can Outperform AI
If you’re worried about losing a job to an AI voice, take precautions now to ensure that doesn’t happen. You can do so many things that an AI voice cannot. Being there when clients need you, whether by email or text and always being on time with your deadlines—being ahead of them is even better.
Triple-check your audio quality and then your files so what you think is on there is actually “on there.” Showing empathy and support for your client goes a long way, as does anticipate their next need. Be that person they can rely on, not just for great sound and production values, but the one who asks, “What else can I do to help you?” An AI voice cannot and will not do those things, and it can’t convince the client of a strong work ethic and professional business acumen.