From virtual assistants to audiobook voiceovers, AI voice generation has become a fast-growing field — and it’s no wonder companies are rushing to tap the technology’s potential.
Among them is located in Valencia Voice mod. The startup has developed an AI voice changer and soundboard software that enables instant speech-to-speech conversion. Unlike most of its competitors, the company claims it transforms voting in real time and with low latency, allowing users to converse as they would in real life.
According to Jaime Bosch, CEO and co-founder of Voicemod, the company trains its AI model using publicly available datasets and professional voice actors, resulting in a wide range of vocal expressions, pitches, tones and emotions. Using machine learning techniques, the model learns to understand, analyze and predict a person’s speech patterns and intricacies.
“When a user speaks into our software or application, their voice input is processed in real time,” Bosch told TNW. “Our AI model then applies the learned patterns and transformations to the input, enabling speech conversion immediately.”
Voicemod mainly targets the entertainment industry including gamers, streamers, content creators and vtubers on platforms ranging from Discord and Twitch to Zoom and WhatsApp.
To go further into the increasing user demand for self-expression, pseudonymity and creativity online, in addition to the 100 voting options in its portfolio, the startup is now launching the so-called “AI Humans” collection. While Voicemod already offers filters for human voices, the new collection will be the company’s most human-realistic to date.
Trained on voice acting recordings, AI Humans consists of 20 sonic avatars that vary in personality, gender and age. The personas include Joe, an 80-year-old male voice with a “raspy, sardonic tone” and Jennifer, a 25-year-old female voice, with an “energetic and friendly” character. Users can also adjust the pitch of each persona, changing the perception of the gender and age of the voice.
The video below can give you an idea of what these characters sound like:
“AI voices present exciting opportunities for industries looking to cultivate creative exploration and self-expression, enhance personalization and promote inclusiveness in digital spaces,” said Bosch.
But despite the positive impact AI speech generation can have, the technology also comes with numerous risks. Some of them include abuse, fraudimitation and even voice theftwhich especially affects professional voice actors.
According to Bosch, Voicemod is actively working to mitigate these risks. For example, it is developing a watermarking technology to help platforms identify and track AI-generated voices, while implementing measures to protect the intellectual property of the voice actors it works with.
Bosch believes AI will become “a tool” for these professionals. “Something that might be overlooked in these discussions is that behind any use of real-time voice AI, the use case that Voicemod is targeting is a human effectively driving the AI,” he told TNW.
Voicemod already counts more than 40 million desktop downloads. In the future, it plans to launch on mobile as well and reach millions of monthly active users. It is also working on B2B partnerships with gaming companies and VR headset platforms.
The software is available for free, with the option for a paid PRO version that unlocks additional features and content.