Fred, a trans the man, clicked his mouse, and his tenor tones suddenly sank deeper. He had activated voice change algorithms that provided what looked like an instant vocal cord transplant. “This one’s ‘Seth’,” he said, of a character he was testing on a Zoom call with a reporter. Then he switched to speaking as “Joe”, whose voice was more nasal and upbeat.
Fred’s friend Jane, a trans woman who was also testing the prototype software, laughed and presented some artificial voices that she liked for their feminine sound. “This one’s ‘Courtney’” – bright and upbeat. “This is ‘Maya’” – sharper, sometimes too much. “It’s ‘Alicia’, the one I find has the most vocal variance,” she concluded more quietly. The issues were slight enough to prompt one to think that the pair may not have joined the call with their “real” voices to begin with.
Fred and Jane are the first testers of startup Modulate’s technology that could add new pleasures, protections, and complications to socializing online. WIRED does not use their real names to protect their privacy; trans people are often targets of online harassment. The software is the latest example of the delicate potential of artificial intelligence technology that can synthesize real-looking video or sound, sometimes referred to as deepfakes.
Modulate co-founders Mike Pappas and Carter Huffman initially thought that the technology they call “voice skins” could make the game more fun by allowing players to take on the voices of the characters. As the couple showcased studios and recruited the first testers, they also heard a chorus of interest in using voice skins as a privacy shield. More than 100 people asked if technology can alleviate dysphoria caused by a mismatch between their voice and gender identity.
“We realized that many people don’t think they can participate in online communities because their voice puts them at greater risk,” said Pappas, CEO of Modulate. The company is now working with game companies to deliver voice skins in a way that offers options that are both fun and confidential, while also committing to prevent them from becoming a tool of fraud or harassment themselves.
Games like Fortnite and social apps like Discord have made it common to participate in voice chats with strangers on the internet. As in the early days of texting over the internet, the boom in voice has unlocked both new delights and new horrors.
The Anti-Defamation League found last year that nearly half of gamers had experienced harassment via voice chat while playing, more than texting. A sexist trend in gaming culture sees women and LGBTQ people singled out for particular abuse. When Riot Games launched a team-based shooter valiant in 2020, executive producer Anna Donlon said she was stunned to see a culture of sexist harassment quickly emerge. “I don’t use voice chat if I’m going alone,” she told WIRED.
Modulate’s technology is not yet widely available, but Pappas says he’s in talks with game companies interested in rolling it out. One possible approach is to create modes within a game or community where everyone is assigned a voice skin that matches their character, whether it’s a gruff troll or a knight in action. armor; alternatively, the votes could be assigned at random.
In June, two of Modulate’s voices were launched in a preview of an app called Animaze, which turns a user into a digital avatar during live broadcasts or video calls. Developer, Holotech Studios, markets voices as both a privacy feature and a way to “transform your voice to better suit a character of a different age, gender or body type than yours.” Modulate also offers game companies software that automatically alerts moderators to signs of abuse in voice chats.
Modulate’s voice skins are powered by machine learning algorithms that adjust the audio patterns of a person’s voice to sound like someone else. To teach its technology to express many different tones and timbres, the company collected and analyzed audio from hundreds of actors reading scripts designed to deliver a wide range of intonations and emotions. Individual voice skins are created by tuning algorithms to reproduce the sound of a specific voice actor.