This story originally appeared in Open Source, our weekly newsletter on emerging technology. To get stories like this in your inbox first, subscribe here.
In September, MIT Technology Review highlighted the emergence of artificial intelligence-generated clones of Chinese influencers helping e-commerce platforms like Taobao to promote their products on livestreams. These clones may come across as somewhat robotic to inquisitive viewers, but are nonetheless sensible enough to get the job done, brokering sales in a matter of hours. The best part? The actual influencer doesn’t need to do any of the work, allowing livestreams to extend late into nights and beyond typical hours, with fatigue no longer a consideration.
Deep synthesis technology forms the backbone of this rising phenomenon. In short, it entails using AI to generate or alter images or videos, resulting in fabricated media that appears to be real, but isn’t.
This technology isn’t new, though it has only recently gained headway in terms of its capability to mimic both semblance and behavior of humans. Just a few years ago, in 2019, Alibaba Group commissioned AI developer iFlytek to produce an AI-generated version of high-profile Chinese influencer Li Jiaqi, using it to market products such as eye drops and instant noodles. The synthetic Li was unnatural, clunky, and a poor representation of the actual influencer.
It’s worth noting that examples like Taobao’s AI-cloned streamers and synthetic Li are only a fraction of the technology’s use cases that are rapidly unfolding around the world. More importantly, these are benign examples. The same technology has also been used to manipulate images and videos in previously unimaginable ways, influencing multiple spheres spanning politics, society, and beyond.
Used rightly, and deep synthesis technology can be a holy grail of efficiency. But place it in the wrong hands, and risk suffering calamitous effects.
One of the common concerns lies in the proliferation of deepfakes, which notoriously refers to media synthetically generated using deep synthesis technology. Black Mirror, a British anthology television series about dystopian futures, sought to portray the darker effects of deepfaking in an episode titled Joan Is Awful.
In this episode, protagonist Joan discovers that her life has become an experiment for a show on Streamberry, a parody version of on-demand video service Netflix. That show features the worst parts of Joan’s life, including her infidelity and an unsavory incursion into a church wedding, with a slight twist: it stars actress Salma Hayek in her stead. This was made possible using generative AI. Joan’s life falls apart as a result, realizing later that she fell victim to the scheme after carelessly agreeing to Streamberry’s subscription terms and conditions.
While some aspects of the episode remain confined to the realm of sci-fi, others have already started to unfold in reality. In 2019, the Zao app was introduced to the Chinese market. Zao—which means to make, build, or fabricate in Mandarin Chinese—enables users to digitally graft their faces onto the bodies of actors and actresses in movies, television shows, and music videos. This feature requires only a set of selfies or profile photos, and takes about ten seconds to generate a video.
Shortly after its launch, Zao became viral for its novelty, though Chinese superapp WeChat quickly banned users from sharing material generated using the face-swapping app. It cited privacy concerns as the reason for the ban and sought to nip the problem at its bud, bringing Zao’s virality to a premature end. But that is just the start of a deepfake problem that looks set to affect the world.
While deepfaking is a global problem, the world splinters in perspective on the solutions needed.
China wants to solve it with regulations, enacting new laws in December 2022 to govern the use of deep synthesis technology. Issued by the Cyberspace Administration of China, the rules significantly restrict the usage of AI-generated media, with consent, disclosure, and identity authentication among the key tenets. Intriguingly, a rule specifies the requirement of carrying identifiers, like watermarks. One might wonder how well that will hold up—if AI can be used to superimpose faces onto bodies, wouldn’t it be equally capable of erasing watermarks from media?
Meanwhile, the west is seemingly singing a different tune from China. While countries like the US are equally cognizant of the dangers that deepfakes can pose, solutions hitherto proposed in this region tend to take on a more technological spin. For years, the US government has been collaborating with various research institutions to develop tools that can reliably identify and circumvent deepfakes.
One example is PhotoGuard, which is a preemptive solution that uses “perturbations” to disrupt the manipulative capabilities of AI models. Like computers, AI models see images as complex sets of data points describing pixel color and position. PhotoGuard immunizes them from manipulation by making minute changes to the way they are mathematically represented. These changes are invisible to the human eye thus preserving their visual integrity, while simultaneously protect them from manipulation if they are fed to an AI model.
Whether such efforts come to fruition will have significant implications. American politics has increasingly been mired in AI-generated misinformation, and this is majorly driven by deepfakes.
The reason China’s stance differs from the west boils down to more than just prejudice and preference. For one, China already has systems in place to control the transmission of content in online spaces, allowing the nation to institute and enforce new rules seamlessly. Contrast this with the US, which paints a different picture. For example, the saga over a potential TikTok ban has dragged on for months since CEO Shou Zi Chew testified before the US Congress to address concerns over the short video app’s ties to China. The country remains divided on its next course of action.
Whether through tools or regulation, the proof of the pudding is in the eating. Deep synthesis technology can enhance efficiency and innovation, but should not come at the expense of ethical integrity and social responsibility. Let’s tread carefully.