Today’s machine learning models mostly interpet and classify existing data: for instance, recognizing faces or identifying fraud. Generative AI is a fast-growing new field that focuses instead on building AI that can generate its own novel content. To put it simply, generative AI takes artificial intelligence beyond perceiving to creating.
Two key technologies are at the heart of generative AI: generative adversarial networks (GANs) and variational autoencoders (VAEs).
The more attention-grabbing of the two methods, GANs were invented by Ian Goodfellow in 2014 while he was pursuing his PhD at the University of Montreal under AI pioneer Yoshua Bengio.
Goodfellow’s conceptual breakthrough was to architect GANs with two separate neural networks—and then pit them against one another.
Starting with a given dataset (say, a collection of photos of human faces), the first neural network (called the “generator”) begins generating new images that, in terms of pixels, are mathematically similar to the existing images. Meanwhile, the second neural network (the “discriminator”) is fed photos without being told whether they are from the original dataset or from the generator’s output; its task is to identify which photos have been synthetically generated.
As the two networks iteratively work against one another—the generator trying to fool the discriminator, the discriminator trying to suss out the generator’s creations—they hone one another’s capabilities. Eventually the discriminator’s classification success rate falls to 50%, no better than random guessing, meaning that the synthetically generated photos have become indistinguishable from the originals.
In 2016, AI great Yann LeCun called GANs “the most interesting idea in the last ten years in machine learning.”
VAEs, introduced around the same time as GANs, are a conceptually similar technique that can be used as an alternative to GANs.
Like GANs, VAEs consist of two neural networks that work in tandem to produce an output. The first network (the “encoder”) takes a piece of input data and compresses it into a lower-dimensional representation. The second network (the “decoder”) takes this compressed representation and, based on a probability distribution of the original data’s attributes and a randomness function, generates novel outputs that “riff” on the original input.
In general, GANs generate higher-quality output than do VAEs but are more difficult and more expensive to build.
Like artificial intelligence more broadly, generative AI has inspired both widely beneficial and frighteningly dangerous real-world applications. Only time will tell which will predominate.
On the positive side, one of the most promising use cases for generative AI is synthetic data. Synthetic data is a potentially game-changing technology that enables practitioners to digitally fabricate the exact datasets they need to train AI models.
Getting access to the right data is both the most important and the most challenging part of AI today. Generally, in order to train a deep learning model, researchers must collect thousands or millions of data points from the real world. They must then have labels attached to each data point before the model can learn from the data. This is at best an expensive and time-consuming process; at worst, the data one needs is simply impossible to get one’s hands on.
Synthetic data upends this paradigm by enabling practitioners to artificially create high-fidelity datasets on demand, tailored to their precise needs. For instance, using synthetic data methods, autonomous vehicle companies can generate billions of different driving scenes for their vehicles to learn from without needing to actually encounter each of these scenes on real-world streets.
As synthetic data approaches real-world data in accuracy, it will democratize AI, undercutting the competitive advantage of proprietary data assets. In a world in which data can be inexpensively generated on demand, the competitive dynamics across industries will be upended.
A crop of promising startups has emerged to pursue this opportunity, including Applied Intuition, Parallel Domain, AI.Reverie, Synthesis AI and Unlearn.AI. Large technology companies—among them Nvidia, Google and Amazon—are also investing heavily in synthetic data. The first major commercial use case for synthetic data was autonomous vehicles, but the technology is quickly spreading across industries, from healthcare to retail and beyond.
Counterbalancing the enormous positive potential of synthetic data, a different generative AI application threatens to have a widely destructive impact on society: deepfakes.
We covered deepfakes in detail in this column earlier this year. In essence, deepfake technology enables anyone with a computer and an Internet connection to create realistic-looking photos and videos of people saying and doing things that they did not actually say or do.
The first use case to which deepfake technology has been widely applied is pornography. According to a July 2019 report from startup Sensity, 96% of deepfake videos online are pornographic. Deepfake pornography is almost always non-consensual, involving the artificial synthesis of explicit videos that feature famous celebrities or personal contacts.
From these dark corners of the Internet, the use of deepfakes has begun to spread to the political sphere, where the potential for harm is even greater. Recent deepfake-related political incidents in Gabon, Malaysia and Brazil may be early examples of what is to come.
In a recent report, The Brookings Institution grimly summed up the range of political and social dangers that deepfakes pose: “distorting democratic discourse; manipulating elections; eroding trust in institutions; weakening journalism; exacerbating social divisions; undermining public safety; and inflicting hard-to-repair damage on the reputation of prominent individuals, including elected officials and candidates for office.”
The core technologies underlying synthetic data and deepfakes are the same. Yet the use cases and potential real-world impacts are diametrically opposed.
It is a great truth in technology that any given innovation can either confer tremendous benefits or inflict grave harm on society, depending on how humans choose to employ it. It is true of nuclear energy; it is true of the Internet. It is no less true of artificial intelligence. Generative AI is a powerful case in point.