Only as recently as 2014, Goodfellow et al. introduced the concept of generative and adversarial training with Generative Adversarial Nets (GANs). The idea is to train two neural networks competitively, where the first network generates fake samples and the second network discriminates whether the samples are real or fake. This is a two-player zero-sum game where the generative model is pitted against an adversary. A classic analogy used to describe the dynamics between the generative and discriminative models is the constant battle between a forger and a detective, where the generative model is the forger trying to produce counterfeit items, and the discriminative model is the detective attempting to detect fraudulent items from the real items. The end goal is to have a perfect generator that can always fool the discriminator and a discriminator which cannot tell the difference between real and fake, at which point a Nash equilibrium has been reached.
There are many promising applications for GANs, but currently most are in the field of computer vision. Some use cases include:
At NVIDIA, researchers have built a system that generates fake celebrities from thousands of real celebrity photos. For example, the photo below is generated but looks incredibly realistic:
GANs generated image trained from an image bank with thousands of celebrity headshots.
GANs are defined for real-valued data, where the gradient of the output of the discriminator network informs the generator network how to generate more realistic synthetic data. In computer vision applications GANs work well because the output domain is continuous. That is, to generate a new image the generative model can easily make smooth changes by changing the pixel values by a small amount. For example, if you output an image with a pixel value of 1.0, you can change that pixel value to 1.0001 on the next step.
In Natural Language Processing, it is not trivial to apply GANs because words are discrete tokens, such that a small change made by the generator no longer generates something meaningful. When you output a new word, it is not as simple as changing the pixel value, if you output the word “lion” you can’t obtain a new word (ex: “lioness”) just by adding a small value because the domain space is not continuous. Small changes in the embedding vector will rarely lead to a new valid word. Generating high-quality sentences/paragraphs is an open research problem that is traditionally relegated to Recurrent Neural Networks (RNNs) because they have the capability to understand the semantics and syntactic form of sentences.
At Commerce AI we are working on using GANs to generate new products from existing product data. Shown below are results from training a GAN on our shoe database. The set of images represents the intermediate steps from initial to final design on how the GAN is working to generate new shoes. In the initial step none of the images resemble shoes, but as the generative and discriminative models converge together, the generator is able to generate realistic shoes. Our idea is to use AI to fill the gap that exists between consumer desire and product offering.
Source: Commerce Ai
Stay tuned to learn more about how brands are collaborating with us to differentiate themselves with the help of AI. Commerce AI has a deep learning platform that mines the data and provides product-level insights. Product teams are increasingly interested in leveraging our GANs to design new products.
Contact us at firstname.lastname@example.org if you would like to learn more and click the request demo button above to try a demo.
More coming soon …