Machine learning and AI are quickly becoming commonplace with the tools every photographer uses. There are neural filters for Photoshop, AI enhancement tools in Luminar Neo and PortraitPro, and even programs that use AI to generate captions for photos.
These enhancements and features may look convincing, but that's to be expected for a consumer-ready product. So, what's happening in the development and research space? How good is the AI that isn't available to the masses?In a word: scary.
In this video by Vox, producer Joss Fong takes us on a journey through AI research into image creation and how it all started with asking software to create a green school bus back in 2015. From there, we learn that in the last two years, OpenAi, which has been a trailblazer in this field, has had monumental success with its two iterations of image AI called DALL-E and DALL-E 2. These programs are based on something called prompt engineering, which works by giving the program a prompt like “a bowl of bananas on a table,” and through deep learning of the millions of images used as training data, it can create a mathematical latent space, which it can use to create an image from the prompt.
It's important to understand that these programs do not borrow pixels from images they have learned to create an image with multiple items. Rather, they are creating these images from scratch, and the more specific the prompt, the more interesting the creation. For example, at 5:32 in the video we see this prompt: “1980s analog synthesizer hardware partially made of polygonal flesh with wires made of tentacles and knobs made of suction cups, dark-colored lighting, leaking cosmic ooze, highly detailed, in lomography.”
This technology is exciting and a bit scary at the same time. The photorealism alone was enough to cause concern, let alone the exponential jump in the learning from even DALL-E to DALL-E 2, which was just one year. And OpenAi is just one of many companies working on this specific kind of research. Imagine what this will look like when it becomes available to everyone as a commercially viable product.