DALL-E mini is the AI that brings to life all the crazy “what if” questions you never asked: what if Voldemort was a member of green day† What if there was a McDonald’s in Mordor? What if scientists a roomba to the bottom of the Mariana Trench†
No more wondering what a Roomba cleaning the bottom of the Mariana Trench would look like. DALL-E mini can show you.
DALL-E mini is an online text-to-image generator that has become extremely popular on social media in recent weeks.
The program takes a text phrase — such as “sunset in the mountains,” “Eiffel Tower on the moon,” “Obama is making a sand castle,” or anything else you can think of — and creates an image of it.
The results can be strangely beautiful, like “synthwave buddha,” or “a chicken nugget smoking a cigarette in the rain.” Others, like “Teletubbies in Nursing Home,” are really terrifying.
DALL-E mini rose to prominence on the internet after social media users began using the program to grind recognizable pop culture icons into bizarre, photo-realistic memes.
Boris Dayma, a Texas computer engineer, originally created DALL-E mini as an entry for a coding competition. Dayma’s program gets its name from the AI it’s based on: Inspired by the incredibly powerful DALL-E from the artificial intelligence company OpenAI, DALL-E mini is basically a web app that applies similar technology in an easier-to-access way. (Dayma has since rebranded DALL-E mini to Craiyon at the request of the company).
While OpenAI restricts most access to its models, Dayma’s model can be used by anyone on the web and was developed in collaboration with the AI research communities on Twitter and GitHub.
“I would get interesting feedback and suggestions from the AI community,” says Dayma told NPR over the phone. “And it got better, and better, and better” at image generation, until it reached what Dayma called “a viral threshold.”
While the images DALL-E mini produces may still look distorted or unclear, Dayma says it has reached a point where the images are always good enough, and it has reached a wide enough audience that the conditions were right to make the project go viral.
Learning from the past and a complicated future
While DALL-E mini is unique in its widespread accessibility, this isn’t the first time AI-generated art has been in the news.
In 2018, art auction house Christie’s sold an AI-generated portrait for over $400,000†
Ziv Epstein, a researcher with the Human Dynamics Group at the MIT Media Lab, says the advancement of AI image generators is complicating notions of ownership in the art industry.
In the case of machine learning models like DALL-E mini, there are plenty of stakeholders to consider when considering who should get credit for creating a work of art.
“These tools are these diffuse socio-technical systems,” Epstein told NPR. †[AI art generation is a] complicated arrangement of human actors and computational processes interacting in this kind of crazy way.”
First, there are the programmers who created the model.
For DALL-E mini, this is mainly Dayma, but also members of the open-source AI community who contributed to the project. Then there are the owners of the images that the AI is trained on – Dayma used an existing library of images to modify the model, essentially teaching the program how to translate text to images.
Finally, there’s the user who came up with the text prompt – like “CCTV Footage of Darth Vader Stealing a Unicycle” — for DALL-E mini to use. So it’s hard to say who exactly “owns” this image of Gumby gives an NPR Tiny Desk concert†
Some developers are also concerned about the ethical implications of AI media generators.
Deepfakesoften convincing uses of machine learning models to depict fake images of politicians or celebrities are a major concern for software engineer James Betker.
Betker is the creator of Tortoise, a text-to-speech program that implements some of the latest machine learning techniques to generate speech from a reference voice.
Initially, Tortoise started as a side project, but Betker said he was not motivated to develop it further because of the potential abuse.
“That’s what I’m absolutely concerned about — people trying to get politicians to say things they didn’t actually say, or even make affidavits that you’re taking to court… [that are] completely faked,” Betker told NPR.
But the accessibility of open-source AI projects like Dayma’s and Betker’s has also had positive effects. Tortoise has given developers who can’t afford to hire voice actors a way to create realistic voiceovers for their projects. Likewise, Dayma said small businesses used DALL-E mini to create graphics when they couldn’t afford to hire a designer.
The increasing accessibility of AI tools can also help people familiarize themselves with the potential threats posed by AI-generated media. For Dayma and Betker, the accessibility of their projects makes it clear to people that AI is advancing quickly and can spread misinformation.
Epstein of MIT said the same thing: “If humans are able to interact with AI and be sort of creators themselves, in a way it inoculates them against misinformation.”