- | 4:00 pm
This AI can produce stunning images with just a few words of description, but is it art?
What does it mean to make art when an algorithm automates so much of the creative process itself?
A picture may be worth a thousand words, but thanks to an artificial intelligence program calledĀ DALL-E 2, you can have a professional-looking image with far fewer.
DALL-E 2 is aĀ new neural networkĀ algorithm that creates a picture from a short phrase or sentence that you provide.Ā The program, which was announced by the artificial intelligence research laboratory OpenAI in April 2022, hasnāt been released to the public. But a small and growing number of peopleāmyself includedāhave been given access to experiment with it.
As a researcherĀ studying the nexus of technology and art, I was keen to see how well the program worked. After hours of experimentation, itās clear that DALL-Eāwhile not without shortcomingsāis leaps and bounds ahead of existing image generation technology. It raises immediate questions about how these technologies will change how art is made and consumed. It also raises questions about what it means to be creative when DALL-E 2 seems to automate so much of the creative process itself.
A STAGGERING RANGE OF STYLE AND SUBJECTS
OpenAI researchers built DALL-E 2Ā from an enormous collection of images with captions. They gathered some of the images online and licensed others.
Using DALL-E 2 looks a lot like searching for an image on the web: you type in a short phrase into a text box, and it gives back six images.
But instead of being culled from the web, the program creates six brand-new images, each of which reflect some version of the entered phrase. (Until recently, the program produced 10 images per prompt.) For example, when some friends and I gave DALL-E 2 the text prompt ācats in devo hats,āĀ it produced 10 imagesĀ that came in different styles.
Nearly all of them could plausibly pass for professional photographs or drawings. While the algorithm did not quite grasp āDevo hatāāthe strange helmets worn by the New Wave band Devoāthe headgear in the images it produced came close.
Over the past few years, a small community of artists have been using neural network algorithms to produce art. Many of these artworks have distinctive qualities that almost look like real images,Ā but with odd distortions of spaceĀ ā a sort of cyberpunk Cubism. The most recent text-to-image systemsĀ often produce dreamy, fantastical imageryĀ that can be delightful but rarely looks real.
DALL-E 2 offers a significant leap in the quality and realism of the images. It can also mimic specific styles with remarkable accuracy. If you want images that look like actual photographs, itāll produce six life-like images. If you want prehistoric cave paintings of Shrek, itāll generate six pictures of Shrek as if theyād been drawn by a prehistoric artist.
Itās staggering that an algorithm can do this. Each set of images takes less than a minute to generate. Not all of the images will look pleasing to the eye, nor do they necessarily reflect what you had in mind. But, even with the need to sift through many outputs or try different text prompts, thereās no other existing way to pump out so many great results so quicklyānot even by hiring an artist. And, sometimes, the unexpected results are the best.
In principle,Ā anyone with enough resources and expertise can make a system like this. Google ResearchĀ recently announced an impressive, similar text-to-image system, and one independent developer is publicly developing their own version thatĀ anyone can try right now on the web, although itās not yet as good as DALL-E or Googleās system.
Itās easy to imagine these tools transforming the way people make images and communicate, whether via memes, greeting cards, advertisingāand, yes, art.
WHEREāS THE ART IN THAT?
I had a moment early on while using DALL-E 2 to generate different kinds of paintings, in all different stylesālike āOdilon Redon painting of Seattleāāwhen it hit me that this was better than any painting algorithm Iāve ever developed. Then I realized that it is, in a way, a better painter than I am.
In fact, no human can do what DALL-E 2 does: create such a high-quality, varied range of images in mere seconds. If someone told you that a person made all these images, of course youād say they were creative.
ButĀ this does not make DALL-E 2 an artist. Even though it sometimes feels like magic, under the hood it is still a computer algorithm, rigidly following instructions from the algorithmās authors at OpenAI.
If these images succeed as art, they are products of how the algorithm was designed, the images it was trained on, andāmost importantlyāhow artists use it.
You might be inclined to say thereās little artistic merit in an image produced by a few keystrokes. But in my view, this line of thinking echoesĀ the classic takeĀ that photography cannot be art because a machine did all the work. Today the human authorship and craft involved in artistic photography are recognized, and critics understand that the best photography involves much more than just pushing a button.
Even so, we often discuss works of art as if they directly came from the artistās intent. The artist intended to show a thing, or express an emotion, and so they made this image. DALL-E 2 does seem to shortcut this process entirely: you have an idea and type it in, and youāre done.
But when I paint the old-fashioned way, Iāve found that my paintings come from the exploratory process, not just from executing my initial goals. And this is true for many artists.
Take Paul McCartney, who came up with the track āGet Backā during a jam session. He didnāt start with a plan for the song; he just started fiddling and experimentingĀ and the band developed it from there.
PicassoĀ described his process similarly: āI donāt know in advance what I am going to put on canvas any more than I decide beforehand what colors I am going to use . . . Each time I undertake to paint a picture I have a sensation of leaping into space.ā
InĀ my own explorations with DALL-E 2, one idea would lead to another which led to another, and eventually Iād find myself in a completely unexpected, magical new terrain, very far from where Iād started.
PROMPTING AS ART
I would argue that the art, in using a system like DALL-E 2, comes not just from the final text prompt, but in the entire creative process that led to that prompt. Different artists will follow different processes and end up with different results that reflect their own approaches, skills and obsessions.
I began to see my experiments as a set of series, each a consistent dive into a single theme, rather than a set of independent wacky images.
Ideas for these images and series came from all around, often linked by a set ofĀ stepping stones. At one point, while making images based on contemporary artistsā work, I wanted to generate an image of site-specific installation art in the style of the contemporary Japanese artistĀ Yayoi Kusama. After trying a few unsatisfactory locations, I hit on the idea of placing it inĀ La Mezquita, a former mosque and church in CĆ³rdoba, Spain. I sentĀ the picture to an architect colleague, Manuel Ladron de Guevara, who is from CĆ³rdoba, and we began riffing on other architectural ideas together.
This became a series on imaginary new buildings in different architectsā styles.
So Iāve started to consider what I do with DALL-E 2 to be both a form of exploration as well as a form of art, even if itās often amateur art like the drawings I make on my iPad.
Indeed some artists, likeĀ Ryan Murdoch, have advocated for prompt-based image-making to be recognized as art. He points to theĀ experienced AI artist Helena Sarin as an example.
āWhen I look at most stuff fromĀ Midjourneyāāanother popular text-to-image systemāāa lot of it will be interesting or fun,ā Murdoch told me in an interview. āBut with [Sarinās] work, thereās a through line. Itās easy to see that she has put a lot of thought into it, and has worked at the craft, because the output is more visually appealing and interesting, and follows her style in a continuous way.ā
Working with DALL-E 2, or any of the new text-to-image systems, means learning its quirks and developing strategies for avoiding common pitfalls. Itās also important to know aboutĀ its potential harms, such as its reliance on stereotypes, and potential uses for disinformation. Using DALL-E 2, youāll also discover surprising correlations, like the way everything becomes old-timey when you use an old painter, filmmaker or photographerās style.
When I have something very specific I want to make, DALL-E 2 often canāt do it. The results would require a lot of difficult manual editing afterward. Itās when my goals are vague that the process is most delightful, offering up surprises that lead to new ideas that themselves lead to more ideas and so on.
CRAFTING NEW REALITIES
These text-to-image systems can help users imagine new possibilities as well.
Artist-activist Danielle BaskinĀ told me that she always works āto show alternative realities by ārealā example: either by setting scenarios up in the physical world or doing meticulous work in Photoshop.ā DALL-E 2, however, āis an amazing shortcut because itās so good at realism. And thatās key to helping others bring possible futures to life ā whether its satire, dreams or beauty.ā
She has used it to imagineĀ an alternative transportation systemĀ andĀ plumbing that transports noodles instead of water, both of which reflectĀ her artist-provocateur sensibility.
Similarly, artist Mario KlingemannāsĀ architectural renderings with the tents of homeless peopleĀ could be taken as a rejoinder toĀ my architectural renderings of fancy dream homes.
Itās too early to judge the significance of this art form. I keep thinking of a phrase from the excellent book āArt in the After-CultureāāāThe dominant AI aesthetic is novelty.ā
Surely this would be true, to some extent, for any new technology used for art. The first films by theĀ LumiĆØre brothersĀ in 1890s were novelties, not cinematic masterpieces; it amazed people to see images moving at all.
AI art software develops so quickly that thereās continual technical and artistic novelty. It seems as if, each year, thereās an opportunity to explore an exciting new technologyāeach more powerful than the last, and each seemingly poised to transform art and society.
This article is republished fromĀ The ConversationĀ under a Creative Commons license. Read theĀ original article.