Computer science student creates new tool to make AI-generated art
ASU’s Active Perception Group envisions more sustainable and ethical open-source solutions
Tech experts say that users produce more than 34 million images per day using artificial intelligence, or AI, tools such as Midjourney and DALL-E 2. The results are often inventive and astonishing.
While people might find making AI-generated art a relaxing, creative outlet, these images come at a cost. Server farms, giant data centers full of computers, will consume more energy each year processing AI art than the entire country of Argentina. In 2023, Google used 5.6 billion gallons of water just cooling its servers.
The challenge of how to make these artistic tools available to those who want to use them while keeping an eye on sustainability is a problem that computer science doctoral student Maitreya Patel is keen to solve.
Patel has been working under the supervision of Yezhou “YZ” Yang, an associate professor of computer science and engineering in the School of Computing and Augmented Intelligence, part of the Ira A. Fulton Schools of Engineering at Arizona State University. Yang heads the Active Perception Group, a lab that studies computer vision and image generative AI.
Yang oversees several projects funded by grants from the National Science Foundation dedicated to researching computer visual recognition tools. Some of the novel work being done there seeks to make a system that can create an image, check out what it has produced and learn from the comparison. The computer might draw a dog, scan the image, ask itself if the picture looks like a dog and then update its programming based on the results.
As part of his doctoral research, Patel has created Eclipse, a resource-efficient tool that takes in text prompts and then produces images. He made a demonstration website where a user can type in a short description of what they would like to see, and the AI tool will generate a picture.
A model of more sustainable artificial intelligence
The work deals with the central problem of training a model.
Today, most AI solutions have been created by feeding large sets of data into networks of computers and “training” models — tweaking the algorithms, or sets of instructions, that the computers use to do their work. The software engineer supplies a computer with thousands of pictures of dogs and then tasks it with generating its own dog images.
But Patel and Yang believe there are better ways of harnessing the power of AI than simply using more and more computers to process more and more data.
“We have created a new model pipeline,” Patel says. “Our model will use a small number of processing units and it can be trained in one to two days.”
The team’s work is concerned with three basic issues: creating an image-generating model that requires less time and computational resources to train, producing a good open-source system that can be reused and, finally, making software that users can train exclusively with their own images.
To make a more efficient image-generation model, they have a few new ideas. One is using a training strategy called contrastive learning, or teaching the computer what information is not relevant to get the right result.
Patel and Yang are also using adversarial training, a programming technique that deliberately attacks the image model and tries to get it to fail.
“The advantage to this type of training is that we can discover the shortcomings of the current model, deal with its disadvantages and improve the system based on attacking the results,” Yang says.
Closing the black box
A second concern is ensuring that the best tools are open source. Patel notes that some of the top sites, such as Midjourney, are closed-source or so-called “black box” systems. They are privately developed and information about their model training is off limits to other developers. In a world dominated by closed-source AI, models can’t learn from each other’s training and must consume incredible resources to process the data they take in. This is comparable to all professors writing their own textbooks or every city in the world creating its own language.
“Training only one model can produce 10 tons of carbon emissions,” Patel says. “We’re looking at the overall impact to society if we just create model after model.”
As the New American University, ASU seeks to create impactful sustainability solutions.
“The university has a sustainability focus baked into its charter,” says Ross Maciejewski, director of the School of Computing and Augmented Intelligence. “Yang’s work, which is designed to address and minimize the environmental impact of innovative AI, is an important aspect of honoring our commitment.”
Users can use their own images
Patel and Yang are concerned about matters of privacy and ethics. Many AI image tools are rife with controversy about the provenance of the art that was used to train their models.
The Eclipse team — which also includes computer science doctoral students Sheng Cheng and Sangmin Jung, computer engineering doctoral student Changhoon Kim along with Chitta Baral, a professor of computer science and engineering, serving in an advising role — has a plan that could enable businesses to deploy their own versions of the Eclipse model and train it using only images owned by that enterprise. Concerns about lawsuits or image sources would be eliminated.
These are ideas that are interesting to artists working in the AI space. Erika Gronek, a Fulton Schools photographer, has used AI art tools and even written a book on AI art called Uncanny: AI Speaks for Itself.
“AI isn’t going away,” Gronek says. “It has its critics, and rightly so, but it can also be viewed as another tool in the toolbox for an artist. At the very least, it should be wielded ethically by using proper datasets and sustainably because it can require such immense computing power.”
In June, Yang and the research team will present their work at the prestigious IEEE/CVF Conference on Computer Vision and Pattern Recognition in Seattle. They are also on the lookout for enterprise partners who might want to back further development of their technology.
“We’re trying to figure out the sweet spot where vision and language meet to make critical improvements to the efficiencies of these models,” Yang says.
He also hopes the project will inspire more doctoral and master’s thesis students like Patel.
“Exciting work in AI is being done here at the School of Computing and Augmented Intelligence,” Yang says. “We want to attract and inspire new doctoral and master’s thesis students, help them develop professionally and showcase their efforts.”