in the world of artificial intelligence, there are so-called text-image generators. It’s a pretty self-explanatory name: based on the phrase the user typed, the system returns an image corresponding to what was written.
By that time, the leader in this type of program was DALL-E, a software created by the OpenAi laboratory. Now Google decided to enter the game with imageAnnounced last Tuesday (24).
Imagen works in the same way as other generators: based on a text, it generates an image. On the page dedicated to the show, she is described as having “an unprecedented degree of photorealism and a keen sense of language”. In fact, look at the images released by the company to understand the potential of the new tool:
According to Google, Imagine produces better images than DALL-E. To reach this conclusion, the company created a comparison metric called DrawBench. This is nothing complicated: they used the same text to create drawings in several generators. Presentations were presented to human judges, who chose their favourites. And the Imagination results were chosen more often than the competitors.
problem with images
Despite the impressive results on Imagine, caution is needed. Ultimately, the images released were chosen to show off the software’s best capabilities – and may not represent an average test result.
Another problem with Imagine: Even with a huge artistic and creative potential, the program can be used to generate fake news and misinformation , as it happened with deep fake,
The Google team also draws attention to the problems caused by the project’s database. Let’s go by the parts: work through the system like this machine learning (“machine learning”). The software is exposed to huge amounts of data (in the case of text-image generators, text and images related to them). The program then studies this data to find patterns (for example, associate the word “ball” with images with different types of balls).
The aim is that, with this learning, the program can repeat these patterns as the user demands. If I type “football” it needs to understand not only that I want an image of a ball, but that it is a brown oval ball with visible seams.
To make images as complex as you saw above, Imagine, of course, requires a huge amount of data. And the higher the quantity, the more difficult it is to filter it. And therein lies the problem:absorbing this information from internet banks, machines learn Carry with you the same prejudices and stereotypes that spread on the net.
“There is a risk that Imagen encodes harmful stereotypes and representations, which justifies our decision not to release Imagen for public use,” the project team said in its report. Official Page, After an initial evaluation, the company identified “various social prejudices and stereotypes” embodied by ImageGen, “including a tendency to generate images of people with lighter skin tones and an inclination to portray different professions in line with Western gender stereotypes”.
It is for these and other reasons that Imagine still does not have a release date for the public. Google is committed to fixing “these challenges and limitations in future work”. It is hoped that, with the new updates, the program will become a secure tool for generating amazing images from simple texts.
Share this article via: