The AI among us

There are very few things in this world that take me by surprise and I must say I was surprised to hear about the latest in artificial intelligence that’s been made available to the public.  There are now a number of AI image generators that produce any image from a text prompt. While there are some exceptions to the use of these technologies, for the most part, the images it produces are nothing short of amazing and fun. 

After some time experimenting with these image  generators I realized that there is going to be better image generators in the nearby future as the results are mostly flawed, but some of the results are indistinguishable from something created naturally in reality, which leads me to believe that the Internet will be saturated with such images and many people will be deceived because of the rapid creation time. Having said that, it is important to point out that there will be regulations instituted over the use of such images.

Some generators I have found:

Craiyon: a lower resolution AI image generator. Not much has been done to prevent bias, but bias is an indicator which should be addressed, highlighted, and rectified instead of masking it in the data. Developers should making it clear that the bias exists and what to use in the prompt to get the desired results. This one is free to use and has ads.  Potential for misuse is there, however restricting access is detrimental to the development of the AI. 

One example of the Craiyon results for the prompt “A tavern in the middle of the desert”:

Craiyon’s image generation of “A tavern in the middle of the desert”

You will notice that the results returned were not consistent, some are drawings, some are an attempt as realism, but for the most part it generated a decent variety of images and it took less than 2 minutes to create.

OpenAI’s DALL-E 2: a higher resolution engine that produces interesting results. They have actively removed bias by addressing it using some sort of algorithm and have set up a payment structure that for every $15 you get 115 credits, and have a very restrictive content policy that could easily put a ban on your account and lose your money. One credit generates four images (used to be six). Images generated took less than 30 seconds go generate.

The results for DALL-E 2’s image generation of “A tavern in the middle of the desert”:

Open AI’s image generation of “A tavern in the middle of the desert”

You will notice that because of the prompt, I was not specific about art, it attempted to produce by default, realistic photo representation of what was requested.

Google’s Imagen: a high resolution image generator with restricted access. It looks like it is quite powerful but again the bias does not appear to be addressed so instead they are restricting access from everyone. 

creator.nightcafe.studio: impressive art generator but do not offer the ability to resell. Plus they are working on a newer more powerful one, however their existing one is limited to specific styles, which are basically prompt generators, none photorealistic. Used for NFTs, etc.

Results from Nightcafe of: “A tavern in the middle of the desert”:

A tavern in the middle of the desert
Nightcafe’s rendering of “A tavern in the middle of the desert”

The image generated here is a single image, took less than 1 minute to generate.

DeepAI: they have a public image generator with an API focus. Does not seem to have sufficient data to generate some images. 

DeepAI’s rendering of “A tavern in the middle of the desert”

There are several others starting to appear “out of the woodwork” and are beginning to crowd the market with services that are connected to the APIs of the above services. 

In short, expect to see “prompt engineers” whose job is to create the prompt that is used in the generation of an image request because getting it right can be tricky, but it is not that difficult that you need to pay 25 times the actual cost, where for $2 you can generate 25 images, for example. Therefore, learning a simple language, so to speak, to communicate to the AI to perform image generation tasks accordingly, is easy and will save you lots of money. 

For example, punching in “a lady sitting on a street corner eating ice cream” produced the following from three different image generators:

DALL-E 2:

OpenAI’s image generation of “a lady sitting on a street corner eating ice cream”

Notice here that there is no bias in terms of race in the results, as the image generator produced only one white person.

Craiyon:

Craiyon’s image generation of “a lady sitting on a street corner eating ice cream, photo realism”

Here it is not clear as to the race on some, but the majority of these are white with no faces. Creepy.

Nightcafe:

a lady sitting on a street corner eating ice cream, photo realism
Nightcafe’s rendering of “a lady sitting on a street corner eating ice cream, photo realism”

Nightcafe’s image did not produce anything worth using.

In conclusion, the problems we have today with deep fakes and other images that are AI generated, there is a benefit to having such technologies in the playground. Such innovative technologies always lead to something greater than what was done originally. Soon, you will see AI with the ability to generate a movie on demand, from a story book. Once image generation has been mastered, moving images are next.