It’s a predicament only Matthew McConaughey could find himself in. So he made a reservation at a restaurant for an outdoor table, but supposedly because the restaurant doesn’t use Agentforce from Salesforce, no one at the restaurant thought to change the poor millionaire actor’s reservation to an inside table when the rain started pouring.
Or, if it started raining after McConaughey was seated, no one in the restaurant staff thought to check in on him. How is that customer doing who has the power to ruin our restaurant if we fail to provide him anything less than top-notch service? Or did they just decide that McConaughey is weird and he just likes dining in the rain?
A server actually goes out in the rain to drop on McConaughey’s table a dish that doesn’t look all that appetizing, and McConaughey says he doesn’t like that food. But the server has already gone back inside, apparently unconcerned that the customer is talking to himself in the rain.
It happens that way only because the commercial is scripted that way. If it happened in real life, it would probably be because of artificial intelligence, of course combined with a major failure of natural intelligence.
When A.I. gets things wrong, it’s usually called a “hallucination.” But A.I. doesn’t hallucinate, unless it has been specifically programmed to behave in a way that mimics an actual human hallucinating.
An abstract painting generated by SDXL 1.0, it was supposed to be a picture of Matthew McConaughey dining in the rain. Color Painting preset on NightCafé, random number seed 3122291035, sampling method K_DPMPP_2M.
I’ve already seen others push back on the use of that term. I think it would be more precise to say the A.I. is “misunderstanding.”
A.I. image generators sometimes give some very good examples of misunderstandings.
To head up this article, I was thinking of using a still from the Salesforce commercial. But since I’m writing about A.I., I figured I might as well use A.I. to create an image for this article. My prompt was simple: “Matthew McConaughey eating in the rain.”
I burned through twenty of my NightCafé credits (I still have more than a thousand left) and got images of varying alignment with the prompt. Some show Matthew McConaughey in the rain but no food. Some show Matthew McConaughey with food but there’s no rain. And some don’t show either Matthew McConaughey nor any food. So, compared to those, the one that I eventually chose is pretty damn good.
An A.I. like DALL-E 3 creates something new only in the sense that the exact image it produces is almost always something that usually differs in a tangible way from any image that has been produced before. Then it's original, but that's not the same thing as "good."
The purveyors of A.I. encourage users to think of themselves as “artists.” But typing some words into a box (writing a “prompt”) and then pressing a Create button doesn't make you an artist. If you need an image for an important purpose and it needs to be good, you’ll probably need to hire a real artist.
It is impressive that an A.I. can take the words of a prompt and produce something that recognizably corresponds to the words of the prompt. It does also happen sometimes that an A.I. produces an image that looks very good, is in the style you were expecting and correctly corresponds the prompt.
Much more frequently, however, the images produced look good but don’t quite correspond to the prompt, or they do correspond very well to the prompt but are not in the style you were hoping for, or are even just plain bland. It also happens quite often that one or more words in your prompt seem to have been simply ignored. And once in a blue moon, the A.I. seems to just completely ignore your prompt.
Let’s say for example that you want a picture of Justin Timberlake playing the violin. The A.I. probably starts by retrieving several images of Justin Timberlake and several images of violins. That's probably where a human artist would start, if he or she is not familiar with Justin Timberlake or violins.
The next part is where human artists and A.I. differ considerably. If the A.I. uses a generation algorithm of the stable diffusion type, the next step might be to gradually add a lot of random noise until producing an image that looks like static. Allan Kouidri explains this part of the process very well on the Ikomia guide to SDXL.
And then, somehow, from the static, the A.I. wrests a recognizable image. Well, Kouidri does also explain that part of the process.
Image of Justin Timberlake playing the violin generated by SDXL 1.0 using the Modern Comic preset, with seed 1364152073, sampling method K_DPMPP_2M, no CLIP guidance.
In this example image, you can recognize Justin Timberlake and you can recognize that he’s playing the violin. It's actually quite good, if you don’t scrutinize where his right hand is relative to the bow. And please don't scrutinize the background!
If a picture is worth a thousand words, then maybe a thousand words are needed to generate an image. The prompts for some of the best A.I.-generated images are very elaborate, intricate, and use a special syntax that is not all that intuitive even to professional computer programmers.
Also, there are several decisions for human operators to make before the A.I. starts its work, such as:
- Algorithm (or model). Choices include DALL-E 2, Stable Diffusion, SDXL, etc.
- Aspect ratio. Choices include 4:3 and 9:16, etc. Might default to 1:1.
- Runtime, how long to run the algorithm: short (almost instantaneous), medium, long (might take several seconds).
- Start image, to work in conjunction with the prompt. May be omitted.
- Overall prompt weight percentage, in the range from 0% to 100%. Might default to 50%, lower percentages tend to result in images more like the corresponding start images, when applicable. I’m not sure if overall prompt weight matters when there's no start image.
- Refiner weight percentage, for algorithms that work by gradually refining a rough initial attempt. Might default to 50%, can be turned off altogether with 0%.
- Seed number for a pseudorandom number generator, range might be something like 0 to 4000000000. For example, 2665828219.
- Sampling method. I don't know what's being sampled with this one. Choices include K_EULER, K_HEUN, K_DPMPP_2M, etc.
NightCafé offers a lot of help with these settings, by bundling some of the most popular choices into presets.
When a human artist makes a mistake, the patron can usually explain to the artist what the intended artwork was supposed to look like and how the produced artwork fell short. When the A.I. fails to produce what you want, how do you explain it in terms of random numbers and sampling algorithms? I don’t know.
Sometimes you can’t even figure out why the A.I. messed up. The first time I noticed the violinists in the background of Justin Timberlake playing the violin, I was completely puzzled as to why they were playing fragments of broken violins.
It wasn’t until much later that it occurred to me that the A.I. misunderstood photos of violinists with other violinists behind them. Let's say there are four violinists playing, seated in a row, and behind them there is a row of another four violinists.
From your vantage point, your view of the back row musicians’ instruments is obscured, but you understand that they’re also playing violins. Or maybe they’re playing violas, many people can’t tell the difference visually between violins and violas. But you understand that the back row musicians are playing instruments that are very similar if not the same in shape and size.
Look at this stock photo of women violinists. Look at the woman in the center. The camera’s view of her instrument was partially obstructed by a music stand with sheet music on it.
Photo by Lucas Oliver of women playing violins in an orchestra. This is a stock photo from Pexels that has been downsampled. A higher resolution image is available for free from Pexels.
An A.I. looking at this image might actually conclude that the woman in the center and the woman next to her are playing different instruments. The woman in the center would be misunderstood to be playing a violin fragment. Quite likely something similar happened with one of the source images that SDXL used to produce the image of Justin Timberlake playing the violin.
By the way, the stock photo is free from Pexels, but Pexels did suggest I could donate. I obtained a JPEG sized at 6,000 by 4,000 pixels, which I downsampled to 1080 by 720 only because I want it to load quickly. A similar image from Adobe Stock or Getty Images could cost me $50 easy, and a lot more on the latter.
If one of your photos or paintings is used as an input to an A.I. image generator, are you entitled to any monetary compensation? Some artists say that any use of your images by A.I. is theft. Just another way to deny artists of their rightful wages. Though the argument could also be made that it’s not all that different from another human artist drawing on your work for inspiration.
The main fear with A.I. image generators is of course that artists are going to be put out of work. Why pay Getty Images $500 for a high quality image made by a human artist when you can maybe get a good enough image for less than $1 from an A.I.? Pricing on NightCafé starts with 100 credits a month for $4.49, and you can also earn credits for various activities on the platform.
The danger of human artists being replaced might be overblown. Nevertheless, it has already caused a lot of nuisance, and a cheapening of real art by real artists.
But there’s also an opportunity here to educate the public about what makes real art, how it comes about by a combination of a human artist’s natural aptitude, lived experiences, training and practice, and also understanding the work of other artists working in similar media or genres. That can’t be simulated by computers rolling dice.
“There is no art without intention,” Duke Ellington once said. Maybe that explains why art generated by “artificial intelligence” is so often lacking in certain qualities that we find in the best art by humans and even in the mediocre art of people who would much rather delegate the creation to an A.I.
As for restaurant owners, they should probably not be in a rush to sign up for Salesforce with Agentforce.