Unless you’ve been living way out in a desert, you’ve probably heard of the recent proliferation of Artificial Intelligence driven image generators like Dall-E and Midjourney. Folks are wringing their hands together in despair, as if this signals in the end of life as we know it, if not the world itself. You know, like folks did with the printing press, and the fountain pen, and the typewriter.
Look, the technology can be scary. We don’t understand the long-reaching implications of what A.I. can do or what it means from a sociological standpoint. But you might as well make your peace with it. That particular genie is not going back into the bottle. Calling for the abolition of A.I. at this point is really no different from those folks in 1900 who were calling for the abolition of automobiles because horses and buggies made more sense. Those people are largely forgotten by history, and for good reason.
It is scary what A.I. image generators can do. I’ve been tinkering with Midjourney, Dall-E, Stable Diffusion, among a few others, for months now. They can do amazing things. And even while I was rather thrilled by the possibilities, almost from the beginning I kept telling people I showed generated images to, “don’t believe anything you see now.”
One common refrain I’m hearing is that A.I. is going to put artists out of work. That could be true in some individual instances. But overall I don’t think it’s going to be a widespread problem. Here’s the thing. As I said, I’ve been tinkering with A.I. art generators for months now. I can tell you, there are some significant limitations, none of which will be overcome by advancements in the algorithms any time soon. You can go to Midjourney right now and put in a simple prompt like “Portrait of a beautiful young woman” and the bot will generate a pleasing image of a pretty girl. Probably a white girl (yes, there are inherent biases). What you’ll initially get is a set of 4 images to choose from.
Here are images I literally just generated with that prompt.
These images were created by the A.I., drawn from the A.I. perusing millions of photographs and paintings, and collectively this is what it thinks society in general means when it asks for a portrait of a beautiful young woman. Now, the reason there’s 4 images is that you’re supposed to pick the one you like best, which is closest to what you had in mind, and either render it or evolve it. If you’ll notice, none of them are perfect. The A.I. has issues with eyes. And often noses. And it’s best to stay away from fingers if you can. But overall, the results are pretty impressive. For the sake of expediency, I’m going to choose the bottom right image to upscale, which means the A.I. will make a larger and, hopefully, more detailed version. I could have just as easily told the A.I. to use that image as a jumping off point to make variations of that one image, and then picked my favorite from those. That’s how you start inching toward an image which is, bit by bit, closer to the image you had in your head.
See, here’s the thing. If you’re just looking for a quick image, it’s easy to pop in a description and get a nice image. Does this put an artist out of work? No. You weren’t going to hire an artist to render an image of your random thoughts anyway. And if you want a very specific image, of a person wearing special clothing such as a Starfleet uniform, or with a slight scar above on eyebrow, or with an identifying tattoo, you’re going to have a very hard time getting something that specific. A.I. is just no good at that. You can try. You can change your prompts. You can modify various images, and add different things. You can even take an existing image which you’ve just generated and feed it back into the A.I. to use that image as a jumping off point for an entirely new prompt, which maybe goes off in a wildly different direction.
You know what? Let’s try that. Here’s the upscaled image I made of the girl.
As you can see, it’s a little better than the smaller image. The quirk on the nose has largely been corrected. Her neck is a little weird, though. It looks a little “vapory” on the left side. I think maybe the A.I. was going for dangling hair but made it just a bit too undefined. Still, it’s a pretty good image. The only thing is, since I used the word “portrait” the A.I. basically composed the image like a glamor shot or a traditional portrait. And since we didn’t put any special prompts, it made an image that looks a bit like a painting, which is the style Midjourney defaults to.
Let’s try something different.
Midjourney will let you use specific parameters to change as aspects of your image. You can change the size ratio of the image. You can trigger more realistic renderings using either –testp or –v 4. You can trigger more artistic renderings using –creative. There are a lot of different options available. Part of using the A.I. image generators successfully is learning these prompts and how they’ll apply to different things and make different images.
Let’s do a quick example before I do anything else with the portrait of the girl. Let’s try the new v4 rendering engine. I’m going to put in the same prompt, but I’m going to add “–v 4” at the end of it, which tells the A.I. to use its relatively new v4 engine to render our image. This is the result.
As you can see, these are still portraits (we haven’t changed that description), but they’re far more refined images. This is because the v4 engine / algorithm (I don’t know the proper terminology here) is Midjourney’s latest, and it’s better at generating images that previous algorithms (like the default we use for the original images). Again, the A.I. gave us 4 images to choose from. I’m going to upscale the first image, because I think it’s closest to the original image. Although, really, any one of these images is great. Again, it’s about giving you choices. You pick the one closest to what you were thinking, and you either upscale it or evolve it further. I’m just going to upscale that first image so we can look at the detail and depth of color. Could an actual living and breathing artist do this image, and possibly (probably) do it better, including specific details that I haven’t mentioned yet but which we could talk about? Yes, of course. Absolutely. Would I have commissioned an artist for this exercise? No. Of course not. It’d be wasteful to do that just to make a point, and I’m not rich, so I can’t afford to do that.
What the A.I. did was create a great image based on my prompts. It was fast. And it was free. And if I needed a quick character image to reference in a story or to help visualize a character, this would be perfect. It fills a niche which would not be filled by an actual artist. Who hires an artist to help them visualize their thoughts? Maybe billionaires do. I don’t know who else could afford it.
Now, I like the result. It’s a great image (posted below). And depending upon what I’m looking for, I could either go with that and work toward a more realistic image. Or I could use –testp to get a different flavor. Or –creative. I’m going to make 3 images with the following prompts and see what I get. Here are the prompts I’m going to use.
- Portrait of a beautiful young woman –v 4
- Portrait of a beautiful young woman –testp
- Portrait of a beautiful young woman –testp –creative
Here are the results.
Prompt: Portrait of a beautiful young woman –v 4
Prompt: Portrait of a beautiful young woman –testp
This one used the –testp parameter. Like the default it presents multiple images, but instead of 4 it gives you 2. As you may notice, the images are somewhat different. They’re more realistic than the original. But if you look closer you’ll see that it’s still given to the same issues with eyes being slightly wonky. And if you look at the image on the right, you’ll see that it still has problems with hands. What happened to your pinky finger, love? Depending on what you’re going for, that could be useful. If I was writing a story with a particular character and was fleshing out her look in visualization, I might add that to her character, that she’s missing part of her pinky finger and is self-conscious of it. This is one of the overlooked aspects of working with the A.I.; it’s randomness. If you approach using the A.I. as a collaborative process, your perspective on the whole thing changes. You’re still in control of the process. You’re still directing the evolution and generation of the images. Yes, you can put in basic stuff and get nice images. But don’t be fooled. The A.I. gives those same kinds of nice images to everybody. I had some early tinkerings that I was very proud of. And then I found almost images on other people’s profiles that were almost exact copies of what I had. Simply put, left to its own devices, the A.I. will always come up with nice images. But they’re going to be very generic. If you want something unique, you’re going to have to work for it. This is in contrast to the general perception a lot of people have, that you just put in some text and out comes Rembrandt. Sure, you can create something that looks kind of like a Rembrandt painting, but it’s not going to fool anyone. That’s the inherent flaw of A.I. It has no inherent creativity of its own, however much a surface scan of A.I. images might suggest otherwise. When you generate a lot of A.I. images, you realize rather quickly that you keep getting the same results.
Here’s the –testp –creative example.
Prompt: Portrait of a beautiful young woman –testp –creative
Hopefully the first thing you’ll notice is that these images are a little different. That’s because of the –creative parameter. Now, –testp is included in this prompt as well because you can’t use –creative without –testp. What the –creative prompt does is ask the A.I. to, well, be more creative, at least as much as an A.I. can be creative. I don’t know the specifics of how this works, but you can see that it uses the same basic idea but does take a slightly different spin on it. There are slight variations in the theme. But please notice, again, that these images have inherent flaws particular to the A.I. On the woman on the left, for example, the eyes are wonky and her hand is weird. For one thing, assuming that’s her left arm, which is a reasonable assumption, her thumb is on the wrong side of her hand. If you don’t look too closely, at first glance it’s just a normal hand. But upon close inspection, there are obvious issues. That goes back to the inherent flaw of A.I. It has limitations. The woman on the right is better overall, because we’ve avoided hands altogether. But her eyes are weird. She has no irises for one thing. That’s an easy enough fix in Photoshop, but it does speak to the limitations of A.I., even in an otherwise perfect image.
But… the algorithms are getting better.
To show how things can evolve, I’m going to experiment a little bit here. I’m going to upscale a couple of the images in the examples above. And then I’m going to use them as image prompts to create new images to show what the newer v4 algorithm can do using those images as a base with an different prompt. I’m going to modify the prompts slightly to take these images in new directions, to illustrate the nature of evolving images with A.I.. Below are those prompts and the results (I’ve removed the image links because they’re rather long and unwieldy). I’ve included the original image and the modified version for comparison.
Prompt: <IMAGE LINK> Portrait of a beautiful viking princess –v 4
Prompt: <IMAGE LINK> Portrait of an aging, weary woman who is reminiscing about her past –v 4
These last two sets of images are meant to illustrate how images can be modified. And how quickly. All of the images here were generated in the background this morning while I was at work. In short, I put in prompts and then went back to my regular duties while those images rendered in the background. I started writing this shortly after I got to work this morning at 4:30 am. In about 12 minutes I will have been at work for 2 hours. In those 2 hours I generated all the images in this article (but it definitely didn’t take 2 hours to render them – that was just how much time I spent doing them in the background). Now, please explain to me how this article would have been possible if I had needed to commission an artist to create them (assuming I could even afford to do so).
This is the strength and promise of A.I. image generation. The threat of artists being put out of work has been overblown. I am an artist myself, although I’m nowhere near accomplished enough to do it professionally or even to create images of the same quality that the A.I. can generate. But as an artist I can appreciate other artists’ concerns about A.I. training on their work which may be posted on the internet. Hell, I’ve tried creating images in the style of many of the great artists I admire. Caravaggio. Hieronymus Bosch. Boris Vallejo. Frank Frazetta. Luis Royo. And many others. And the A.I. has known who I’m talking about.
What I found rather quickly, though, was that while the A.I. was able to create images that were reminiscent of the styles of those artists, it really didn’t nail any of them. As I pointed out above, it’s not as simple as putting in a prompt of “portrait of a beautiful young woman in the style of boris vallejo” and suddenly having a Boris Vallejo painting you can go to print with. You really have to work hard with prompts and base images to get anything near an image that looks like an actual Boris Vallejo painting, and even then it’s unlikely to fool anyone. So, A.I. isn’t going to put Boris Vallejo out of work any time soon.
Just for the heck of it, since I mentioned Boris Vallejo, let’s generate two images using “portrait of a beautiful young woman in the style of boris vallejo” as prompts, to illustrate the possibilities here, one with the older algorithm and one using v4. The prompts are portrait of a beautiful young woman in the style of boris vallejo and portrait of a beautiful young woman in the style of boris vallejo –v 4.
As you can see, the older algorithm made some interesting images for the first set, but they’re not going to make anyone think “Wow, Boris Vallejo!”
The v4 algorithm did much better with the second set, and captured more of the signature nuance and color of Vallejo’s work, but it still wouldn’t fool anyone. I think the first image of the v4 images comes closest, though. You could continue to evolve the image to get closer to something that looks like Vallejo’s work. But that doesn’t just happen solely by putting in a prompt.
For me, the promise of A.I. is with original work, not copying others. Once I realized I could use existing images as base images for new prompts, I started tinkering around with my own art. Sure, I don’t draw much anymore. Honestly, I couldn’t tell you the last time I drew something, But I did have some old art laying around, tucked away in drawers here and there. I decided to use some of those images as base images to see what the A.I. would build on top of them. Below are a few of my experiments, with the original image on the on opt and the A.I. generated image below (which, again, used the image on the top as a base to make the image below it).
For me, this opens up a world of possibilities as an artist. And this represents the possibility of A.I. that I really don’t see anyone talking about; using A.I. as an artistic tool as part of the creative process. In fact, while I couldn’t tell you the last time I picked up a pencil and drew something, the possibilities and potential of using it as part A.I. generated art has certainly given me an itch to get back to it.
Of course, there are ethical issues here. For example, while the A.I. generated images that I made based upon my work would not exist without the original art and without my prompts, is the A.I. generated art my art, or is it a collaboration? And if it’s a collaboration, does the co-credit go to the A.I. or to Midjourney?
My gut instinct is that it’s absurd that we have to have that conversation, but it’s inevitable because we’re still defining what this is. However, I would point out that when Eric Clapton created a song no one expected him to give collaborator credit to Fender guitars even though his music wouldn’t exist in the same way without his iconic Stratocasters. Just in the same way as the guitar in that example, A.I. is a tool. Yes, I know that’s not an exact comparison, but it works at its most basic level. A.I. is not the issue, really, that we’re wrangling with; it’s the issue of how unscrupulous human beings might use A.I. to their own nefarious schemes. But don’t fool yourself. Humans have been creating fake images from the very beginnings of photography. And counterfeit paintings were an issue long before photography came along, when quite accomplished painters would copy the works of more famous painters to peddle on the art markets. That’s nothing new.
Here’s another one of mine that I think turned out well. I like my original drawing, but I certainly like what the A.I. built upon it as well. I’m curious as to what can be done with this approach.
It should be fairly obvious by now that I don’t think A.I. is the problem. You’re not going to regulate or legislate what can or can not be done with A.I. As I said in the beginning, that genie is out of the bottle, and there’s no going back. I’m sure politicians will parachute in for their photos ops and grand standing, and they’ll eventually cobble together some ill-advised legislation that’ll make A.I. tools harder for anyone but the wealthy and corporations to use. That is such a sure thing that I’ve been generating as many images as I can while I still can, because I expect either the politicians will outlaw or regulate A.I. into the periphery of popular culture, or all of these companies that currently provide A.I. image generating services will be bought up by massive corporations, which will then wall off these capabilities behind pay walls and shove them into expensive software that few can afford. That’s they way it usually works out.
Anyway, for the moment it’s the great frontier where A.I. is concerned. But geez, y’all. The one thing I don’t hear in the conversation are the possibilities and the promise of A.I. All you hear are to doomsayers predicting the end of all human creativity.
Here’s how it’s affected me. I’ve been working on an album and a collection of short stories in which each song on the album corresponds to a story in the book. For the album, I planned to create a booklet, with an image which represented each song / story, to collectively make a visual narrative people could connect with. My plan was to create the artwork myself, despite my limitations and time constraints (working full time and taking college classes), and just let it be what it would be. Yeah, my art would be colored pencils, but I have Photoshop skills and I thought I could elevate the images I drew.
But somewhere in the middle of all that the A.I. imagine generators came in the public consciousness (and mine, as well). I started tinkering around with Dall-E, Stable Diffusion, and Midjourney. I generated thousands of images trying to figure out how to use this stuff. I tried many different approaches. I used straight descriptions of the images I wanted. I got interesting images by plugging in song lyrics and poetry. I used image prompts to generate new images. In the process I generated about a dozen images that I’m going to use for my album cover and book cover, as well as a bunch of images I’m going to use in a couple of videos.
A.I. opened up a whole new world to me, by letting me move beyond my own simplistic talents into hallowed territory usually reserved for professional artists. Now, keep in mind, I was never going to commission an artist to do the artwork for me. I not only couldn’t afford it, but I would have just done my own if that were the case and let the pieces fall where they may. No, the A.I. became a collaborator that took me in directions I wouldn’t have expected. And the possibilities are astounding. I’ve already started compositing multiples images I generated into new images. It really does seem like the sky is the limit.
There are certainly a lot of ethical issues involved here that need to be sorted out. I mean, right now you can’t even register your original A.I. based art, no matter how that was generated, because the American government is perpetually about 10 years behind the times. And, of course, we’ll have to go through the obligatory preening by politicians who won’t understand the first thing about A.I. but will insist on being seen as doing something about it. So they’ll pass legislation that will ultimately damage the advancement and evolution of this technology. They always do.
But there are no more ethical issues involved here than there were for the printing press. It wasn’t so long ago that if you created art on a computer, some people didn’t think you were a real artist. And there was some debate back then about whether computer-based art was real art and could be copyrighted (they ultimately got that right), because most idiots thought, like they do with A.I., that you just pressed a button on your computer and it generated Rembrandt.
We will need to address these many questions. We’ll have to wrestle and scuffle and angrily stake out territory to protect these emerging technologies from the stifling hands of the uninformed and misguided. But we’ll answer those questions. We’ll figure it out. Artists will continue to survive and thrive, no worse for wear. I mean, if your work is prolific enough that A.I. is training on it, you’re probably not going to be hurt financially by some guy in Florida, like me, using A.I. to generate reasonably professional looking images to use on his home recorded CD. And while there will certainly be a rush to put out a lot of A.I. generated content, people will quickly realize (because most human beings don’t have much of an actual imagination) that the images that will proliferate will be very similar, and whatever buzz rises around them will eventually die back down. One day we’ll get to that point where A.I. generated art is so ubiquitous that no one is impressed by it. You know what will be the difference then? The difference will be that undefinable something that creative artists bring to their own work. That “something” that separates the Van Goghs and the Matisses and the Picassos from all the other artists who are, technique-wise, just as good or as technically proficient. A.I. will eventually just be seen as another tool. And eventually geniuses will come along who can use A.I. to go to new areas of human expression that we haven’t even thought of yet.
We’ll have to solve the issue of deep fakes, and the myriad other ethical issues that have arisen because of A.I. But we’ll figure it out. We always have. Folks should calm down. The printing press didn’t destroy human creativity. Newspapers didn’t destroy conversation. The automobile didn’t end the world. Or computers. There was a time when the dry town I grew up in was voting on whether to allow the sale of alcohol within city limits, and there were people insisting that there’d be alcoholics lying drunk in the gutters and the very fabric of the family would begin to unravel; none of which happened.
All things will not end this time, either. So relax. Take a breath.
Let’s poke the A.I. and see what it has for us.
And having said that…
prompt: Let’s poke the A.I. and see what it has for us. –v 4
Okay, so… that’s not helpful…