One of the latest Midjourney feature releases, Character Reference, is the final nail in the stock photography coffin and a potentially severe blow to several others. It will likely unlock a wave of creativity as it “democratizes” many aspects of what is required to illustrate stories. It feels magical. But I’m struggling with how we got here. Let’s talk about that first.
The fact, and I do believe it is a fact, that Midjourney and others were trained on millions, if not billions, of copyrighted images is troubling. While I can make arguments, and have, that what they have done is no different from me looking at other photographers’ work for inspiration, it is different. But is it unethically different? It’s a nuanced issue that raises complex ethical questions around intellectual property rights and fair use.
On the one hand, training an AI on billions of existing images could be viewed as transformative, creating something new and valuable from that data, similar to how I gain inspiration from other photographers’ work. My experience is this; In 99.9% of instances, the resulting AI generates new images, not replicas of the training data. That is where I think this could potentially fall under fair use principles.
However, the massive scale of copyrighted images ingested without permission from artists and photographers is unprecedented. While I might be able to look at a small amount of photographs for inspiration, the mass indexing of billions of images objectively goes far beyond that. There are real economic impacts on artists and photographers when their work is appropriated into these models without compensation or control over its use.
When I can produce an image in seconds by simply typing in “A stunning image of a boxer after a tough fight, shot in the style of Annie Leibovitz,” it’s hard to understand how this is not a problem. I can simply add Tom Cruise to the prompt and, seconds later, have an image that he has not consented to be produced. Does he need to approve? Midjourney gave me the copyright to the resulting image. What does that even mean in this new world of generative AI?
It also raises more than a few privacy concerns. It seems evident to me that personal images may have been sucked up and used to train these models without consent. And representational biases could be baked into the models based on what images were available for training. Still, we can’t know for sure since there is almost no transparency around the training data.
Ultimately, whether this is unethically different from gaining inspiration comes down to examining issues like the transformative nature of the use, scale, economic impact, privacy violations, representation in training data, and whether copyright holders’ rights were truly respected. Reasonable minds may disagree, and ultimately, this will need to be settled in the courts.
Personally, while I believe the development of AI image generators is incredibly valuable, I do lean towards this crossing an ethical line in terms of the unconsented use of intellectual property at a massive scale. Some form of compensation, such as an opt-out ability or a licensing plan, would have been more appropriate.
As you can see, I’m struggling with this and think I should be. If these systems were built unethically, what stand should I take? Should I boycott Midjourney, as I did Facebook, and all other generative AI companies until they become more transparent about how their AI works? Should we ban them, like Napster, only to have huge incumbents swoop into the void? Would it make a difference? The genie will not be put back into the lamp. All of this troubles me.
But as I tweeted a few months ago, a technology’s most dangerous pathways are discovered only through its use. The pioneer’s role is to identify the dangerous trails and warn us, allowing us to either close them off, secure them, or seek alternative routes. So, I feel compelled to use them to understand their good and bad potential.
The Future Has Arrived
The future has not looked bright for stock photography for a couple of years. It was clear from the start that this technology would make a real dent in the creative market. However, with the introduction of this simple way to get consistent character control from the popular Midjourney image generation software, stock photography, at a minimum, is officially on the verge of becoming a part of history.
I wrote this in August of 2022, “The future is not looking good for stock photography and illustration or for the photographers and artists who supplement their income with it. After spending some time with Midjourney and DALL-E 2, I can feel the shift. The day is fast approaching, and the ability to generate a unique high-quality image of just about anything you wish to conceptualize will be possible.”
“The day is fast approaching.” That day likely passed a few months ago, but Midjourney’s “character reference” release has sealed stock photography’s fate. It was released in its test version on March 11, 2024, but I am just now testing it. And frankly, it’s brilliant.
A Character Reference Example
According to Midjourney, “This feature works best when using characters made from Midjourney images. It’s not designed for real people/photos (and will likely distort them as regular image prompts do).” I have tested it with images of myself and they are right. It’s not very good yet. There are other applications you can turn to for that. Character reference in Midjourney works by focusing on the character traits of the conjured reference image. However, the accuracy of this new technique is limited. For example, it won’t reproduce exact facial imperfections or features like dimples or freckles. If it does, it’s pure chance.
So, I started by “creating” my character using the prompt “angry old man, full gray hair and scruffy beard.” It produced four different variations of an old man. The image I selected is below.
Character Reference In Different Settings
Using the image above as the character reference allowed me to easily instruct Midjourney to place the character into different situations. It has been possible to do this via other methods, but this simplifies the process so anyone can do it. As you’ll see below, my first attempt results are stunning. And while it’s not perfect, it’s important to remember that this is as “bad” as it will ever be.
It also works with illustration styles. From my vantage point, this appears to open up numerous new creative possibilities for artistically challenged storytellers and just as many new threats to those who create illustrations for a living.
A Wake-Up Call
This isn’t a tutorial; there are plenty of them on YouTube. If you have a desire to learn, the information is out there. But I do hope it’s a bit of a wake-up call. Built on the copyrighted creative output of actual humans, generative AI is progressing at a pace we’ve never seen in technology. Annie Lebovitz may not be worried about AI, but there are hoards of creative artists decrying the advances built on their sweat and tears. I think they have reason to be concerned.
I think it’s time to ask more questions that fall into the “should we” category and fewer questions that fall into the “can we” category.
Add your voice...