Artwork - How we use a text to image generator to create our illustrations

Generative AI has been super helpful in creating our artwork as neither Vasu nor I are graphic illustrators or artists. But what we lack in design skills we make up for in imagination and curiosity! I hope the following encourages you to experiment with generative AI to experiment to create images for yourself!

As we wrote and revised (and revised) the manuscript for our first book, I had visualised specific scenes to accompany the story. I went down a rabbit hole in learning and trying different text to image generators available at the time and eventually selected Midjourney for its ease of use, straightforward pricing and capabilities that aligned to the images and outcomes I had in my head.

In the meantime, I learned a lot about the process and wanted to share some tips and tricks in case you were interested in trying these types of tools for yourself.

But first, a quick refresher on generative AI for image generation and the importance of prompts.

What is the role of generative AI in image generation?

Generative AI is a type of artificial intelligence that can create new content, such as images, music, text, or even videos, based on patterns it learns from existing data.

Training generative AI for image generation involves feeding it large amounts of image data (thousands or even millions of images), then training it to understand visual patterns and structures (like shapes, textures, relationships to other objects). Once the AI has learned the data, it can start to generate images based on prompts, and then further refines its ability to create new images through an iterative learning process. Clicking ‘like’ or giving a bot response a ‘thumbs up’ are examples of feedback loops within iterative learning used for refining the model.

What are prompts and how do you use them effectively?

Prompts essentially guide the AI to generate specific outputs, be they text, images or music. In image generation, a prompt is a text description of what you want the image to look like and you can string together many different elements within a prompt to get closer to your desired output. In Midjourney, a single prompt will generate a set of 4 images from which you can either upscale a single image clicking on the Ux where x is the number of the image or generate variations of a single image by clicking on the Vx for the image you want redone.

There are many different elements of an image you can prompt: the size (or aspect ratio), visual style, emotion, color, objects, location, character design, camera angle, etc.) Currently the best outcome for me is generated by structuring a prompt like so:

[style], [image/character description], [action/location description/camera angle], [Midjourney parameters]

For example, typing the following into the /imagine prompt on Midjourney yielded the first image above this post.

anime in the style of hayao miyazaki, a sleeping tuxedo cat wearing a yellow polka dot bowtie, dreaming of many fish and many birds in a sunny warm living room --ar 1:1

Changing some of the parameters will generate a new set of images, for example the second image that was generated using this prompt:

black and white photorealistic, a sleeping tuxedo cat wearing a yellow polka dot bowtie, dreaming of many fish and many birds in a sunny warm living room --ar 1:1

It’s so fun to see what changes as you let your prompting imagination go!

I definitely relied heavily on YouTube tutorials (shout out to Tokenized AI by Christian Heidorn amongst others) to learn some advanced techniques for getting consistency in character design and learn about camera angles.

So here are my top five tips for getting started on using Midjourney for book artwork:

1) Define your image style upfront to help with character and background design.

I really wanted a pixar animation + children’s book illustration style and wanted to lean into the fun, creative and innovative feel of Makerville. I used descriptors such as “whimsical”, “innovative”, “fun” and included color styles such as “a blend of bright and muted pastel colors” to build out the background illustrations like the main street in Makerville in the third image. Play around with different illustration styles to see what you like! Links to get the basics are shown below.

2) Develop your character designs separately with a white/blank background to be able to cut and paste characters into settings.

I used the following prompt structure to create a series of standing, running, walking, back views, etc. of my characters:

[height] [age] [physical descriptors] [actions and poses needed] [body shot view] [image style] [output format] [background needed], [specific clothing needed] [Midjourney parameters]

The following prompt generated different poses for Amina, one of our main characters in our Makerville books: “a short 7 year old girl with dark skin, long wavy loose hair, multiple expressions and gestures, full body, pixar style animation, 3d model, character sheet, white background, brown hoodie and jeans --ar 16:9”

3) You can train (some) consistency into the bot.

The biggest challenge in Midjourney (as I assume with all image generators today) was to get a consistent character design, particularly important as we wanted to ensure our characters had the same look and feel throughout the story. Once you decide on the type of clothing and color, always include those descriptors in your prompt.

Another way to train the Midjourney bot is through positive reinforcement. Clicking on the heart under the image you upscaled positively reinforces the bot’s generated image to your prompt. Another positive reinforcement tip is to use the seed number. A seed number is generated randomly for each image and you can use the seed number of your preferred image to reinforce generating variations or new images that take the seed into account. The Midjourney parameter for including a seed in your prompt is —seed [number]. The image seed number can be found by clicking on the three horizontal dots (“More”) at the upper right corner of the image you want the seed for and clicking the envelope icon at the top. The Midjourney bot will then send you a message with the image along with the job ID and seed numbers. Copy and paste that seed number to re-use it in your prompts.

You can also use reference images as a starting image to train the bot. Reference image URLs can be inserted after the /imagine prompt followed by your prompt. For example, to get some images of Ishaan and Amina riding a bicycle (coming in Book 2!), I first found some static graphic illustrations of a person riding a bicycle and uploaded their URLs into a Midjourney /imagine prompt then added my character design descriptors and parameters.

4) Get to know (and try out!) the Midjourney parameters.

Midjourney parameters help adjust and refine the image you want generated, as well as how Midjourney delivers your prompt. Parameters used in Midjourney are prefaced with a “—” directly followed by the parameter and the value you want. My favorite parameters when generating the Makerville images were:

  • aspect ratio (—ar) helpful to get an image output in a predefined format

  • Image weight (—iw) this was helpful when using reference images from outside Midjourney to get the bot to rely more (or less) on the reference image to generate a new set of images

  • seed (—seed) was critical to generate images using similar character design

  • repeat (—r) was helpful to run the same prompt multiple times in one go

Check out the link on parameters below to get a feel for what can be adjusted.

5) Set up separate Discord servers in Midjourney to organise your artwork.

When you first access Midjourney, you are basically on an open channel and your output is mixed in with those of other users. I found myself getting lost in the long scrolls to find previous image sets I had generated. Discord allows you to set up your own server with different channels. I set up separate channels for each of our characters and locations. This helped me stop the scrolling and have the ability to run separate jobs at the same time from each channel. Win!

You first need to create your own Discord server and then add the Midjourney bot to it. Phil Svitek’s short video shows you how to set up that up. And then you can just click on the “+” sign next to “Text Channels” to set up separate channels and then start to generate artwork by using /imagine within any channel.

Bonus Tip!! Sometimes it’s easier to cut, edit and mix images together to get your final output.

As I had very specific images in mind for the stories, I found it easier/faster to generate the backgrounds, objects and characters separately and then just cut, paste and place them in layers together to get the final result. If you have photo editing skills even better! Theoretically one could try to generate the images all within Midjourney but I was fine with editing them together to maintain a consistent look and feel for the story.

If you’re curious about trying Midjourney, here are some links to get you started:

Midjourney home page: https://www.midjourney.com/home (click on “Join the Beta” if you want to access the image generator)

How to get started: https://docs.midjourney.com/docs/quick-start

Know your rights: Section 4 of https://docs.midjourney.com/docs/terms-of-service

Midjourney prompt guide: https://docs.midjourney.com/docs/explore-prompting

Midjourney parameters: https://docs.midjourney.com/docs/parameter-list

And if you are interested in the prompts we used to generate the artwork for our book you can check them out here!

Previous
Previous

Book 1: Discovering AI: In Our Town