Midjourney takes on Sol LeWitt’s Wall Drawings
Sol LeWitt’s ‘Wall Drawings’ aren’t actually drawings at all but, rather, instructions for drawings. These instructions have been implemented in many ways, by many different people, revealing how they are both prescriptive and ambiguous. Control over the final output lies somewhere between the instruction giver and the instruction follower.
A few weeks ago I used AI chatbot ChatGPT to implement the instructions, first using GPT-3 and then using GPT-4. Continuing my mission to use AI tools for things they really weren’t designed for and aren’t very good at, in this article I’ll be using AI image generation tool Midjourney.
ChatGPT is a language based AI and I asked it to write p5js code to draw images, while Midjourney creates images directly. In the previous experiments I didn’t alter the instructions much at all because the outputs revealed things about ChatGPT’s understanding of language and the ambiguity of the instructions combined with that understanding led to unexpected results. This time, I’m keen to play with Midjourney prompt engineering so we’ll be starting out with the basic instruction for comparison and then experimenting a bit more.
n.b. In all my prompts I added —v 5 to use Midjourney version 5 and —ar 16:9 to produce images of that aspect ratio.
On a wall surface, any continuous stretch of wall, using a hard pencil, place fifty points at random. The points should be evenly distributed over the area of the wall. All of the points should be connected by straight lines.
Right away we notice that Midjourney includes a human in 3 of the 4 outputs. It interprets everything as a description of an image, while ChatGPT understands it is being instructed. If you tell ChatGPT, “draw on a wall” it knows that it must draw on the wall, whereas Midjourney thinks “an image of a wall being drawn on.” A strong statement on the presence of the artist’s self in their work.
Let’s also take a brief moment to appreciate that Midjourney is nailing hands a lot of the time now, but in the top right image the person seems to be directly drawing with their finger, which gives me that gacky nails-on-blackboard feeling.
Here are some implementations of this instruction by a human and by GPT-4, for comparison.
Wall Drawing #118 - GPT-4 output
I rephrased the Midjourney prompt to make it clearer what we actually want.
A wall surface with fifty points drawn in hard pencil with random positions. The points are connected by straight lines.
Some liberties have been taken with the accuracy of number of points and what exactly the lines are meant to be doing, but aesthetically it’s very cool.
To make it more “Sol LeWitt”, I mentioned this is all happening in a contemporary art gallery.
A wall surface in a contemporary art gallery with fifty points drawn in hard pencil with random positions. The points are connected by straight lines.
I love the different ‘algorithms’ Midjourney has experimented with for placing dots and connecting them. Particularly the second image where it connected most dots to their close neighbours and then selected a few to connect at a distance.
Wall drawing #11
A wall divided horizontally and vertically into four equal parts. Within each part, three of the four kinds of lines are superimposed.
This instruction doesn’t mention using a “hard pencil” or any other medium or context, so Midjourney made its own choices about colour and styles. Generally it seems to default to flat colours, but sometimes it dreams up something cool. I love the bottom right image particularly.
In LeWitt’s vocabulary, the four kinds of lines are: horizontal, vertical, 45º diagonal right and 45º diagonal left. Let’s add that information to the prompt.
A wall divided horizontally and vertically into four equal parts. Within each part, three of the four kinds of lines are superimposed. The four kinds of lines are horizontal, vertical, 45º diagonal right and 45º diagonal left.
Some valiant attempts which fall short of accurately implementing the instruction. I like how Midjourney apparently looked at the amount of numbers in the prompt and thought it better shove a few numbers in the outputs.
I then gave Midjourney some more context and guidance
A wall in a contemporary art gallery divided horizontally and vertically into four equal parts drawn with a hard pencil. Within each part, three of the four kinds of lines (horizontal, vertical, 45º diagonal right and 45º diagonal left) are superimposed. In the style of Sol LeWitt.
The thing is, Midjourney is nowhere near as good as ChatGPT at interpreting instructions accurately. This is to be expected, as it doesn’t have the same language processing training. I think this works out well - Midjourney can be used as an inspiration tool precisely because it takes things in unexpected directions.
I tried another style -
A sheet of watercolor paper divided horizontally and vertically into four equal parts painted in rich colour inks. Within each part, three of the four kinds of lines (horizontal, vertical, 45º diagonal right and 45º diagonal left) are superimposed.
A sheet of watercolor paper divided into four equal parts painted in rich colour inks. Within each part, three lines (horizontal, vertical, or diagonal) are drawn in hard pencil.
MJ was hesitant to draw diagonal lines in ink for some reason but I love the sketchy notes around the sides.
I drilled down the prompt a little more -
A sheet of watercolor paper divided into four equal parts painted in rich colour inks. Within each part, three diagonal lines are drawn in hard pencil.
A wall divided vertically into six equal parts, with two of the four kinds of line directions (horizontal, vertical, 45º diagonal right and 45º diagonal left) superimposed in each part.
Once again Midjourney sidesteps the particulars but comes up with some brilliant aesthetics. For some reason all of these came out kind of skeuomorphic with flat pieces and drop shadows.
Vertical lines, not straight, not touching, covering the wall evenly.
The first issue here is that Midjourney isn’t very good at parsing the phrase “not x”. Instead we can use :: to add negative and positive weighting to different parts of a prompt. Here are a couple of prompts and their outputs to demonstrate that.
a london bus, not red
london bus::2, red::-2
However, using that method in this Sol LeWitt instruction didn’t seem to help and Midjourney was weirdly reluctant to draw vertical un-straight lines on a wall. I tried a bunch of different things.
Vertical lines::1, straight::-2, not touching::0.5, covering a wall evenly::0.5
Vertical wobbly lines, not touching, covering a wall evenly
Vertical wobbly lines on a canvas
Vertical wobbly lines on a canvas
Vertical lines::1, wavy lines::2, straight::-2, not touching::0.5, covering the wall evenly::0.5
vertical wavy lines
vertical lines that are wavy
vertical lines that are wobbly
Are there no pictures of vertical wavy lines in Midjourneys training set? Perhaps something about the words “wavy” or “wobbly” implies horizontal, because of water waves, while the word “vertical” implies straight, because of architecture, etc. Honestly, I don’t know what to tell you.
The closest I got was this. I think perhaps starting with “a pencil drawing of…” helped enable Midjourney to break out of the idea that this needs to be a representation of a real thing, but it looks like it’s sort of fighting against itself to form anything vertical and wobbly.
a pencil drawing of vertical wobbly lines
Again, top marks for aesthetic though.
All architectural points connected by straight lines
Nice. Everyone loves a sketchy drawing of an abstract geometric form.
For reference, here’s an example of this instruction implemented by a human at The Massachusetts Museum of Contemporary Art.
Wall Drawing #51 - Human drawn at Mass Moca
I gave the prompt some more context, to try to get an output closer to the human example.
A wall in a contemporary gallery with a window, a door, an exit sign and a fire alarm. All the architectural points are connected by straight lines drawn in pencil.
There’s a lot to unpack here - like why it happily included a door but barely attempted a window (probably because windows are not common on gallery walls - fair enough) and why we got all these thick red lines as well as the pencil lines (I’m not really sure).
However, the Escher-esque, recursive, optical illusion door in the first image is a delight.
A black outlined square with a red horizontal line from the midpoint of the left side toward the middle of the right side
A disappointment both in accuracy and aesthetics.
For comparison, here are the implementations of this same instruction by a human and two versions of ChatGPT (via p5js code). ChatGPT has almost got it, but Midjourney just doesn’t parse sentence structure well enough to get anywhere close.
Wall Drawing #154 - Human drawn
Wall Drawing #154 - GPT-3 output
Wall Drawing #154 - GPT-4 output
I gave Midjourney another shot with different phrasing of a slightly simpler instruction.
A pencil drawing of a black square. Inside the black square is a horizontal line drawn in red pencil
Interesting that Midjourney was keen to interpret “square” as “cube” and overall this is another one for the “inaccurate but aesthetically cool” files. I particularly like the glowing red shadows.
Six white geometric figures (outlines) superimposed on a black wall. Circle, square, triangle, rectangle, trapezoid and parallelogram
Midjourney drawing all sorts of sick shapes here but absolutely refusing to include a parallelogram.
Out of interest I took a brief side quest to investigate whether Midjourney knows what a parallelogram or a trapezoid is.
A parallelogram
A trapezoid
Well that explains that.
That aesthetic is an absolute nightmare, so I also tried -
A paralellogram on a black wall drawn in white chalk
MJ massively overcompensating for not knowing what a parallelogram is both by amping up the visuals and by including an actual chalk in the top left of the last one like a little peace offering to prove it does know what some things are.
Anyway, back to the prompt, here are the human and ChatGPT versions.
Wall Drawing #295 - Human drawn
Wall Drawing #295 - GPT-3 output
Wall Drawing #295 - GPT-4 output
I tried out a bunch of different prompts here and it’s pretty interesting to see how Midjourney interprets each attempt.
A black wall with a circle square triangle and rectangle in the center, in white chalk
a black canvas with a circle square triangle and rectangle in the center, drawn in white paint
a black wall in a contemporary art gallery, white lines, circle::1.2, square::1.2, triangle::1.2, rectangle::1.2
a black wall in a contemporary art gallery, white paint, centered circle, centered square, centered triangle, centered rectangle
black canvas, white painted circle, white painted square, white painted triangle, white painted rectangle
a black canvas with a white painted circle, white painted square, white painted triangle and white painted rectangle in the center
Six-part drawing. The wall is divided horizontally and vertically into six equal parts. 1st part: On red, blue horizontal parallel lines, and in the center, a circle within which are yellow vertical parallel lines; 2nd part: On yellow, red horizontal parallel lines, and in the center, a square within which are blue vertical parallel lines; 3rd part: On blue, yellow horizontal parallel lines, and in the center, a triangle within which are red vertical parallel lines; 4th part: On red, yellow horizontal parallel lines, and in the center, a rectangle within which are blue vertical parallel lines; 5th part: On yellow, blue horizontal parallel lines, and in the center, a trapezoid within which are red vertical parallel lines; 6th part: On blue, red horizontal parallel lines, and in the center, a parallelogram within which are yellow vertical parallel lines. The horizontal lines do not enter the figures.
When I got to this prompt I expected Midjourney to really struggle and I was like oh no I’m going to be first against the wall when the AI revolution comes but actually it did surprisingly well at parsing this.
Here are the human and ChatGPT versions for comparison of what we should have been aiming at here (I had to fix GPT’s code a bit, which is why that one is labelled as GPT-4 + human). I guess perhaps my bar for Midjourney accuracy is pretty low at this point because actually the MJ outputs are sort of like a vague gesture towards what was requested.
Wall Drawing #340 - Human drawn
GPT-4 + Human
Wall Drawing #340 - GPT-4 output
Double Drawing. Right: Isometric Figure (Cube) with progressively darker graduations of gray on each of three planes; Left: Isometric figure with red, yellow, and blue superimposed progressively on each of the three planes. The background is gray.
I’m not sure I’ll ever learn to predict what niche instructions an AI will be good or bad at. I mean I would never have guessed “vertical wavy lines” to be a seemingly impossible task, while “isometric” is completely fine… or that it can handle “double drawing” but it doesn’t really know its left from its right.
I love these deconstructed fragmented cubes. Not really sure why that happened either. Perhaps “superimposed progressively” is doing something funny there, but it worked out great.
Three concentric arches. The outside one is blue; the middle red; and the inside one is yellow.
Sigh. Midjourney, you have got to listen.
Twenty-one isometric cubes of varying sizes, each with color ink washes superimposed
The top right one actually has 21 cubes exactly! But it has aligned them weirdly almost as if it was trying to hide the fact it got the count correct.
They are more or less all the same size when they were supposed to be “varying sizes”, but they are pretty.
The first drafter has a black marker and makes an irregular horizontal line near the top of the wall. Then the second drafter tries to copy it (without touching it) using a red marker. The third drafter does the same, using a yellow marker. The fourth drafter does the same using a blue marker. Then the second drafter followed by the third and fourth copies the last line drawn until the bottom of the wall is reached.
With the mention of a drafter right there in the prompt, I think we all knew what was going to happen here. I am enjoying whatever awkward moment is happening in the top right output.
I tried streamlining the prompt to make it more Midjourney friendly.
A wall with wobbly wavy horizontal lines. The first line is black, the second line is red, the third line is yellow and the fourth is blue. Each line tries to copy the one above.
There is basically no way to get Midjourney to take on board the specificity of instructions like what order the colours come in, or that one line copies the next (with the implication that inaccuracies will compound as we move down the wall).
I did away with these parts of the instruction and added in the contemporary gallery context.
A wall in a contemporary art gallery covered in wavy lines. Black, red, yellow, blue
Literally whyyyy did this suddenly make one output with vertical wavy lines? I am bewildered.
These colours are so rich though, and I love the way the art is reflected in the floor in a couple of them. Midjourney is an enigma but it is beautiful.
Color bands and black blob. The wall is divided vertically into six equal bands; red; yellow; blue; orange; purple; green. In the center is a black glossy blob.
Midjourney is completely ignoring the “glossy black blob” bit of the instruction, even though it’s mentioned twice. Also bleurgh at the aesthetic here.
Let’s see if we can make the prompt a bit more Midjourney-friendly.
A white wall in a contemporary art gallery divided vertically into six equal bands painted red, yellow, blue, orange, purple, green. Centered black glossy painted blob
Hmm. Still no black blob! For reference, this is what a human-implemented version of this prompt looked like.
Wall Drawing #901 - Human drawn
Let’s experiment with ways to draw a black blob like that one, ignoring the colourful striped background for now.
A black glossy blob painted on a white canvas
Midjourney seems to have interpreted this as a blob of black paint rather than a painted black blob. Let’s try a few different things.
a black irregular circle painted on a white canvas
a black wobbly circle painted on a white canvas
black blobby organic shape painted on canvas
a painting of a very irregular blobby black circle on a white canvas
a painting of a very irregular blobby black circle on a white canvas, curved edges, smooth edges, defined edges, wavy edges, blobby, irregular, organic, circle, flat painting, 2D
2D painting of an irregular black circular polygon on a white canvas, curved edges, smooth edges, defined edges, wavy edges, irregular, organic, circle, painting, 2D::2
a painting of a very irregular black blob with bits spread out all over the canvas, smooth edges, curved, wobbly, painting::2, splatter::-2, drips::-1, 2D, bold, silhouette
a painting of a very irregular black blob with bits spread out all over the canvas, smooth edges, curved, wobbly, painting::1.5, splatter::-0.2, drips::-0.1, 2D, bold, flat, silhouette
Hope you enjoyed that journey. The top left output of the last set is approaching what we want, let’s try asking for some background stripes as well.
a painting of a very irregular black blob with bits spread out all over the canvas, smooth edges, curved, wobbly, painting::1.5, splatter::-0.2, drips::-0.1, 2D, bold, flat, silhouette, in the background are stripes of red yellow blue orange purple and green
Oh no. Not what we were looking for, although I LOVE that blob in the bottom right munching on lots of squidgy little blobs while sweating colours.
I tried with some simpler versions, just trying to get some sort of black shape as a foreground with a separate background of vertical stripes.
a black circle with wobbly edges painted on a canvas, background of coloured vertical stripes
A black glossy irregular shape painted the centre of a canvas. Background of vertical lines in red yellow blue orange purple and green.
a painting of colorful vertical stripes::3, in the foreground of the painting there is a centered black circle::5, colored circle::-1, black stripes::-1
a painting of colorful stripes::3, in the foreground there is a centered black circle::2, colored circle::-1, black stripes::-1
black circle on top of colorful stripes
a painting of a black circle in front of colorful vertical stripes
a black blob with colorful stripes in the background:: black blob::0.6, colorful stripes::0.5
Midjourney is currently bad at separating out different parts of the prompt. With a photorealistic prompt, such as “a cat in a kitchen”, the cat won’t be melded into the kitchen or the kitchen into the cat, because MJ generally defaults to realism. However, it sees the whole prompt as one thing and when the prompt is abstract and there is no realistic training data to base the output on, things get combined.
To demonstrate this another way, I tried a couple of prompts asking for a cat in a fluffy kitchen. Midjourney applies the word “fluffy” to the cat not the kitchen.
a cat sitting in a fluffy kitchen
a cat, a fluffy kitchen
I expect/hope that the ability to break prompts into separate segments is something that will be introduced in newer versions.
Within a four-meter (160”) circle, draw 10,000 black straight lines and 10,000 black not straight lines. All lines are randomly spaced and equally distributed.
Kind of underwhelming, but there is a lot of stuff in that prompt that is very much not geared to Midjourney. Let’s try some adjustments.
A wall in a contemporary art gallery. Drawn on the wall in hard pencil is a circle containing lots of straight and wavy lines
A wall in a contemporary art gallery. Drawn on the wall in hard pencil is a circle containing lots of straight and wavy lines
A drawing of a circle containing lots of straight and wavy lines
A drawing of thousands of straight lines and thousands of wavy lines
We’re seeing a similar issue here to the last prompt, where Midjourney isn’t able to discern that I’d like both wavy lines and straight lines. I experimented with doing each one individually, where I learnt that “tiny” works better than “short” and “irregular” works better than “wavy”, before combining them again.
an abstract drawing, thousands of tiny straight pencil lines
an abstract drawing, thousands of tiny irregular pencil lines
an abstract drawing on the wall of a contemporary gallery, thousands of tiny straight pencil lines, thousands of tiny irregular pencil lines, not touching, space, individual lines, gaps
an abstract drawing on the wall of a contemporary gallery, thousands of tiny straight pencil lines, thousands of tiny irregular pencil lines, not touching, space, individual lines, gaps, inside a circle
As expected, Midjourney is quite bad at picking up specifics in instructions, but very good at making things look nice. It’s also very good at taking things in an unexpected direction, which can be an excellent tool for inspiration.
For example with my recent project Tiny Endless Things I built the background, containing thousands of tiny brush strokes, in p5js first. Then I used Midjourney to generate ideas for further additions, first by getting it to describe my existing outputs, and then by feeding those descriptions back into Midjourney with various additions, until I found things that I felt a connection to, in terms of themes and aesthetic.
Midjourney output
My enduring opinion of AI is that it’s at its best when used in collaboration with a human.
At least 80% of the outputs to the Sol LeWitt instructions here could form the basis of a generative art collection. In moments of artist’s block in the last couple of months I have idly tried using prompts like “a generative artwork” to get basic inspiration, but the outputs tend to be repetitive and uninspired.
This approach, of providing detailed instructions with specific geometric shapes and placements (which are then often overlooked), proves to be a more effective way to use Midjourney for inspiration.