This post introduces the Generative AI Art Styles multimedia series. This series attempts to impart practical knowledge related to the creation of AI-generated images in the content creation process. It shows how referencing well-known visual art movements in your image generation prompts can help you fine-tune the aesthetics of your AI-generated images.
So how to use these results? Start by looking at the image gallery. Click on the images whose aesthetics resemble those you want in your original creation. Experiment creating images by using these art movement labels and the content of the generated descriptions or Wikipedia pages for more clues about good prompting. If you really like image generation and want to get good at it, there is no substitute for getting real training from a good art expert who knows how to teach.
This content series is part of a larger project that studies the fields of content creation and content creation enterpreneurship. This larger project seeks to (1) train people in the practical operations of mass communications and (2) study this practical knowledge to advance sociological understandings of how mass communications technologies affect culture.
AI is disrupting the field of content creation. See my recent draft paper on AI and how academic programs and information/culture workers might adapt.
Insofar as the field of creating visual images is concerned, the technology has altered the practical tasks and skills required to generate content with visual content. Two years ago, the creation of visual images was far more dependent on one’s possession of things like fine motor skills or comfort learning complicated computer programs (e.g., Adobe Illustrator).
Now, it is possible to create images using verbal descriptions in image creation prompts. Creating images still requires the artist to conceive of an image mentally and fine-tune its physical instantiation into something sufficiently acceptable for whatever practical purpose that image serves. However, it is possible for someone without once-required knowledge and skills to create lower quality images, which are often what a larger communications project requires.
Increasingly, quality image generation can be a matter of users developing a vocabulary to conceive and articulate nuanced aspects of the visual images they create. This project tries to impart that knowledge to content creation students when we engage visual content.
This series describes and shows standardized generated images of, at present, sixty visual arts movements that can be included in an image generation prompt to fine-tune the aesthetic of the visual image one is charged with creating. These descriptions can help guide an image creator to generate prompts that more closely resemble the aesthetic towards which one strives.
Look at these images, and see which art movement labels and descriptors helped generate the style variations among them. Readers can then experiment using these movement labels and descriptors in their own work.
In addition to providing help with image generation, this project also represented a data wrangling and communications project that integrated Generative AI in a content production process. The project involves web scraping, interacting with the OpenAI API, and deploying information on scale via WordPress. Markdown file available upon request.
Keep in mind: I am a sociologist and data analyst, not an art scholar. This is an empirical exercise to build practical techniques in content creation, rather than an attempt to definitively define art. If you know something about art, by all means enlighten me below. I’ve become more interested in art after doing this project, and enjoy opportunities to learn more.
Developing this tool involved both human and generated content, and considerable human-machine interaction. This process is described here.
The project began with a list of known art movements and associated Wikipedia pages. This is an ad hoc list that was generated through a process that blended human and Generative AI work. Generative AI was used to brainstorm lists of well-known art movements and art movements associated with well-known artists. The human user (me) then looked up these movements on Wikipedia, and used references in those pages to network towards more or better art movements to profile. I ultimately arrived at a list of 60 art movements with which to experiment:
Ultimately, this list is the product of a human-led process in which the human had low information in this knowledge domain. Put more plainly, I do not know anything about art. I am sure that the list is wholly unsatisfying to a serious art person. My goal was not to create an objectively true and exhaustive typology of all the movements in art history. Instead, it was to generate labels that computers would interpret as pertaining to a definable aesthetic, seeing what the label’s invocation does to generate images and allowing the reader to develop image generation strategies of their own based on the comparisons offered by the project.
If you know something about art, your comments are most appreciated. You can add them in comments below, or email me.
Generating Rich Descriptions
On each style page, we include a brief generated description of the art style. Again, this description is not intended as an objectively true definition of an art style. It is a computer generated explanation that draws on the contents of each style’s Wikipedia page as well as any implicit descriptions or associations of these styles that arise from Chat-GPT’s underlying model. It should be treated as a label that users can use practically to build image generation prompts to create images that resemble these style prompts featured in this project. However, they should not be taken as expressing real or accepted categories of art among visual art experts.
The strategy used to create these descriptions was as follows: I scraped the text of these art movements and their associated artists, and then asked Chat-GPT 4 review the Wikipedia entry and provide guidance on how to reproduce art in this style. An example prompt would be:
Please synopsize the visual features that would make a piece of art recognizably pertain to the style of Japonism. In less than 400 characters, explain the major visual or aesthetic features commonly associated with Japonism. Provide detailed instructions on how to produce art that would be immediately recognized as reminiscent of this style. To help you better build that description, please consider the information conveyed in this excerpt from the style’s Wikipedia page: Insert text of Wikipedia Entry here
And for that entry we received:
Japonism art is recognized by elements inspired by Japanese woodblock prints (ukiyo-e), featuring bold colors, dramatic foreshortening, and asymmetrical compositions. Other aesthetic features include abstract color usage and flattened planes. Decorative arts such as ceramics, enamels, metalwork, and lacquerware also influence this style, as well as patterns from landscapes, architecture, and clothing. To produce Japonism-style art, utilize off-centered, diagonal designs with expansive negative space. Micro details are important, so consider patterns found in nature or traditional Japanese motifs. Use a vibrant color palette and consider lacquer-like finishes for a quintessentially Japanese aesthetic.
Is this accurate or not? I am not a good judge. However, I will post these images and descriptions with comments open, and hope to learn and refine the resource.
Unfortunately, in this phase of the experiment, I lacked the ability to wrest the same kind of image quality from OpenAI models via the API versus through its Chat-GPT Pro web client. It is my sense that comparability is maximized through the API for several reasons. First, it is easier to turn down the temperature. Second, it is easy to process large batches of images at once through the API. Third, OpenAI appears to use Generative AI to regenerate user’s image prompts before generating the image, and my impression is that the Chat-GPT Pro web client rewrites prompts more aggressively.
The problem is image quality and the resources at hand. The images are far better quality, and thus more usable to content creators, when generated by a user with low art knowledge or ability. I clearly lack the image prompting skill to match the quality of image output through the API as is delivered through the Chat-GPT-rewritten prompts on the web client. Given my comparative comfort with computer programming compared to the typical creator encountered in our project I reasoned that the best strategy under these circumstances was to feed the standard prompts to Chat-GPT and saved the images manually. It established comparisons through the image generation product that I believe creators would be most likely to use, given that it is the easiest method to create the best images with no variable API costs. However, comparability of art styles through comparisons of images depicting the same subject have been critically harmed.
Images in this Series. The series depicts two subjects: an urban setting (New York City’s Times Square) and natural one (the shores of Lake Kenogamissi). Each subject is presented with a “thin” and “thick” prompt. The former simply asks the model to generate an image in a style that resembled a particular named art style, without summarizing the character of that style. The latter includes a generated description of each art movement, described above. There is also a “thin” prompt (asking for an image in the style but no description of the style) for an image of a person.
Deploying the Content
To deploy this content, I programmed R to create a set of RMarkdown documents corresponding to a post for each individual art style. My task was to determine a way to present this information so that users could make comparisons and use these images practically in the WordPress system on which it was to be deployed. I settled upon individual art style posts to showcase how each individual style manifests into different art forms. All of these posts would then be showcased in an online gallery that allows users to compare the same image with different styles and prompting strategies.
Generating the Markdown Files
The deployment process R to generate a set of RMarkdown files, each pertaining to an individual post that profiles an art style.
Manual Web Deployment
I deployed the pages manually, giving me the opportunity to both review the pages’ final versions and make edits to convey comments on the information generated. As a matter of practice, I feel it is important to ensure that the information that we deploy using AI ultimately reflects mindful human communications. I used the AI to create the empirical objects and to pre-process standard elements of the page creation process. Even if you do not add commentary, and I generally do not in this project because my understanding of art is limited and I would not presume to teach people about the topic, the act of reviewing individual pages acts as a quality check on what you put out in an AI-augmented process..
To be clear on what this project does and does not deliver: It does not deliver expert-vetted information on art. My knowledge of art is shallow. Instead, these products show the results of my experiments with AI art prompting. These are exploratory findings in a process that makes no rigorous determination of what kind of art exists, what is characteristic of this or that art type, how to do a particular kind of art, or anything. This is just a catalog that offers various looks at empirical experiences with altering practical content creation processes. I hope you find these posts helpful if you have to generate images for a content creation project and are interested in ideas of prompts that can make something that resembles the aesthetic in your mind’s eye. Maybe it’s helpful to you, or maybe not.