This image is generated by AI

After understanding all the basic nodes and operations of ComfyUI, you should be able to generate images using ComfyUI.

However, the generated image effect may not be what you want. For example, you may want to generate an image in an anime style, but the generated image is in a real person style. At this time, you need to use some means to let AI generate images with a specific style. There are several ways to adjust the image style:

  • Method 1: Use prompt to affect the generation effect. This is the simplest method. Basically, using the default workflow can achieve it. This chapter will introduce how to use prompt to adjust the image style. At the same time, it will introduce some more efficient methods.
  • Method 2: Use different Flux Fine-Tune models to affect the generation effect. This is the simplest method. However, the cost of fine-tuning Flux models is relatively high, so the number of Flux fine-tune models is not very large.
  • Method 3: Use LoRA to affect the generation effect. This method is slightly more complex, but compared to Method 1, because its fine-tuning cost is low, so now there are more available models, and the file is much smaller. You can refer to Flux LoRA for detailed introduction.

1. Flux is smarter than you think

First, I mentioned in the previous chapter that Flux uses a t5xxl_fp16 model, which is much better at semantic understanding than the Stable Diffusion model. However, this model also makes writing Flux prompts more complex than Stable Diffusion.

Based on my experiments and some feedback from Flux users, I summarized the following insights:

Insight 1: Prompt is not longer better

When writing the Prompt for Stable Diffusion, if the Prompt is too short, the generated image content will be relatively simple and lack some content. So many people will tend to describe the image content very detailed and write a very long Prompt. However, in Flux:

  1. If you give it too many words, it will summarize. Your 5000-word prompt is compressed to about 100 tokens.
  2. If you give it too few words, it will fill in the blanks. For example, your prompt is “Girl at the beach”, only three words. Flux’s T5 model will automatically fill in some other related things, like sand, blue sky, etc.

Insight 2: It can understand the syntax and the meaning behind the words

Another place where Flux is smarter than Stable Diffusion is that it can understand syntax and the meaning behind the words.

So you no longer need to split the key information in the image into individual keywords and use various symbols to increase the weight of certain words. Instead, you can describe your intention naturally. In other words, you just need to describe your intention like you usually do.

Like Stable Diffusion cannot understand “left”, “right”, “top”, “bottom”, in Flux you can use them naturally, and it can understand. For example, the Prompt for the following image is:

A dog and a cat are both standing on a red box. 
In the middle, there is a blue ball with a parrot standing on top of it. 
The box has "I ❤️ Flux" written on it.

The result is that the position information is basically correct, and it even recognizes emojis:

Insight 3: Specific and definite words have a greater impact on the model

I don’t know if it’s just my personal illusion, but I found that specific and definite words have a greater impact on the model.

What you need to do is to describe your intention as specifically and definitely as possible. Describe what elements you need in the picture. Instead of using vague concepts. At the same time, try to use positive words instead of negative words like “Remove” or “Not Include”. These words are basically ineffective in Flux.

2. Prompt template

Although Flux’s prompt is very smart, you can also use prompt templates to help you better organize your prompt. Generally, the prompt needs to include the following elements:

  1. Object:The main element in the picture. And try to describe only the necessary details, such as position, action, color, size, etc.
  2. Environment:The environment in the picture. For example, what is the background, or how is the light.
  3. Style:The style of the picture. For example, cartoon, realistic, or cyberpunk, etc.

As long as the above three elements are included, you can generate a pretty good picture.

If you want to make some control, you can also add:

  • Medium:The medium of the picture. For example, selfie, film camera, or studio lighting.
  • Artists:If Style is not enough, you can also specify an artist (but note that Flux seems to only work for deceased artists).
  • Time:If Style is not enough, you can also specify the era of the picture.

3. How to write a good prompt

After reading the above content, you may think that writing a good Flux prompt is quite simple. However, I believe that most people (including me) are actually unable to generate the image you want in your mind at once. The reason is:

  1. Most people lack an aesthetic sense: This is a big problem, because you don’t know what kind of picture you want, so you don’t know how to write a prompt.
  2. Most people lack aesthetic knowledge: Even if you know what is beautiful, you don’t know how to express it, for example, if you don’t know that there is a painter named “Van Gogh” in the world, you will not be able to write “Van Gogh style” in the prompt.

So, I suggest you look at how others write prompts and imitate them. To make it easier for everyone to learn and reference, I will share some excellent prompts in the Flux Prompt Library.

4. Prompt assistant plugin

Finally, I recommend some prompt assistant extensions.

4.1 Bilbox ComfyUI

The first plugin is Bilbox ComfyUI. This plugin can help you better organize your prompt.

You can install this plugin through ComfyUI Manager, or through Git Clone. For detailed installation methods, please refer to Install ComfyUI Extension.

Note that this plugin is more suitable for writing prompts for photo-like images.

The usage is very simple. First, you need to insert a BilboX node (② in the figure) before the CLIP Text Encode node (① in the figure), and connect it to the text terminal of the CLIP Text Encode node. Then, according to the prompt of the plugin, fill in or select the prompt you need. For example, in the figure below, I selected street photography and added some light options. The generated image is very perfect:

Also, someone may not know how to display the text terminal of the CLIP Text Encode node. It’s actually very simple, just right-click on the CLIP Text Encode node (① in the figure), then select Convert Widget to Input (② in the figure), and then select Convert text to input (③ in the figure).

4.2 Flux Prompt Generator

The first plugin is more suitable for generating photo-like prompts, but the second plugin can help you generate a more complete Flux prompt, which is Flux Prompt Generator.

After installing the plugin, add it to the workflow, and you can see that its options are more than the first plugin:

4.3 ComfyUI Portrait Master

The last plugin is ComfyUI Portrait Master. This plugin has more settings for facial details, which can help you write prompts more suitable for Portrait scenes.