Skip to article frontmatterSkip to article content

Image Generation with Stable Diffusion and Fooocus

Image generation tools are now available to anyone using open source tools and consumer grade hardware. In this note I will detail how to install Fooocus, an frontend built on Gradio that can take advantage of Stability AI’s Stable Diffusion (SD) model checkpoints. These checkpoints have already been trained (and in some cases, fine-tuned) making them ideal for quick image generation for a wide array of use cases.

Installing Fooocus

The Fooocus installation procedures are relatively straightforward for Linux, Windows, and macOS, including systems with either AMD or NVIDIA GPUs. In this guide I will focus on the installation of Fooocus on the MPIA Astro GPU Nodes which are equipped with enterprise GPUs that will be more than up to the task of processing our images. However, due to the software available on the nodes, the installation procedure requires a slight deviation from what’s provided directly from Fooocus.

Let’s begin by logging into the node of our choice. In this example we will log in to astro-gpu-node1. Let’s open an SSH connection and print out the GPU information (for NVIDIA GPUs).

ssh user@astro-gpu-node1
nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.43.04    Driver Version: 515.43.04    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro RTX 6000     On   | 00000000:3B:00.0 Off |                    0 |
| N/A   30C    P0    55W / 250W |   2863MiB / 23040MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Quadro RTX 6000     On   | 00000000:D8:00.0 Off |                    0 |
| N/A   33C    P0    55W / 250W |   2053MiB / 23040MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

We are going to follow the default Fooocus installation instructions with a slight modification to account for the outdated CUDA version available on the GPU nodes. We’ll be loading the native Anaconda module on the astro-nodes to install the software.

module purge # Optional, gets rid of any previously loaded modules
module load anaconda3-py3.10
git clone https://github.com/lllyasviel/Fooocus.git
cd Fooocus
conda env create -f environment.yaml
conda activate fooocus

We can now follow the instructions for installing older versions of PyTorch (our acceleration framework) to be compatible with our version of CUDA. Here I am following the instructions for CUDA 11.7 as this is the version available on the astro-gpu-node at the time of writing.

conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements_versions.txt

And there we are! That should be it for installation.

Starting Fooocus

Let’s begin by activating the environment and running the main script. The first thing that will happen is a Stable Diffusion XL model checkpoint will have to be downloaded. These can several gigabytes in size so may take some time.

conda activate fooocus
python entry_with_update.py

While the code is initalizing, a web browser might pop up on your local machine over an X-server. Feel free to close it, as we’re going to be using our local web-browser than than the KDE default which will allow us to properly load the app. We’ll know the model is ready to run once we’ve seen the output:

App started successful. Use the app with http://127.0.0.1:XXXX/ or 127.0.0.1:XXXX

The Fooocus/Gradio interface is a web app built on Svelt. Therefore, in order to interact with the code, we’re going to need to forward the traffic to our local machine. We can select any port on our local machine, which we’ll label as YYYY. In a new terminal we can run the following:

ssh -L YYYY:127.0.0.1:XXXX user@astro-gpu-node1

In general, I just set use whatever port is used by the remote server. As the default port for Fooocus is 7865 I forward this 7865 on our local machine:

ssh -L 7865:127.0.0.1:7865 user@astro-gpu-node1

By leaving these two terminals open, one to run the server, and one to forward the web traffic, we can now open Fooocus on our local machine and start generating images.

Generating Images

We can now open our web browser of choice, for me it’s Firefox, and navigate to the Fooocus web application by navigating to http://localhost:YYYY/, or in my case http://localhost:7865/. If an empty image appears with a text box beneath it, then the Fooocus application has launched correctly!

Generating an image is now as simple as typing any prompt and hitting generate! We can watch the output logs in our original terminal session. The default settings will produce two images from our prompt. If the GPUs are not being used by other users, we should expect about 30-45s per image. While the Fooocus documentation provides a description of the different options, I intend to provide a short introduction to the most important configurations we can tweak in the UI. If you are familiar with Midjourney, one of the more popular image generation services, I recommend reading the Fooocus documentation section on transitioning between the two.

The Advanced Tab

By clicking the advanced checkbox at the bottom of the page, we can open the a panel whioch provides us with the fine grained control of our image generation process and consists of the following sections, each of which provides increasingly detailed control:

Input Images

One of the most powerful features of Stable Diffusion is not just generation from prompts, but from input images. By clicking the Input Image selection box, we can now edit original images (or AI generated images) in a few specificy ways:

With all of these controls under your belt, you should be well on your way to producing your AI generated images with Stable Diffusion XL and Fooocus!

Extra: Adding a new Model

One of the most exciting parts of image generation from Stable Diffusion XL checkpoints is the ability to start from any SDXL checkpoint, not just the defaults provided here. In fact, Fooocus ships with two additional default models, one designed for photorealism and the other for anime-style images. If you run entry_with_update.py with the --preset flag set as realistic or anime, both additional models will be downloaded. Either of the three default models can then be selected in the Advanced settings tab.

Additional models can be downloaded from the internet. Fooocus currently supports the following types of models:

Checkpoints (e.g. SDXL or SD1.5 models) should be placed in the Fooocus/models/checkpoints directory, while LoRAs should be placed in the Fooocus/models/loras directory and will then appear as options upon restarting the code. There are quite a few other directory to place additional configurations and tweaks, but at this point you’re equipped to set off on this adventure on your own!

Good luck!