Stable Diffusion Tutorial: Create Stable Diffusion API on GCP VM

What is Stable Diffusion?

Stable Diffusion is a state-of-the-art deep learning, text-to-image model that was released in 2022. This innovative model is primarily designed to generate detailed and high-quality images based on textual descriptions. However, its versatility allows for various applications including tasks such as inpainting (filling in missing parts of an image), outpainting (expanding images beyond their original borders), and image-to-image translations that are guided by specific text prompts. For more information about its capabilities, you can visit Stability.ai.

Creating a Google Cloud Platform (GCP) Account

To utilize the power of Stable Diffusion, you will need to first create a GCP account. Visit Google Cloud Platform to sign up. It's essential to set up a billing account, as GPU resources cannot be utilized under the free tier. Additionally, it is advisable to set a budget with alerts in place since GPU usage can become quite costly.

Requesting GPU Access

Once your GCP account is established, navigate to the APIs & Services page to enable the Compute Engine API. Search for the Compute Engine API and click on Enable. Following this, you need to request the ability to create virtual machines that require GPU capability. This can be done by going to the Quotas section and filtering for GPUs across all regions. Request an increase from 0 to 1, providing a reason such as "Using an ML model that requires GPU". Please be patient, as approval may take a day or two. For more details, visit Google Cloud Compute Quotas.

Creating a VM Instance

Next, you will need to create a virtual machine (VM) instance. Head over to Google Cloud Console and click on Create instance. Assign your instance a name; we will use stable-diffusion-instance for this example. Select a region that supports GPU—you might need to try a few options. Under the Machine configuration, select the option for GPU. If you're looking for high performance and can accommodate the cost, the A100 GPU is the best option, while the T4 is a more budget-friendly selection, costing around $300/month if running continuously.

For the machine type, select n1-standard-4, which provides 15GB of memory—adequate for most tasks, though more may be preferable depending on your specific needs. Make sure to adjust the boot disk settings; select Change under Boot Disk, and opt for the Deep Learning VM option based on Debian 10. You might want to adjust the boot disk size (50GB is minimal). Under Firewall, check both Allow HTTP traffic and Allow HTTPS traffic.

In the Advanced options under Networking, add a network tag such as stable-diffusion-tag to identify this instance later.

Creating a Firewall Rule

To access your instance expeditiously, visit the Firewall Rules page and click Create Firewall Rule. Name this rule stable-diffusion-rule, and for Targets, select Tags (enter the tag stable-diffusion-tag). For Source IP ranges, input 0.0.0.0/0, and under Protocols and ports, add tcp:5000 before clicking Create.

Accessing the Compute Instance

The simplest method to access your instance is via SSH directly from the console. Navigate to the list of compute instances and click on the one you just created to connect.

Setting Up Stable Diffusion on the Instance

Upon logging into your instance for the first time, you will be prompted to install the Nvidia driver—simply respond with 'Y'. Note that each time you stop and restart the VM, you will need to reinstall these drivers.

The next step involves cloning the required repositories and installing Cog, which provides an efficient way to manage and deploy machine learning models on this VM without additional installations since Docker is pre-installed. Run the necessary commands to establish your environment.

Building the Docker Image

To execute the model, download the weights from Hugging Face. You'll need to set up an account and generate an authentication token if you haven't already. Once you have your token, leverage it to download the weights with your script. After the installation, you can start testing if the model is operational.

Once everything runs smoothly, you should find a generated image, output-1.png, which can be downloaded via the terminal. Use the pwd command to ascertain your working directory, which typically returns a path like /home//cog-stable-diffusion. This path will be integral to set as a volume when starting the stable diffusion docker image to facilitate access to your Hugging Face weights.

How to Test the API?

While testing the API, be aware that the response payload will be base64 encoded. To decode it, you can use an online tool such as Code Beautify Base64 Decoder.

Exploring Extra Features

For users interested in creating short videos, explore Deforum, which has a robust model for video generation using similar methods. Check it out at Deforum on Replicate. To employ it, simply stop the current container and follow the requisite steps.

Conclusion

Thank you for reading this comprehensive tutorial on setting up Stable Diffusion. If you found this guide helpful, you can discover more tutorials and resources on our dedicated tutorial page. Stay tuned for more insights and innovations in the realm of AI!