Creating GenAI Apps in Java with SD4J (Stable Diffusion for Java) and the ONNX Runtime — Part 1
This blog post provides a quick guide on how to generate images with SD4J (Stable Diffusion for Java), a Stable Diffusion implementation.
Oracle Labs (Open-Source) has introduced the Stable Diffusion in Java (SD4J) project, a modified port of a Stable Diffusion C# implementation with support for negative text inputs. SD4J can be used via the GUI or programmatically in Java applications to generate images by reusing a Java class with a well-defined interface contract com.oracle.labs.mlrg.sd4j.SD4J.
SD4J runs on top of the ONNX Runtime, a cross-platform inference engine.
So without further ado, let’s get started!
Prerequisites
- JDK — Java Development Kit 17 or newer
- Oracle Database Free Release 23ai — Container Image
- Oracle JDBC Driver 23ai (23.4.0.24.05) — Maven Central
- Your preferred Java IDE — Eclipse, IntelliJ, VS Code
- Apache Maven
- A Hugging Face user access token
- Hugging Face CLI
- Python 3
- Git
- Git Large File Storage (extension)
Image Generation — Steps
The steps below demonstrate how to use ONNX Runtime from Java and provide some tips and the best practices for ONNX Runtime to achieve good performance.
Install Python 3 and the Python dependencies
The scripts require a suitable Python 3 installation, along with the dependencies listed below:
pip install diffusers
pip install transformers
pip install onnxruntime
pip install optimum
pip install onnx
pip install torch
pip install accelerate
pip install onnxruntime_extensions
Clone the SD4J project
Create a root directory for your environment, then clone the SD4J project.
cd C:\models
git clone https://github.com/juarezjuniorgithub/sd4j.git
Install Git Large File Storage
LFS is an open-source Git extension for versioning large files.
git lfs install
Prepare the ONNX model checkpoint
Hugging Face’s website has many compatible models. We have tested the Stable Diffusion v1.5 checkpoint, which has pre-built ONNX models. You can download it via the git command below.
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 -b onnx
Setup the ONNX Runtime extensions
You will also need to check out and compile onnxruntime-extensions for your platform. The related repository is available at https://github.com/microsoft/onnxruntime-extensions
git clone https://github.com/microsoft/onnxruntime-extensions
Last, download the Python script convert_stable_diffusion_checkpoint_to_onnx.py from Hugging Face, and save it to a scripts directory. The directory structure on Windows is below for your reference.
Preparation with Hugging Face
Install the Hugging Face CLI.
pip install -U "huggingface_hub[cli]"
huggingface-cli --help
Authenticate against the Hugging Face platform. Note that you have to provide the user access token per the pre-requisites section above.
huggingface-cli.exe login
Compile the ONNX Runtime extensions for your target platform (Windows)
The ONNX Runtime extension is a library that extends the capabilities of the ONNX models and the interference with the ONNX Runtime.
cd C:\models\onnxruntime-extensions
build.bat
It will generate the required library (ortextensions.dll for Windows) as shown below. The DLL will be located at C:\models\onnxruntime-extensions\out\Windows\bin\RelWithDebInfo\ortextensions.dll
Copy the DLL file to the root of your SD4J project.
Compile the Java project with Maven
cd C:\models\sd4j
mvn clean package
Run the SD4J GUI
cd C:\models\sd4j
mvn clean package exec:exec -DmodelPath=../stable-diffusion-v1-5/
Unless stated otherwise, the images in this news story were created with guidance scale 10, seed 42, inference steps 50, and image scheduler Euler Ancestral.
You can specify the parameters of the image you’d like to generate,
and each image creates its own window where it can save the image as a PNG file. Saved PNG files contain a metadata field with the generation parameters.
The seed is a random number used to generate noise. When using the same seed, prompt, and other parameters, the generated images stay the same.
Stable diffusion starts with an image of random noise. With each inference step, the noise is reduced and steered towards the prompt. Higher is not always better, as it might introduce unwanted details. The Hugging Face website generally recommends 50 inference steps.
Create an image
Now, you can use the GUI to create an image of a sports car on the road, with the following textual description.
Professional photograph sports car on the road, high resolution, high quality
The inference steps will be logged as below.
As soon as the process is completed, your image will be available. An example image is below.
SD4J supports negative text inputs for exclusion, in this case, the red color. Most sports cars are red, so to create images of sports cars that aren’t red, the image's negative text may be used to specify what the image shouldn’t contain.
Wrap-up
That’s it! You learned how to create images with Java and the SD4J Library in Part 1 of this series. It introduced the tools, how to install them, and presented the SD4J Library in action with an existing code sample.
Part 2 will show you how to combine SD4J with the Oracle Database 23ai, and we’ll also explore the possibilities with GraalVM to create a fast yet solid image generator, so stay tuned!
References
Oracle Database Free Release 23ai — Container Image
Oracle® Database JDBC Java API Reference, Release 23ai
Oracle JDBC Driver 23ai (23.4.0.24.05) — Maven Central
Stable Diffusion — A latent text-to-image diffusion model — https://github.com/CompVis/stable-diffusion
Stable Diffusion for Java (SD4J) — https://github.com/oracle/sd4j
ONNX Runtime — https://onnxruntime.ai/
JDBC — https://www.oracle.com/database/technologies/appdev/jdbc.html
Introduction to Oracle JDBC Driver Support for Virtual Threads — https://bit.ly/3UlNJWP
Developing an Oracle JDBC app with GraalVM Native Image — https://rb.gy/iy3sgh
Getting Started with Reactive Relational Database Connectivity and the Oracle R2DBC Driver — https://rb.gy/42dnz5
Getting Started with the Java library for Reactive Streams Ingestion (RSI) — https://bit.ly/3rEiRnC
Introduction to JDBC Reactive Extensions with the Oracle Database 23c Free — Developer Release — https://rb.gy/qxlrbx
Pipelined Database Operations — https://rb.gy/iy3sgh
Oracle Developers and Oracle OCI Free Tier
Join our Oracle Developers channel on Slack to discuss Java, JDK, JDBC, GraalVM, Microservices with Spring Boot, Helidon, Quarkus, Micronaut, Reactive Streams, Cloud, DevOps, IaC, and other topics!
Build, test, and deploy your applications on Oracle Cloud — for free! Get access to OCI Cloud Free Tier!