A GenAI Solution with Virtual Threads, JDBC Reactive Extensions, and Pipelined Database Operations
Introduction
This blog post explores two phenomenal technologies. The first is an Oracle JDBC and Database feature called Pipelined Database Operations. Starting with Oracle Database Release 23ai, JDBC Thin drivers now support pipelined database operations
The second is Generative AI and how to use the Oracle Cloud Infrastructure GenAI Service. Our sample scenario and simple use case will present the code for combining both to create a GenAI application with the Oracle Database and Oracle Cloud Infrastructure.
So, without further ado, let’s get started!
Prerequisites
- JDK — Java Development Kit 17 or newer
- Oracle Database Free Release 23ai — Container Image
- Oracle JDBC Driver 23ai — Maven Central
- Your preferred Java IDE — Eclipse, IntelliJ, VS Code
- Apache Maven
- OCI’s Command Line Interface (CLI)
Introduction to Pipelined Database Operations
Java applications can now asynchronously submit several SQL requests to the server without waiting for the return of the preceding calls.
The core principle of Pipelined Database Operations is to keep the server busy and enable an application to use the interleaving requests appropriately. The application can keep sending requests, while the server builds up a queue and processes those requests one by one. Then, the server sends the responses back to the client in the same order in which it received the requests.
The Oracle JDBC Driver 23ai supports Pipelined Database Operations. Although my blog post does not exhaustively cover this subject, we have rich and extensive documentation you can use if needed. Please check Support for Pipelined Database Operations.
Just to mention, I won’t cover all the possible scenarios with Pipelined Database Operations in terms of libraries and APIs it can be combined with. However, it’s important to mention the available options, as listed below.
- The Oracle JDBC Reactive Extensions
- Reactive Streams libraries
- Java Virtual Threads
- Structured Concurrency
- The Standard JDBC Batching API
We’ll explore a practical scenario combined with Structured Concurrency to illustrate how we interact with it all. In a nutshell, Structured Concurrency treats groups of related tasks running in different threads as a single unit of work, thereby streamlining error handling and cancellation, improving reliability, and enhancing observability.
The code sample, including the Java implementation and the most relevant methods, will be explained. However, once again, check the official documentation Support for Pipelined Database Operations for more information about all the remaining options beyond Structured Concurrency.
Introduction to OCI GenAI SDK
Oracle Cloud Infrastructure (OCI) Generative AI is a fully managed service for seamlessly integrating these versatile language models into a wide range of GenAI-related use cases, including writing assistance, summarization, analysis, and chat.
Our example and the associated sample scenario will use the OCI GenAI Service and its provided embedding models to create the vector embeddings that will be stored by an instance of Oracle Database 23ai as a full-fledged vector store, which also supports Oracle AI Vector Search.
The configuration involves some security-related requirements, including using OCI’s Command Line Interface (CLI) and OCIDs, which are typical when interacting with the Oracle Cloud Infrastructure. If you need an introduction, please check here.
From an OCI GenAI Service standpoint, it’s interesting to mention that you can get the OCID of an OCI GenAI embedding model related to your compartment and region by running the command below:
oci generative-ai model-collection list-models - region {your-region} - compartment-id {your-compartment-id}
Remember to replace the placeholders {your-compartment-id} and {your-region} with the right values, and save the returned model OCID value. We’ll need it later to configure the interaction with the OCI GenAI Service.
Besides, note the Maven dependencies for the Oracle Cloud Infrastructure GenAI Service.
The code sample — the PipelineVectorDemo class
The code sample uses an instance of Oracle Autonomous Database on OCI. To configure your JDBC connection details, use the /resources/config.properties file as usual.
The key feature and the resulting key learning from the provided code sample is that many threads share one JDBC connection and use it to execute SQL operations concurrently. Normally, that would not be possible, as synchronous APIs like PreparedStatement.execute() block the thread's calling until all previous SQL operations have been completed.
Concurrent SQL is possible because the code sample uses Pipelined Database Operations.
Pipelined Database operations
The PipelineVectorDemo.java class below contains code examples that use virtual threads, pipelined database calls, and AI vector search.
These methods use the Structured Concurrency API, which has been a preview feature since JDK 21 and continues to be previewed as of JDK 23.
They will cover three different operations:
- The
loadTable(Connection)
method executes pipelined batch inserts on virtual threads. The inserts will load a database table with text data retrieved from an external URL.
- The
updateTable(Connection)
method executes pipelined batch updates on virtual threads. The updates store vector embeddings for the text data, which are requested from Oracle Cloud's Generative AI service.
- The
searchTableText(Connection, List)
method executed pipelined SELECT queries on virtual threads. The queries use the VECTOR_DISTANCE function of Oracle Database to perform a similarity search against vector embeddings.
Before trying to execute the sample code, remember that in a previous step, you got the OCID value and stored it as an environment variable, as it will be used by our sample Java class in the following sections, so please note the static Java block that points to its configuration.
Note that you must also configure other environment variables for your usual OCID details, such as OCI_PROFILE, OCI_AUTHENTICATION, and COMPARTMENT_OCID. Again, please check the official documentation if you need an introduction to OCI Security.
The code sample — database-related dependencies
Note that, as required, a database table will be created automatically for you, as shown below.
The same goes for the Oracle Database 23ai JDBC Driver, as usual, the dependencies are provided as expected.
The code sample —see it in action
You can now run the code sample. The block diagram below shows the sequence of operations that will be executed.
Note the “-enable-preview” option must be provided when compiling and running the code sample. For Mac OS Users: Run with “-Doracle.jdbc.disablePipeline=false” to enable pipelined database calls.
Of course, feel free to debut it so you can follow the different operations and check breakpoints at the key methods to follow if you want.
First, it will connect to the database, fetch the text from the remote URL, and then insert the data, as shown below. A message will be presented when the table loading process has finished, e.g. Table loaded in 4.284 seconds.
Next, it will interact with the Oracle Cloud Infrastructure (OCI) GenAI Service to create the vector embeddings per the selected embedding model and then update the database records accordingly.
Ultimately, after all the asynchronous pipelined database operations, it will perform vector similarity searches with the following prompts:
- Predatory behavior of cats
- Location of bears
- Best climate for dogs
- Animals in ancient times
- Beautiful birds
- Deadly fish
- Where the wild things are
PipelineVectorDemo.java
Wrapping it up
That’s it! You learned how to use Pipelined Database Operations along with the Oracle Cloud Infrastructure — Generative AI Services to create a nice GenAI solution with the use of many enterprise, mission-critical features that support modern GenAI solutions.
We’ll also cover other scenarios with the Oracle Database 23ai and the OCI GenAI Service. I hope you enjoyed this blog post, so stay tuned!
References
Support for Pipelined Database Operations
JEP 480: Structured Concurrency (Third Preview)
Oracle Cloud Infrastructure (OCI) — Generative AI Service
Oracle Database Free Release 23ai — Container Image
Oracle JDBC Driver 23ai — Maven Central
Oracle® Database JDBC Java API Reference, Release 23ai
Develop Java applications with Oracle Database
Quickstart: Connect to Oracle Database 23ai using IntelliJ IDEA
Oracle Developers and Oracle OCI Free Tier
Join our Oracle Developers channel on Slack to discuss Java, JDK, JDBC, Microservices with Spring Boot, Helidon, Quarkus, Micronaut, Reactive Streams, GraalVM, Cloud, DevOps, IaC, and other topics!
Build, test, and deploy your applications on Oracle Cloud — for free! Get access to OCI Cloud Free Tier!