diff --git a/README.md b/README.md index ba8c95a..4a5ed06 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,30 @@ make create-gke-cluster make bootstrap-flux2 ``` +## Building a chat service with Quarkus and OpenAI + +```bash +# use the Quarkus starter to create a service skeleton +# select desired build system and dependencies +open https://code.quarkus.io + +# for local development use the following commands +cd openai-chat-service +export QUARKUS_LANGCHAIN4J_OPENAI_API_KEY=$OPENAI_API_KEY +./gradlew quarkusDev + +# interact with the service locally +http get localhost:8080/api/ask q=="Was macht QAware?" +http get localhost:8080/api/ask q=="What does QAware do?" +http get localhost:8080/api/ask q=="Was macht Microsoft?" +http get localhost:8080/api/ask q=="What is the sum of 40 and 2?" +http get localhost:8080/api/ask q=="What does QAware do? Send email to mlr@qaware.de with subject Information and response as message." + +# this here is managed by Flux2 +kubectl apply -k infrastructure/services/openai-chat-service/ +kubectl get all +``` + ## Building an OpenAI Proxy using Envoy The access to the OpenAI API is provided using a cluster internal Envoy based proxy. @@ -31,6 +55,29 @@ curl http://localhost:10000/v1/chat/completions \ }' ``` +## Building a chat service with Quarkus and Ollama + +```bash +# this is 99% similar to the instructions of using Quarkus and OpenAI +# the only difference, use +# 'io.quarkiverse.langchain4j:quarkus-langchain4j-ollama:0.22.0' +# instead of +# 'io.quarkiverse.langchain4j:quarkus-langchain4j-openai:0.22.0' + +# for local development use the following commands +ollama serve +ollama run llama3.1 + +cd ollama-chat-service +./gradlew quarkusDev + + + +# this here is managed by Flux2 +kubectl apply -k infrastructure/services/openai-chat-service/ +kubectl get all +``` + ## Deploying custom LLMs using Ollama Operator ```bash diff --git a/ollama-chat-service/.dockerignore b/ollama-chat-service/.dockerignore new file mode 100644 index 0000000..4361d2f --- /dev/null +++ b/ollama-chat-service/.dockerignore @@ -0,0 +1,5 @@ +* +!build/*-runner +!build/*-runner.jar +!build/lib/* +!build/quarkus-app/* \ No newline at end of file diff --git a/ollama-chat-service/.gitignore b/ollama-chat-service/.gitignore new file mode 100644 index 0000000..ba4fbcc --- /dev/null +++ b/ollama-chat-service/.gitignore @@ -0,0 +1,41 @@ +# Gradle +.gradle/ +build/ + +# Eclipse +.project +.classpath +.settings/ +bin/ + +# IntelliJ +.idea +*.ipr +*.iml +*.iws + +# NetBeans +nb-configuration.xml + +# Visual Studio Code +.vscode +.factorypath + +# OSX +.DS_Store + +# Vim +*.swp +*.swo + +# patch +*.orig +*.rej + +# Local environment +.env + +# Plugin directory +/.quarkus/cli/plugins/ +# TLS Certificates +.certs/ diff --git a/ollama-chat-service/build.gradle b/ollama-chat-service/build.gradle new file mode 100644 index 0000000..5c489dc --- /dev/null +++ b/ollama-chat-service/build.gradle @@ -0,0 +1,47 @@ +plugins { + id 'java' + id 'io.quarkus' +} + +repositories { + mavenCentral() + mavenLocal() +} + +dependencies { + implementation enforcedPlatform("${quarkusPlatformGroupId}:${quarkusPlatformArtifactId}:${quarkusPlatformVersion}") + + implementation 'io.quarkus:quarkus-rest' + + implementation 'io.quarkus:quarkus-smallrye-health' + implementation 'io.quarkus:quarkus-smallrye-metrics' + + implementation 'io.quarkiverse.langchain4j:quarkus-langchain4j-core:0.22.0' + implementation 'io.quarkiverse.langchain4j:quarkus-langchain4j-ollama:0.22.0' + + implementation 'io.quarkus:quarkus-arc' + + testImplementation 'io.quarkus:quarkus-junit5' + testImplementation 'io.rest-assured:rest-assured' +} + +group 'de.qaware.demo' +version '1.0.0' + +java { + sourceCompatibility = JavaVersion.VERSION_21 + targetCompatibility = JavaVersion.VERSION_21 +} + +test { + systemProperty "java.util.logging.manager", "org.jboss.logmanager.LogManager" +} + +compileJava { + options.encoding = 'UTF-8' + options.compilerArgs << '-parameters' +} + +compileTestJava { + options.encoding = 'UTF-8' +} diff --git a/ollama-chat-service/gradle.properties b/ollama-chat-service/gradle.properties new file mode 100644 index 0000000..12b848c --- /dev/null +++ b/ollama-chat-service/gradle.properties @@ -0,0 +1,7 @@ +# Gradle properties + +quarkusPluginId=io.quarkus +quarkusPluginVersion=3.17.2 +quarkusPlatformGroupId=io.quarkus.platform +quarkusPlatformArtifactId=quarkus-bom +quarkusPlatformVersion=3.17.2 diff --git a/ollama-chat-service/gradle/wrapper/gradle-wrapper.jar b/ollama-chat-service/gradle/wrapper/gradle-wrapper.jar new file mode 100644 index 0000000..62d4c05 Binary files /dev/null and b/ollama-chat-service/gradle/wrapper/gradle-wrapper.jar differ diff --git a/ollama-chat-service/gradle/wrapper/gradle-wrapper.properties b/ollama-chat-service/gradle/wrapper/gradle-wrapper.properties new file mode 100644 index 0000000..19cfad9 --- /dev/null +++ b/ollama-chat-service/gradle/wrapper/gradle-wrapper.properties @@ -0,0 +1,5 @@ +distributionBase=GRADLE_USER_HOME +distributionPath=wrapper/dists +distributionUrl=https\://services.gradle.org/distributions/gradle-8.9-bin.zip +zipStoreBase=GRADLE_USER_HOME +zipStorePath=wrapper/dists diff --git a/ollama-chat-service/gradlew b/ollama-chat-service/gradlew new file mode 100755 index 0000000..fbd7c51 --- /dev/null +++ b/ollama-chat-service/gradlew @@ -0,0 +1,185 @@ +#!/usr/bin/env sh + +# +# Copyright 2015 the original author or authors. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +############################################################################## +## +## Gradle start up script for UN*X +## +############################################################################## + +# Attempt to set APP_HOME +# Resolve links: $0 may be a link +PRG="$0" +# Need this for relative symlinks. +while [ -h "$PRG" ] ; do + ls=`ls -ld "$PRG"` + link=`expr "$ls" : '.*-> \(.*\)$'` + if expr "$link" : '/.*' > /dev/null; then + PRG="$link" + else + PRG=`dirname "$PRG"`"/$link" + fi +done +SAVED="`pwd`" +cd "`dirname \"$PRG\"`/" >/dev/null +APP_HOME="`pwd -P`" +cd "$SAVED" >/dev/null + +APP_NAME="Gradle" +APP_BASE_NAME=`basename "$0"` + +# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. +DEFAULT_JVM_OPTS='"-Xmx64m" "-Xms64m"' + +# Use the maximum available, or set MAX_FD != -1 to use that value. +MAX_FD="maximum" + +warn () { + echo "$*" +} + +die () { + echo + echo "$*" + echo + exit 1 +} + +# OS specific support (must be 'true' or 'false'). +cygwin=false +msys=false +darwin=false +nonstop=false +case "`uname`" in + CYGWIN* ) + cygwin=true + ;; + Darwin* ) + darwin=true + ;; + MINGW* ) + msys=true + ;; + NONSTOP* ) + nonstop=true + ;; +esac + +CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar + + +# Determine the Java command to use to start the JVM. +if [ -n "$JAVA_HOME" ] ; then + if [ -x "$JAVA_HOME/jre/sh/java" ] ; then + # IBM's JDK on AIX uses strange locations for the executables + JAVACMD="$JAVA_HOME/jre/sh/java" + else + JAVACMD="$JAVA_HOME/bin/java" + fi + if [ ! -x "$JAVACMD" ] ; then + die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME + +Please set the JAVA_HOME variable in your environment to match the +location of your Java installation." + fi +else + JAVACMD="java" + which java >/dev/null 2>&1 || die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. + +Please set the JAVA_HOME variable in your environment to match the +location of your Java installation." +fi + +# Increase the maximum file descriptors if we can. +if [ "$cygwin" = "false" -a "$darwin" = "false" -a "$nonstop" = "false" ] ; then + MAX_FD_LIMIT=`ulimit -H -n` + if [ $? -eq 0 ] ; then + if [ "$MAX_FD" = "maximum" -o "$MAX_FD" = "max" ] ; then + MAX_FD="$MAX_FD_LIMIT" + fi + ulimit -n $MAX_FD + if [ $? -ne 0 ] ; then + warn "Could not set maximum file descriptor limit: $MAX_FD" + fi + else + warn "Could not query maximum file descriptor limit: $MAX_FD_LIMIT" + fi +fi + +# For Darwin, add options to specify how the application appears in the dock +if $darwin; then + GRADLE_OPTS="$GRADLE_OPTS \"-Xdock:name=$APP_NAME\" \"-Xdock:icon=$APP_HOME/media/gradle.icns\"" +fi + +# For Cygwin or MSYS, switch paths to Windows format before running java +if [ "$cygwin" = "true" -o "$msys" = "true" ] ; then + APP_HOME=`cygpath --path --mixed "$APP_HOME"` + CLASSPATH=`cygpath --path --mixed "$CLASSPATH"` + + JAVACMD=`cygpath --unix "$JAVACMD"` + + # We build the pattern for arguments to be converted via cygpath + ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null` + SEP="" + for dir in $ROOTDIRSRAW ; do + ROOTDIRS="$ROOTDIRS$SEP$dir" + SEP="|" + done + OURCYGPATTERN="(^($ROOTDIRS))" + # Add a user-defined pattern to the cygpath arguments + if [ "$GRADLE_CYGPATTERN" != "" ] ; then + OURCYGPATTERN="$OURCYGPATTERN|($GRADLE_CYGPATTERN)" + fi + # Now convert the arguments - kludge to limit ourselves to /bin/sh + i=0 + for arg in "$@" ; do + CHECK=`echo "$arg"|egrep -c "$OURCYGPATTERN" -` + CHECK2=`echo "$arg"|egrep -c "^-"` ### Determine if an option + + if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then ### Added a condition + eval `echo args$i`=`cygpath --path --ignore --mixed "$arg"` + else + eval `echo args$i`="\"$arg\"" + fi + i=`expr $i + 1` + done + case $i in + 0) set -- ;; + 1) set -- "$args0" ;; + 2) set -- "$args0" "$args1" ;; + 3) set -- "$args0" "$args1" "$args2" ;; + 4) set -- "$args0" "$args1" "$args2" "$args3" ;; + 5) set -- "$args0" "$args1" "$args2" "$args3" "$args4" ;; + 6) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" ;; + 7) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" ;; + 8) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" ;; + 9) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" "$args8" ;; + esac +fi + +# Escape application args +save () { + for i do printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/" ; done + echo " " +} +APP_ARGS=`save "$@"` + +# Collect all arguments for the java command, following the shell quoting and substitution rules +eval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS "\"-Dorg.gradle.appname=$APP_BASE_NAME\"" -classpath "\"$CLASSPATH\"" org.gradle.wrapper.GradleWrapperMain "$APP_ARGS" + +exec "$JAVACMD" "$@" diff --git a/ollama-chat-service/gradlew.bat b/ollama-chat-service/gradlew.bat new file mode 100755 index 0000000..a9f778a --- /dev/null +++ b/ollama-chat-service/gradlew.bat @@ -0,0 +1,104 @@ +@rem +@rem Copyright 2015 the original author or authors. +@rem +@rem Licensed under the Apache License, Version 2.0 (the "License"); +@rem you may not use this file except in compliance with the License. +@rem You may obtain a copy of the License at +@rem +@rem https://www.apache.org/licenses/LICENSE-2.0 +@rem +@rem Unless required by applicable law or agreed to in writing, software +@rem distributed under the License is distributed on an "AS IS" BASIS, +@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +@rem See the License for the specific language governing permissions and +@rem limitations under the License. +@rem + +@if "%DEBUG%" == "" @echo off +@rem ########################################################################## +@rem +@rem Gradle startup script for Windows +@rem +@rem ########################################################################## + +@rem Set local scope for the variables with windows NT shell +if "%OS%"=="Windows_NT" setlocal + +set DIRNAME=%~dp0 +if "%DIRNAME%" == "" set DIRNAME=. +set APP_BASE_NAME=%~n0 +set APP_HOME=%DIRNAME% + +@rem Resolve any "." and ".." in APP_HOME to make it shorter. +for %%i in ("%APP_HOME%") do set APP_HOME=%%~fi + +@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. +set DEFAULT_JVM_OPTS="-Xmx64m" "-Xms64m" + +@rem Find java.exe +if defined JAVA_HOME goto findJavaFromJavaHome + +set JAVA_EXE=java.exe +%JAVA_EXE% -version >NUL 2>&1 +if "%ERRORLEVEL%" == "0" goto init + +echo. +echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. +echo. +echo Please set the JAVA_HOME variable in your environment to match the +echo location of your Java installation. + +goto fail + +:findJavaFromJavaHome +set JAVA_HOME=%JAVA_HOME:"=% +set JAVA_EXE=%JAVA_HOME%/bin/java.exe + +if exist "%JAVA_EXE%" goto init + +echo. +echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME% +echo. +echo Please set the JAVA_HOME variable in your environment to match the +echo location of your Java installation. + +goto fail + +:init +@rem Get command-line arguments, handling Windows variants + +if not "%OS%" == "Windows_NT" goto win9xME_args + +:win9xME_args +@rem Slurp the command line arguments. +set CMD_LINE_ARGS= +set _SKIP=2 + +:win9xME_args_slurp +if "x%~1" == "x" goto execute + +set CMD_LINE_ARGS=%* + +:execute +@rem Setup the command line + +set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar + + +@rem Execute Gradle +"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %CMD_LINE_ARGS% + +:end +@rem End local scope for the variables with windows NT shell +if "%ERRORLEVEL%"=="0" goto mainEnd + +:fail +rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of +rem the _cmd.exe /c_ return code! +if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1 +exit /b 1 + +:mainEnd +if "%OS%"=="Windows_NT" endlocal + +:omega diff --git a/ollama-chat-service/settings.gradle b/ollama-chat-service/settings.gradle new file mode 100644 index 0000000..d02b96a --- /dev/null +++ b/ollama-chat-service/settings.gradle @@ -0,0 +1,12 @@ +pluginManagement { + repositories { + mavenCentral() + gradlePluginPortal() + mavenLocal() + } + plugins { + id "${quarkusPluginId}" version "${quarkusPluginVersion}" + } +} + +rootProject.name='ollama-chat-service' diff --git a/ollama-chat-service/src/main/docker/Dockerfile.jvm b/ollama-chat-service/src/main/docker/Dockerfile.jvm new file mode 100644 index 0000000..e7d17f7 --- /dev/null +++ b/ollama-chat-service/src/main/docker/Dockerfile.jvm @@ -0,0 +1,97 @@ +#### +# This Dockerfile is used in order to build a container that runs the Quarkus application in JVM mode +# +# Before building the container image run: +# +# ./gradlew build +# +# Then, build the image with: +# +# docker build -f src/main/docker/Dockerfile.jvm -t quarkus/java-ai-ollama-service-jvm . +# +# Then run the container using: +# +# docker run -i --rm -p 8080:8080 quarkus/java-ai-ollama-service-jvm +# +# If you want to include the debug port into your docker image +# you will have to expose the debug port (default 5005 being the default) like this : EXPOSE 8080 5005. +# Additionally you will have to set -e JAVA_DEBUG=true and -e JAVA_DEBUG_PORT=*:5005 +# when running the container +# +# Then run the container using : +# +# docker run -i --rm -p 8080:8080 quarkus/java-ai-ollama-service-jvm +# +# This image uses the `run-java.sh` script to run the application. +# This scripts computes the command line to execute your Java application, and +# includes memory/GC tuning. +# You can configure the behavior using the following environment properties: +# - JAVA_OPTS: JVM options passed to the `java` command (example: "-verbose:class") +# - JAVA_OPTS_APPEND: User specified Java options to be appended to generated options +# in JAVA_OPTS (example: "-Dsome.property=foo") +# - JAVA_MAX_MEM_RATIO: Is used when no `-Xmx` option is given in JAVA_OPTS. This is +# used to calculate a default maximal heap memory based on a containers restriction. +# If used in a container without any memory constraints for the container then this +# option has no effect. If there is a memory constraint then `-Xmx` is set to a ratio +# of the container available memory as set here. The default is `50` which means 50% +# of the available memory is used as an upper boundary. You can skip this mechanism by +# setting this value to `0` in which case no `-Xmx` option is added. +# - JAVA_INITIAL_MEM_RATIO: Is used when no `-Xms` option is given in JAVA_OPTS. This +# is used to calculate a default initial heap memory based on the maximum heap memory. +# If used in a container without any memory constraints for the container then this +# option has no effect. If there is a memory constraint then `-Xms` is set to a ratio +# of the `-Xmx` memory as set here. The default is `25` which means 25% of the `-Xmx` +# is used as the initial heap size. You can skip this mechanism by setting this value +# to `0` in which case no `-Xms` option is added (example: "25") +# - JAVA_MAX_INITIAL_MEM: Is used when no `-Xms` option is given in JAVA_OPTS. +# This is used to calculate the maximum value of the initial heap memory. If used in +# a container without any memory constraints for the container then this option has +# no effect. If there is a memory constraint then `-Xms` is limited to the value set +# here. The default is 4096MB which means the calculated value of `-Xms` never will +# be greater than 4096MB. The value of this variable is expressed in MB (example: "4096") +# - JAVA_DIAGNOSTICS: Set this to get some diagnostics information to standard output +# when things are happening. This option, if set to true, will set +# `-XX:+UnlockDiagnosticVMOptions`. Disabled by default (example: "true"). +# - JAVA_DEBUG: If set remote debugging will be switched on. Disabled by default (example: +# true"). +# - JAVA_DEBUG_PORT: Port used for remote debugging. Defaults to 5005 (example: "8787"). +# - CONTAINER_CORE_LIMIT: A calculated core limit as described in +# https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt. (example: "2") +# - CONTAINER_MAX_MEMORY: Memory limit given to the container (example: "1024"). +# - GC_MIN_HEAP_FREE_RATIO: Minimum percentage of heap free after GC to avoid expansion. +# (example: "20") +# - GC_MAX_HEAP_FREE_RATIO: Maximum percentage of heap free after GC to avoid shrinking. +# (example: "40") +# - GC_TIME_RATIO: Specifies the ratio of the time spent outside the garbage collection. +# (example: "4") +# - GC_ADAPTIVE_SIZE_POLICY_WEIGHT: The weighting given to the current GC time versus +# previous GC times. (example: "90") +# - GC_METASPACE_SIZE: The initial metaspace size. (example: "20") +# - GC_MAX_METASPACE_SIZE: The maximum metaspace size. (example: "100") +# - GC_CONTAINER_OPTIONS: Specify Java GC to use. The value of this variable should +# contain the necessary JRE command-line options to specify the required GC, which +# will override the default of `-XX:+UseParallelGC` (example: -XX:+UseG1GC). +# - HTTPS_PROXY: The location of the https proxy. (example: "myuser@127.0.0.1:8080") +# - HTTP_PROXY: The location of the http proxy. (example: "myuser@127.0.0.1:8080") +# - NO_PROXY: A comma separated lists of hosts, IP addresses or domains that can be +# accessed directly. (example: "foo.example.com,bar.example.com") +# +### +FROM registry.access.redhat.com/ubi8/openjdk-21:1.20 + +ENV LANGUAGE='en_US:en' + + +# We make four distinct layers so if there are application changes the library layers can be re-used +COPY --chown=185 build/quarkus-app/lib/ /deployments/lib/ +COPY --chown=185 build/quarkus-app/*.jar /deployments/ +COPY --chown=185 build/quarkus-app/app/ /deployments/app/ +COPY --chown=185 build/quarkus-app/quarkus/ /deployments/quarkus/ + +EXPOSE 8080 +USER 185 +ENV JAVA_OPTS_APPEND="-Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager" +ENV JAVA_APP_JAR="/deployments/quarkus-run.jar" + +ENTRYPOINT [ "/opt/jboss/container/java/run/run-java.sh" ] + diff --git a/ollama-chat-service/src/main/docker/Dockerfile.legacy-jar b/ollama-chat-service/src/main/docker/Dockerfile.legacy-jar new file mode 100644 index 0000000..e35cd3e --- /dev/null +++ b/ollama-chat-service/src/main/docker/Dockerfile.legacy-jar @@ -0,0 +1,93 @@ +#### +# This Dockerfile is used in order to build a container that runs the Quarkus application in JVM mode +# +# Before building the container image run: +# +# ./gradlew build -Dquarkus.package.jar.type=legacy-jar +# +# Then, build the image with: +# +# docker build -f src/main/docker/Dockerfile.legacy-jar -t quarkus/java-ai-ollama-service-legacy-jar . +# +# Then run the container using: +# +# docker run -i --rm -p 8080:8080 quarkus/java-ai-ollama-service-legacy-jar +# +# If you want to include the debug port into your docker image +# you will have to expose the debug port (default 5005 being the default) like this : EXPOSE 8080 5005. +# Additionally you will have to set -e JAVA_DEBUG=true and -e JAVA_DEBUG_PORT=*:5005 +# when running the container +# +# Then run the container using : +# +# docker run -i --rm -p 8080:8080 quarkus/java-ai-ollama-service-legacy-jar +# +# This image uses the `run-java.sh` script to run the application. +# This scripts computes the command line to execute your Java application, and +# includes memory/GC tuning. +# You can configure the behavior using the following environment properties: +# - JAVA_OPTS: JVM options passed to the `java` command (example: "-verbose:class") +# - JAVA_OPTS_APPEND: User specified Java options to be appended to generated options +# in JAVA_OPTS (example: "-Dsome.property=foo") +# - JAVA_MAX_MEM_RATIO: Is used when no `-Xmx` option is given in JAVA_OPTS. This is +# used to calculate a default maximal heap memory based on a containers restriction. +# If used in a container without any memory constraints for the container then this +# option has no effect. If there is a memory constraint then `-Xmx` is set to a ratio +# of the container available memory as set here. The default is `50` which means 50% +# of the available memory is used as an upper boundary. You can skip this mechanism by +# setting this value to `0` in which case no `-Xmx` option is added. +# - JAVA_INITIAL_MEM_RATIO: Is used when no `-Xms` option is given in JAVA_OPTS. This +# is used to calculate a default initial heap memory based on the maximum heap memory. +# If used in a container without any memory constraints for the container then this +# option has no effect. If there is a memory constraint then `-Xms` is set to a ratio +# of the `-Xmx` memory as set here. The default is `25` which means 25% of the `-Xmx` +# is used as the initial heap size. You can skip this mechanism by setting this value +# to `0` in which case no `-Xms` option is added (example: "25") +# - JAVA_MAX_INITIAL_MEM: Is used when no `-Xms` option is given in JAVA_OPTS. +# This is used to calculate the maximum value of the initial heap memory. If used in +# a container without any memory constraints for the container then this option has +# no effect. If there is a memory constraint then `-Xms` is limited to the value set +# here. The default is 4096MB which means the calculated value of `-Xms` never will +# be greater than 4096MB. The value of this variable is expressed in MB (example: "4096") +# - JAVA_DIAGNOSTICS: Set this to get some diagnostics information to standard output +# when things are happening. This option, if set to true, will set +# `-XX:+UnlockDiagnosticVMOptions`. Disabled by default (example: "true"). +# - JAVA_DEBUG: If set remote debugging will be switched on. Disabled by default (example: +# true"). +# - JAVA_DEBUG_PORT: Port used for remote debugging. Defaults to 5005 (example: "8787"). +# - CONTAINER_CORE_LIMIT: A calculated core limit as described in +# https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt. (example: "2") +# - CONTAINER_MAX_MEMORY: Memory limit given to the container (example: "1024"). +# - GC_MIN_HEAP_FREE_RATIO: Minimum percentage of heap free after GC to avoid expansion. +# (example: "20") +# - GC_MAX_HEAP_FREE_RATIO: Maximum percentage of heap free after GC to avoid shrinking. +# (example: "40") +# - GC_TIME_RATIO: Specifies the ratio of the time spent outside the garbage collection. +# (example: "4") +# - GC_ADAPTIVE_SIZE_POLICY_WEIGHT: The weighting given to the current GC time versus +# previous GC times. (example: "90") +# - GC_METASPACE_SIZE: The initial metaspace size. (example: "20") +# - GC_MAX_METASPACE_SIZE: The maximum metaspace size. (example: "100") +# - GC_CONTAINER_OPTIONS: Specify Java GC to use. The value of this variable should +# contain the necessary JRE command-line options to specify the required GC, which +# will override the default of `-XX:+UseParallelGC` (example: -XX:+UseG1GC). +# - HTTPS_PROXY: The location of the https proxy. (example: "myuser@127.0.0.1:8080") +# - HTTP_PROXY: The location of the http proxy. (example: "myuser@127.0.0.1:8080") +# - NO_PROXY: A comma separated lists of hosts, IP addresses or domains that can be +# accessed directly. (example: "foo.example.com,bar.example.com") +# +### +FROM registry.access.redhat.com/ubi8/openjdk-21:1.20 + +ENV LANGUAGE='en_US:en' + + +COPY build/lib/* /deployments/lib/ +COPY build/*-runner.jar /deployments/quarkus-run.jar + +EXPOSE 8080 +USER 185 +ENV JAVA_OPTS_APPEND="-Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager" +ENV JAVA_APP_JAR="/deployments/quarkus-run.jar" + +ENTRYPOINT [ "/opt/jboss/container/java/run/run-java.sh" ] diff --git a/ollama-chat-service/src/main/docker/Dockerfile.native b/ollama-chat-service/src/main/docker/Dockerfile.native new file mode 100644 index 0000000..9be381a --- /dev/null +++ b/ollama-chat-service/src/main/docker/Dockerfile.native @@ -0,0 +1,27 @@ +#### +# This Dockerfile is used in order to build a container that runs the Quarkus application in native (no JVM) mode. +# +# Before building the container image run: +# +# ./gradlew build -Dquarkus.native.enabled=true +# +# Then, build the image with: +# +# docker build -f src/main/docker/Dockerfile.native -t quarkus/java-ai-ollama-service . +# +# Then run the container using: +# +# docker run -i --rm -p 8080:8080 quarkus/java-ai-ollama-service +# +### +FROM registry.access.redhat.com/ubi8/ubi-minimal:8.10 +WORKDIR /work/ +RUN chown 1001 /work \ + && chmod "g+rwX" /work \ + && chown 1001:root /work +COPY --chown=1001:root build/*-runner /work/application + +EXPOSE 8080 +USER 1001 + +ENTRYPOINT ["./application", "-Dquarkus.http.host=0.0.0.0"] diff --git a/ollama-chat-service/src/main/docker/Dockerfile.native-micro b/ollama-chat-service/src/main/docker/Dockerfile.native-micro new file mode 100644 index 0000000..6594c7f --- /dev/null +++ b/ollama-chat-service/src/main/docker/Dockerfile.native-micro @@ -0,0 +1,30 @@ +#### +# This Dockerfile is used in order to build a container that runs the Quarkus application in native (no JVM) mode. +# It uses a micro base image, tuned for Quarkus native executables. +# It reduces the size of the resulting container image. +# Check https://quarkus.io/guides/quarkus-runtime-base-image for further information about this image. +# +# Before building the container image run: +# +# ./gradlew build -Dquarkus.native.enabled=true +# +# Then, build the image with: +# +# docker build -f src/main/docker/Dockerfile.native-micro -t quarkus/java-ai-ollama-service . +# +# Then run the container using: +# +# docker run -i --rm -p 8080:8080 quarkus/java-ai-ollama-service +# +### +FROM quay.io/quarkus/quarkus-micro-image:2.0 +WORKDIR /work/ +RUN chown 1001 /work \ + && chmod "g+rwX" /work \ + && chown 1001:root /work +COPY --chown=1001:root build/*-runner /work/application + +EXPOSE 8080 +USER 1001 + +ENTRYPOINT ["./application", "-Dquarkus.http.host=0.0.0.0"] diff --git a/ollama-chat-service/src/main/java/de/qaware/demo/ChatBot.java b/ollama-chat-service/src/main/java/de/qaware/demo/ChatBot.java new file mode 100644 index 0000000..464d345 --- /dev/null +++ b/ollama-chat-service/src/main/java/de/qaware/demo/ChatBot.java @@ -0,0 +1,24 @@ +package de.qaware.demo; + +import dev.langchain4j.service.MemoryId; +import dev.langchain4j.service.SystemMessage; +import dev.langchain4j.service.UserMessage; +import io.quarkiverse.langchain4j.RegisterAiService; + +@RegisterAiService( + chatMemoryProviderSupplier = ChatBotMemoryProvider.class +) +public interface ChatBot { + + @SystemMessage(""" + You are an AI named Bob. + Your response must be polite, use the same language as the question, and be relevant to the question. + """) + String ask(@UserMessage String question); + + @SystemMessage(""" + You are an AI named Bob. + Your responses must be polite, use the same language as the question, and be relevant to the questions. + """) + String chat(@MemoryId int memoryId, @UserMessage String message); +} \ No newline at end of file diff --git a/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotMemoryProvider.java b/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotMemoryProvider.java new file mode 100644 index 0000000..193e47e --- /dev/null +++ b/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotMemoryProvider.java @@ -0,0 +1,27 @@ +package de.qaware.demo; + +import java.util.function.Supplier; + +import dev.langchain4j.memory.ChatMemory; +import dev.langchain4j.memory.chat.ChatMemoryProvider; +import dev.langchain4j.memory.chat.MessageWindowChatMemory; +import dev.langchain4j.store.memory.chat.InMemoryChatMemoryStore; + +public class ChatBotMemoryProvider implements Supplier { + private final InMemoryChatMemoryStore store = new InMemoryChatMemoryStore(); + + @Override + public ChatMemoryProvider get() { + return new ChatMemoryProvider() { + + @Override + public ChatMemory get(Object memoryId) { + return MessageWindowChatMemory.builder() + .maxMessages(42) + .id(memoryId) + .chatMemoryStore(store) + .build(); + } + }; + } +} diff --git a/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotResource.java b/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotResource.java new file mode 100644 index 0000000..2edd3bf --- /dev/null +++ b/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotResource.java @@ -0,0 +1,30 @@ +package de.qaware.demo; + +import dev.langchain4j.agent.tool.P; +import jakarta.inject.Inject; +import jakarta.ws.rs.GET; +import jakarta.ws.rs.Path; +import jakarta.ws.rs.Produces; +import jakarta.ws.rs.QueryParam; +import jakarta.ws.rs.core.MediaType; + +@Path("/api") +public class ChatBotResource { + + @Inject + ChatBot bot; + + @GET + @Path("/ask") + @Produces(MediaType.TEXT_PLAIN) + public String ask(@QueryParam("q") String question) { + return bot.ask(question); + } + + @GET + @Path("/chat") + @Produces(MediaType.TEXT_PLAIN) + public String chat(@QueryParam("id") int id, @QueryParam("msg") String message) { + return bot.chat(id, message); + } +} diff --git a/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotTools.java b/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotTools.java new file mode 100644 index 0000000..5167b3f --- /dev/null +++ b/ollama-chat-service/src/main/java/de/qaware/demo/ChatBotTools.java @@ -0,0 +1,20 @@ +package de.qaware.demo; + +import dev.langchain4j.agent.tool.Tool; +import io.quarkus.logging.Log; +import jakarta.enterprise.context.ApplicationScoped; + +@ApplicationScoped +public class ChatBotTools { + + @Tool("send email to address with subject and message") + public void sendEmail(String address, String subject, String message) { + Log.infof("Sending email to %s about %s", address, subject); + } + + @Tool("add a and b") + int sum(int a, int b) { + Log.infof("Adding %s and %s with tool.", a, b); + return a + b; + } +} diff --git a/ollama-chat-service/src/main/resources/application.properties b/ollama-chat-service/src/main/resources/application.properties new file mode 100644 index 0000000..641b012 --- /dev/null +++ b/ollama-chat-service/src/main/resources/application.properties @@ -0,0 +1,7 @@ +quarkus.langchain4j.ollama.chat-model.model-id=llama3.1 +# quarkus.langchain4j.ollama.base-url=http://localhost:11434 + +quarkus.langchain4j.ollama.log-requests=true +quarkus.langchain4j.ollama.log-responses=true + +quarkus.management.enabled=true \ No newline at end of file diff --git a/ollama-chat-service/src/native-test/java/de/qaware/demo/ChatBotResourceIT.java b/ollama-chat-service/src/native-test/java/de/qaware/demo/ChatBotResourceIT.java new file mode 100644 index 0000000..f3b36c2 --- /dev/null +++ b/ollama-chat-service/src/native-test/java/de/qaware/demo/ChatBotResourceIT.java @@ -0,0 +1,7 @@ +package de.qaware.demo; + +import io.quarkus.test.junit.QuarkusIntegrationTest; + +@QuarkusIntegrationTest +class ChatBotResourceIT extends ChatBotResourceTest { +} diff --git a/ollama-chat-service/src/test/java/de/qaware/demo/ChatBotResourceTest.java b/ollama-chat-service/src/test/java/de/qaware/demo/ChatBotResourceTest.java new file mode 100644 index 0000000..f5ddb86 --- /dev/null +++ b/ollama-chat-service/src/test/java/de/qaware/demo/ChatBotResourceTest.java @@ -0,0 +1,21 @@ +package de.qaware.demo; + +import io.quarkus.test.junit.QuarkusTest; +import org.junit.jupiter.api.Test; + +import static io.restassured.RestAssured.given; +import static org.hamcrest.CoreMatchers.is; +import static org.hamcrest.CoreMatchers.notNullValue; + +@QuarkusTest +class ChatBotResourceTest { + @Test + void testAskEndpoint() { + given() + .when().queryParam("q", "What is QAware GmbH?") + .get("/api/ask") + .then() + .statusCode(200) + .body(is(notNullValue())); + } +} \ No newline at end of file diff --git a/openai-chat-service/README.md b/openai-chat-service/README.md deleted file mode 100644 index a131463..0000000 --- a/openai-chat-service/README.md +++ /dev/null @@ -1,95 +0,0 @@ -# java-ai-openai-service - -This project uses Quarkus, the Supersonic Subatomic Java Framework. - -If you want to learn more about Quarkus, please visit its website: . - -## Running the application in dev mode - -You can run your application in dev mode that enables live coding using: - -```shell script -./gradlew quarkusDev -``` - -> **_NOTE:_** Quarkus now ships with a Dev UI, which is available in dev mode only at . - -## Packaging and running the application - -The application can be packaged using: - -```shell script -./gradlew build -``` - -It produces the `quarkus-run.jar` file in the `build/quarkus-app/` directory. -Be aware that it’s not an _über-jar_ as the dependencies are copied into the `build/quarkus-app/lib/` directory. - -The application is now runnable using `java -jar build/quarkus-app/quarkus-run.jar`. - -If you want to build an _über-jar_, execute the following command: - -```shell script -./gradlew build -Dquarkus.package.jar.type=uber-jar -``` - -The application, packaged as an _über-jar_, is now runnable using `java -jar build/*-runner.jar`. - -## Creating a native executable - -You can create a native executable using: - -```shell script -./gradlew build -Dquarkus.native.enabled=true -``` - -Or, if you don't have GraalVM installed, you can run the native executable build in a container using: - -```shell script -./gradlew build -Dquarkus.native.enabled=true -Dquarkus.native.container-build=true -``` - -You can then execute your native executable with: `./build/java-ai-openai-service-1.0.0-SNAPSHOT-runner` - -If you want to learn more about building native executables, please consult . - -## Related Guides - -- REST ([guide](https://quarkus.io/guides/rest)): A Jakarta REST implementation utilizing build time processing and Vert.x. This extension is not compatible with the quarkus-resteasy extension, or any of the extensions that depend on it. -- LangChain4j Easy RAG ([guide](https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html)): Provides the Easy RAG functionality with LangChain4j -- Quarkus LangChain4j pgvector embedding store ([guide](https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html)): Provides the pgvector Embedding store for Quarkus LangChain4j -- LangChain4j ([guide](https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html)): Provides the basic integration with LangChain4j -- LangChain4j Ollama ([guide](https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html)): Provides the basic integration of Ollama with LangChain4j -- LangChain4j OpenAI ([guide](https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html)): Provides the basic integration with LangChain4j - -## Provided Code - -### LangChain4j Easy RAG - -This code is a very basic sample service to start developing with Quarkus LangChain4j using Easy RAG. - -This code is set up to use OpenAI as the LLM, thus you need to set the `QUARKUS_LANGCHAIN4J_OPENAI_API_KEY` environment variable to your OpenAI API key. - -In `./easy-rag-catalog/` you can find a set of example documents that will be used to create the RAG index which the bot (`src/main/java/org/acme/Bot.java`) will ingest. - -On first run, the bot will create the RAG index and store it in `easy-rag-catalog.json` file and reuse it on subsequent runs. -This can be disabled by setting the `quarkus.langchain4j.easy-rag.reuse-embeddings.enabled` property to `false`. - -Add it to a Rest endpoint: -```java - @Inject - Bot bot; - - @POST - @Produces(MediaType.TEXT_PLAIN) - public String chat(String q) { - return bot.chat(q); - } -``` - -In a more complete example, you would have a web interface and use websockets that would provide more interactive experience, see [ChatBot Easy RAG Sample](https://github.com/quarkiverse/quarkus-langchain4j/tree/main/samples/chatbot-easy-rag) for such an example. -### REST - -Easily start your REST Web Services - -[Related guide section...](https://quarkus.io/guides/getting-started-reactive#reactive-jax-rs-resources)