To deploy all the components in OpenShift, the simplest way is to login using oc
, e.g.:
$ oc login -u <USER>
Next you can create a project for this demo. We will use ccfd
(Credit Card Fraud Detection).
$ oc new-project ccfd
We start by installing OpenDataHub via its operator. Start by cloning the operator:
$ git clone https://gitlab.com/opendatahub/opendatahub-operator
$ cd opendatahub-operator
Next, we deploy ODH and Seldon's CRD. 1
$ oc create -f deploy/crds/opendatahub_v1alpha1_opendatahub_crd.yaml
$ oc create -f deploy/crds/seldon-deployment-crd.yaml
Next, create the services and RBAC policy for the service account the operator will run as. This step at a minimum requires namespace admin rights.
$ oc create -f deploy/service_account.yaml
$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
$ oc adm policy add-role-to-user admin -z opendatahub-operator
Now we can deploy the operator with:
$ oc create -f deploy/operator.yaml
And wait before the pods are ready before continuing. You can verify using:
$ oc get pods
Strimzi is used to provide Apache Kafka on OpenShift.
Start by making a copy of deploy/crds/opendatahub_v1alpha1_opendatahub_cr.yaml
, e.g.
$ cp deploy/crds/opendatahub_v1alpha1_opendatahub_cr.yaml frauddetection_cr.yaml
and edit the following values:
# Seldon Deployment
seldon:
odh_deploy: true
kafka:
odh_deploy: true
kafka_cluster_name: odh-message-bus
kafka_broker_replicas: 3
kafka_zookeeper_replicas: 3
Kafka installation requires special setup, the following steps are to configure Kafka. Add your username to the kafka_admins
list,
by editing deploy/kafka/vars/vars.yaml
:
kafka_admins:
- admin
- system:serviceaccount:<PROJECT>:opendatahub-operator
- <INSERT USERNAME>
You can now deploy Kafka using:
$ cd deploy/kafka/
$ pipenv install
$ pipenv run ansible-playbook deploy_kafka_operator.yaml \
-e kubeconfig=$HOME/.kube/config \
-e NAMESPACE=<PROJECT>
Deploy the ODH custom resource based on the sample template
$ oc create -f frauddetection_cr.yaml
This installation of Rook-Ceph assumes your OCP 3.11/4.x cluster has at least 3 worker nodes. Download the files for Rook Ceph v0.9.3 and modify the source Rook Ceph files directly, clone the Rook operator and checkout the v0.9.3 branch. For convenience we also included the modified files.
$ git clone https://github.com/rook/rook.git
$ cd rook
$ git checkout -b rook-0.9.3 v0.9.3
$ cd cluster/examples/kubernetes/ceph/
Edit operator.yaml
and set the environment variables for FLEXVOLUME_DIR_PATH
and ROOK_HOSTPATH_REQUIRES_PRIVILEGED
to allow the Rook operator to use OpenShift hostpath storage.
name: FLEXVOLUME_DIR_PATH
value: "/etc/kubernetes/kubelet-plugins/volume/exec"
name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
value: "true"
The following steps require cluster wide permissions. Configure the necessary security contexts , and deploy the rook operator, this will create a new namespace, rook-ceph-system
, and deploy the pods in it.
$ oc create -f scc.yaml # configure security context
$ oc create -f operator.yaml # deploy operator
You can verify deployment is ongoing with:
$ oc get pods -n rook-ceph-system
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-j4zms 1/1 Running 0 33m
rook-ceph-agent-qghgc 1/1 Running 0 33m
rook-ceph-agent-tjzv6 1/1 Running 0 33m
rook-ceph-operator-567f8cbb6-f5rsj 1/1 Running 0 33m
rook-discover-gghsw 1/1 Running 0 33m
rook-discover-jd226 1/1 Running 0 33m
rook-discover-lgfrx 1/1 Running 0 33m
Once the operator is ready, you can create a Ceph cluster, and a Ceph object service. The toolbox service is also handy to deploy for checking the health of the Ceph cluster.
This step takes a couple of minutes, please be patient.
$ oc create -f cluster.yaml
Check the pods and wait for this pods to finish before proceeding.
$ oc get pods -n rook-ceph
rook-ceph-mgr-a-66db78887f-5pt7l 1/1 Running 0 108s
rook-ceph-mon-a-69c8b55966-mtb47 1/1 Running 0 3m19s
rook-ceph-mon-b-59699948-4zszh 1/1 Running 0 2m44s
rook-ceph-mon-c-58f4744f76-r8prn 1/1 Running 0 2m11s
rook-ceph-osd-0-764bbd9694-nxjpz 1/1 Running 0 75s
rook-ceph-osd-1-85c8df76d7-5bdr7 1/1 Running 0 74s
rook-ceph-osd-2-8564b87d6c-lcjx2 1/1 Running 0 74s
rook-ceph-osd-prepare-ip-10-0-136-154-mzf66 0/2 Completed 0 87s
rook-ceph-osd-prepare-ip-10-0-153-32-prf94 0/2 Completed 0 87s
rook-ceph-osd-prepare-ip-10-0-175-183-xt4jm 0/2 Completed 0 87s
Edit object.yaml
and replace port 80
with 8080
:
gateway:
# type of the gateway (s3)
type: s3
# A reference to the secret in the rook namespace where the ssl certificate is stored
sslCertificateRef:
# The port that RGW pods will listen on (http)
port: 8080
And then run:
$ oc create -f toolbox.yaml
$ oc create -f object.yaml
You can check the deployment progress, as previously, with:
$ oc get pods -n rook-ceph
rook-ceph-mgr-a-5b6fcf7c6-cx676 1/1 Running 0 6m56s
rook-ceph-mon-a-54d9bc6c97-kvfv6 1/1 Running 0 8m38s
rook-ceph-mon-b-74699bf79f-2xlzz 1/1 Running 0 8m22s
rook-ceph-mon-c-5c54856487-769fx 1/1 Running 0 7m47s
rook-ceph-osd-0-7f4c45fbcd-7g8hr 1/1 Running 0 6m16s
rook-ceph-osd-1-55855bf495-dlfpf 1/1 Running 0 6m15s
rook-ceph-osd-2-776c77657c-sgf5n 1/1 Running 0 6m12s
rook-ceph-osd-3-97548cc45-4xm4q 1/1 Running 0 5m58s
rook-ceph-osd-prepare-ip-10-0-138-84-gc26q 0/2 Completed 0 6m29s
rook-ceph-osd-prepare-ip-10-0-141-184-9bmdt 0/2 Completed 0 6m29s
rook-ceph-osd-prepare-ip-10-0-149-16-nh4tm 0/2 Completed 0 6m29s
rook-ceph-osd-prepare-ip-10-0-173-174-mzzhq 0/2 Completed 0 6m28s
rook-ceph-rgw-my-store-d6946dcf-q8k69 1/1 Running 0 5m33s
rook-ceph-tools-cb5655595-4g4b2 1/1 Running 0 8m46s
Next, you will need to create a set of S3 credentials, the resulting credentials will be stored in a secret file under the rook-ceph
namespace.
There isn’t currently a way to cross-share secrets between OpenShift namespaces, so you will need to copy the secret to the namespace running Open Data Hub operator. To do so, run:
$ oc create -f object-user.yaml
Next we are going to retrieve the secrets using
$ oc get secrets -n rook-ceph rook-ceph-object-user-my-store-my-user -o json
Create a secret in your deployment namespace that includes the secret and key for S3 interface.
Make sure to copy the accesskey
and secretkey
from the command output above and download the secret yaml file
available in this repo in deploy/ceph/s3-secretceph.yml
.
$ oc create -n ccfd -f deploy/ceph/s3-secretceph.yaml
From the Openshift console, create a route to the rook service, rook-ceph-rgw-my-store
, in the rook-ceph
namespace to expose the endpoint. This endpoint url will be used to access the S3 interface from the example notebooks.
$ oc expose -n rook-ceph svc/rook-ceph-rgw-my-store
Deploy fraud detection fully trained model by using deploy/model/modelfull.json
in this repository:
$ oc create -n ccfd -f deploy/model/modelfull.json
Check and make sure the model is created, this step will take a couple of minutes.
$ oc get seldondeployments
$ oc get pods
Create a route to the model by using deploy/model/modelfull-route.yaml
in this repo:
$ oc create -n ccfd -f deploy/model/modelfull-route.yaml
Enable Prometheus metric scraping by editing modelfull-modelfull
service from the portal and adding these two lines under annotations:
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/path: /prometheus
prometheus.io/scrape: 'true'
Make sure to decode the key and secret copied from the rook installation by using the following commands:
$ base64 -d
<Paste secret>
[Ctrl-D]
From a command line use the aws
tool to upload the file to rook-ceph
data store:
$ aws configure
Only enter key and secret, leave all other fields as default. Check if connection is working using the route
created previously (you can use oc get route -n rook-ceph
):
$ aws s3 ls --endpoint-url <ROOK_CEPH_URL>
Create a bucket and upload the file:
$ aws s3api create-bucket --bucket ccdata --endpoint-url <ROOK_CEPH_URL>
Download the credit card transaction creditcard.csv
file (available here) and upload it using:
$ wget -O creditcard.csv https://gitlab.com/opendatahub/fraud-detection-tutorial/-/raw/master/data/creditcard.csv
$ aws s3 cp creditcard.csv s3://ccdata/OPEN/uploaded/creditcard.csv --endpoint-url <ROOK_CEPH_URL> --acl public-read-write
You can verify the file is uploaded using:
$ aws s3 ls s3://ccdata/OPEN/uploaded/ --endpoint-url <ROOK_CEPH_URL>
In order to use jBPM's prediction service from User Tasks, a second Seldon model must be deployed using
oc new-app ruivieira/ccfd-seldon-usertask-model
To deploy the KIE server you can use the deploy/ccd-service.yaml
on this repo and run:
$ oc create -f deploy/ccd-service.yaml -n ccfd
The KIE server can be configured by editing the enviroment variables in that file, under the env
key.
Some configurable values are:
SELDON_URL
, location the Seldon server providing fraudulent score predictionCUSTOMER_NOTIFICATION_TOPIC
, Kafka topic for outgoing customer notificationsBROKER_URL
, Kafka broker location and port
If the Seldon server requires an authentication token, this can be passed to the KIE server by adding the following environment variable to deploy/ccd-service.yaml
:
- name: SELDON_TOKEN
value: <SELDON_TOKEN>
By default, the KIE server will request a prediction to the endpoint <SELDON_URL>/predict
. If however, your Seldon deployment uses another prediction endpoint, you can specify it by adding the SELDON_ENDPOINT
enviroment variable, for instance:
- name: SELDON_ENDPOINT
value: 'api/v0.1/predictions'
The HTTP connection parameters can also be configured, namely the connection pool size and the connections timeout. The timeout value provided is treated as milliseconds. For instance:
- name: SELDON_TIMEOUT
value: '5000' # five second timeout
- name: SELDON_POOL_SIZE
value: '5' # allows for 5 simulataneous HTTP connections
The prediction service's confidence threshold, above which a prediction automatically assigns an output and
closes the user task can be also provided. It is assumed to be a probability value between 0.0
and 1.0
.
If not provided, the default value is 1.0
. To specify it use:
- name: CONFIDENCE_THRESHOLD
value: '0.5' # as an example
If you want to interact with the KIE server's REST interface from outside OpenShift, you can expose its service with
$ oc expose svc/ccd-service
The notification service is an event-driven micro-service responsible for relaying notifications to the customer and customer responses.
If a message is sent to a "customer outgoing" Kafka topic, a notification is sent to the customer asking whether the transaction was legitimate or not. For this demo, the micro-service simulates customer interaction, but different communication methods can be built on top of it (email, SMS, etc).
If the customer replies (in both scenarios: they either made the transaction or not), a message is written to a "customer response" topic. The router (described below) subscribes to messages in this topic, and signals the business process with the customer response.
To deploy the notification service, we use the image ruivieira/ccfd-notification-service
(available here), by running:
$ oc create -f deploy/notification-service.yaml -n ccfd
The Apache Camel router is responsible consume messages arriving in specific topics, requesting a prediction to the Seldon model, and then triggering different REST endpoints according to that prediction.
The route is selected by executing configurable Drools rules using the model's prediction as inout. Depending rules outcome a specific business process will be triggered on the KIE server.
To deploy a router with listens to the topic KAFKA_TOPIC
from Kafka's broker BROKER_URL
and starts a process instance on the KIE server at KIE_SERVER_URL
, we can use the built image ruimvieira/ccd-fuse
(available here):
$ oc create -f deploy/router.yaml -n ccfd
Router configuration can be performed by editing the deploy/router.yaml
file.
Some configurable values are:
BROKER_URL
, Kafka broker location and portKAFKA_TOPIC
, Kafka topic for incoming transactionsKIE_SERVER_URL
, KIE server location and portSELDON_URL
, Seldon server location and port for fraud score predictionCUSTOMER_NOTIFICATION_TOPIC
, Kafka topic for outgoing customer notificationsCUSTOMER_RESPONSE_TOPIC
, Kafka topic for incoming customer responsesSELDON_ENDPOINT
, custom Seldon REST prediction endpoint
Also optionally, a Seldon token can be provided by editing the file:
- name: SELDON_TOKEN
value: <SELDON_TOKEN>
By default, the router will request a prediction to the endpoint <SELDON_URL>/api/v0.1/predictions
. If however, your Seldon deployment uses another prediction endpoint, you can specify it by adding the SELDON_ENDPOINT
enviroment variable above, for instance:
- name: SELDON_ENDPOINT
value: 'predict'
The Kafka Producer needs specific parameters to read from S3 interface and call the model's REST prediction endpoint.
Edit deploy/kafka/ProducerDeployment.yaml
in this repository. Edit the file to specify namespace and your rook-ceph
URL, your bucket name (this need to point to the location of the creditcard.csv
file in the rook-ceph
data store).
- name: NAMESPACE
description: The OpenShift project in use
value: <PROJECT> # e.g. ccfd
- name: s3endpoint
value: "<ROOK_CEPH_URL>:443"
- name: s3bucket
value: "ccdata"
- name: filename
value: "OPEN/uploaded/creditcard.csv"
Create the producer pod with:
$ oc process -f deploy/kafka/ProducerDeployment.yaml | oc apply -f -
From the Openshift portal click on the Prometheus route and explore some of the metrics. To launch Grafana dashboard click on the Grafana route. Use the Grafana Boards and upload them to the dashboard. The following is a list of the boards:
Additional Prometheus metrics are exposed by the router and the KIE server.
To enable them, edit the pod's annotation to in ccd-fuse
and ccd-service
, respectively, to include:
# ccd-fuse
prometheus.io/scrape: 'true'
prometheus.io/path: '/prometheus'
prometheus.io/port: '8091'
# ccd-service
prometheus.io/scrape: 'true'
prometheus.io/path: '/rest/metrics'
prometheus.io/port: '8090'
You must also expose, if you haven't done so, the ccd-service
with:
$ oc expose svc/ccd-service
ccd-fuse:8091/prometheus
provides:
transaction.incoming
, total number of incoming transactionstransaction.outgoing (type=standard)
, total outgoing transactions to the "standard" business processtransaction.outgoing (type=fraud)
, total outgoing transactions to the "fraud" business processnotifications.outgoing
, the number of customers notified (via SMS, email, etc) about a potentially fraudulent transactionnotifications.incoming
:notifications.incoming(response=approved)
, number of customers which approved the transactionnotifications.incoming(response=non_approved)
, number of customers which did not approved the transaction
And ccd-service:8090/rest/metrics
provides:
fraud_investigation_amount
, histogram amount for a transaction that will be investigated as fraudfraud_approved_low_amount
, histogram of transaction amounts not acknowledged by customer, but approved (mainly due to low amount)fraud_approved_amount
, histogram amount for a transaction that was approved by the customerfraud_rejected_amount
, histogram amount for a transaction that was rejected by the customer
The Kafka producer sends transaction data (TX
) to the odh-demo
topic (1).
The transaction data is based on the Kaggle Credit Card Fraud dataset.
The Camel-based router reads the odh-demo
topic for incoming transactions. Once it gets a transaction, it will extract the features needed for the model being served by Seldon. It will then issue a prediction request, via REST to the Seldon server (2). This is done using an HTTP POST request with the transaction features as the payload.
The Seldon server will return a prediction probability (PR
) on whether this is potentially a fraudulent transaction or not (3).
The router will then send the data to the KIE server (4). The router will instantiate a standard or fraudulent transaction business process, depending on the value returned by Seldon.
When a transaction is classified as potentially fraudulent, it is sent to the KIE server, creating a "fraud" business process (1).
The business process then sends a message to the Kafka topic ccd-customer-outgoing
with the customer id, transaction details and process id (2). The business process will then wait for either, whichever happens first:
- A response from the client (approving or not approving the transaction)
- A pre-defined timer runs out
The messages in the ccd-customer-outgoing
topic trigger a notification service (3). This service can be extended to inquire the customer in any preferred way (SMS, email, etc.), but for the purposes of this demo, we randomly generate a reply (or no reply) from the customer (4).
In the case where the customer replies (5)/(6), the notification service will publish the customer's response to the ccd-customer-response
topic.
This response message will be picked up by the router (7), which will then be redirected in order to signal the appropriate business process within the KIE server that the customer replied along with the response (8).
If the transaction was flagged as potentially fraudulent, the customer did not acknowleged it after a certain pre-defined period, an insvestigation branch of the business process will be triggered. A User Task will be created and use the jBPM's prediction service API to request a outcome prediction from Seldon using REST (2).
The prediction result will then be parsed by the prediction service and if the prediction confidence is:
- Above the defined threshold, automatically close the User Task, assigning the outcome returned by the ML model
- Below the defined threshold, set the prediction outcome to the most likely result, but do not close the User Task
The Business Process (BP) corresponding to a potential fraudulent transaction consists of the following flow:
- The process is instantiated with the transaction's data
- The
CustomerNotification
node sends a message to the<CUSTOMER-OUTGOING>
topic with the customer'sid
and the transaction'sid
- At this point, either one of the two branches will be active:
- If no customer response is received, after a certain specified time, a timer will trigger this branch.
- A DMN model is used where the outcome is to either initiate an investigation or accept the transaction.
- For demonstration purposes, a simple rule is evaluated where:
- If the fraud probability is below a certain threshold, and the transaction amount is sufficiently small, it is accepted.
- If the transaction amount is large or the probability is above a certain threshold, the BP proceeds with the creation of a User Task, assigned to a fraud investigator.
- If (before the timer expires) a response sent, the process is notified via a signal, containing the customer's response as the payload (either
true
, the customer made the transation orfalse
, they did not). From here two additional branches:- If the customer acknowledges the transaction, it is automatically approved.
- If not, the transaction is cancelled.
- If no customer response is received, after a certain specified time, a timer will trigger this branch.
The customer notification/response works by:
- Sending a message with the customer and transaction id to
<CUSTOMER-OUTGOING>
topic - This message is picked by the notification service, which will send an approriate notification (email, SMS, etc)
- Customer response is sent to the
<CUSTOMER-RESPONSE>
topic, which is picked by the Camel router, which in turn sends it to the appropriate container, using a KIE server REST endpoint, as a signal containing the customer response
Footnotes
-
Note that this step requires
cluster-admin
permissions. ↩