A sane approach to interacting with an RDBMS in Java
sane-dbc
addresses the aspect of RDBMS interaction in the Java programming language.
While numerous libraries exist in the JVM ecosystem which cope with this in an elegant, well-grounded and efficient manner, to our knowledge the Java alternatives currently widespread fall short in one way or another.
sane-dbc
provides an effective, performant and robust way to interact with RDBMS in Java. It is based on SQL and the
DB Monad.
That is, you use
- SQL and your RDBMS system's specific facilities to interact with it in the way best suited to your application
- the DB class to compose operations into larger operations and programs, and to separate DB operations' description from their execution (interpretation), which leads to large increase in programmer efficiency and code correctness
- the facilities provided by
sane-dbc
to achieve common programming tasks, as well as spare yourself from boilerplate, such as the one inherent in JDBC
sane-dbc
is hosted on Artifactory OSS repository.
With gradle
, you can refer to sane-dbc
snapshot version in the following way:
repositories {
maven {
url 'https://oss.jfrog.org/artifactory/libs-snapshot/'
}
}
dependencies {
// the sane-dbc library
compile('com.novarto:sane-dbc-core:0.9.1-SNAPSHOT') {
changing = true
}
// HikariCP support
compile('com.novarto:sane-dbc-hikari:0.9.1-SNAPSHOT') {
changing = true
}
// The core library provides support for asynchronous interpretation through AsyncDbInterpeter, which returns CompletableFuture
// If in addition you require support for Guava ListenableFuture, use this jar:
compile('com.novarto:sane-dbc-guava:0.9.1-SNAPSHOT') {
changing = true
}
}
You can refer to sane-dbc
in Maven and other tools equivalently.
Let's start with a small example. We need a table to work with:
CREATE TABLE FOO (ID INTEGER IDENTITY PRIMARY KEY, DESCRIPTION NVARCHAR(100))
Let's look at some plain old pieces of code which read or mutate it:
void insertSomeFoos(Connection c) throws SQLException
{
try(PreparedStatement insert = c.prepareStatement(
"INSERT INTO FOO(DESCRIPTION) VALUES (?)"
))
{
for(String descr: asList("one", "two", "three"))
{
insert.setString(1, descr);
insert.executeUpdate();
}
}
}
List<Foo> selectTheFoos(Connection c) throws SQLException
{
try(PreparedStatement s = c.prepareStatement("SELECT ID, DESCRIPTION FROM FOO"))
{
List<Foo> result = new ArrayList<>();
ResultSet rs = s.executeQuery();
while(rs.next())
{
result.add(new Foo(rs.getInt(1), rs.getString(2)));
}
return result;
}
}
An RDBMS interaction in Java, then, takes the form
A run(Connection c) throws SQLException;
, where A
is a type parameter.
This is exactly the run
method of the DB class.
We'll be using that from now on.
For operations returning nothing (DB mutations, where we are only interested in changing the tables state), in Java it is
customary to use void
, as above. However, we want to parametrize on the return type, and java.lang.Void
is strange in that its
only 'valid' value is null
, which can lead to NPE. We will instead use another type with only one value which is null-safe -
Unit. You obtain a Unit value
to return by calling Unit.unit()
.
A
DB
instance is just a description of a database interaction.
If we just construct a DB
, nothing will happen - there will
be no connections, calling the database, or any other side effects. In other words, a method returning a DB
is
referentially transparent
(all the example code in this tutorial is published in our repo)
@Test
public void nothingHappens()
{
@SuppressWarnings("unused")
SelectOp.FjList<String> selectThem = new SelectOp.FjList<>(
"SELECT FOO FROM WHATEVER",
NO_BINDER,
rs -> rs.getString(1)
);
}
In the above example nothing happens, and rightly so. Nobody called the run()
method on our DB
(in this case, SelectOp
).
A piece of code which takes a
DB<A>
, runs it, and returns some other value, is called an interpreter
Since running a DB
requires a Connection
, the interpreter also needs to know how to spawn connections.
The simplest interpreter we can imagine is one that calls run() in the caller thread, and upon success returns the DB
result.
Upon error it throws.
We have such an interpreter baked into sane-dbc-core
, we just need to supply it with the code which spawns connections,
and give it a spin:
@Test
public void syncInterpreter()
{
// create a synchronous DB interpreter. It is a stateless object, and the act of creating one is also
// referentially transparent
SyncDbInterpreter dbi = new SyncDbInterpreter(
// provide a piece of code which knows how to spawn connections
// in this case we are just using the DriverManager
() -> DriverManager.getConnection("jdbc:hsqldb:mem:DescribeVsInterpret", "sa", "")
);
// submit an Update (mutate) operation which creates a table
dbi.submit(new UpdateOp(
"CREATE TABLE FOO (WHATEVER VARCHAR(200))",
NO_BINDER
));
// objects we will insert in the table
List<String> helloSaneDbc = List.arrayList("hello", "sane", "dbc");
//insert some data
Option<Integer> updateCount = dbi.submit(
//this describes an operation which inserts an iterable of objects in a table via addBatch / executeBatch
new BatchUpdateOp<>(
"INSERT INTO FOO(WHATEVER) VALUES(?)",
x -> preparedStatement -> preparedStatement.setString(1, x),
helloSaneDbc
)
);
// the operation returns an optional update count, since the JDBC driver might not return an update count at all
assertThat(updateCount, is(some(3)));
List<String> result = dbi.submit(
// select all of the objects in the table
new SelectOp.FjList<>("SELECT WHATEVER FROM FOO", NO_BINDER, resultSet -> resultSet.getString(1))
);
assertThat(result, is(helloSaneDbc));
}
By interpreting the DB
result in our code using a SyncDbInterpeter
, we explicitly state we do not not care about error handling, or about the caller thread getting blocked waiting on JDBC. This is mostly useful (and convenient) in unit testing your app's DB
layer. Later we will learn about interpreters more suitable for production code.
sane-dbc
works with only two abstractions
- A
DB
is a description of operations which will be executed against a database - An interpreter is responsible for executing the operations and returning the results. It is also responsible for all the specifics of execution, such as connection management, threading, error handling, transactions, etc.
The way it usually works is that you only call the interpreter at the edge of your application - the main
method;
a webservice method; a testcase. In this way, the bulk of your program only works with DB
descriptions, and is referentially transparent.
Side effects are only performed in specific places, and in a controlled manner.
These are all the concepts you need to know to start working with sane-dbc
. Let's now jump into details.
One of the things sane-dbc
does is provide ready-to-use DB
implementations for common tasks such as Select, Insert / Update,
batch Insert / Update, Aggregate (count, etc), and so forth.
Let's have a look at some of those in an example:
package com.novarto.sanedbc.examples;
import com.novarto.sanedbc.core.interpreter.SyncDbInterpreter;
import com.novarto.sanedbc.core.ops.*;
import fj.Unit;
import fj.control.db.DB;
import fj.data.List;
import fj.data.Option;
import org.junit.Test;
import java.sql.DriverManager;
import static com.novarto.sanedbc.core.ops.DbOps.unique;
import static fj.data.List.nil;
import static java.util.Arrays.asList;
import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;
public class BasicUsage
{
/*
* In a real-world application, DB-related operations are usually encapsulated in a separate module, so that's what we'll do here.
* It's kind of like a DAO (Data Access Object) since it provides operations over a table, or a set of logically related tables.
*
* It's different from a DAO in that it does not access the database itself; rather it returns descriptions of how it should
* be accessed - as already explained.
*
*/
public static final class StuffDb
{
/*
Create the STUFF table.
An EffectOp is an operation which takes nothing, and returns nothing( fj.Unit ). It is useful for performing
side effects which take no parameters, such as executing DDL:
*/
public static DB<Unit> createStuffTable()
{
//this specific table has auto-generated ID
return new EffectOp("CREATE TABLE STUFF (ID INTEGER PRIMARY KEY IDENTITY, DESCRIPTION NVARCHAR(200) NOT NULL)");
}
/*
Insert a single record. Return the updateCount - in this case it will always be 1
*/
public static DB<Integer> insertStuff(String description)
{
// an UpdateOp takes SQL, and a binder. It returns an update count
// a binder is a piece of code which takes a prepared statement, and returns nothing.
// it is so called because it binds the prepared statement parameters
return new UpdateOp(
"INSERT INTO STUFF(DESCRIPTION) VALUES(?)",
// bind the single parameter of the statement
ps -> ps.setString(1, description)
);
}
/*
Insert a single record, and return its generated key
*/
public static DB<Integer> insertStuffGetKey(String description)
{
/*
An InsertGenKeysOp is the same as an UpdateOp, only that it expects an auto-generated key to be present
after executing the update, and returns it as the result.
InsertGenKeysOp.Int is a specialization which expects an Integer key
*/
return new InsertGenKeysOp.Int(
"INSERT INTO STUFF(DESCRIPTION) VALUES(?)",
// you can see that the binder is the same as in the previous operation.
// in that case it is good practice to extract it as a static final field of your module
ps -> ps.setString(1, description)
);
}
/*
Select entries by description
*/
public static DB<List<Stuff>> selectByDescription(String description)
{
/*
A select operation takes SQL, a binder and a mapper, and returns an iterable of results
A mapper is a function which maps a single row from the resultset to a single object
A generic SelectOp also needs to be told what collection to use in the result
The SelectOp.FjList specialization uses fj.data.List, which is an immutable singly-linked list.
(Also known as Cons list)
There is also a specialization for java Lists; or you can supply your own collection builder
*/
return new SelectOp.FjList<>(
"SELECT ID, DESCRIPTION FROM STUFF WHERE DESCRIPTION=?",
ps -> ps.setString(1, description),
//build a Stuff from a resultset row
rs -> new Stuff(rs.getInt(1), rs.getString(2))
);
}
/*
Select an entry by id
The return type is fj.data.Option, since it may be that no entry with this id exists
fj.data.Option is equivalent to Java Optional
*/
public static DB<Option<Stuff>> selectByKey(int id)
{
//given a regular operation which returns an iterable:
DB<List<Stuff>> temp = new SelectOp.FjList<>(
"SELECT ID, DESCRIPTION FROM STUFF WHERE ID=?",
ps -> ps.setInt(1, id),
rs -> new Stuff(rs.getInt(1), rs.getString(2)));
// using the unique() function,
// we can convert it to an operation which expects at most one result, and returns that optional result:
return unique(temp);
}
/*
Insert many entries, using addBatch / executeBatch, and return the update count.
It is possible that a JDBC driver does not return the update count, therefore the optional result.
*/
public static DB<Option<Integer>> insertMany(Iterable<String> descriptions)
{
return new BatchUpdateOp<>(
//SQL
"INSERT INTO STUFF(DESCRIPTION) VALUES(?)",
//given the current element of the iterable, return a binder which sets the parameters for that element
// if you extracted the binder from the previous operations in a field named SET_DESCRIPTION, code becomes
// description -> SET_DESCRIPTION
description -> ps -> ps.setString(1, description),
//the iterables to insert
descriptions
);
}
/*
Count all entries in the table, where the description is LIKE the passed parameter
*/
public static DB<Long> count(String like)
{
String searchQuery = like.trim().toLowerCase();
//an aggregate op expects the resultset to have one element, and that element to be cast to long
//useful for numeric aggregate operations
return new AggregateOp("SELECT COUNT(*) FROM STUFF WHERE DESCRIPTION LIKE LOWER(?)",
ps -> ps.setString(1, searchQuery + "%")
);
}
}
@Test
public void testIt()
{
SyncDbInterpreter dbi = new SyncDbInterpreter(
() -> DriverManager.getConnection("jdbc:hsqldb:mem:test", "sa", ""
));
dbi.submit(StuffDb.createStuffTable());
Integer updateCount = dbi.submit(StuffDb.insertStuff("stuff 1"));
assertThat(updateCount, is(1));
Integer generatedKey = dbi.submit(StuffDb.insertStuffGetKey("stuff 2"));
List<Stuff> stuffs = dbi.submit(StuffDb.selectByDescription("no such"));
assertThat(stuffs.isEmpty(), is(true));
stuffs = dbi.submit(StuffDb.selectByDescription("stuff 1"));
//with head() we select the first element of the immutable list
assertThat(stuffs.head().description, is("stuff 1"));
//with tail() we select the rest of the list. it should be the empty list: nil()
assertThat(stuffs.tail(), is(nil()));
Option<Stuff> stuff2Option = dbi.submit(StuffDb.selectByKey(generatedKey));
assertThat(stuff2Option.isSome(), is(true));
Stuff stuff2 = stuff2Option.some();
assertThat(stuff2, is(new Stuff(generatedKey, "stuff 2")));
Option<Integer> batchCountOpt = dbi.submit(StuffDb.insertMany(asList("a", "b", "c")));
Integer batchCount = batchCountOpt.some();
assertThat(batchCount, is(3));
Long stuCount = dbi.submit(StuffDb.count("StU"));
//stuff 1 and stuff 2 match StU, so count should be 2
assertThat(stuCount, is(2L));
}
public static class Stuff
{
public final int id;
public final String description;
public Stuff(int id, String description)
{
this.id = id;
this.description = description;
}
@Override public boolean equals(Object o); //... noise ommitted
@Override public int hashCode(); //... noise ommitted
}
}
There's a couple more built in operations you can use, possibly with more coming in the future. Consult the javadoc of the individual operations for more details.
Hopefully we have demonstrated scrapping the JDBC boilerplate into reusable operations makes Java SQL programming reasonable.
You can further apply this principle to your program by implementing your own DB
instances.
A good property of the DB
abstraction is that it is practically zero-cost. There is no SQL generation, automatic object mapping
or anything of that sort going on. The total runtime overhead of using sane-dbc is the creation of a constant number of objects
(usually a couple) per DB operation. This gets amortized by the network call, and even by the creation of your DB's result object.
Another very useful property of the approach is that you are in full control of database interaction. Creating the SQL; supplying the input; choosing the exact result type - these are all done by the library user.
The DB
abstraction enables us to compose DB
instances with existing functions, and DB
instances with other DB
instances,
to obtain composite DB
's as a result.
The DB
class defines the following operation:
public final <B> DB<B> map(final F<A, B> f)
So map takes a DB<A>
, a function f: A -> B
and returns a DB<B>
. That happens if the original DB is successful.
If not, the DB<B>
will just contain the original error. By contain, we mean it will throw it upon run()
.
Let's see map()
in action by implementing a simple login service (complete source here):
public static class UserDB
{
public static DB<Boolean> login(String email, String pass)
{
//we map an operation DB<Option<User>> with a function which takes an Option<User> and returns a boolean,
//resulting in a DB<Boolean>
return selectUser(email).map(userOpt -> {
if (userOpt.isNone())
{
//invalid email
return false;
}
User user = userOpt.some();
//check pass hash matches
return loginOk(user, pass);
});
}
private static boolean loginOk(User user, String password)
{
return hash(password).equals(user.hash);
}
private static String hash(String password); // impl omitted
}
dbi.submit(UserDB.insertUser("me@that.com", "abcd"));
boolean success = dbi.submit(UserDB.login("me@that.com", "abcd"));
assertThat(success, is(true));
success = dbi.submit(UserDB.login("me@that.com", "wrong"));
assertThat(success, is(false));
success = dbi.submit(UserDB.login("larry@this.com", "abcd"));
assertThat(success, is(false));
You can imagine many other examples. One might select a list of employees and group them by department, using
java.util.stream.Collectors.groupingBy
. Or one might select a product catalog stored flat in a database, and turn it into
a fj.data.Tree
. The important thing is, if you need to (and one always needs to) work with the data from your DB
, you don't
run it - you just transform the original DB
description using map
. This way you delay performing side effects until the edge
of your app, where you interpret the DB
.
What if we wanted to execute an operation, and upon its success, take the result and execute another operation based on that?
bind
comes to the rescue:
public final <B> DB<B> bind(final F<A, DB<B>> f);
So bind takes a DB<A>
and a function which, given an A, produces a DB<B>
. The final result is a DB<B>
.
(If we had used map
in this scenario, we would end up with a DB<DB<B>>
, which is not convenient.)
For example, we might want to log a user in, and if successful, return their sales orders. If not, we return an error message. (the full code is available here)
//the operation authenticates the user, and reads their orders
//it returns either an error message (string), in case the login fails; or the list of orders
public static DB<Either<String, List<Order>>> authenticateAndGetOrders(String email, String pass)
{
// with bind (a.k.a. flatMap), we take the result of one operation, and use it to return another operation
return UserDB.login(email, pass).bind(success -> {
if(!success)
{
Either<String, List<Order>> errorMessage = Either.left("auth failure");
// the DB.unit operation returns an immediate result with the passed value, without touching the connection
return DB.unit(errorMessage);
}
return selectOrdersByEmail(email).map(orders -> Either.right(orders));
});
}
public static DB<List<Order>> selectOrdersByEmail(String userEmail)
{
return new SelectOp.FjList<>(
"SELECT ORDER_ID, USER_EMAIL, TEXT FROM ORDERS WHERE USER_EMAIL=?",
ps -> ps.setString(1, userEmail),
rs -> new Order(rs.getInt(1), rs.getString(2), rs.getString(3))
);
}
dbi.submit(UserDB.insertUser("john@doe.com", "abcd"));
dbi.submit(UserDB.insertUser("foo@bar.com", "abcd"));
dbi.submit(
OrderDb.insertOrders(
arrayList(new CreateOrder("john@doe.com", "Hi there"),
new CreateOrder("foo@bar.com", "Bye there")
))
);
Either<String, List<Order>> result = dbi.submit(OrderDb.authenticateAndGetOrders("john@doe.com", "abcd"));
assertThat(result.isRight(), is(true));
List<Order> orders = result.right().value();
assertThat(orders.isSingle(), is(true));
Order johnOrder = orders.head();
assertThat(johnOrder.text, is("Hi there"));
Either<String, List<Order>> shouldFail = dbi.submit(OrderDb.authenticateAndGetOrders("haxx0r", "abcd"));
assertThat(shouldFail, is(Either.left("auth failure")));
With bind
and map
in our arsenal, we can compose and reuse existing DB
operations,
as well as existing business logic functions (pure functions working on data and returning data, without being bothered with database specifics),
to arrive at composite operations that implement new use cases.
Because we only work with descriptions of DB interaction, and we only ever interpret those at the edge of our app,
there is no need to propagate Connection
objects or throws SqlException
clauses throughout the whole call stack,
as often happens when working with plain JDBC.
Folding is the process of taking an iterable of things, and collapsing them to a single result. Some examples of folds are:
java.util.stream.Stream.reduce()
fj.data.List.foldLeft()
fj.data.List.foldRight()
One obvious way to fold a ResultSet
into a single thing is to first get a DB<Iterable<A>>
(for example, via a SelectOp
)
and then do:
db.map(iterable -> iterable.reduce(...));
This approach is fine in general; but it could be suboptimal if the ResultSet
is very large, since we are constructing
an intermediate Iterable
that we later discard.
We can do better: we can do the reduction directly while iterating the ResultSet
, without using an intermediate collection:
(full source here)
// we will select employees and group them by department
// the result type is immutable map from integer (department id) to a immutable list of employees from that department
FoldLeftSelectOp<HashArrayMappedTrie<Integer, List<Employee>>> selectGroupedByDepartment =
new FoldLeftSelectOp<>(
"SELECT * FROM EMPLOYEES",
NO_BINDER,
(soFar, rs) -> {
// the current row of the resultset, which we will append to the result so far
Employee employee = new Employee(rs.getInt(1), rs.getString(2), rs.getInt(3));
// the employees collected so far for this department
// if this is the first employee of this department, fromThisDepartment will be none()
Option<List<Employee>> fromThisDepartment = soFar.find(employee.departmentId);
return fromThisDepartment
//in case there are already employees collected for this department,
//append this employee to the already collected (cons)
//and update the map so far, associating departmentId with the appended list
.map(employees -> soFar.set(employee.departmentId, employees.cons(employee)))
//otherwise update the map so far, associating departmentId with a sized-one list
//containing this employee
.orSome(() -> soFar.set(employee.departmentId, List.single(employee)));
},
initial
);
Perhaps that was a handful to grasp if you're not used to working with immutable collections. The point is, we are doing a
reduction while directly iterating the ResultSet
, with an empty initial value, without resorting to building an intermediate
collection from the ResultSet
, such as java.util.List
or fj.data.List
.
The result of the DB operation will be just like a hashmap from integer to a linkedlist of integer, only difference being neither the hashmap, nor the lists can be mutated.
In the advanced section of the tutorial we will discuss when and why it is good practice to return an immutable object from your DB
.
Sometimes you end up with an iterable of DB<A>
. It is convenient to treat that as a single DB<List>, containing the
aggregated result. This is achieved via the sequence
operator.
If you find yourself in need to use
sequence
, first investigate if you can change your SQL query so that it returns all the results in a singleDB
, in one go. That will usually be more performant.
import static com.novarto.sanedbc.core.ops.DbOps.sequence;
//given an iterable of DB's
List<DB<String>> dbList = arrayList(
DB.unit("foo"),
DB.unit("bar"),
DB.unit("baz")
);
//we can treat it as a DB<List>
DB<List<String>> db = sequence(dbList);
List<String> result = dbi.submit(db);
assertThat(result, is(arrayList("foo", "bar", "baz")));
Since DB
only describes database operations, a lot of aspects are left for interpretation-time. This includes
- JDBC connection management
- Transactional behaviour
- Error handling
- Forking execution in another thread
This is handy, since this way methods returning DB<A>
are only concerned with selecting from / updating the database and
building / transforming / composing the result, and nothing else. It also means that the same DB<A>
instance can be executed
with different interpreters, yielding a different behaviour.
Interpeters will generally support these two methods:
submit
will submit the operation for execution withautoCommit=true
transact
will submit the operation transactionally, i.e. withautoCommit=false
, and the operation will be rolled back upon error
Interpreters will require a piece of code at construction time, which is capable of returning JDBC connections. This is just like
a DataSource
, except you don't have to implement the DataSource
interface.
As we already know, the simplest interpreter is SyncDbInterprer
. It blocks the caller thread, and tries to execute the operation.
Upon error it throws RuntimeException. Here it is in action, again:
"CREATE TABLE DUMMY (ID INTEGER PRIMARY KEY, X NVARCHAR(200)"
// a datasource backed by a HikariCP connection pool
HikariDataSource hikariDS = Hikari.createHikari("jdbc:hsqldb:mem:test", "sa", "", new Properties());
SyncDbInterpreter sync = new SyncDbInterpreter(
//transform a datasource to a Try0<Connection, SqlException>
lift(hikariDS)
);
sync.submit(new EffectOp("INSERT INTO DUMMY VALUES(1,'a')"));
Long count = sync.submit(new AggregateOp("SELECT COUNT(*) FROM DUMMY"));
assertThat(count, is(1L));
try
{
sync.transact(new EffectOp(
"INSERT INTO DUMMY VALUES((2, 'b'), (1,'a'))"
));
fail("expected constraint violation");
}
catch (RuntimeException e)
{
assertThat(e.getCause(), is(instanceOf(SQLException.class)));
count = sync.submit(new AggregateOp("SELECT COUNT(*) FROM DUMMY"));
//transactional, no records were updated
assertThat(count, is(1L));
}
Thus far we can see that
- We can supply interpreters with a data source; we can easily construct a production-suitable one, too;
- We can achieve transactional behaviour by calling
transact()
instead ofsubmit()
Next, we might want to take a more principled approach to error handling. A common technique is to embed the exception that run
can throw in the return type of our interpreter, thus treating errors as regular values, instead of relying on try / catch.
You can achieve this behaviour by utilising ValidationDbInterpreter.
Its return type is Validation<Exception, A>
. It is a box which either contains the failure - java.lang.Exception
, or the
successful result <A>
.
//construct an interpreter that turns the result type to Validation<Exception, A>
// we can reuse the same data source across multiple interpreters
ValidationDbInterpreter vdb = new ValidationDbInterpreter(lift(hikariDS));
Validation<Exception, Long> successExpected = vdb.submit(new AggregateOp("SELECT COUNT(*) FROM DUMMY"));
assertThat(successExpected.isSuccess(), is(true));
assertThat(successExpected.success(), is(1L));
RuntimeException rte = new RuntimeException("failed I have");
Validation<Exception, Long> failExpected = vdb.submit(new DB<Long>()
{
@Override public Long run(Connection c) throws SQLException
{
// all subclasses of java.lang.Exception and lifted to Validation, not just SqlException
throw rte;
}
});
assertThat(failExpected.isFail(), is(true));
assertThat(failExpected.fail(), is(rte));
The examples so far have the property of blocking the caller thread when we call submit
/transact
. For some use cases this might be
unsuitable. For example, if we are in a HTTP worker thread, we don't want to hijack it while blocking on JDBC.
In such cases you can use an AsyncDbInterpreter.
It returns a CompletableFuture<A>
which will be completed exceptionally iff the underlying DB
throws; otherwise it will be
completed successfully.
ExecutorService ex = Executors.newCachedThreadPool();
// submits DB operations using the supplied executor, returns CompletableFuture<A>
AsyncDbInterpreter async = new AsyncDbInterpreter(lift(hikariDS), ex);
CompletableFuture<Long> countFuture = async.submit(new AggregateOp("SELECT COUNT(*) FROM DUMMY"));
//blocking call, don't do this in production
Long theCount = countFuture.get();
assertThat(theCount, is(1L));
CompletableFuture<Long> failedFuture = async.submit(new AggregateOp("BLA BLA BLA"));
//blocking call, don't do in production
TestUtil.waitFor(() -> failedFuture.isCompletedExceptionally(), 5, SECONDS);
Throwable failure = getFailure(failedFuture);
assertThat(failure.getCause(), instanceOf(SQLException.class));
This chapter outlines best practices and design considerations when using sane-dbc
.
Since DB
is only involved with describing operations, it never performs any forking itself. This means that in a composite
DB
operation (pipeline), all operations composing it happen in the same thread, and in sequence, e.g.
someDbOp() // is eventually interpreted in a thread 'X'
.map(function1) //happens in thread 'X'
.bind(result -> anotherDbOp(result)) //lambda called in thread 'X', and anotherDbOp() interpreted in thread 'X'
.map(otherFunction) //thread 'X' again
Therefore the way to execute DB operations in parallel, is to submit them using one of the asynchronous interpreters, e.g.
AsyncDbInterpreter ai = ...;
CompletableFuture<A> aFuture = ai.submit(op1);
CompletableFuture<B> bFuture = ai.transact(op2);
One consequence of separating description and interpretation is that the result of a DB
operation may be consumed on a different
thread than the one producing the DB
result. Therefore the result type A
of a DB<A>
must be thread safe.
A simple and performant way to ensure thread safety is to make all your DB
return types immutable.
You may have noticed that's what we do in all the examples in this repo, and there is a good reason.
This way you are thread-safe
by virtue of immutability, and no synchronization / locking / CAS takes place.
Furthermore, we strongly encourage you use
true immutable collection types, as opposed to java.util.Collections.unmodifiableXXX
wrappers, so that you do not have to incur
the overhead of copying. Such collection types can be found in popular libraries such as functionaljava
, vavr
, guava
, etc.
sane-dbc
is already made to work well with functionaljava
. If you need to integrate another collection framework, that's
fairly straightforward:
- All the operations in the library dealing with collections take an
Iterable<A>
and therefore do not assume mutability - You can make a
SelectOp
return any collection type, by either supplying a customCanBuildFrom
instance, or subclassingAbstractSelectOp
, which does not even assume an iterable, but works with any type
A thing worth noting is that while the final return type of your (potentially) composite DB should be immutable, there is nothing preventing you from utilizing mutable data types in the intermediate operations in the pipeline. This is because of the guarantee that all operations in the pipeline happen in a single thread. This is showcased in this example.
If you are writing REST services or something similar, chances are you will need to serialize your DB
result to JSON
.
For functionaljava
collections, we recommend utilizing jackson-module-fj.
Similar functionality exists for other immutable collection libraries.
To serialize / deserialize immutable beans/domain objects, one uses jackson-modules-java8.
If you are using jackson-module-fj
, it registers jackson-modules-java8
automatically for you,
and everything will work out of the box.
Creating table definitions, altering them, inserting predefined data (such as, say, the default system user in a
USERS
table), and migrating existing data to reflect table alterations, are a bit different from regular DB
code:
- Each instance of such actions should happen at most once;
- They should usually happen at application startup time, or at the first time the database is accessed;
- Their failure often indicates a fatal problem, and the application should shut down forcefully instead of proceeding.
Due to these considerations, we recommend you keep such actions separate from the rest of your code, and that you utilize
a library which knows how to handle database migrations. In terms of a library, we have found that
flyway does the job well. Here is an example which shows how to use flyway
in your code.
Implementing your own interpreter is very straightforward. There is no specific interface to implement, but the user will usually expect these two methods:
submit
, which executes theDB
non-transactionallytransact
, which executes theDB
transactionally, i.e.commit
s at the end of the pipeline and rolls back in case of failure
If you need to implement your own interpreter, it is best to look at some existing ones:
We recommend you reuse the existing logic around commit / rollback, as is done in the examples above.
An interpreter interface cannot be expressed in Java in a typesafe way. Conceptually, an interpreter is parametric on two things:
- The result type of the interpretation, say
Future
,CompletableFuture
,Validation
, etc, which itself is type-parametric - The result type of the
DB<A>
being interpreted, namelyA
Since the type of the interpretation is itself type-parametric, we need higher-kinded types to express it. Java does not have this feature, therefore the interface does not exist in the library.
If this were Scala
, we would write something along the lines of
trait DbInterpreter[F[_]] {
def submit[A](db: DB[A]) : F[A]
def transact[A](db: DB[A]) : F[A]
}
That is, for a fixed thing F
which takes a type parameter, and a DB<A>
being interpreted, return an F<A>
.