Husk was designed to be used directly by Webservice and API based applications without the need for an external storage provider.
- What is HuskDB?
- Setup and Installation
- Creating Table/Structure
- Preparing Seed Data
- Operations
- Create
- Read
- Update
- Delete
- Calculate
- Working with Collections
- Advantages over traditional SQL
- hsk Tags
- Benchmarks
Husk database is an embedded, object-oriented data store which uses Go to interact with records. It is meant to be used by small webservices which fully control the data it hosts.
This engine attempts to force users to keep business logic close to their required objects, and minimizes entry points and chances for loop holes. ie. when a developer accesses or modifies data outside of the intended scope.
This includes many sources;
- Directly modifying data via an external tool. (SQL Server Management Studio, WorkBench, Toad, etc.)
- Able to access core "Database" API in higher-level code, rather than the Logic Layer as intended. (Front-end, Public facing API End-points)
- Editing the data file directly.
Creation timestamps sort records internally, and thereafter their traditional "ID". The 'Key' is created from the combination of timestamp and id "0`0". The Key allows the index to sort the records by creation date, by default. This creates faster access to the most recent records and removes need for sorting after every query.
The database engine works similar to ISAM, as it stores data on a sequential "tape" which lives on disk and held in memory. Husk uses an index with pointers to the actual location on the tape for faster access.
As HuskDB is a library, which can simply be imported into any Go project.
To download the Husk module, run;
$ go get -u github.com/louisevanderlith/husk
After installing the latest module, all that is required is to create a Table/Structure for your data. The following section will provide information on setting up the Table.
More usage examples can be found under the /tests folder.
- Define data object Any type which has the 'Valid()' function, qualifies as a data object, and can be used as a Tabler.
package sample
import "github.com/louisevanderlith/husk"
//Person Data Record
type Person struct {
Name string `hsk:"size(50)"`
Age int
Accounts []Account
}
//Valid - To qualify as a Data Record,
//a struct MUST have a Valid function
func (o Person) Valid() (bool, error) {
return husk.ValidateStruct(&o)
}
In this sample, we create a "Person" object which holds Name, Age and Account information.
The Name property can only have a value which has a maximum length of 50characters. More information in hsk
tags can be found later in this document.
- Create a context for the structure
package sample
import (
"github.com/louisevanderlith/husk"
"github.com/louisevanderlith/husk/serials"
)
//Context holds the Tables we want to access
type Context struct {
//People table
People husk.Tabler
}
// NewContext returns the context object with Table values initialized
func NewContext() Context {
result := Context{}
//Creats a new "Person" Table, with GobSerializer
result.People = husk.NewTable(Person{}, serials.GobSerial{})
return result
}
We now have a context which can be utilised by our application
Husk is able to import seed data via a JSON file.
- Create seed file (people.seed.json) in a folder named 'db'
- Populate seed data.
Seed data should be provided as a JSON array.
Example;
[ { "Name": "Charlie", "Age": 23, "Accounts": [{"Number": 1234500,"Balance": 0.10, "Transactions": []}] }, { "Name": "Mike", "Age": 48, "Accounts": [{"Number": 1234501,"Balance": 99.10, "Transactions": []}] } ]
- Call the Seed function on structures to populate the table
func (ctx Context) Seed() {
//Seed files can be specified, so that we have data to boot.
err := ctx.People.Seed("people.seed.json")
if err != nil {
panic(err)
}
ctx.People.Save()
}
- Create Records can be created by providing an object to the Create function. Husk also supports .CreateMulti which can be used to create many records at once.
p := sample.Person{Name: "Jimmy", Age: 25}
//Send the object to the context for creation
set := ctx.People.Create(p)
if set.Error != nil {
t.Error(set.Error)
}
//Persist the changes
ctx.People.Save()
- Read
Records can be located using various methods:
- By Key
This is the fastest way to locate a record
//parse the string representation of the key to an actual husk.Key
k, err := husk.ParseKey("0`0")
//find the record with the key
rec, err := ctx.People.FindByKey(key)
- By Filter
Filters can be defined as functions which return true/false depending on the conditions. Husk also provides a default filter "husk.Everything" which can be used to return all records.
type personFilter func(obj Person) bool
func (f personFilter) Filter(obj husk.Dataer) bool {
return f(obj.(Person))
}
func ByName(name string) personFilter {
return func(obj Person) bool {
return obj.Name == name
}
}
Operations that find records always have to supply page size and index. FindFirst is a shortcut for Find(1,1, filter)
//Find People 'ByName', but I only want the first 3 matches
result := ctx.People.Find(1, 3, ByName("Jimmy"))
//Find 'All' People, but I only want the first 3 matches
result = ctx.People.Find(1, 3, husk.Everything())
- Update
//First find the desired record, in this case "ByKey"
p, _ := ctx.People.FindByKey(key)
p.Age = 87
ctx.People.Update(p)
//Persist the changes
ctx.People.Save()
- Delete
Records can be deleted directly from the database by providing the key of the record to be removed.
//Provide the key to the delete function
err := ctx.People.Delete(key)
//Persist the changes
ctx.People.Save()
- Calculate
This function can be used in multiple ways to generate datasets. There is no concept of "SELECT" in Husk, but it still provides a way of creating custom datasets. When thinking in terms of traditional SQL, this would be a Stored Procedure.
Calculators can be defined in the same way as Filters, as they function in exactly the same manner. For this reason, we can chain Filters and Calculators to narrow result sets.
type personCalc func(result interface{}, obj Person) error
// Calc can take a pointer to a result, and update it with values found in the data obj
func (f personCalc) Calc(result interface{}, obj husk.Dataer) error {
return f(result, obj.(Person))
}
In the following example we want to get the Total value of a Person's Accounts.
// SumBalance will return the SUM Balance of a Person's Accounts
func SumBalance() personCalc {
return func(result interface{}, obj Person) error {
answ := float32(0)
for _, acc := range obj.Accounts {
answ += acc.Balance
}
totl := result.(*float32)
*totl += answ
return nil
}
}
This example shows how we can get the Lowest Valued Account
func LowestBalance() personCalc {
min := float32(9999999)
return func(result interface{}, obj Person) error {
// We can utilise previously defined functions
balnce := float32(0)
err := SumBalance(&balnce)
if balnce < min {
min = balnce
n := result.(*string)
*n = obj.Name
}
return nil
}
}
After finding records, you may want to iterate over them to perform other operations. Husk provides a way of enumerating these collections.
//Find 'Everything', but I only want the first 3
result := ctx.People.Find(1, 3, husk.Everything())
//Gets the iterable collection
rator := result.GetEnumerator()
//Moves to the next item in the collection, until there isn't anything else
for rator.MoveNext() {
curr := rator.Current()
someone := curr.Data().(sample.Person)
log.Printf("$v\n", someone)
}
Husk doesn't use any Query language, and all data manipulation happens via the code. The immediately avoids SQL injection attacks, and narrows the points of entry.
- No SELECT, Datasets are 'Calculated' and relates more to Stored Procedures, but written in Go.
- No WHERE Filter using Go functions. Functions can be easily tested and compile with the application.
- Index Keys are separate from Data, and should only be used to Identify records and create relationships across micro-services.
- The database stores records in descending order of creation.
- The Primary Key consists of a Timestamp and ID. Quite husk.Key
- Querying collections always require the 'pagesize' (current page and results per page) to be specified. This forces the developer to always limit the amount of data returned, thus reducing query times.
- Database embedded into application, no externally hosted services. Greatly reduces the attack surface of the data.
- Doesn't require a "ConnectionString"
- TDD and Unit Tests can use Husk to create tables that work with the logic. Object-Oriented approach to working with data.
- Tables can be mapped directly to JSON structures. Big JSON files can be easily analysed.
- Seed files are JSON documents, so other systems can easily be migrated.
- Objects are "Serialization" aware, and columns like 'Password' can easily be hidden using
json:"-"
- Everything related to an object will always remain nested within that object.
- Database will only consist of one or two tables, micro-services should only control one part of a system.
Husk supports a limited set of tags which can be added to properties.
hsk:"null"
- This property may be empty, and is not required.hsk:"size(50)"
- Limits the length of the value to 50 characters.
History:
Please note these numbers come from our Sample_ETL test, which inserts the same record(16kb) for 20seconds.
This function has since been deprecated, and a better benchmark is yet to be written.
-
v1.0.1 Write: 138rec/s Every record saved creates a new file. Very slow read, and write.
-
v1.0.2 Write: 509rec/s (x3.6 improvement) One file stores all records. Greatly improves reads.
-
v1.0.3 Write: 1463rec/s (x3 improvement) Index file will only be updated on disk when the context gets saved.
-
v1.0.4 Write: 1221rec/s (x1 improvement)
File operations improved.
-
v1.0.5 Write: 2315rec/s (x2 improvement)
Indexing logic refactored and Keys changed to Pointers. Improved reading.
-
v1.0.6 Write: 4314rec/s (x2 improvement)
Keys are no longer Pointers.
-
v1.0.7 Write: Unknown
General improvements
Average Write Performance:
- MAC 3167rec/s (Unicorn Power)
- WINDOWS 2315/rec/s (Spinning Disk, AMD)
- LINUX 2289rec/s (SSD, Intel i5(2nd gen))