GitHub - yassine/avrodite: A fast, lightweight, POJO driven, Avro serialization/deserialization library

Avrodite

A fast, lightweight, POJO driven, Apache Avro serialization/deserialization library.

Motivation

While designing this library, the following concerns were in mind:

POJO Driven: This is particularly useful for an already existing application that have a set of plain JAVA objects as API. We tried the avro core implementation, but the library adds and additional layer of indirection/de-referencing. Moreover, the mapping between the Record objects/dictionaries and the existing api can quickly turn into a fastidious boilerplate to maintain.
Generics and complex types: Existing JAVA API models often leverage the typing features that the core language offers. We need a library that would support any type, including types with generics.
Fast: Latency is a critical requirement for modern applications and real-time event processors. Having a library that performs at high standards, is definitely a criteria for its adoption.
Low memory footprint: Holding pressure from the heap by minimizing the created objects during the serialization / deserialization process support applications towards meeting their SLA. We need to have lowest possible memory footprint and offer an API for re-using existing objects when applicable.

Current Status

Currently the library fulfills most of its design objectives and is performing better than many widely adopted libraries. It has full support for Java generics and complex types (We've stress-tested it with fields of this kind List<Map<String, Event<Map<String, String>, Model<A,B>>>> and it passes).

From a performance perspective, Avrodite performs 3x-4x better than its closest mate, between 5x and 15x than Avro core Record API (See benchmarks reports).

Regarding usability, the interaction with Avrodite API would consist of something like so:

//API configuration would happen in your dependency injection layer 
Avrodite<AvroStandard, AvroCodec<?>> avrodite = AvroStandardV19.avrodite()
                                                      .build();                                                               
//once you catch an avrodite instance you can get a codec for a given target as follows:
AvroCodec<Model> codec = avrodite.getCodec(Model.class);

//serialization and deserialization is then trivial
byte[] avroData = codec.encode(new Model());
Model decodedModel = codec.decode(avroData);

//When needed, you can get your schema from the codec instance like so:
Schema schema = codec.getSchema();

Modules description

The library is modular and was designed with the idea of implementing other serialization formats in the future (JSON for example). Currently the modules consists of :

avrodite-api: The global API that projects would depends on for serialization/deserialization.
avrodite-avro: An extension and implementation of the public API that provides AVRO custom API (api access to Schema instances for example).
avrodite-tools: Mainly a compile-time/build phase dependency that contains the necessary logic for introspecting your beans and plain objects API.
avrodite-tools-avro: An avrodite-tools plugin that generate custom classes for the AVRO binary format.
avrodite-avro-maven-plugin: A maven plugin that abstracts away from you the avrodite-tools stack to generate your codecs classes during the build phase.
avrodite-avro-benchmarks: A test module that benchmarks the AVRO implementation against other libraries.

Benchmark Results Snapshot

Refer to this document for detailed results.

Throughput

(Higher is better)

Framework	T1 throughput [ ops/ms ]	T1 relative perf.	T2 throughput [ ops/ms ]	T2 relative perf.
avrodite	1858	100.00%	2208	100.00%
protocolBuffers	585	31.47%	589	26.67%
avroCoreNoHydration	132	7.08%	501	22.67%
avroCoreWithHydration	102	5.48%	239	10.82%
jacksonAvro	87	4.68%	158	7.14%
jacksonJSON	88	4.74%	89	4.02%

Heap Usage

(Lower is better)

Framework	T1 Heap Allocation Rate [ Byte/op ]	T1 relative perf.	T2 Heap Allocation Rate [ Byte/op ]	T2 relative perf.
avrodite	1024	100.00%	1024	100.00%
avroCoreNoHydration	2248	219.53%	2248	219.53%
avroCoreWithHydration	4840	472.66%	4840	472.66%
protocolBuffers	6088	594.53%	6088	594.53%
jacksonJSON	9472	925.00%	9472	925.00%
jacksonAvro	16264	1588.28%	16296	1591.41%

Roadmap

The following features are planned for the future:

Enums support (easy).
Avro Union types (can be useful when you have a supertype with various children) (easy/medium).
Map a model API evolution to Schema migration (most likely hard).
Support other serialization formats: A good portion of the library (types introspection, codec compilation) can be re-used to support other formats such as JSON. (average difficulty, depends on format)

Licence

Licensed under the Apache License, Version 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.mvn		.mvn
avrodite-api		avrodite-api
avrodite-avro-benchmarks		avrodite-avro-benchmarks
avrodite-avro-maven-plugin		avrodite-avro-maven-plugin
avrodite-avro		avrodite-avro
avrodite-pages		avrodite-pages
avrodite-tools-avro		avrodite-tools-avro
avrodite-tools		avrodite-tools
.editorconfig		.editorconfig
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
lombok.config		lombok.config
pom.groovy		pom.groovy
sonar-ci.properties		sonar-ci.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Avrodite

Motivation

Current Status

Modules description

Benchmark Results Snapshot

Throughput

Heap Usage

Roadmap

Licence

About

Releases 1

Packages

Languages

License

yassine/avrodite

Folders and files

Latest commit

History

Repository files navigation

Avrodite

Motivation

Current Status

Modules description

Benchmark Results Snapshot

Throughput

Heap Usage

Roadmap

Licence

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages