Erlang NIF CSV parser and writer

Version: 0.2.0

Authors: Hynek Vychodil (hynek.vychodil@altworx.com).

See also: ecsv.

ecsv is fast NIF parser and writer based on libcsv

The main purpose of the module is the fast parsing of CSV data in GB volumes. This requirement leads to the necessity of stream oriented API (see ecsv:parse_stream/4).

Building

Application requires development files for libcsv version 3.0.x (tested with 3.0.3). For example in debian you need run

apt-get install libcsv3 libcsv-dev

For building you need installed rebar3.

rebar3 compile

And run tests using

rebar3 eunit

Running

For interactive Erlang shell use

rebar3 shell

Try as starter

1> ecsv:parse(<<"Hello,World">>).
[{<<"Hello">>,<<"World">>}]

See ecsv for API description.

Known performance caveats

ecsv:write/1 and ecsv:write_lines/1 doesn't perform as expected. For an unknown reason, it is slower than pure Erlang implementation when compiled using HiPE and used in a real application. On the other hand, ecsv:parse_stream/5 met expectations and performs around 100MB/s on commodity HW (i7 2.6GHz).

Current implementation uses enif_make_new_binary() for parsed fields. From our experience, this call allocates small binaries on the process heap in contrast to enif_alloc_binary() always allocates on the binary heap which is slightly slower. Fields from currently parsed line is kept between NIF calls in an own environment which could lead to bad behavior when parse_raw/3 is called with very short binaries and there is a long row with many fields. In an extreme case, one long line with short or in worst case empty fields will lead to quadratic behavior if fed one byte a time. If you would like parse CSV file with more than 20kB rows with thousands of fields you should probably use another parser or fix the issue.

Modules

ecsv

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
c_src		c_src
doc		doc
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
rebar.config		rebar.config
rebar.lock		rebar.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Erlang NIF CSV parser and writer

Building

Running

Known performance caveats

Modules

About

Releases

Packages

Contributors 2

Languages

License

altworx/ecsv

Folders and files

Latest commit

History

Repository files navigation

Erlang NIF CSV parser and writer

Building

Running

Known performance caveats

Modules

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages