Skip to content

Latest commit

 

History

History
203 lines (147 loc) · 3.51 KB

README.md

File metadata and controls

203 lines (147 loc) · 3.51 KB

PyPI version Code style: black

CSV 2 JSON

Library to convert an CSV into python dict. The CSV headers are used as path to location in the target dict.

Path expression

The path is separated with dots (.) to express list indexes and dict keys. If all the sub keys of a collections are made of digits, the collections is assumed to be a list. Otherwise it's a dict.

Note: List indices don't have to be contiguous or ordered.

Sample:

"a.0,a.5,a.3"

will translate into:

{
    "a": [None, None, None, None, None]
}

But,

"a.0,a.5,a.3,a.b"

will translate into:

{
    "a": {
        "0": None,
        "5": None,
        "3": None,
        "b": None,
    }
}

Note: list may contain sub-lists and sub-dicts at any level

Options

By default the input type is preserved. (meaning there is no type conversion by default) Several options, may be used to alter the output:

fill_value

Indicate which value can be used to populate empty indices in the list.

Sample:

options = {
    "abc": {"fill_value": "?"}
}
headers = "abc.3,abc.1"

will produce the minimal output:

{
    "abc": ["?", "?", "?", "?"]
}

infer_type

Indicates the transcoder to try to figure out the type of the data and to cast it.

Sample:

options = {
    "abc": {"infer_type": True},
    "ghi": {"infer_type": True},
}
headers = "abc,def,ghi"
data = ["1", "2", "g,h,i"]

will produce the output:

{
    "abc": 1,
    "def": "2",
    "ghi": ["g", "h", "i"]
}

Supported types:

  • int
  • bool
  • array

render

A callable which return the output used instead of the original input value.

Sample:

options = {
    "abc": {"render": lambda _, _: "?"}
}
headers = "abc,def"
data = ["1", "2"]

will produce the output:

{
    "abc": "?",
    "def": "2"
}

optional

Refer a callable to figure out if the entry should be removed from the output or not.

Sample:

options = {
    "abc": {"optional": len}
}
headers = "abc,def"
data = ["", ""]

will produce the output:

{
    "def": ""
}

multi-level

This may affect the output structure on several levels.

Sample:

headers = "abc.0,foo.0,foo.1".split(",")
options = {
    "abc.0": {"optional": len},
    "foo": {"optional": len},
    "foo.0": {"optional": len},
    "foo.1": {"optional": len},
}
assert headers2template(headers, options=options).render_as_dict(["", "", ""]) == {
    "abc": []
}
# "foo" does not appear in the output because of 3 factors:
# - all its sub-items are optional
# - all its sub-items are dropped
# - it is also optional

JSON to headers

The function json2csv_headers may be used to evaluate what could be the headers of a CSV input based on a JSON. This function only get a JSON string and returns a list of headers and a list of values extracted from the JSON string.

Sample

json2csv_headers('{"a": "true","b": null}') == (["a", "b"], ["true", None])

CLI

csv2json

The package provide an handy CLI command to turn CSV inputs to JSON outputs, just type csv2json -h

json2csv

Use the flag '-r' for the command csv2json to extrapolate from a JSON, what the headers might be.

Sample

echo '{"a": 1, "b": ["toto", {"test": "bar"}, 1, 3]}' | csv2json -r -
headers: a,b.0,b.1.test,b.2,b.3
values: 1,toto,bar,1,3