-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JSON form of the table #305
Comments
I prefer to use CSV, and think it is simpler for many cases. (I do use the CSV myself.) However, it is also true that in some programming langauges, JSON will be better, so having both can be helpful. One problem with using JSON for contribution is the lack of a trailing comma, meaning that the records do not all have the same format. Another note about JSON is that JSON does not properly have 64-bit integers; although JavaScript has a integer type now, JSON predates the integer type in JavaScript. If JSON is implemented, ensure that the CSV is generated in the same format it is now (including spacing, although a column may be made wider if necessary), possibly adding additional columns at the end if such a thing is necessary (which it might or might not be). The CSV should be plain ASCII (do not use any non-ASCII characters), and should not have any quotation marks; if you convert JSON to CSV, ensure that this is the case, even though the JSON will definitely include quotation marks, and possibly non-ASCII characters. (If it includes long descriptions, they will need to somehow be truncated, or maybe it is better to have a separate field for a short description.) If you do want to switch to JSON like that, you will first need to convert CSV to JSON first, so that it will all be JSON, and then you can convert JSON to CSV. (Then, you can check that the output matches the original, that the converter does not have a bug. Fortunately, you have a version control system, so you can revert it if something goes wrong.) The program will have to check that it is valid. If the JSON includes the varint code (as suggested in #297), then it will have to check that the varint code matches the numeric code. This should be easily enough to implement. (Some programs might use the numeric codes directly, such as the |
2023-01-03 IPLD triage conversation: |
Here's a rough outline that in my head from the various evolved conversations about this:
[
{ "name": "identity", "tag": "multihash", "code": "0x00", "varint": "0", "status": "permanent", "description": "raw binary" },
{ "name": "cidv1", "tag": "cid", "code": "0x01", "varint": "1", "status": "permanent", "description": "CIDv1", "ref": [ "https://github.com/multiformats/cid" ] },
{ "name": "cidv2", "tag": "cid", "code": "0x02", "varint": "2", "status": "draft", "description": "CIDv2", "ref": [ "https://github.com/multiformats/cid" ] },
{ "name": "cidv3", "tag": "cid", "code": "0x03", "varint": "3", "status": "draft", "description": "CIDv3", "ref": [ "https://github.com/multiformats/cid" ] },
{ "name": "ip4", "tag": "multiaddr", "code": "0x04", "varint": "4", "status": "permanent" }
] |
That looks like to me that it can work, that it seems good enough (the lack of a trailing comma is a bit messy because now the records do not all have the same format, although I suppose there is not a satisfactory way to avoid that). The reference seems a good idea, too. However, please ensure that the CSV file contains no quotation marks, no commas within the data of a field (commas are only between fields), no non-ASCII characters. (If the JSON contains strings with commas, quotation marks, and/or non-ASCII characters, which would appear in the CSV file, then the conversion would need to somehow change them, when it is being converted to CSV.) Also, how are those fields encoded? The |
One thing to note, might be useful to use ndjsonfor streaming parsing. |
maybe? ndjson is nice, but makes it pretty inconvenient for just loading the whole lot I quite like streaming parsing and it might be neat if this gets huge but perhaps consumers would get annoyed they can't just dump it through a standard JSON parser as it is. One nice thing ndjson would give us is strict enforcement of one-entry-per-line, which is what I'd like to see. |
Initial proposal for some feedback: #311 |
I can't find the issue(s) where we discussed this but there's been a proposal on the table for a while that we represent the multicodec table as JSON for easier consumption and more flexibility. I'd like to switch it so that the CSV is generated from the JSON and people contribute to the JSON. A JSON table would let us add more items, like #297, and longer descriptions, and overall much more flexibility for entries and ease of downstream consumption.
This needs a PR to propose the format. Does it need to be line-delimited JSON? Can it be fully pretty-printed? What does linting look like? What does CSV generation look like?
Anyone want to give this a go?
The text was updated successfully, but these errors were encountered: