You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is likely going to be an open question for a while, but there are my current thoughts. All input is welcome.
I feel like, by and large, data collected from lab instruments can sensibly be converted to primitive data types. The most common types I have in mind are Numpy arrays, and Pandas data frames. Both of these can be represented easily with primitive data types.
There are however cases where data will be collected that cannot be converted to a primitive type.
In the new cbor branch, I've added a section to the JSON encoder that will base64 encode bytes Python objects. I've correspondingly included a Marshmallow Bytes field to handle validating binary data in this format. It populates the documentation with information about the string values being a base64 encoded block of binary data. Everything is fine on that front.
However, as @rwb27 has mentioned in the past, sometimes the binary data collected will be big enough that the b64 encoding overhead could become problematic. To handle these cases, I've included support for clients to accept application/cbor responses instead of application/json.
CBOR has built in support for binary encoded data, so if a client requests a CBOR response, no encoding overhead is introduced. The data gets passed directly to the CBOR response, otherwise identical to the JSON response, but with the binary section unencoded.
This solution isn't perfect though. The Thing Description is required to be JSON. This is fine in most cases as it accurately describes the base64 encoded binary blobs. However, it means that the CBOR response will deviate from the Thing Description, receiving a bytes type value where the Description says a string will be returned.
I currently feel however that the cases where large, non-primitive data files are being collected with such high frequency that CBOR encoding is required are infrequent enough that, given proper documentation, this solution could still be fine.
Again, thoughts are welcome.
Note: The CBOR branch is useful even aside from this. It's a much more compact data format that JSON, so for many cases it may be beneficial to actually communicate over BSON even without needing to transfer bytes objects. It was easy to add support, and doesn't affect the JSON functionality at all.
The text was updated successfully, but these errors were encountered:
I had to look up CBOR but this seems like a good solution.
Are you saying that the only negative (or most significant negative) is the divergence from the W3C Web of Things standard?
If so, have you brought this problem/solution to their forum? Somebody might provide a insight on any thoughts the working group(s?) have had. Also, a quick search says that they're currently rechartering the working group so now might be a good time to introduce new ideas for their consideration.
Yeah pretty much, though interestingly the Mozilla implementation actually already specifically describes both CBOR representations and WebSocket protocol bindings, so the newest versions of LabThings are based more heavily on the Mozilla implementation of the W3C standard.
I imagine that if the W3C add new information around these, Mozilla will update their implementation correspondingly. Our spec repo is forked from the Mozilla spec so we can easily make sure we’re synchronised with upstream.
Mozilla have made this much simpler than it would otherwise have been. Very happy!
This is likely going to be an open question for a while, but there are my current thoughts. All input is welcome.
I feel like, by and large, data collected from lab instruments can sensibly be converted to primitive data types. The most common types I have in mind are Numpy arrays, and Pandas data frames. Both of these can be represented easily with primitive data types.
There are however cases where data will be collected that cannot be converted to a primitive type.
In the new
cbor
branch, I've added a section to the JSON encoder that will base64 encodebytes
Python objects. I've correspondingly included a MarshmallowBytes
field to handle validating binary data in this format. It populates the documentation with information about the string values being a base64 encoded block of binary data. Everything is fine on that front.However, as @rwb27 has mentioned in the past, sometimes the binary data collected will be big enough that the b64 encoding overhead could become problematic. To handle these cases, I've included support for clients to accept
application/cbor
responses instead ofapplication/json
.CBOR has built in support for binary encoded data, so if a client requests a CBOR response, no encoding overhead is introduced. The data gets passed directly to the CBOR response, otherwise identical to the JSON response, but with the binary section unencoded.
This solution isn't perfect though. The Thing Description is required to be JSON. This is fine in most cases as it accurately describes the base64 encoded binary blobs. However, it means that the CBOR response will deviate from the Thing Description, receiving a
bytes
type value where the Description says astring
will be returned.I currently feel however that the cases where large, non-primitive data files are being collected with such high frequency that CBOR encoding is required are infrequent enough that, given proper documentation, this solution could still be fine.
Again, thoughts are welcome.
Note: The CBOR branch is useful even aside from this. It's a much more compact data format that JSON, so for many cases it may be beneficial to actually communicate over BSON even without needing to transfer
bytes
objects. It was easy to add support, and doesn't affect the JSON functionality at all.The text was updated successfully, but these errors were encountered: