Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json.cc - embedded types functionality rewrite #26

Open
SpiraMirabilis opened this issue Jan 15, 2018 · 0 comments
Open

json.cc - embedded types functionality rewrite #26

SpiraMirabilis opened this issue Jan 15, 2018 · 0 comments

Comments

@SpiraMirabilis
Copy link

SpiraMirabilis commented Jan 15, 2018

I like this functionality, but I don't like the way it is implemented in stunt -- namely using a pipe character in the value to split the JSON value into | like:
"#5|obj"
This isn't compliant JSON and could cause unpredictable behavior for strings that end in |"obj", "int", "float", "err", "str".

I propose a change where instead of storing the type in the value, all MOO types that need to be detected be stored as a JSON object in this or a similar schema:

{
  "value": 5,
  "x-MOOType": 
  {
    "category": "Primitive",
    "type": "ObjectReference"
  }
}

I made the x-MOOType an object so that we could create a standardized serialization/deserialization format for objects, verb code, or even the verb AST/etc. I think it would be necessary to have category switch statements (can use a hash table for fast string compares) in addition to type if there will be somewhat complicated serializations like entire objects, or verbs.

However, we could just do something like:

{
  "x-MOOType": "ObjectReference",
  "value": 5
}

if we'd rather keep the JSON simple at expense of making the parsing function complicated, potentially, in the future. (I kind of also want to change the Objid to be a UUID internally in which case the ObjectReference would store both the 5 and the UUID -- that way you would know if the object #5 pointed at was different ($recycler would call a builtin to change the UUID if it reused the number, or maybe you'd change the UUID after significant changes to a generic in order to simulate commiting and version-history between UUIDs/commits.)

We could call this JSON mode something else 'OBJECT-EMBEDDED-TYPES' for now and add an alias to the original mode for 'LEGACY-EMBEDDED-TYPES' so it would respond to both 'EMBEDDED-TYPES' and 'LEGACY-EMBEDDED-TYPES' to give users time to change their MOOcode and mark 'EMBEDDED-TYPES' as deprecated for a suitable period of adjustment.

Or we might want to keep both modes as the value-embedding while it produces non-compliant JSON may be more performant (although then again it might not with the string splitting, copying, and comparison -- in the json-object mode checking for every object for a magic property can be sped up by either hacking the yajl directly or transitioning to a different JSON class (my next proposal.) In any case, if we wanted to keep both functionalities 'EMBEDDED-TYPES' might want to be derepcate-renamed to 'VALUE-EMBEDDED-TYPES' for clarity.

Lastly, if this is a good idea I propose we transition to a C++ JSON class/framework that implements more functionality -- ideally JSON Schema as well as a easy way to implement asynchronous parsing for large JSON documents (if we serialize entire MOO objects for example.) I propose rapidjson since it has a lot more functionality and also is on average 5 times as fast as YAJL in all JSON operations.

RapidJSON also can transform to/from JSON <-> DOM as well as SAX

And for the asynchronous parsing we can accomplish that easily enough (for now) by starting a new thread for each asynchronous parse, some other options are below:

  1. New thread + Copy the to be parsed string into the new thread's memory, have a callback verb fire when done (worst but easiest solution)
  2. New thread + Use a mutex on the to be parsed string and have the bf_parse_json() spin / suspend until the memory is unlocked, then continue (so it is synchronous for the verb but asynchronous for the server) Similar to how exec.cc functions except we check the mutex periodically instead of waiting for SIGCHLD -- second easiest
  3. Remain single-threaded but include either libuv, libevent, or boost asio and setup a io_worker/event queue, and add a non-blocking poll/spin to the main loop for work. The bf_parse_json() could have an optional callback verb, without it it would spin/suspend until the internal callback fired and with it it could return like E_FORK to indicate success and then call your specified verb when the parse is done. - Second most difficult but lays ground work of converting the main loop entirely into an event queue.... someday...
  4. All of #32.time_zones missing WEST prevents the web interface from working #3 but adding:
  5. Instead of call it via a builtin function create a custom operator (with 3 variants, regular, semi-forked and forked) --
    * Regular would spin/suspend until internal callback. So synchronous to the verb.
    * Semi-forked wouldn't really create a true MOO forked task as the overhead is high, but if it was placed on the right side of an assignment operator it would do perms checks immediately, raising an error pack if caller_perms() is insufficient to modify whatever's on the left side. The callback would include the current caller_perms() and verb object/name but nothing else besides the current line from the stack. Then it would iterating the refcount if the left side was refcountable and queue the Async parse. When the internal callback fired it would check if the left side's refcount (if countable) is only 1, if so it would just destroy everything since the left side must have been a var and the verb must have ended. If its refcount is more than 1 or it is a property we'd verify the caller_perms() can still modify the left side (would be a crappy TB without the proper call chain if we had a perms issue), stick the parsed JSON in and evaluate the right side. We'd evaluate based on the current value of the left side, so using $ and other index operators/builtins to append would work correctly, or rather as expected. It would be mostly pointless to use on vars unless you knew the verb was long running, as most times the verb would end prior to parse being completed. But you could use it to set props fast, or modify a var in a long-running verb (like a task scheduler,etc.)
    * Forked would create a fork ... endfork block surronding the line operator is on.

Damn, long post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant