-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Version 2 discussion #230
Comments
👍 TypeScript will help with that as you define a compilation target and your code won't compile
Interesting, I use the TypeDoc generated webpage directly for API docs For websites created from markdown files have a look at
To avoid the docs being "too new" you can only re-generate the website
SauceLabs has a limited free offering for open source projects. It is not trivial to setup but if you can get your tests bundled for running inside a browesr |
My only comment on typescript improvements is that current createLanguage does not generate types for its members. Standalone functions do have types though |
Yeah, I'm not 100% sure if that's possible or not, but maybe since TS has a lot more features now. Currently the only type definitions are from the DefinitelyTyped project and have no direct link to this library and are very outdated :( |
Surely typescript is pretty well suited these days. Variadic generics & inferrables should adapt well to this library, even if I should dig deeper to check this. Having the parser in TS would be awesome.
I've been playing with docusaurus lately, and it seems cool to me. Multi-version support is available, generate pages by markdown, is used everywhere at fb and configuration effort should be minimal.
This is absolutely an important point. Not sure about how to implement it anyway, but we should think about giving nice stack info, showing the parsing tree (correct indentation, etc.), deducing useful names, etc. I'll think about it, hoping to give back something |
Nice idea. It will be breaking change. Are you thinking to provide Parsimmon2 as different repository? I think it'd be better than developping in this repository because @types is already existing so it may cause confusion for the users. |
Oh, will that cause a conflict? I hadn't thought of that.
…On Mon, Mar 18, 2019, 01:38 minamorl ***@***.***> wrote:
Nice idea. It will be breaking change. Are you thinking to provide
Parsimmon2 as different repository? I think it'd be better than developping
in this repository because @types <https://github.com/types> is already
existing so it may cause confusion for the users.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#230 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKyr8rqlImX-nfezqt6Ta66VC8RrinJks5vX1BygaJpZM4SlElD>
.
|
No, it won't. The DefinitlyTyped definition is for v1, and if v2 doesn't need it, then there will be no need to add @types. If v1 becomes deprecated, the DefinitlyTyped definition will simply be removed. This is how it is done on other projects either fully converting to TypeScript or adding their own TypeScript definition files. The confusion may be if users are currently using v1 with @types and want to migrate to v2 and may forget to remove @types. But the semver process says that when updating major version, we should check for breaking changes… |
I've spent some time over the last month working on a new library called bread-n-butter (bnb), a spiritual successor to parsimmon, that addresses some of these concerns: 👉 https://wavebeem-bread-n-butter.netlify.appI've written a point-by-point summary of how Parsimmon v2 thoughts influenced the design of bnb below. My question to anyone in this thread with the time to look at stuff is, would something like bnb be of interest to you? Parsimmon has a lot of little functions that don't seem that useful in it. Sort of like how Lodash has tripled in size since it was forked from Underscore... but most people are still using the same few functions from it. A quick note on the life of ParsimmonThis is not me announcing that Parsimmon is dead or on life support, but I have felt a little trapped by the current state of Parsimmon. bnb has allowed me to work on parser combinators with a fresh perspective. I want to test the waters with what people want to see out of v2. Parsimmon seems to be the only major player in the parser combinators space in JS, so I don't want to just release a version 2 that makes everyone mad. Change our official supported environmentsbnb requires ES2015 suppport. Switch to TypeScriptNo more out of date types, no overhead on library users to contribute back to DefinitelyTyped. This was also really helpful while developing the library as well. Rethink the documentationI tried using TypeDoc, but the resulting website was just not as well organized as I would like. So while I'm still providing brief TSDoc comments so IDE users can get descriptions populated in their intellisense, most documentation lives in markdown files and is rendered by 11ty. There's still the problem that the master branch is what's shown on the website, so it's possible to see docs for a feature before it exists in npm. Breaking API changesI shrunk the API drastically. parsimmon supports a lot of niche features (fantasy land, binary parsing, built-in parsers that people normally replace, character based parsers like oneOf or noneOf or test instead of just using regexp). I could probably stand to add a few more convenience methods in, but overall it feels fairly comfortable to work in already, and is a much smaller library. Custom parsers are even easier to write now, in the rare occasion you should need to. bnb has 8 value exports, 9 type exports, and 16 methods or properties on the Parser class. parsimmon has over 50 value exports, 36 parser methods (and several odd aliases for fantasy land compat) DebuggingI still haven't dug too deeply into this. One thing holding parsimmon back was IE7 support meant we couldn't even rely on For now, I think I'll be honest that I still haven't spent the time to look into the Scala parser combinators debugging support to figure out what kind of stuff we could add. |
Some thoughts based on my experience: Non-discarding APIWhen using Parsimmon to implement a tokenizer, we often want to keep all the tokens, even for the trivial ones (i.e. whitespaces and separators), in order to create a high-fidelity AST which can be mapped back to the original source. This renders some of the functions less useful, e.g. Improved indexing and range markingThis is already extensively discussed in #331. API trimmingSome APIs are just auxiliary wrappers of DebuggingA parser combinator is a state machine, so I think the most important part of debugging is to track the state transitions. This can be done by using immutable states and storing all of them (much like the flux/redux way) for further analysis, by reading as log entries or via additional tooling. |
More on debugging: We are working on a complex language with a heavily modified bnb, and explored debugging with ReduxDevTools as suggested above. Here is the outcome: const dateParser = bnb.all(
bnb
.choice(
bnb.text('Jan'),
bnb.text('Feb'),
bnb.text('Mar'),
bnb.text('Apr'),
bnb.text('May'),
bnb.text('Jun'),
bnb.text('Jul'),
bnb.text('Aug'),
bnb.text('Sep'),
bnb.text('Oct'),
bnb.text('Nov'),
bnb.text('Dec')
)
.debug('month'),
bnb.match(/\s+/),
bnb.match(/\d{1,2}/).debug('day'),
bnb.text(','),
bnb.match(/\s*/),
bnb.match(/\d{4}/).debug('year')
);
dateParser.parse('Aug 27, 1983', true); The For us this is more than sufficient to debug our parser. |
More related experimental thoughts on debugging, which I'm not sure if I wrote down last time it came up: I wrote a const parser = p.times(
p.alt([p.string('ba'), p.string('na')]),
0, 3)
const parserWithDebug = p.debug(parser)
const result = parserWithDebug('banana')
console.log(result) Can wrap it around any parser, and it prints everything that happens in that "sub-tree". The Of course this only works in a console, and it's extremely noisy if used at top-level on a deeply nested real-world parser, but this is the best I could come up with for use in non-interactive console context. I think it's clearer than JSON logs, but I'm not sure by how much. 😅 It's effectively a continuous call stack visualisation, but filtered to just parser calls, and with corresponding context on parse progress. The main commit for the implementation was anko/partser@f2cdec4. In short, I added an optional debug handler function to parser calls, which is called before and after parse logic; https://github.com/anko/partser/blob/870216a31c7f0943db8806b79c99b13d89e15c48/index.js#L76-L81. Parsers pass that handler function around when they call each other. All the fancy coloured indented terminal stuff is just the default debug handler function, which users can override with their own function (e.g. https://github.com/anko/partser/blob/870216a31c7f0943db8806b79c99b13d89e15c48/test.js#L856-L873) to handle the debug information in a different format. It's nifty, but I'm not sure if it's actually a good idea.
|
So I caught the parser bug and wrote my own library as well in the past couple weeks I called parsinator.js. It supports state, which was the main thing I was missing from parsimmon. I also built in similar debug trace console output ability to what you did @anko . In parsinator.js, you can set it into debug mode, which causes the parsers to build a debug record which can then be displayed by other code that's not in the core library. Anko, it seems like it should be pretty easy to split out the debug display stuff from your main library so people can decide whether or not to include it. I do like how you have the handle debug method called before and after parsing. That seems like a better approach than I used in my library, which is slightly complicated even if it isn't that much code. I'm curious about your right hand display of the text. What happens when the text is longer than a single word? Like an entire file? I guess you'd just sweep through with a constant sized window? Its a pretty cool way to display that. I also like that you have opening and closing lines for your combinators. It makes it more intuitive to read than my library's output. @halfzebra I used your idea from #143 for the result display. And @rstacruz might be interested in the trace output from my l library and Anko's library. One open question I have is how to handle state. Currently the state is stored with the rest of the context and so propagates in the same way as the current index - ie successful parsers pass any mutated state to all future parsers (while failed parsers don't). But I'm wondering if it should only propagate state to sub parsers and siblings, and not be propagated up the tree. Wondering if you have any thoughts on that @wavebeem or anyone. |
I haven't really spent much of any time thinking about this, to be honest (this issue has been open since 2018 lol). On the few occasions I needed state, I used parameterized parsers instead. |
Yep, that's correct. I wrote slice-with-context for that. The context window's size and left-right bias are configurable. The readme has examples. And then lots of thoughts on state:
I've implemented exactly this in partser: The parse function takes both a string and an optional environment value, which is recursively passed to all sub-parsers, and to user functions for
You've clearly run into the same problems that I have, which is encouraging! 😄 I've found that in both cases (sub-, and sibling parsers) both options (propagate, and don't-propagate) need to be supported, because there are use-cases for each. Sometimes, a sub-parser represents a separate inner scope that needs a sub-environment, but sometimes it's just structurally necessary because you need to parse multiple pieces of a single thing. And sometimes, a sequence of parsers represents multiple things sharing the same environment which changes need to chain through the sequence of them, but sometimes again it's just structurally necessary because you need to parse a list of things that have no relation to each other. In partser, I solved this such that by default the environment (or "state", terminology, same thing) object is passed as-is into sub-parsers, so every change to it appears everywhere, and sibling parsers are independent. But I added mechanisms to change those when necessary:
This has worked well for me. |
That's an interesting point that the behavior might need to really depend on the specific situation. Do you have examples of cases where you've needed to use one or the other pattern? It would be interesting to see some places where you've used partser. Wavebeem is right that anything you might use an environment or state for can be done by passing arguments, tho it would be cumbersome to pass the state to most every parser in your system. But I was thinking that perhaps the return value of parsers can be used in place of using state that can propagate "down" your parser call tree. I'll have to think it over more. Thanks for your thoughts! |
Edited for disambiguation :) Hello everyone. I have an update on the typescript docs, and a few drawbacks with using typescript. I know that typescript has been praised for its type safety, but I (along with other typescript devs) notice that typescript falls apart in two ways: typescript requires devs to perform declaration merging when extending existing APIs; declaration merging can potentially cause type/module declarations to clash, similar to conflicting variables (e.g. which type declaration takes priority?). Another (potential) issue is Now on to the updates. Typescript has new ways to build types; the docs can be found here. Below is a small example: type Parsers = {
[parser: string]: (r: Lang) => Parser<Any>
}
type Lang = {
[ParserName in keyof Parsers]: ReturnType<Parser[ParserName]>
}
export function createLanguage(parsers: Parsers): Lang {
//...
} I hope this helps! Thank you all for making an incredible parser library! |
The only thing with your idea on separating APIs is that it may lead to the extensibility issue with typescript. |
Excuse me for barging in with my last two comments. I have a few suggestions. First, I would suggest switching to ESM or TS because CommonJS is, for the lack of a better word, slowly dying. My second suggestion answers @wavebeem's question:
After running into cycle.js and, more importantly, xstream, I saw how xstream's streams have two different APIs: the first one is just like Parsimmon ( Finally, I would like to thank the team for designing such an amazing library. I've been using Parsimmon on my biggest project, ExtVM. Part of the reason I mentioned phasing out CJS is because I've been thinking about switching ExtVM to ESM before CJS dies, but that's not an option right now because I don't know if Parsimmon v1 is compatible with ESM. |
Closing this as it's obviously not happening 5 years later and without a maintainer. Whoever is in charge of this next can figure out what they want to do with it. |
FWIW, here is a drop in debugging helper.
example output:
|
The issue is for brainstorming what we can do in a 2.x version of Parsimmon to make it easy to port code from 1.x but clean up any rough edges that have evolved since the library's inception.
Ideas
Change our official supported environments
I'll be honest. I haven't tested any releases in IE7, and I don't plan on doing that ever either. Setting up browser testing is expensive and probably not something I'll end up doing for the project.
I think it would be best to state that we only routinely supported Node.js LTS releases, but that it should work in ES5 browsers.
Switch to TypeScript
TypeScript is really good at this point and I'm confident we could statically type probably all of Parsimmon, or at least enough of it that it's really helpful and maybe redesign some things that don't statically type well.
The current static types contributed by the DefinitelyTyped project are a bit out of date, and are going to be even more inaccurate once version 2 comes out.
Also this means we can use some modern features like classes and arrow functions but still compile down to ES5 for maximum support.
Rethink the documentation
The main problem with the current documentation is that it's referenced straight from the master branch in the repo, so people can see documentation for features that have not been released yet.
The other problem is that it's not as well organized as it could be since it's a simple markdown file. If we turned it into a web page we could make it much easier to read.
I think the best way to do it would be to use TypeDoc to generate a JSON file from code comments and write a small website to use that. This is the strategy I used with Ramda (except that I used JSDoc there not TypeDoc).
Breaking API changes
Generally just think about all the breaking change issues here and if they still make sense and if there's any other little tweaks like this.
Debugging
Parsimmon parsers are hard to debug. The callstack is filled with just
Parsimmon._
functions the whole way usually, with no good way to tell where parser construction failed.Tracking the flow of parsing is also difficult because no tools are provided.
See #193 and #143 for debugging ideas.
The text was updated successfully, but these errors were encountered: