-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is record creation O(n^2)? #14
Comments
For cases where you statically know the size of the output like parsing there's a trick in the library. See the json parsing: superrecord/src/SuperRecord.hs Line 603 in aecd579
It's pretty fast as you can see from the benchmarks :) |
It would be good to have some way to allocate the correct sized array when constructing records in the syntax users are expected to use: |
A possible way to do this would be to track the currently allocated size in the record itself, and then when adding a new field check if there's enough space - if yes just add it - if no reallocate and copy. Problem here is that it will break referential transparency so it will probably either need to be wrapped in a monad and/or marked with |
You might be able to use the |
This is a deal-breaker for this library, because it slows down typechecking time even for relatively small records, which leads to a bad feedback loop even with haskell-language-server. I see a noticable slowdown with records <10 elements, and it becomes unusable at 15–20. |
Is it maybe related to the sorting of records? How would one even start debugging this? |
possible; afaik the issue is that the type level induction causes quadratic core code. so probably best to get GHC to dump core code and take a look what is being generated. |
Unless I'm misunderstanding something, it appears that record creation is
O(n^2)
when we have all the information needed at compile time to statically allocate aSmallMutableArray#
of known size right away and write all the elements into it, and avoid all the copying. Given one of the most appealing parts of this library is theO(1)
field access even for large records it seems wasteful to make record creation so expensive (relatively). I'm imagining cases where I might be parsing a CSV file with millions of rows and thatO(n^2)
is going to add up quickly!The text was updated successfully, but these errors were encountered: