Compute TokenList.value dynamically #623
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Partial fix for #621:
To avoid quadratic behavior, make
TokenList.value
a dynamically-computed property rather than an attribute so that it does not need to be recomputed each timeTokenList.group_tokens()
is called withextend=True
.The first three commits in this PR are supporting work: I found that making
TokenList.value
a property caused test failures due to line endings not being correctly preserved when stripping comments. This is due to the fact that before makingTokenList.value
a property, theStripCommentsFilter
was changing the underlying tokens without updating thevalue
attribute of the parentTokenList
. After makingTokenList.value
a property, those changes did get reflected in the parentTokenList
'svalue
and as a result the desired line endings were being lost.The most straightforward way I found to address this was to make comment stripping happen before grouping is performed, which required a small amount of hackery to make grouping happen via a filter. I am open to suggestions to better ways to handle this.
I am also a little concerned about the possibility for slowdowns if a particular
TokenList
's value is accessed, and thus computed, multiple times, but I didn't actually observe any. This might still be an issue via a codepath I didn't look at. I have some ideas on how to address it if a performance problem comes up, but it would clutter up the code somewhat so I didn't want to implement it unless a need could be demonstrated.