Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow non-alphanumeric literal characters in regular expressions #115

Open
caleb531 opened this issue Dec 7, 2022 · 6 comments
Open

Allow non-alphanumeric literal characters in regular expressions #115

caleb531 opened this issue Dec 7, 2022 · 6 comments

Comments

@caleb531
Copy link
Owner

caleb531 commented Dec 7, 2022

@eliotwrobson Per your brief comment from #112:

Currently, only alphanumeric characters are supported in the regex parsing here (which I'm now realizing is kindof a breaking change from v6, whoops). But I think some way of adding escape characters could be useful to people. I think it's just a matter of reconfiguring the lexer to treat characters coming after a backslash as a literal. But I think it should definitely be in a separate PR.

I think it would be helpful to allow non-alphanumeric characters in a regex, such that you could create a regex for an email address, @username, etc. This enhancement would imply that you can also escape symbols to be literal characters.

@eliotwrobson
Copy link
Collaborator

eliotwrobson commented Dec 7, 2022

I think this is a reasonable change to make. All it should require is a change to the regular expressions given to the lexer for each token type. You'll have to prevent it from matching against characters with a slash in front.

Also, the regular expression used for the LiteralToken needs to be modified to account for this.

EDIT: @caleb531 if you want to attack this, this might go well along with the initial refactor I referenced in #109. I have exams coming up and I'm not a huge regex expert 😅

@eliotwrobson
Copy link
Collaborator

@caleb531 looking over this again (since it is on the v8 milestones), I'm not knowledgeable enough about regex syntax to make this work as expected. This requires messing with the regex expressions used for lexing. I certainly think it would be nice to include this with v8, and I can help if anyone wants to take a crack at this, but I can't craft the expressions needed myself 😢

@caleb531 caleb531 removed this from the v8 milestone Jun 17, 2023
@caleb531
Copy link
Owner Author

@eliotwrobson That's fine! I'm not that committed to having this be part of v8, so I've removed the milestone designation. I wouldn't want this to hold up the Jupyter integration's debut in the coming v8.

@leonbett
Copy link

leonbett commented Nov 4, 2023

Hi! I'd be interested in this feature.
[works with 6.0.0 though, which is nice!]

@eliotwrobson
Copy link
Collaborator

@leonbett thanks for indicating interest! I would definitely accept, and potentially even collaborate on a PR that added this feature, but I'm not knowledgeable enough about regexes to write it myself.

@eliotwrobson
Copy link
Collaborator

Looks like this has been handled, except for escape characters in a regex: #233

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants