Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is REGEX case insensitive now? #1314

Open
belett opened this issue Mar 28, 2024 · 3 comments
Open

Is REGEX case insensitive now? #1314

belett opened this issue Mar 28, 2024 · 3 comments

Comments

@belett
Copy link

belett commented Mar 28, 2024

Hi,

It seems to me that REGEX was case sensitive (as in the SPARQL standard) and is now case insensitive.
Is my impression right?

And if so, s a workaround, how can I can make it case sensitive again? (I know that REGEX has a flag "i" to make it case insensitive but I can't find a flag to make it case sensitive - as it's the default)

Cheers, Nicolas

@hannahbast
Copy link
Member

@belett The REGEX function itself is case-insensitive if and only if the "i" flag is provided. For example https://qlever.cs.uni-freiburg.de/wikidata/huiLRw (astronauts with name matching "neil", zero results) and https://qlever.cs.uni-freiburg.de/wikidata/U7OafN (astronauts with name matching "Neil", two results).

However, QLever has an optimized implementation for prefix search, that is, a REGEX starting with a ^ followed by a fixed string. For example https://qlever.cs.uni-freiburg.de/wikidata/RU9ISc (astronauts with name match "^neil", two results). It is configurable when building the index whether this optimized implementation is case-sensitive or not. It is currently configured to be case-insensitive. What's missing (for historical reasons) is that in that case, the optimized implementation should only be used when the "i" flag is provided. We should fix that for the sake of standard conformity.

Does this answer your question?

@belett
Copy link
Author

belett commented Apr 2, 2024

Thanks, it does answer most of my question.

Indeed, I was looking for prefix (with query like https://qlever.cs.uni-freiburg.de/wikidata/AT4L5i where I look for French labels starting with "Église" instead of "église").

Do you have an idea when it will be fixed? And is there a way around until then?

@hannahbast
Copy link
Member

hannahbast commented Apr 2, 2024

@belett Yes, there is an easy workaround. As I said, the optimization is only triggered for REGEXes for the form ^ + fixed string. So just turn the fixed string into an equivalent REGEX that is not a fixed string, for example: https://qlever.cs.uni-freiburg.de/wikidata/QKyWtw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants