New fnmatch.py #28

rakus · 2019-05-06T20:55:29Z

This pull request is opened as a draft, as it is not mergeable yet. It contains a switch to switch between different implementations for number ranges.
After a decision is made on the issue editorconfig/editorconfig#371, the code needs cleanup to only support one implementation. (If it is of interest at all.)

This pull request proposes a new implementation for the translation of editorconfig glob expressions to python regular expressions.

IMO the following points are important:

better handling of escaped characters (see Tests for escaped special characters in glob expressions.
find matching brace or bracket (handles }xyz{, see Test globs with braces that are back to back.
for numerical ranges, regular expressions are created, no additional arithmetic compare in a second step needed. More details about that below.

This was initially implemented in VimScript for my Vim plugin and was than ported to Python.

Numerical Ranges

This implementation translates numerical ranges into regular expressions.

E.g.

{3..10} becomes (?:\+?(?:[3-9]|10))
{10..3} also becomes (?:\+?(?:[3-9]|10)), so the order of numbers is irrelevant
{-3..+3} becomes (?:-(?:[0-3])|\+?(?:[0-3]))

The special thing about the implementation of numeric ranges is that it is
switchable between different implementations. See the top-level variable NUMBER_MODE in fnmatch.py.

Mode `AS_IS`

This implementation should work like the current implementation.

Mode `ZEROS`

This implementation allows any number of leading zeros, as proposed by @cxw42 in Py core: numeric ranges don't handle zero correctly.

So: {3..10} becomes (?:\+?0*(?:[3-9]|10)) and would match

3
+3
`0000003'
`+0003'

Mode `JUSTIFIED`

This implementation handles numerical ranges as done by bash. I proposed this in a comment to @cxw42 issue here.

Now

{3..10} becomes (?:[3-9]|10), so leading + is not matched anymore.
{03..10} becomes (?:0[3-9]|10), so all numbers are formatted to equal width. IN this case single-digit numbers need one leading zero.
{03..120} becomes (?:00[3-9]|0[1-9][0-9]|1[0-1][0-9]|120). Again the numbers are formatted to equal width, here three digits. So single-digit numbers need two leading zeros, double-digit numbers one.
For negative numbers, the leading minus sign is part of the width calculation. So {-3..03} matches -3 and 03, but not -03.

Status

I didn't change anything outside fnmatch.py. So the tests failing with the master branch still fail with this branch. The only on that is fixed is brackets_slash_inside4.

The implementation passes all current tests related to globbing in mode AS_IS and JUSTIFIED. For ZEROS one test fails, that test require leading zeros not to be matched.

Locally I added some tests for mode 'JUSTIFIED', that I could provide also.

There is one function (unescapeBrackets) where I'm unsure if this is really correct.

- improved handling of escaped characters (e.g. a\[.abc) - improved finding matching brackets and braces - creates regular expressions to match numerical ranges - unit-tests to test glob to regex translation The regex for numerical ranges can be switched between three modes: - AS_IS: Equivalent to the previous implementation - ZEROS: Allow any number of leading zeros - JUSTIFIED: Implement ranges similar to bash The final mode depends on the outcome of editorconfig/editorconfig#371 Currently active: JUSTIFIED (the decision I would prefer :-)) On braces and brackets: '{alpha,[a,]beta}' should become '%(alpha\|\[a\|\]beta\)' as described in the bash man page. So if we scan brackets, that are inside braces, the comma should be handled as a separator for the braces. To prevent this escape the comma with a backslash: '{alpha,[a\,]beta}' becomes '%(alpha\|[a,]beta\)'.

'?' now translated to '[^/]'

cxw42 · 2023-01-15T02:21:19Z

If you're still working on this, would you be willing to check if the new fnmatch handles editorconfig/editorconfig-vim#205 ?

rakus added 4 commits May 22, 2019 20:02

Shortcut for justified num-ranges 00-99*

6e5e10c

Fix handling glob wildcard '?'

782f89f

'?' now translated to '[^/]'

Fix justified leading zeros for num-range max

8d33658

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New fnmatch.py #28

New fnmatch.py #28

rakus commented May 6, 2019

cxw42 commented Jan 15, 2023 •

edited

Loading

New fnmatch.py #28

Are you sure you want to change the base?

New fnmatch.py #28

Conversation

rakus commented May 6, 2019

Numerical Ranges

Mode AS_IS

Mode ZEROS

Mode JUSTIFIED

Status

cxw42 commented Jan 15, 2023 • edited Loading

Mode `AS_IS`

Mode `ZEROS`

Mode `JUSTIFIED`

cxw42 commented Jan 15, 2023 •

edited

Loading