Nearly all modern regular expression engines support numbered capturing groups and numbered backreferences. Long regular expressions with lots of groups and backreferences may be hard to read. They can be particularly difficult to maintain as adding or removing a capturing group in the middle of the regex upsets the numbers of all the groups that follow the added or removed group.
Languages or libraries like Python, PHP's preg engine, and .NET languages support captures to named locations, that we called Named Capture Groups. One of the most important benefits of NCG is that assigning a human-readable name to each individual capture group may be less confusing later to someone reading the code who might otherwise be left wondering about which number exactly conrrepsponds which capture group.
Named Capture Groups is great. NSRegularExpression
does not support it.
Cocoa's NSRegEx
implementation, according to Apple's
official documentation, is based on ICU's regex implementation:
The pattern syntax currently supported is that specified by ICU. The ICU regular expressions are described at http://userguide.icu-project.org/strings/regexp.
And that page (on http://http://site.icu-project.org) claims that Named Capture Groups are now supported, using the same syntax as .NET Regular Expressions:
(?...) Named capture group. The are literal - they appear in the pattern.
for example:
\b**(?\d\d\d)-(?\d\d\d)-(?**\d\d\d\d)\b
However, Apple's own documentation for NSRegEx
does not list the syntax for Named Capture Groups, it only appears on ICU's
own documentation, suggesting that NCG are a recent
addition and hence Cocoa's implementation has not integrated it yet.
That is to say, the only way of capturing group matching results exposed by NSRegEx
is currently by talking with rangeAt(:_)
method within NSTextCheckingResult
class, which is number-based. Come on, Cocoa.
The extension library, NSRegExNamedCaptureGroup, aims at providing developers using NSRegEx
with an intuitive approach to deal with Named Capture Groups within
their regular expressions.
Carthage:
If you use Carthage to manage your dependencies:
- Simply add NSRegExNamedCaptureGroup to your
Cartfile
:
github "TorinKwok/NSRegExNamedCaptureGroup" ~> 0.0.3
-
Click
File
->Add Files to "$PROJECT_NAME"
item in Xcode menu bar. Choose theNSRegExNamedCaptureGroup.xcodeproj
-
Embed NSRegExNamedCaptureGroup in
General
panel
CocoaPods:
To install using CocoaPods, add the following to your project Podfile:
pod 'NSRegExNamedCaptureGroup', '~>0.0.3'
Git Submodule:
- Clone and incorporate this repo into your project with
git submodule
command:
git submodule add https://github.com/TorinKwok/NSRegExNamedCaptureGroup.git "$SRC_ROOT" --recursive`
- The remaining steps are identical to the last two in Carthage section
import NSRegExNamedCaptureGroup
let phoneNumber = "202-555-0136"
// Regex with Named Capture Group.
// Without importing NSRegExNamedCaptureGroup, you'd have to
// deal with the matching results (instances of NSTextCheckingResult)
// through passing the Numberd Capture Group API:
// `rangeAt(:_)` a series of magic numbers: 0, 1, 2, 3 ...
// That's rather inconvenient, confusing, and, as a result, error prune.
let pattern = "(?<Area>\\d\\d\\d)-(?:\\d\\d\\d)-(?<Num>\\d\\d\\d\\d)"
let pattern = try! NSRegularExpression( pattern: pattern, options: [] )
let range = NSMakeRange( 0, phoneNumber.utf16.count )
Working with NSRegEx
's first match convenient method:
let firstMatch = pattern.firstMatch( in: phoneNumber, range: range )
// Much better ...
// ... than invoking `rangeAt( 1 )`
print( NSStringFromRange( firstMatch!.rangeWith( "Area" ) ) )
// prints "{0, 3}"
// ... than putting your program at the risk of getting an
// unexpected result back by passing `rangeAt( 2 )` when you
// forget that the middle capture group (?:\d\d\d) is wrapped
// within a pair of grouping-only parentheses, which means
// it will not participate in capturing at all.
//
// Conversely, in the case of using
// NSRegExNamedCaptureGroup's extension method `rangeWith(:_)`,
// we will only get a range {NSNotFound, 0} when the specified
// group name does not exist in the original regex.
print( NSStringFromRange( firstMatch!.rangeWith( "Exch" ) ) )
// There's no a capture group named as "Exch",
// so prints "{9223372036854775807, 0}"
// ... than invoking `rangeAt( 2 )`
print( NSStringFromRange( firstMatch!.rangeWith( "Num" ) ) )
// prints "{8, 4}"
Working with NSRegEx
's block-enumeration-based API:
pattern.enumerateMatches( in: phoneNumber, range: range ) {
match, _, stopToken in
guard let match = match else {
stopToken.pointee = ObjCBool( true )
return
}
print( NSStringFromRange( match.rangeWith( "Area" ) ) )
// prints "{0, 3}"
print( NSStringFromRange( match.rangeWith( "Exch" ) ) )
// There's no a capture group named as "Exch"
// prints "{9223372036854775807, 0}"
print( NSStringFromRange( match.rangeWith( "Num" ) ) )
// prints "{8, 4}"
}
Working with NSRegEx
's array-based API:
let matches = pattern.matches( in: phoneNumber, range: range )
for match in matches {
print( NSStringFromRange( match.rangeWith( "Area" ) ) )
// prints "{0, 3}"
print( NSStringFromRange( match.rangeWith( "Exch" ) ) )
// There's no a capture group named as "Exch"
// prints "{9223372036854775807, 0}"
print( NSStringFromRange( match.rangeWith( "Num" ) ) )
// prints "{8, 4}"
}
- macOS 10.10+ / iOS 8.0+
- Xcode 8.1, 8.2, 8.3 and 9.0
- Swift 3.0, 3.1, 3.2, and 4.0