Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally use sed extended regex syntax #453

Merged
merged 5 commits into from
Sep 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions src/std/text.ab
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,22 @@ pub fun replace(source, pattern, replacement) {
/// Replaces all occurences of a regex pattern in the content with the provided replacement text.
///
/// Function uses `sed`
pub fun replace_regex(source: Text, pattern: Text, replacement: Text): Text {
return unsafe $echo "{source}" | sed -e "s/{pattern}/{replacement}/g"$
pub fun replace_regex(source: Text, pattern: Text, replacement: Text, extended: Bool = false): Text {
unsafe {
if extended {
// GNU sed versions 4.0 through 4.2 support extended regex syntax,
// but only via the "-r" option; use that if the version information
// contains "GNU sed".
$re='\bCopyright\b.+\bFree Software Foundation\b'; [[ \$(sed --version 2>/dev/null) =~ \$re ]]$
let flag = status == 0 then "-r" else "-E"
return $echo "{source}" | sed {flag} -e "s/{pattern}/{replacement}/g"$
} else {
return $echo "{source}" | sed -e "s/{pattern}/{replacement}/g"$
}
}
}

/// Splits the input `text` into an array of substrings using the specified `delimiter`.
/// Splits the input `text` into an array of substrings using the specified `delimiter`.
pub fun split(text: Text, delimiter: Text): [Text] {
let result = [Text]
unsafe $IFS="{delimiter}" read -rd '' -a {nameof result} < <(printf %s "\${nameof text}")$
Expand All @@ -29,7 +40,7 @@ pub fun lines(text: Text): [Text] {

/// Splits a `text` into an array of substrings based on space character.
pub fun words(text: Text): [Text] {
return split(text, " ")
return split(text, " ")
}

/// Merges text using the delimeter specified.
Expand Down
8 changes: 0 additions & 8 deletions src/tests/stdlib/replace_regex.ab

This file was deleted.

8 changes: 8 additions & 0 deletions src/tests/stdlib/replace_regex_basic.ab
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import * from "std/text"

// Output
// abc[123]def

main {
echo replace_regex("abc123def", "\([0-9][0-9]*\)", "[\1]")
}
10 changes: 10 additions & 0 deletions src/tests/stdlib/replace_regex_ext.ab
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
import * from "std/text"

// Output
// abc[123]def

main {
// This will fail on any system where sed does not support extended
// regex syntax, via "-r" on GNU sed and "-E" on all other versions.
echo replace_regex("abc123def", "([0-9]+)", "[\1]", true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test would fail on a machine that does not support the new flag, right? Can we at least write a note somewhere or maybe even skip this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep the test, because it confirms that on systems which do support -E or -r, the right option is being used. And while Amber may be run on older systems, I doubt that anyone will run cargo test on them. Would you be happy with a comment in the test file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I think that we should create some syntax for interfacing commands given different versions and os's. This way we could just run the sed through that interface and never worry

Copy link
Contributor Author

@hdwalters hdwalters Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should create some syntax for interfacing commands given different versions and os's

Makes sense, would need to give it some thought, but happy to discuss. In the meantime, I've added this warning comment. Does this resolve your immediate request?

// This will fail on any system where sed does not support extended
// regex syntax, via "-r" on GNU sed and "-E" on all other versions.

}