-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement builtin glob function #488
base: master
Are you sure you want to change the base?
Implement builtin glob function #488
Conversation
build.rs
Outdated
@@ -1,3 +1,4 @@ | |||
fn main() { | |||
build_helper::rerun_if_changed("src/tests/**"); | |||
// See [https://doc.rust-lang.org/cargo/reference/build-scripts.html]. | |||
build_helper::rerun_if_changed("src/tests"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "rerun-if-changed" directive takes a directory name, not a "**" extended glob; this was rebuilding Amber even if nothing had changed in "src/tests".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know this was commented. Good job!
src/main.rs
Outdated
@@ -175,15 +175,13 @@ fn handle_docs(cli: Cli) -> Result<(), Box<dyn Error>> { | |||
} | |||
} | |||
|
|||
/* | |||
#[cfg(target_os = "windows")] | |||
#[cfg(windows)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a conditional build for Windows (even if not officially supported), surely it shouldn't be commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #490.
format!("{quote}{}{quote}", translate_interpolated_region(self.strings.clone(), interps, true)) | ||
let strings = translate_interpolated_region(self.strings.clone(), interps, true); | ||
if self.unquoted { | ||
strings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary so we generate Bash for item in file*
not for item in "file*"
.
src/modules/function/glob.rs
Outdated
token(meta, "glob")?; | ||
token(meta, "(")?; | ||
let mut new_is_unquoted_ctx = true; | ||
swap(&mut new_is_unquoted_ctx, &mut meta.context.is_unquoted_ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern is dangerous, because the context value is not reverted to its previous value if the function does an early return, as it will if syntax(meta, &mut arg)?
fails below. However, the source code is full of instances of this, so we need a comprehensive fix. I will create a separate issue for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #489.
src/modules/function/invocation.rs
Outdated
@@ -58,8 +58,7 @@ impl SyntaxModule<ParserMetadata> for FunctionInvocation { | |||
// Get the function name | |||
let tok = meta.get_current_token(); | |||
if let Some(ref tok) = tok { | |||
self.line = tok.pos.0; | |||
self.col = tok.pos.1; | |||
(self.line, self.col) = tok.pos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally more readable to use tuple destructuring than explicit indices.
import * from "std/text" | ||
|
||
// Output | ||
// === |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These separators are necessary because if the script creates no output (as in this case) the unit test actually behaves as if we had written:
// Output
// Succeded
format!("{quote}{}{quote}", translate_interpolated_region(self.strings.clone(), interps, true)) | ||
let strings = translate_interpolated_region(self.strings.clone(), interps, true); | ||
if self.unquoted { | ||
strings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still allows for injection attacks, but I have pushed another commit to prevent this, by escaping spaces and semicolons:
if self.escaped {
strings.replace(" ", "\\ ").replace(";", "\\;")
} else {
let quote = meta.gen_quote();
format!("{quote}{strings}{quote}")
}
[ | ||
format!("for {name} in {expr}"), | ||
"do".to_string(), | ||
format!("if {conditional}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note, this will only work with string literals:
#!/usr/bin/env amber
echo "[works]"
loop index, item in glob("*.md") {
echo "{index}: {item}"
}
echo "[fails]"
let wildcard = "*.md"
loop index, item in glob(wildcard) {
echo "{index}: {item}"
}
I do not see any way around this (short of calling find
, which has its own problems) because Bash glob expansion does not work with quoted globs.
src/tests/stdlib/change_owner.ab
Outdated
@@ -13,5 +12,5 @@ main { | |||
} else { | |||
echo "File {tmpdir}/amber-symbolic not exists" | |||
} | |||
unsafe $rm {tmpdir}/amber-symbolic$ | |||
unsafe $rm -fr {tmpdir}$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were several places in the standard library unit tests where we failed to clean up after ourselves, so /tmp/
filled up with temporary directories. Fixed this while I was here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
} | ||
|
||
echo "===" | ||
loop file in glob("{tmpdir}/*with spaces*") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The glob builtin does not expect the calling code to escape spaces in its own string literals; it has to do this itself, to prevent injection attacks.
@@ -114,6 +114,8 @@ pub struct Context { | |||
pub is_main_ctx: bool, | |||
/// Determines if the context is in an unsafe block | |||
pub is_unsafe_ctx: bool, | |||
/// Determines if text should be escaped e.g. for file globs | |||
pub is_escaped_ctx: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is interesting, I think that can be used for:
A Bash generated:
I like this approach using the bash for. |
Please bear in mind that the
Not:
Or:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the syntax perspective, since glob
is a built-in and echo
is a built-in and glob
requires parentheses but echo
does not - does it mean we introduce a new builtin type?
I'd keep the new built-in just like the others: glob "value"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is there so many changes, unrelated to this PR? mainly the refactoring ones.
can you cherry pick the commits with the refactoring changes to a separate branch and make another PR for it? and leave only the relevant ones in this one
Yeah, fair enough. I tend to make changes as I go (though I do try to limit them to the vicinity of the functional changes). However, I would like to get to the bottom of the "are we ok with builtin functions that behave like standard library functions" question, before I do any more work in this branch. |
I've created a new pull request #495 for this purpose, including all the refactoring from this PR, and some other general tidying I've spotted while working on other branches. If we can get that merged, I'll rebase this one, and hopefully it will become a lot easier to review. |
f8e8de8
to
d23ca60
Compare
Now that I've rebased the branch onto the latest "master", can I get another review please? |
Fixes #463, by creating a builtin
glob
function.