Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check the file type after open() in Dir.openFile() when using WASI #15824

Closed
wants to merge 1 commit into from

Conversation

jedisct1
Copy link
Contributor

When the O_DIRECTORY flag is not present, Wasmtime now opens files and directories the same way, and returns a descriptor in any case.

So, explicitly check the file type after opening it, and return err.IsDir if what we actually have is a directory.

Fixes the lib/std/fs "file operations on directories" test on WASI with Wasmtime 9.0.0

When the `O_DIRECTORY` flag is not present, Wasmtime now opens
files and directories the same way, and returns a descriptor in
any case.

So, explicitly check the file type after opening it, and return
`err.IsDir` if what we actually have is a directory.

Fixes the `lib/std/fs` `"file operations on directories"` test
on WASI with Wasmtime 9.0.0
@jedisct1 jedisct1 changed the title Check the file type after open() in os.openFile() when using WASI Check the file type after open() in Dir.openFile() when using WASI May 22, 2023
@squeek502
Copy link
Collaborator

squeek502 commented May 22, 2023

I would guess that wasmtime behaves differently here on Windows vs non-Windows (i.e. I don't think the stat here is necessary on Windows, as AFAIK there's no way to open a directory as a file on Windows; EDIT: I might be wrong about this). Related: #13157

I'm having a very hard time navigating wasmtime/wasi spec/etc to understand if/where this behavior is specified, why it changed, how it changed, etc. My naive first impression is that it seems like it'd be a wasmtime/wasi spec bug, assuming at least that wasmtime wants each host platform to behave the same; from #13157 (comment):

The extra stat calls on Zig's side seems unfortunate, since, as I understand it, ironing out these sorts of cross-platform differences would be something that a WASI runtime would have an interest in doing (basically for the same reasons as Zig's std library). If so, then in theory both wasmtime and Zig may eventually converge in terms of their cross-platform API (that is, the stuff wasmtime would need to do to have consistent cross-platform behavior would be similar/the same to what Zig would have to do).

@jedisct1
Copy link
Contributor Author

Here is where the change may have been introduced.

I think the intent was to be In line with how O_DIRECTORY works on POSIX.

@squeek502
Copy link
Collaborator

Thanks for the link, that makes sense. I was wrong about Windows needing file or directory specified in its open calls; see d3f87f8.

Perhaps the wasi spec would be interested in a FILE_NON_DIRECTORY_FILE-like open flag (since they could then use the stat they're already doing [for capability stuff] to handle it for us).

@squeek502
Copy link
Collaborator

squeek502 commented May 23, 2023

Also related: #5732

zig/lib/std/fs/test.zig

Lines 458 to 460 in a0652fb

// Note: The `.mode = .read_write` is necessary to ensure the error occurs on all platforms.
// TODO: Add a read-only test as well, see https://github.com/ziglang/zig/issues/5732
try testing.expectError(error.IsDir, tmp_dir.dir.openFile(test_dir_name, .{ .mode = .read_write }));

I wonder if this would be considered a wasmtime bug, in that it seems like it's allowing opening directories for writing.

EDIT: Actually, why isn't the openat syscall within wasmtime returning EISDIR here? Is it not being called with O_RDWR even though the test is passing .read_write? Will test this with strace when I get a chance.

@squeek502
Copy link
Collaborator

I can't reproduce the file operations on directories test failure with wasmtime 9.0.1 with Linux as the host OS.

$ wasmtime --version
wasmtime-cli 9.0.1
$ zig version
0.11.0-dev.3277+a0652fb93
$ zig test --zig-lib-dir lib --main-pkg-path lib/std -target wasm32-wasi --test-cmd wasmtime --test-cmd --dir=. --test-cmd-bin lib/std/std.zig
...
1106/2505 test.file operations on directories... OK
...
2354 passed; 151 skipped; 0 failed.

@jedisct1
Copy link
Contributor Author

jedisct1 commented May 23, 2023

Still failing for me with wasmtime 9.0.1:

$ build/stage3/bin/zig build test-std -Dskip-release -fwasmtime -Dtest-filter=directories

run test std-wasm32-wasi-Debug-libc: error: 'test.file operations on directories' failed: expected error.IsDir, found fs.file.File{ .han
dle = 1098, .capable_io_mode = io.ModeOverride__enum_7224.blocking, .intended_io_mode = io.ModeOverride__enum_7224.blocking }

@jedisct1
Copy link
Contributor Author

Is that the intended behavior, or a wasmtime bug? /cc @jameysharp

@jameysharp
Copy link

I don't understand what this test is trying to do, and I also don't understand WASI very well, either in spec or in implementation. So I don't know how to answer that question.

I do think the idea of a FILE_NON_DIRECTORY_FILE-like flag as the opposite of O_DIRECTORY is an interesting idea that would be worth taking to the WASI folks.

And I'm curious about this as well:

Actually, why isn't the openat syscall within wasmtime returning EISDIR here? Is it not being called with O_RDWR even though the test is passing .read_write?

@jedisct1
Copy link
Contributor Author

@jameysharp The test opens a directory (without O_DIRECTORY) for writes, and a proper handle is returned instead of an error.

I think this is a recent change. On CI which is still running Wasmtime 2.0.2, the test passes.

@squeek502
Copy link
Collaborator

The failure is related to linking libc. I can reproduce it if I add -lc to my zig test command. Will investigate more to understand what's happening.

@squeek502
Copy link
Collaborator

squeek502 commented May 23, 2023

EDIT: This comment can be ignored (only the "Here's how I'm testing" part is relevant), see the next comment for better info


Looks like wasi-libc is translating openat to an openat2 syscall. From strace (my version doesn't recognize the openat2 syscall number EDIT: see my next comment):

// notes: 0x1b5 = 437 which is openat2, 0x3 is the cwd's fd, 0x18 is sizeof(struct open_how)
syscall_0x1b5(0x3, 0x7ffe0db39fc8, 0x7ffe0db39f68, 0x18, 0, 0) = 0x5

According to the openat2 docs, all the possible errors of openat should apply to openat2, so my guess is that this is either a bug in the wasi-libc implementation of openat2 or wasi-libc is effectively dropping the O_RDWR somewhere before calling openat2. Need to somehow figure out what's in the open_how struct being passed to openat2.

Here's how I'm testing:

// open-dir-rw.zig
const std = @import("std");

pub fn main() !void {
    const test_dir_name = "testdir";
    try std.fs.cwd().makePath("testdir");

    try std.testing.expectError(error.IsDir, std.fs.cwd().openFile(test_dir_name, .{ .mode = .read_write }));
}
zig build-exe -target wasm32-wasi open-dir-rw.zig -lc
wasmtime --dir=. open-dir-rw.wasm

or with strace

strace wasmtime --dir=. open-dir-rw.wasm

EDIT: Here's the relevant line from strace if open-dir-rw.zig is built without linking libc:

syscall_0x1b5(0x3, 0x7ffc8b25e198, 0x7ffc8b25e138, 0x18, 0x1, 0x1) = -1 EISDIR (Is a directory)

So most likely openat2 is being called without the appropriate access flags when linking libc.


EDIT#2: I think this is as far as I can go for now. My debugging skills are not up to par for this task, but hopefully this is enough of a starting point that someone with better debugging skills can figure out what's happening here.

@squeek502
Copy link
Collaborator

squeek502 commented May 24, 2023

Updated strace so that it recognizes openat2 + made it print a stack trace with -k.

With libc linked (no EISDIR):

openat2(3, "testdir", {flags=O_RDONLY|O_CLOEXEC, resolve=RESOLVE_NO_MAGICLINKS|RESOLVE_BENEATH}, 24) = 5
 > /home/ryan/.wasmtime/bin/wasmtime(cap_primitives::rustix::linux::fs::open_impl::open_beneath+0x17b) [0x7b4b6b]
 > /home/ryan/.wasmtime/bin/wasmtime(cap_primitives::rustix::linux::fs::open_impl::open_impl+0x27) [0x7b4847]
 > /home/ryan/.wasmtime/bin/wasmtime(wasi_cap_std_sync::dir::Dir::open_file_+0x116) [0x76d8d6]
 > /home/ryan/.wasmtime/bin/wasmtime(<wasi_cap_std_sync::dir::Dir as wasi_common::dir::WasiDir>::open_file::{{closure}}+0x4c) [0x76dbcc]
 > /home/ryan/.wasmtime/bin/wasmtime(wasi_common::snapshots::preview_1::<impl wasi_common::snapshots::preview_1::wasi_snapshot_preview1::WasiSnapshotPreview1 for wasi_common::ctx::WasiCtx>::path_open::{{closure}}+0xa2) [0x79cde2]
 > /home/ryan/.wasmtime/bin/wasmtime(<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll+0x1115) [0x429da5]
 > /home/ryan/.wasmtime/bin/wasmtime(wiggle::run_in_dummy_executor+0x35f) [0x3e8f0f]
 > /home/ryan/.wasmtime/bin/wasmtime(<core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once+0x1bf) [0x44ce3f]
 > /home/ryan/.wasmtime/bin/wasmtime(<F as wasmtime::func::IntoFunc<T,(wasmtime::func::Caller<T>,A1,A2,A3,A4,A5,A6,A7,A8,A9),R>>::into_func::native_call_shim+0x107) [0x4f3677]
 > unexpected_backtracing_error [0x7f35358f70e5]

Without libc linked (gets EISDIR):

openat2(3, "testdir", {flags=O_RDWR|O_APPEND|O_NOFOLLOW|O_CLOEXEC, resolve=RESOLVE_NO_MAGICLINKS|RESOLVE_BENEATH}, 24) = -1 EISDIR (Is a directory)

(stack trace is the same as above)

EDIT: Note also that changing .read_write to .write_only doesn't affect anything, the bug still occurs and the same flags are passed to openat2 when linking libc.


EDIT: The bug doesn't seem to be on the Zig side. If I add

std.debug.print("openatZ RDWR={}\n", .{flags & O.RDWR == O.RDWR});

before this line:

const rc = openat_sym(dir_fd, file_path, flags, mode);

then it prints:

openatZ RDWR=true

EDIT#2: The equivalent C program compiled to wasm exhibits the same (buggy) behavior:

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>

int main() {
    int rc = openat(AT_FDCWD, "testdir", O_RDWR);
    return 0;
}
zig build-exe -target wasm32-wasi open-dir-rw-c.c -lc
openat2(3, "testdir", {flags=O_RDONLY|O_CLOEXEC, resolve=RESOLVE_NO_MAGICLINKS|RESOLVE_BENEATH}, 24) = 5

EDIT#3: Can also reproduce this with wasi-sdk-20.0 (https://github.com/WebAssembly/wasi-sdk)

$WASI_SDK/bin/clang --sysroot=$WASI_SDK/share/wasi-sysroot open-dir-rw-c.c -o open-dir-rw-c-sdk.wasm
$ strace -e trace=openat2 wasmtime --dir=. open-dir-rw-c-sdk.wasm
openat2(3, "testdir", {flags=O_RDONLY|O_CLOEXEC, resolve=RESOLVE_NO_MAGICLINKS|RESOLVE_BENEATH}, 24) = 5

EDIT#4: Made a wasi-libc issue: WebAssembly/wasi-libc#415

@jedisct1
Copy link
Contributor Author

The test is failing since this exact commit in wasmtime: bytecodealliance/wasmtime@98501e4

@jedisct1
Copy link
Contributor Author

This is actually a bug in Wasmtime: bytecodealliance/wasmtime#6462

And more importantly, they are planning to released a fixed version soon, so we don't need the extra stat()s.

That was also a good reminder that we should constantly keep wasmtime up to date on the CI, and eventually run the tests with other runtimes to spot inconsistencies.

Thanks a lot @squeek502 for your help and investigation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants