Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some shaders that compiled in 0.8 are erring in 0.9 #1089

Open
schell opened this issue Aug 3, 2023 · 10 comments
Open

Some shaders that compiled in 0.8 are erring in 0.9 #1089

schell opened this issue Aug 3, 2023 · 10 comments
Labels
t: bug Something isn't working

Comments

@schell
Copy link

schell commented Aug 3, 2023

Expected Behaviour

Shaders should compile.

Example & Steps To Reproduce

Shaders that previously compiled in 0.8 are no longer compiling when bumping rust-gpu, spirv-std and spirv-builder to 0.9.

Specifically the shaders in question all give some form of this error:

error: error:0:0 - In Logical addressing, variables may not allocate a pointer type
         %24 = OpVariable %_ptr_Private__ptr_Private_uint Private %25
  |
  = note: spirv-val failed
  = note: module `/Users/schell/code/renderling/shaders/shader-crate/target/spirv-unknown-vulkan1.2/release/deps/shader_crate.spvs/main_fragment_scene.spv`
  1. git clone https://github.com/schell/renderling --branch chore/bump-rust-gpu-0.9
  2. cd renderling/shaders
  3. cargo run --release

System Info

Apple M1
macOS 13.4.1 (22F82)

spirv-val

SPIRV-Tools v2023.3 v2022.4-198-ge7c6084f
Targets:
  SPIR-V 1.0
  SPIR-V 1.1
  SPIR-V 1.2
  SPIR-V 1.3
  SPIR-V 1.4
  SPIR-V 1.5
  SPIR-V 1.6
  SPIR-V 1.2 (under OpenCL 2.2 Full Profile semantics)
  SPIR-V 1.0 (under Vulkan 1.0 semantics)
  SPIR-V 1.3 (under Vulkan 1.1 semantics)
 SPIR-V 1.4 (under Vulkan 1.1 semantics)
 SPIR-V 1.5 (under Vulkan 1.2 semantics)
 SPIR-V 1.6 (under Vulkan 1.3 semantics)
@schell schell added the t: bug Something isn't working label Aug 3, 2023
@repi
Copy link
Contributor

repi commented Aug 14, 2023

any ideas @eddyb ?

@eddyb
Copy link
Contributor

eddyb commented Aug 14, 2023

@repi sorry, this got discussed on the Discord but I forgot to post a reply here as well.

The problem is that 0.8, and perhaps earlier versions too (0.7? unsure how far back it goes), are actually in the wrong here, at least in the short term.

That is, we had allowed code we didn't intend to, to compile.

The reason is MIR inlining: it got enabled by default, and IIRC even made more aggressive, upstream, and it wasn't obvious right away how damaging it could be to us (I think the first hint was debug builds sometimes erroring).

0.9 closed this hole, but the damage had been done in some codebases, if they had used previous versions.


For this specific example I'm not even sure what's triggering it, my plan for a while has been to add our own validation (that can generate SPIR-T diagnostics, and thus get far more context information than spir-val), but I only started on that last week, I should prioritize it, to make debugging in general.

But "helping users tweak their code to bypass a limitation" is only a short-term solution.
Here it looks like the problem is something like &&123, and presumably enough inlining and/or constant-folding can eliminate the extra indirection, but it will take a lot more SPIR-T work to make it almost always work.

@schell
Copy link
Author

schell commented Sep 3, 2023

After compiling my shaders with RUSTGPU_CODEGEN_ARGS='--no-early-report-zombies --no-infer-storage-classes --spirt-passes=qptr' cargo run --release I got these errors which were quite a bit more helpful (thanks @eddyb for the tip).

I then made this one line change:

 -       self == &Id::NONE
 +      *self == Id::NONE

and now my shaders compile and validate.

@pema99
Copy link

pema99 commented Jan 24, 2024

Hello. I am facing a similar issue when updating to 0.9:

error: In Logical addressing, variables may not allocate a pointer type
           %165 = OpVariable %_ptr_Private__ptr_Private_uint Private %164
    |
    = note: module `D:\Projects\rust-gpu-compute-example\target\spirv-builder\spirv-unknown-vulkan1.1\release\deps\kernels.spv`

  warning: an unknown error occurred
    |
    = note: spirv-opt failed, leaving as unoptimized
    = note: module `D:\Projects\rust-gpu-compute-example\target\spirv-builder\spirv-unknown-vulkan1.1\release\deps\kernels.spv`

  error: error:0:0 - In Logical addressing, variables may not allocate a pointer type
           %165 = OpVariable %_ptr_Private__ptr_Private_uint Private %164
    |
    = note: spirv-val failed
    = note: module `D:\Projects\rust-gpu-compute-example\target\spirv-builder\spirv-unknown-vulkan1.1\release\deps\kernels.spv

When I run with the flags suggested in the comment just above this one, I get an ICE

error: failed to run custom build command for `rustic v0.1.0 (D:\Projects\rust-gpu-compute-example)`

Caused by:
  process didn't exit successfully: `D:\Projects\rust-gpu-compute-example\target\debug\build\rustic-abd3588c8bbf489a\build-script-build` (exit code: 101)   
  --- stdout
  cargo:rerun-if-env-changed=RUSTGPU_CODEGEN_ARGS
  cargo:rerun-if-env-changed=RUSTGPU_RUSTFLAGS

  --- stderr
     Compiling core v0.0.0 (C:\Users\Pema Malling\.rustup\toolchains\nightly-2023-05-27-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core)
     Compiling rustc-std-workspace-core v1.99.0 (C:\Users\Pema Malling\.rustup\toolchains\nightly-2023-05-27-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\rustc-std-workspace-core)
     Compiling compiler_builtins v0.1.92
     Compiling libm v0.2.6
     Compiling bytemuck v1.13.1
     Compiling spirv-std-types v0.7.0
     Compiling bitflags v1.3.2
     Compiling num-traits v0.2.15
     Compiling glam v0.22.0
     Compiling spirv-std v0.7.0
     Compiling shared_structs v0.1.0 (D:\Projects\rust-gpu-compute-example\shared_structs)
     Compiling kernels v0.1.0 (D:\Projects\rust-gpu-compute-example\kernels)
  thread 'rustc' panicked at 'unknown `--spirt-passes=qpt`', C:\Users\Pema Malling\.cargo\registry\src\index.crates.io-6f17d22bba15001f\rustc_codegen_spirv-0.9.0\src\linker\spirt_passes\mod.rs:176:18
  stack backtrace:
     0:     0x7ffe3aef60ac - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h70b4919684dd67bc
     1:     0x7ffe3af2f2cb - core::fmt::write::h49ea4d0830ece613
     2:     0x7ffe3aeeb809 - <std::io::IoSliceMut as core::fmt::Debug>::fmt::hb84243a4dbe28bcc
     3:     0x7ffe3aef5e5b - std::sys::common::alloc::realloc_fallback::h7830197214bb8741
     4:     0x7ffe3aef96b9 - std::panicking::default_hook::h0792f154c0945b0f
     5:     0x7ffe3aef936f - std::panicking::default_hook::h0792f154c0945b0f
     6:     0x7ffdfa350406 - rustc_driver_impl[870db4b48175521f]::install_ice_hook
     7:     0x7ffe3aef9dcb - std::panicking::rust_panic_with_hook::hd9c01cbcced319a5
     8:     0x7ffe3aef9b4d - <std::panicking::begin_panic_handler::StrPanicPayload as core::panic::BoxMeUp>::get::h09e6bfb0ba837dd7
     9:     0x7ffe3aef6cd9 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h70b4919684dd67bc
    10:     0x7ffe3aef9850 - rust_begin_unwind
    11:     0x7ffe3af62aa5 - core::panicking::panic_fmt::ha271b31fcd91f4fb
    12:     0x7ffe174ff401 - rustc_codegen_spirv::linker::spirt_passes::run_func_passes::h96f076ae6786ab8a
    13:     0x7ffe1734ef93 - rustc_codegen_spirv::linker::link::h43a3a67e6510621b
    14:     0x7ffe175966a5 - rustc_codegen_spirv::link::link::hc84c0de595892507
    15:     0x7ffe1750412d - <rustc_codegen_spirv::SpirvCodegenBackend as rustc_codegen_ssa::traits::backend::CodegenBackend>::link::h5402102cc366c311      
    16:     0x7ffdf7c31451 - <rustc_interface[66947bf71e0aa21e]::queries::Linker>::link
    17:     0x7ffdf7c0b252 - rustc_driver_impl[870db4b48175521f]::args::arg_expand_all
    18:     0x7ffdf7c0a246 - rustc_driver_impl[870db4b48175521f]::args::arg_expand_all
    19:     0x7ffe3af0d1cc - std::sys::windows::thread::Thread::new::ha73e81f94151504d
    20:     0x7ffe7c187344 - BaseThreadInitThunk
    21:     0x7ffe7cf026b1 - RtlUserThreadStart

  error: the compiler unexpectedly panicked. this is a bug.

  note: we would appreciate a bug report: https://github.com/EmbarkStudios/rust-gpu/issues/new

  note: rustc 1.71.0-nightly (1a5f8bce7 2023-05-26) running on x86_64-pc-windows-msvc

  note: compiler flags: --crate-type dylib --crate-type lib -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 -Z unstable-options -Z codegen-backend=D:\Projects\rust-gpu-compute-example\target\debug\rustc_codegen_spirv.dll -Z binary-dep-depinfo -C symbol-mangling-version=v0 -Z crate-attr=feature(register_tool) -Z crate-attr=register_tool(rust_gpu) -C overflow-checks=off -C debug-assertions=off -Z inline-mir=off -C llvm-args=--no-early-report-zombies --no-infer-storage-classes --spirt-passes=qpt

  note: some of the compiler flags provided by cargo are hidden

  query stack during panic:
  end of query stack
  note: `rust-gpu` version `0.9.0`

  error: could not compile `kernels` (lib)
  thread 'main' panicked at 'Kernel failed to compile: BuildFailed', build.rs:22:10
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Project: https://github.com/pema99/rust-path-tracer/tree/wgpu19 (happens after bumping the spirv-builder dep from 0.8.0 to 0.9.0)

Any ideas on how I can debug this?

@schell
Copy link
Author

schell commented Jan 25, 2024

@pema99 Maybe try bisecting your source for instances where functions or structs are using references. It's tedious but that's all I've got.

@pema99
Copy link

pema99 commented Jan 26, 2024

I'm guessing it's one of these two causing issues. Are you not allowed to store references in structs at all?

pub struct FixedVec<T, const CAPACITY: usize> {
    pub data: [T; CAPACITY],
    pub len: u32,
}
pub struct BVHReference<'a> {
    pub nodes: &'a [BVHNode],
}

@pema99
Copy link

pema99 commented Jan 27, 2024

It was actually this:

    pub fn uses_mis(&self) -> bool {
        self == &NextEventEstimation::MultipleImportanceSampling
    }

    pub fn uses_nee(&self) -> bool {
        self != &NextEventEstimation::None
    }

self is a simple enum type. I had to deref it instead of borrowing the literal

@schell
Copy link
Author

schell commented Jan 28, 2024

That's a very similar situation to mine. I had to deref self in a similar comparison. Glad you figured it out @pema99 :)

@eddyb
Copy link
Contributor

eddyb commented Feb 1, 2024

Figured out why == is relevant: the PartialEq implementation for &T looks like this:

        #[inline]
        fn eq(&self, other: &&B) -> bool {
            PartialEq::eq(*self, *other)
        }
        #[inline]
        fn ne(&self, other: &&B) -> bool {
            PartialEq::ne(*self, *other)
        }

If it was using the operators, it would look like **self == **other and **self != **other.

In Rust-GPU 0.8.0 (and possibly a bit before, too), we accidentally allowed MIR inlining to merge that method with your code, effectively giving you the *self == Foo::Bar / *self != Foo:Baz version that works for you today (and should've always worked).

It's unfortunate, and both versions will eventually work, but the version you were using with self == &... likely creates a &'static &'static T value for no good reason - well, other than * is uglier than &, so I can empathize on an aesthetic level with the more indirect version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants
@schell @eddyb @repi @pema99 and others