Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELF file not able to run in gem5 SE mode #4

Open
Yicheng22 opened this issue Jun 21, 2022 · 14 comments
Open

ELF file not able to run in gem5 SE mode #4

Yicheng22 opened this issue Jun 21, 2022 · 14 comments

Comments

@Yicheng22
Copy link

At first, I am surprised about this work. I can convert sliced pinball to ELF file which can reduce the simulation times also get rid of complex environment setup og gem5 simulation.

However, I have followed the instructions. I still cannot run the ELF file with gem5.

@hgpatil
Copy link
Contributor

hgpatil commented Jun 21, 2022

I assume the ELFie by itself otherwise runs fine, does it? (Test the 'perf' version natively so it exits gracefully.)
While we have done some testing with ELFies with GEM5, it did require some changes to GEM5 to make the combination work. We plan to work with the GEM5 developers to get a more robust support for ELFies in the future.

@Yicheng22
Copy link
Author

Yicheng22 commented Jun 23, 2022

Yes. I am using "pinball2elf.sim.sh" to convert pinball to ELFie. The ELFie runs perfectly on bare-metal machine. However, it's very difficult to make it work with GEM5.

I am very interested to run these ELFies with GEM5 SE mode. Could you please provide simple demo or hints, (like how to make hello.sim.elfie runable in gem5)?

@hgpatil
Copy link
Contributor

hgpatil commented Jun 23, 2022

I have talked to my NUS colleagues/co-authors of the CGO2021 ELFie paper. They had to make changes to GEM5 to get ELFies to work with GEM5. We are trying to figure out the best way to get those changes to you. Thank you for your patience.

@Yicheng22
Copy link
Author

Thanks for your reply. Can't wait to apply those changes to GEM5.

@hgpatil
Copy link
Contributor

hgpatil commented Jul 8, 2022

I have obtained a patch from my NUS collaborators. Please email me (first.last@intel.com) so I can send it to you to try out.

@Yicheng22
Copy link
Author

Thanks for your help. I have tried with gem5_v21 path. And the hello world elfie in this repository. However, the error still existed. Based on the error message, I believe there is something wrong with entry address of generated elfie.

**** REAL SIMULATION ****
build/X86/sim/simulate.cc:194: info: Entering event queue @ 0.  Starting simulation...
gem5.opt: build/X86/sim/fd_array.cc:321: std::shared_ptr<gem5::FDEntry> gem5::FDArray::getFDEntry(int): Assertion `0 <= tgt_fd && tgt_fd < _fdArray.size()' failed.

@hgpatil
Copy link
Contributor

hgpatil commented Jul 14, 2022

One think you need to note that ELFies are not like a typical compiler generated ELF binaries. Namely, they may be missing certain pieces that GEM5's ELF parser may be relying on. While I do not believe we encountered the exact error you are seeing, the general suggestion is to modify GEM5's ELF reader/consumer to work around the issues.
Specific to the Entry address: the entry address of the embedded application is different from the entry address of the ELFie binary. As the CGO-2021 ELFie paper discusses, the application pages are not initially loaded in memory when ELFie runs. Instead the start-up code explicitly loads them before jumping to the application code.

@qishao-chalmers
Copy link

Thanks for your help. I have tried with gem5_v21 path. And the hello world elfie in this repository. However, the error still existed. Based on the error message, I believe there is something wrong with entry address of generated elfie.

**** REAL SIMULATION ****
build/X86/sim/simulate.cc:194: info: Entering event queue @ 0.  Starting simulation...
gem5.opt: build/X86/sim/fd_array.cc:321: std::shared_ptr<gem5::FDEntry> gem5::FDArray::getFDEntry(int): Assertion `0 <= tgt_fd && tgt_fd < _fdArray.size()' failed.

I came across the same bug. Have you solved this problem? Thanks for your time.

@qishao-chalmers
Copy link

Thanks for your help. I have tried with gem5_v21 path. And the hello world elfie in this repository. However, the error still existed. Based on the error message, I believe there is something wrong with entry address of generated elfie.

**** REAL SIMULATION ****
build/X86/sim/simulate.cc:194: info: Entering event queue @ 0.  Starting simulation...
gem5.opt: build/X86/sim/fd_array.cc:321: std::shared_ptr<gem5::FDEntry> gem5::FDArray::getFDEntry(int): Assertion `0 <= tgt_fd && tgt_fd < _fdArray.size()' failed.

Hi Yicheng, I found this bug might is due to operating system call "read()". Maybe we need to use the tool "pinball_state" in ./pinball2elf/pintools/ to generate FD files, as mentioned in the "ELFie Execution Challenges" of the paper.

@powerjg
Copy link

powerjg commented Nov 18, 2022

Hi all,

We (@BobbyRBruce and I) would like to help out to get ELFies working in gem5. For those that are running into the FDEntry issue, could you provide us with the ELFie that's causing the issue and/or describe in detail how you generated it?

By the way, I have found two possible other issues that need to be "fixed" during my testing:

  1. Since the ELFie uses lots of mmap calls to allocate each 4KiB page, it's possible to overrun the max number of vm maps. You can "fix" this by adding vm.max_map_count = 2097152 to /etc/sysctl.conf. I can open a separate issue for this, if that's helpful.
  2. The syscall modify_ldt is not implemented in gem5. However, I tested ignoring it and the test ELFie in examples/ST worked. The other ELFie I had (which is broken for a different reason) was also able to get to the main execution, I think. At least, I have the same error on gem5 and hardware now.

Edit: ignoring modify_ldt requires the following patch:

diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc
index 1e7274cc42..af9ceb35fd 100644
--- a/src/arch/x86/linux/syscall_tbl64.cc
+++ b/src/arch/x86/linux/syscall_tbl64.cc
@@ -197,7 +197,7 @@ SyscallDescTable<EmuLinux::SyscallABI64> EmuLinux::syscallDescs64 = {
     { 151, "mlockall" },
     { 152, "munlockall" },
     { 153, "vhangup" },
-    { 154, "modify_ldt" },
+    { 154, "modify_ldt", ignoreFunc },
     { 155, "pivot_root" },
     { 156, "_sysctl" },
     { 157, "prctl", ignoreFunc },

My guess is that the error you're seeing in gem5 is that there was a file open by the application when you created the ELFie. So, we need to let gem5 know about that file. But, I'd like to verify this hypothesis before guessing as to how to solve it :).

Let me know how we in the gem5 community can help!

PS: I've been talking to Trevor about this as well.

@qishao-chalmers
Copy link

Hi all,

We (@BobbyRBruce and I) would like to help out to get ELFies working in gem5. For those that are running into the FDEntry issue, could you provide us with the ELFie that's causing the issue and/or describe in detail how you generated it?

By the way, I have found two possible other issues that need to be "fixed" during my testing:

1. Since the ELFie uses lots of mmap calls to allocate each 4KiB page, it's possible to overrun the max number of vm maps. You can "fix" this by adding `vm.max_map_count = 2097152` to `/etc/sysctl.conf`. I can open a separate issue for this, if that's helpful.

2. The syscall `modify_ldt` is not implemented in gem5. However, I tested ignoring it and the test ELFie in `examples/ST` worked.  The other ELFie I had (which is broken for a different reason) was also able to get to the main execution, I think. At least, I have the same error on gem5 and hardware now.

Edit: ignoring modify_ldt requires the following patch:

diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc
index 1e7274cc42..af9ceb35fd 100644
--- a/src/arch/x86/linux/syscall_tbl64.cc
+++ b/src/arch/x86/linux/syscall_tbl64.cc
@@ -197,7 +197,7 @@ SyscallDescTable<EmuLinux::SyscallABI64> EmuLinux::syscallDescs64 = {
     { 151, "mlockall" },
     { 152, "munlockall" },
     { 153, "vhangup" },
-    { 154, "modify_ldt" },
+    { 154, "modify_ldt", ignoreFunc },
     { 155, "pivot_root" },
     { 156, "_sysctl" },
     { 157, "prctl", ignoreFunc },

My guess is that the error you're seeing in gem5 is that there was a file open by the application when you created the ELFie. So, we need to let gem5 know about that file. But, I'd like to verify this hypothesis before guessing as to how to solve it :).

Let me know how we in the gem5 community can help!

PS: I've been talking to Trevor about this as well.

Hi Jason,

Many thanks for your help. Yes, I have come across those 2 issues about mapping and ignore. And I solved them by the same method.

I generate fat pinball for 505.mcf in spec2017 and use example/ST to generate base/perf/sim.elfie. Enclosed is the compressed file. This is the link to the mcf file. https://drive.google.com/drive/folders/1Ihi0qZSXDWcz7HGDNPgruX03Y1-TFHgu. I copied these file to the example/ST folder and modified testST.sh, replacing log with mcf, and just generate *.elf by testST.sh.

By the way, can gem5 work with the example/ST/log_0.perf.elfie or log_0.basic.elfie file? I tried with:
./build/X86/gem5.fast --outdir=m5out/elf ./configs/example/se.py -c $path/log_0*.file but failed.

Many thanks for your time.

@tanglt1514
Copy link

Hi all,
We (@BobbyRBruce and I) would like to help out to get ELFies working in gem5. For those that are running into the FDEntry issue, could you provide us with the ELFie that's causing the issue and/or describe in detail how you generated it?
By the way, I have found two possible other issues that need to be "fixed" during my testing:

1. Since the ELFie uses lots of mmap calls to allocate each 4KiB page, it's possible to overrun the max number of vm maps. You can "fix" this by adding `vm.max_map_count = 2097152` to `/etc/sysctl.conf`. I can open a separate issue for this, if that's helpful.

2. The syscall `modify_ldt` is not implemented in gem5. However, I tested ignoring it and the test ELFie in `examples/ST` worked.  The other ELFie I had (which is broken for a different reason) was also able to get to the main execution, I think. At least, I have the same error on gem5 and hardware now.

Edit: ignoring modify_ldt requires the following patch:

diff --git a/src/arch/x86/linux/syscall_tbl64.cc b/src/arch/x86/linux/syscall_tbl64.cc
index 1e7274cc42..af9ceb35fd 100644
--- a/src/arch/x86/linux/syscall_tbl64.cc
+++ b/src/arch/x86/linux/syscall_tbl64.cc
@@ -197,7 +197,7 @@ SyscallDescTable<EmuLinux::SyscallABI64> EmuLinux::syscallDescs64 = {
     { 151, "mlockall" },
     { 152, "munlockall" },
     { 153, "vhangup" },
-    { 154, "modify_ldt" },
+    { 154, "modify_ldt", ignoreFunc },
     { 155, "pivot_root" },
     { 156, "_sysctl" },
     { 157, "prctl", ignoreFunc },

My guess is that the error you're seeing in gem5 is that there was a file open by the application when you created the ELFie. So, we need to let gem5 know about that file. But, I'd like to verify this hypothesis before guessing as to how to solve it :).
Let me know how we in the gem5 community can help!
PS: I've been talking to Trevor about this as well.

Hi Jason,

Many thanks for your help. Yes, I have come across those 2 issues about mapping and ignore. And I solved them by the same method.

I generate fat pinball for 505.mcf in spec2017 and use example/ST to generate base/perf/sim.elfie. Enclosed is the compressed file. This is the link to the mcf file. https://drive.google.com/drive/folders/1Ihi0qZSXDWcz7HGDNPgruX03Y1-TFHgu. I copied these file to the example/ST folder and modified testST.sh, replacing log with mcf, and just generate *.elf by testST.sh.

By the way, can gem5 work with the example/ST/log_0.perf.elfie or log_0.basic.elfie file? I tried with: ./build/X86/gem5.fast --outdir=m5out/elf ./configs/example/se.py -c $path/log_0*.file but failed.

Many thanks for your time.

Have you solved the PDEntry problem? Thanks for your time.

@hgpatil
Copy link
Contributor

hgpatil commented Apr 11, 2023

Added examples/ST/pinball.SSE4.2. This pinball is targetted for an older x86 mico-architecture. It does not use any AVX register/instructions. This may be a good test for GEM5.

@hgpatil
Copy link
Contributor

hgpatil commented Apr 11, 2023

Since the ELFie uses lots of mmap calls to allocate each 4KiB page, it's possible to overrun the max number of vm maps. You can "fix" this by adding vm.max_map_count = 2097152 to /etc/sysctl.conf.

Added this to README.md under "Useful Tips"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants