Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server crashes with "qemu: uncaught target signal 6" on Mac M1 silicon #29

Open
jameshowison opened this issue Jul 31, 2023 · 8 comments

Comments

@jameshowison
Copy link

jameshowison commented Jul 31, 2023

Using the 0.8.0-SNAPSHOT image, but running on a Macbook and Docker, I think the image uses x86 emulation. There is a bug (perhaps unfixed) in how qemu relates to this emulation.

That resulted in the server dumping core with the message "qemu: uncaught target signal 6".

I worked to build the image for arm64 (aka mac M1 or M2 silicon) using the edit to the Dockerfile in #28 and using --platform=arm64 and that seems to fix the issue (after specifying --no-cache on the docker build. e.g.,

docker build --platform=linux/arm64 -t grobid/software-mentions:0.8.0-SNAPSHOT-arm64 --build-arg GROBID_VERSION=0.8.0-SNAPSHOT-arm64 --file Dockerfile.software .

Perhaps images could be built as multi platform with --platform=linux/amd64,linux/arm64 as described here:

https://www.docker.com/blog/faster-multi-platform-builds-dockerfile-cross-compilation-guide/

(but see below, that seemed to work briefly but now fails, which is very odd).

@jameshowison
Copy link
Author

Well, I'm completely stumped. 10 minutes ago this image (built with --platform=linux/arm64) was avoiding this error that says qemu: uncaught target signal 6. I was seeing calls to the annotate_tei method and the counter for processed files from the client was going up (slowly); that said, no software.json files were written, so it wasn't perfect.

Now, same image, same data, is immediate throwing that error.

I've tried everything I can think of, can't get this to run on M1 Mac (via Docker, I also tried to get the server to run directly, using the compile instructions) but none of that worked. I guess I'll try via AWS?

@jameshowison jameshowison changed the title server crashes with "qemu: uncaught target signal 6" on Mac M1 silicon (solved) server crashes with "qemu: uncaught target signal 6" on Mac M1 silicon Jul 31, 2023
@jameshowison
Copy link
Author

Working on the mac m1 build for Docker is really difficult when you don’t have the hardware. I don’t exactly know how to go about it, perhaps some other grobid users have it working? It’s definitely something to do with qemu, but I’m not sure if the versions are pinned or if they could be updated?

@jameshowison
Copy link
Author

I played with this a little more. A few notes for others that might come this way:

  1. Following the hint at the end of TensorFlow binary crashes on Apple M1 in x86_64 Docker container tensorflow/tensorflow#52845 (comment) I backed out the stage_1 base image to FROM python:3.8-slim then did a RUN pip install tensorflow tensorflow-io. Building with docker build --platform=linux/arm64/v8 -t grobid/software-mentions:0.8.0-SNAPSHOT-aarch64 --build-arg GROBID_VERSION=0.8.0-SNAPSHOT-aarch64 --file Dockerfile.software . that completes fine. And the builder stage seems to build as arm64 fine (btw, I have no idea of the difference, if any, between aarch64 and arm64)
  2. Then one runs into a problem with DeLFT 0.3.3 that shows up from a dependency that is trying to build tensorflow-gpu (which is deprecated, apparently they are the same but they've stopped building a package with that name?). Also showed up as a problem installing tensorflow==2.9.3
  3. I forked DeLFT at https://github.com/jameshowison/delft and added git to the install so I could mess with the dependencies and eventually figured out that the setup.py was the place to change some of the pinned version numbers. I changed tensorflow to >=2.9.3 but had to remove tensorflow-addon entirely (it is now deprecated, so I don't think there are built versions available for aarch64?)
  4. With those dependency changes then the pip install line works.
  5. Unfortunately I then run into trouble with the # install jep (and temporarily the matching JDK) step. That seems to be directly grabbing a jdk and I'm guessing a x86 one because the error is:
 > [stage-1 14/34] RUN /tmp/jdk-17/bin/javac -version:
0.456 qemu-x86_64: Could not open '/lib64/ld-linux-x86-64.so.2': No such file or directory

Ah, I see that it's actually in the name openjdk-17.0.2_linux-x64_bin.tar.gz

Does the jep install need to be done this way, could more standard package installs help?

@jameshowison
Copy link
Author

It is possible that the jep install will work with:

RUN JAVA_HOME="$(dirname $(dirname $(readlink -f $(which java))))" pip3 install jep==4.0.2   

based on reading: https://stackoverflow.com/questions/43655291/dynamically-set-java-home-of-docker-container

That does seem to work for me in the Dockerfile.software

@jameshowison
Copy link
Author

So, maybe making progress. I can get the docker build to finish, and it does create an aarch64 image.

Unfortunately, it hits this error:

screenit-softcite-server_software_mentions-1  | INFO  [2023-08-01 19:34:43,591] com.hubspot.dropwizard.guicier.DropwizardModule: Added guice injected health check: org.grobid.service.controller.HealthCheck
screenit-softcite-server_software_mentions-1  | com.google.inject.CreationException: Unable to create injector, see the following errors:
screenit-softcite-server_software_mentions-1  | 
screenit-softcite-server_software_mentions-1  | 1) Error injecting constructor, java.lang.UnsatisfiedLinkError: /opt/grobid/grobid-home/lib/lin-64/libwapiti.so: /opt/grobid/grobid-home/lib/lin-64/libwapiti.so: cannot open shared object file: No such file or directory (Possible cause: can't load AMD 64 .so on a AARCH64 platform)
screenit-softcite-server_software_mentions-1  |   at org.grobid.service.GrobidEngineInitialiser.<init>(GrobidEngineInitialiser.java:29)

So lin-64/libwapiti.so is something built earlier in the chain.

@jameshowison
Copy link
Author

@kermitt2 Hi Patrice, I'm trying to return to this, any idea if there was progress here? I'll check over on grobid/grobid as well.

@kermitt2
Copy link
Collaborator

Hi James, I am calling for help @lfoppiano for the issue because he concentrates all the experience for running Grobid on mac and on the challenge of building a arm64 docker image.

Afaik aarch64 and arm64 are the same.

@lfoppiano
Copy link
Contributor

lfoppiano commented Apr 30, 2024

Hi @jameshowison,
this subject is still open, since we still don't have (yet) a CI that build images for ARM yet. But might be an opportunity to have it soon 😄

First, could you try to run this image on your Mac and let me know if it works (try to process a few files)?

https://hub.docker.com/layers/lfoppiano/grobid/0.8.0-arm/images/sha256-79b85da73bae5c2a483e381c1e1231bc73dc0d6b987f16b867a3eb6e8154d7b8?context=explore

Given that software-mention is based on Grobid I think the simplest is to build its docker image from the grobid one, unless I'm overlooking something 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants