Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtualenv is not used when calling subprocess module #63

Open
hguercan opened this issue May 3, 2024 · 0 comments
Open

virtualenv is not used when calling subprocess module #63

hguercan opened this issue May 3, 2024 · 0 comments

Comments

@hguercan
Copy link

hguercan commented May 3, 2024

Hello,

We are using this Dockerfile to generate the virtualenv that we later provide to our Emr Serverless 7.1 Application to be used.

FROM --platform=linux/amd64 public.ecr.aws/amazonlinux/amazonlinux:2023-minimal AS base

RUN dnf install -y gcc python3 python3-devel

ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

RUN python3 -m pip install --upgrade pip && \
    python3 -m pip install \
    venv-pack==0.2.0 \
    pytz==2022.7.1 \
    boto3==1.33.13 \
    pandas==1.3.5 \
    python-dateutil==2.8.2

RUN mkdir /output && venv-pack -o /output/pyspark_ge.tar.gz

FROM scratch AS export
COPY --from=base /output/pyspark_ge.tar.gz /

Within the Spark application we have a part which is calling ['aws', 's3', 'mv'] by calling check_call from subprocess module.
In that case it seems like the virtualenv is not used but the global python is used which is coming without dateutil (python 3.9)

Of course one could rewrite the application to call from the code logic with the current running binary but I also expected that I could provide an option to tell the emr serverless application "in general" to use my virtualenv and not just when running my pyspark application. Is it possible or is this behavior expected?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant