Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot create virtualenv for PythonVirtualenvOperator #39953

Closed
1 of 2 tasks
RajasGujarathi opened this issue May 30, 2024 · 12 comments
Closed
1 of 2 tasks

Cannot create virtualenv for PythonVirtualenvOperator #39953

RajasGujarathi opened this issue May 30, 2024 · 12 comments

Comments

@RajasGujarathi
Copy link

Apache Airflow version

2.9.1

If "Other Airflow 2 version" selected, which one?

No response

What happened?

Using the docker image apache/airflow:slim-2.9.1-python3.12

I installed an alternate Python version 3.9.

When using the PythonVirtualenvOperator in a DAG

.
.
python_3_9 = PythonVirtualenvOperator(task_id='python_3_9',
                                      requirements=["colorama==0.4.0"],
                                      system_site_packages=False,
                                      python_callable=func,
                                      python_version='3.9',
                                      dag=dag)
.
.

The DAG fails with an error

{process_utils.py:183} INFO - Executing cmd: /home/***/.local/bin/python -m virtualenv /tmp/venv_4wv1u08 --python=python3.9
{process_utils.py:187} INFO - Output:
{process_utils.py:191} INFO - PermissionError: [Errno 13] Permission denied: '/root/bin'

What you think should happen instead?

The command for the creation of virtualenv should have been successful

How to reproduce

Non-optimised Dockerfile created only for simulating the behaviour

FROM apache/airflow:slim-2.9.1-python3.12

ARG ALTERNATE_PYTHON_VERSIONS="3.9.0"

WORKDIR /tmp

USER root 

RUN apt-get update \
    && apt-get install -y --fix-missing --no-install-recommends \
        zlib1g-dev \
        wget \
        libsasl2-dev \
        libldap2-dev \
        libssl-dev \
        gcc \
        make

USER airflow

WORKDIR $AIRFLOW_USER_HOME_DIR

RUN pip install virtualenv \
    && for ADD_PYTHON_VERSION in $ALTERNATE_PYTHON_VERSIONS; do \
    wget --quiet "https://www.python.org/ftp/python/${ADD_PYTHON_VERSION}/Python-${ADD_PYTHON_VERSION}.tgz" \
    && tar xzf "Python-${ADD_PYTHON_VERSION}.tgz" \
    && cd "Python-${ADD_PYTHON_VERSION}" && ./configure --prefix=$AIRFLOW_USER_HOME_DIR/.local --enable-optimizations && make altinstall \
    && cd .. && rm -rf "Python-${ADD_PYTHON_VERSION}*" ; \
    done
  • Build the docker image
  • Run the image - docker run -it <image_name:tag> bash
  • Create the virtualenv - python -m virtualenv /tmp/venviplvcl3b --python=python3.9
  • Modify the PATH environment variable, remove /root/bin
  • Try re-creating the virtualenv

Operating System

MacOS

Versions of Apache Airflow Providers

Not applicable

Deployment

Other Docker-based deployment

Deployment details

Extending apache/airflow:slim-2.9.1-python3.12 and then running a webserver, scheduler & worker

Anything else?

The user airflow has a path /root/bin on its $PATH but does not have access to this directory, can this be the cause of the error?

Discussion #39905

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@RajasGujarathi RajasGujarathi added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels May 30, 2024
Copy link

boring-cyborg bot commented May 30, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@Taragolis
Copy link
Contributor

Taragolis commented May 31, 2024

If I follow installing from source from Debian wiki, I've also have the same issue.

I would rather to say that is kind of a bug (or feature better ask about this behaviour them) of virtualenv

airflow@37a3cf621ce5:/tmp$ echo $PATH
/root/bin:/home/airflow/.local/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

airflow@37a3cf621ce5:/tmp$ which python3.9
/usr/local/bin/python3.9

airflow@37a3cf621ce5:/tmp$ python -m virtualenv /tmp/venv67bk3b69 --system-site-packages --python=python3.9
PermissionError: [Errno 13] Permission denied: '/root/bin'

However if I provide a path then there is not a problem to create virtualenv

airflow@37a3cf621ce5:/tmp$ python -m virtualenv /tmp/venv67bk3b69 --system-site-packages --python=/usr/local/bin/python3.9
created virtual environment CPython3.9.0.final.0-64 in 472ms
  creator CPython3Posix(dest=/tmp/venv67bk3b69, clear=False, no_vcs_ignore=False, global=True)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/airflow/.local/share/virtualenv)
    added seed packages: pip==24.0, setuptools==69.5.1, wheel==0.43.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator

So the error happen during lookup python3.9 in virtualenv, in theory it should skip Permission denied, or maybe use which command

airflow@37a3cf621ce5:/tmp$ python
Python 3.12.3 (main, Apr 24 2024, 07:13:43) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import shutil
>>> shutil.which("python3.9")
'/usr/local/bin/python3.9'
>>> 

>>> from virtualenv import session_via_cli
>>> 
>>> session = session_via_cli(["/tmp/fooobar", "-p", "python3.9"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/airflow/.local/lib/python3.12/site-packages/virtualenv/run/__init__.py", line 49, in session_via_cli
    parser, elements = build_parser(args, options, setup_logging, env)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.12/site-packages/virtualenv/run/__init__.py", line 77, in build_parser
    parser._interpreter = interpreter = discover.interpreter  # noqa: SLF001
                                        ^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.12/site-packages/virtualenv/discovery/discover.py", line 41, in interpreter
    self._interpreter = self.run()
                        ^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.12/site-packages/virtualenv/discovery/builtin.py", line 58, in run
    result = get_interpreter(python_spec, self.try_first_with, self.app_data, self._env)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.12/site-packages/virtualenv/discovery/builtin.py", line 75, in get_interpreter
    for interpreter, impl_must_match in propose_interpreters(spec, try_first_with, app_data, env):
  File "/home/airflow/.local/lib/python3.12/site-packages/virtualenv/discovery/builtin.py", line 147, in propose_interpreters
    for pos, path in enumerate(get_paths(env)):
  File "/home/airflow/.local/lib/python3.12/site-packages/virtualenv/discovery/builtin.py", line 170, in get_paths
    if p.exists():
       ^^^^^^^^^^
  File "/usr/local/lib/python3.12/pathlib.py", line 860, in exists
    self.stat(follow_symlinks=follow_symlinks)
  File "/usr/local/lib/python3.12/pathlib.py", line 840, in stat
    return os.stat(self, follow_symlinks=follow_symlinks)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: '/root/bin'

>>> session = session_via_cli(["/tmp/fooobar", "-p", shutil.which("python3.9")])
>>> session.interpreter.version_info
VersionInfo(major=3, minor=9, micro=0, releaselevel='final', serial=0)

Technically PythonVirtualEnvironment might be change to avoid this issue:

  • Allow to provide path to the python interpreter binary
  • Try to determine path locally before provide it into the virtaulenv

@Taragolis Taragolis added upstream-dependency and removed needs-triage label for new issues that we didn't triage yet labels May 31, 2024
@RajasGujarathi
Copy link
Author

... if I provide a path then there is not a problem to create virtualenv

Agree. From my understanding, we cannot provide a path using class airflow.operators.python.PythonVirtualenvOperator

I feel, class airflow.operators.python.PythonVirtualenvOperator should abstract this behaviour🤞

I am new to this, and not sure what happens next, can you help?

@Taragolis
Copy link
Contributor

You might want to raise a PR, or wait some who might change this behaviour.
You also could open issue or at least discussion in virtualenv about the behaiviour.

@RajasGujarathi
Copy link
Author

Started a discussion pypa/virtualenv#2731 (comment)

@potiuk
Copy link
Member

potiuk commented Jun 1, 2024

Yep. Looks like indeed bug in viritualenv - and as they noticed - PR with a fix there would solve the problem.

For now this is a very quick workaround to bypass virtualenv limitation:

env PATH /home/airflow/.local/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

BTW. The reason we have /root/bin in a path and we have pip there is because we want to protect against root user being able to run pip accidentally (because this will break things if you install packages as root user when you extend the image after switching to root user.

For example when you do this in your image, this will break things (packages will be installed by root user but not available to airflow user:

USER root
RUN pip install something
USER airflow

This happened often in the past that people made this mistake, so the /root/bin/pip available on path is precisely a way to get an error if somone tries to run pip as root. The /root/bin/pip script is this:

COLOR_RED=$'\e[31m'
COLOR_RESET=$'\e[0m'
COLOR_YELLOW=$'\e[33m'

if [[ $(id -u) == "0" ]]; then
    echo
    echo "${COLOR_RED}You are running pip as root. Please use 'airflow' user to run pip!${COLOR_RESET}"
    echo
    echo "${COLOR_YELLOW}See: https://airflow.apache.org/docs/docker-stack/build.html#adding-a-new-pypi-package${COLOR_RESET}"
    echo
    exit 1
fi
exec "${HOME}"/.local/bin/pip "${@}"

So another option to workaround this virtualenv error could be (and you can also attempt to make PR and test it) - is to make this script available on PATH somewhere outside of /root/bin and remove /root/bin from the path. It would be additional overhead if it is available to everyone -not only root (default behaviour of the OS is to skip paths that are not accessible) - but for your convenience you could give permissions to the directory/path where the script is to all users, this will add extra overhead on starting bash interpreter for any pip command but would likely work as a workaround.

@RajasGujarathi
Copy link
Author

Thanks @potiuk for the detailed information.

@Mahran-xo
Copy link

is this solved yet?

@potiuk
Copy link
Member

potiuk commented Jun 26, 2024

is this solved yet?

Look at the linked issue in virtualenv. It's a virtualenv issue. You might comment there.

@RajasGujarathi
Copy link
Author

Closing this as it is a virtualenv issue, tracked under pypa/virtualenv#2731 (comment)

@RajasGujarathi RajasGujarathi closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2024
@tadeha
Copy link
Contributor

tadeha commented Nov 4, 2024

@potiuk
Copy link
Member

potiuk commented Nov 4, 2024

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants