-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
James/airflow #5
base: main
Are you sure you want to change the base?
Conversation
The first point I would make, is that we won't want airflow to become a dependency of the ingestion workflow. They should be running from separate python environments. Airflow has the ability to run tasks using their own virtual environment. |
Could you explain this bit? |
Also, what is |
This is how it is set up at the moment anyway following the above instructions. I think you and Sam are giving me conflicting ideas on where Airflow should run?
This is just straight from the docs: docs. It explains why there are constraints, and you can change them i.e. use python 3.7 instead of 3.8.
I was meant to remove that, it should work without it (i don't know what it does, but a tutorial online used it when I was struggling to get Airflow going). The AIRFLOW_HOME is useful to keep though. |
Ah ok, sorry about that. We can all discuss next week to make sure we're all on the same page.
Ok, we need to tidy it up then, as in the docs this is just an example and not the actual configuration we want. In fact, also in reference to the above point, installing airflow with
👍 |
These are the steps I've had to do to get airflow running, there is probably cleaner ways to do this but this worked for me. I cannot get it going within the ingestion repo yet but will look at it in the future:
Create a directory outside the fair-mast-ingestion repo, call it
airflow-dir
for now. Run pretty much the same instructions for the ingestion to get the environment setup correctly:module load python-3.9.6-gcc-5.4.0-sbr552h python -m venv airflow-venv source airflow-venv/bin/activate
git clone git@git.ccfe.ac.uk:MAST-U/mastcodes.git cd mastcodes
Edit
uda/python/setup.py
and change the "version" to 1.3.9.Now we install Airflow, and get the config set up to point to the correct areas.
Run Airflow for the first time, this will put an
airflow
dir in yourairflow-dir
along with the config file.You can now shut down the process. Edit
/airflow/airflow.cfg
and change the following:dags_folder = "PATH"/fair-mast-ingestion/src/dags/
load_examples = False
Reset the database and re-run:
Make note of the username and password in the terminal, and head to
http://localhost:8080/
to log in.The metadata-processing DAG should appear in your DAGs. Before triggering the workflow, please change the paths within
src/dags/ingestion_dag.py
to your own. This is something I need to fix still.