Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import canonical_data.countries through Airflow #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

aquwikimedia
Copy link

We would like to mutualize static data from analytics-refinery:
https://github.com/wikimedia/analytics-refinery/blob/master/static_data/mediawiki/geoeditors/blacklist/country_codes.tsv
with the data in country/coutries.tsv .

In this patch, you will find the HQL script to create the table and the Python script to be triggered by Airflow.

The airflow job is in this patch:
https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/428

As we are going to use the countries data in this repo around analytics-refinery, it will perform the changes:

  • 6 countries were not allowed to be released and are going to be released by this change: Burundi, Equatorial Guinea, Lybia, Singapore, Somalia, Tajikistan
  • 5 countries were allowed to be released and are going to be disallowed by this change: Bangladesh, Honduras, Kuwait, Nicaragua, Oman

The mutualization job in analytics-refinery:
https://gerrit.wikimedia.org/r/c/analytics/refinery/+/929723

Bug: T338033

We would like to mutualize static data from analytics-refinery:
https://github.com/wikimedia/analytics-refinery/blob/master/static_data/mediawiki/geoeditors/blacklist/country_codes.tsv
with the data in country/coutries.tsv

In this patch you will find the hql script to create the table and the python script to be triggered by Airflow.

The airflow job is in this patch:
https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/428

The mutualisation job in refinery:
https://gerrit.wikimedia.org/r/c/analytics/refinery/+/929723

Bug: [T338033](https://phabricator.wikimedia.org/T338033)
@nshahquinn
Copy link
Collaborator

Broader discussion ongoing in T339928.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants