-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use set instead of list for dags' tags #41695
base: main
Are you sure you want to change the base?
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I agree with the idea that set
is more appropriate in this case, as tags should be unique. I'd be happy for more opinions whether it's worth making it as a breaking chage.
airflow/models/dag.py
Outdated
@@ -767,7 +767,7 @@ def __init__( | |||
|
|||
self.doc_md = self.get_doc_md(doc_md) | |||
|
|||
self.tags = tags or [] | |||
self.tags: abc.MutableSet[str] = set(tags or []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably just use Collection
. We don’t really expect users to add more tags to a DAG after it’s created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's true. In cluster policies, we may want to add custome tags to DAGs (for orginization hierarcy for example).
And I don't think it would be correct to make it an immutable collection that we would need to create every time we want to change something.
If we would go with Collection
, we would go with a different programing paradigram, and I am not sure everyone would like it, so I would keep the door open and let everyone chose what they want to.
def test__tags_mutable(): | ||
expected_tags = {"6", "7"} | ||
test_dag = DAG("test-dag") | ||
test_dag.tags.add("6") | ||
test_dag.tags.add("7") | ||
test_dag.tags.add("8") | ||
test_dag.tags.remove("8") | ||
assert test_dag.tags == expected_tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to support this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I want to check behaviour, if we change the behaviour, we should know about it.
For example, changing the implementation of the tags from set
to list
would break this change, and we would know about it.
That's why I think we should have this test, but let's keep the conversation here, as we talk about the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this cleanup! Especially as is is "non-breaking" for existing DAGs, so no effect on DAG authors but better data types. Just small proposals to pin the typing to strings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests fail, otherwise OK for me. Small nit as comments
Co-authored-by: Jens Scheffler <95105677+jscheffl@users.noreply.github.com>
Co-authored-by: Jens Scheffler <95105677+jscheffl@users.noreply.github.com>
I do not like this, but feel free to merge this without me if others feel more strongly this i beneficial. |
…it raises the error: Subscripted generics cannot be used with class and instance checks.
@uranusjr why you don't like this? |
I mostly outlined the reasons above, but to summarise, you want to keep this mutable so you can call With that said, 3.0 is the exact time to break everyone, so I am still open for this to be merged if people feel fine about it. But I will not approve nor merge this change myself. |
Would it make you more happy if we keep the interface as a list (so: non breaking) but during setting/init it is temporarily converted to a set to ensure no duplicates are in the list? |
@uranusjr I am sorry to hear you don't like my change. I think the solution of @jscheffl is reasonable to handle your conflict, even though I think we should break the interface to a But can you (@uranusjr) at least close your conversations? As it blockes the merge request (that was approved by other people). |
Closes issue 41420.
Thank you for viewing this PR :)
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.