Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Migration Guide #13849

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

chore: Migration Guide #13849

wants to merge 3 commits into from

Conversation

comphead
Copy link
Contributor

Which issue does this PR close?

Related to #13764 #13763 #13702.

Rationale for this change

Write a migration guide to document migration steps if the breaking change has been introduced. This is critical to provide for users smoother transition between DF versions

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 19, 2024
@comphead
Copy link
Contributor Author

comphead commented Dec 19, 2024

@alamb @andygrove @Dandandan @jayzhan211 @findepi

Please let me know your thoughts this is a first drop, and lets improve it together

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @comphead -- this is a great start. I think it is very close

I would recommend we don't get caught up in bikeshed some intricate process, but keep the guidelines fairly general and high level and improve them as we iterate.


## Migration Guidelines

To ensure smooth upgrades and maintain application stability, the following guidelines must be followed for changes involving:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistically I would prefer to say this more gently "the following guidelines should be followeed" rather than must. Mostly because there is no way for us to enforce this policy other than pull request reviews

However, I think that may just be a cultural hangup I have as I come from the US and the distinction may not be relevant for other contributors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +78 to +79
- Introducing deprecated methods
- Removal of obsolete methods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't think the last two items are worth documenting -- deprecated methods should have already been removed so there is no need for a contributor to try and document anything when removing them

I would also like us to get more in the habit of deprecating rather than removing methods which will make the last item irrelevant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecated functions are self explanatory if deprecation guidelines followed :)

But one pros for documenting it in migration plan is the user is able to identify amount of migration work needed just by looking into the guide for a specific version.

For example https://spark.apache.org/docs/3.5.3/core-migration-guide.html#upgrading-from-core-34-to-35

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't think the last two items are worth documenting

agreed

But one pros for documenting it in migration plan is the user is able to identify amount of migration work needed just by looking into the guide for a specific version.

otoh it adds burden for development, and sets wrong incentives, discouraging from deprecating methods and removing old code

If this is indeed useful, let's maybe automate pulling of new deprecations, rather than impose a new tax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automatic deprecation doc generation sounds good. We can think of it later.
I'm not sure about removing it completely as it tracks API evolution for the user.

There would be always a balance between development work and migration support. From recent days we understood user migration is important for them


## Migration Guidelines

To ensure smooth upgrades and maintain application stability, the following guidelines must be followed for changes involving:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can also explicitly say here "it is preferrable not to make breaking API changes and instead add new APis and deprecate existing ones in which case no migration guide is needed"

That would perhaps also subtley incentivize people to avoid breaking changes 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@comphead
Copy link
Contributor Author

I'm thinking we can borrow the format like Apache Spark does https://spark.apache.org/docs/3.5.3/core-migration-guide.html#migration-guide-spark-core

its pretty clear how many work needed to migrate from X to Y

To ensure smooth upgrades and maintain application stability, the following guidelines should be followed for changes involving:

- Public API changes
- Introducing deprecated methods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not involve any additional migration guide. I.e. this is automatic (unless the deprecated method is no longer correct to call; in such case we should remove it instead of deprecating, or fix it)

Comment on lines +78 to +79
- Introducing deprecated methods
- Removal of obsolete methods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't think the last two items are worth documenting

agreed

But one pros for documenting it in migration plan is the user is able to identify amount of migration work needed just by looking into the guide for a specific version.

otoh it adds burden for development, and sets wrong incentives, discouraging from deprecating methods and removing old code

If this is indeed useful, let's maybe automate pulling of new deprecations, rather than impose a new tax


To ensure smooth upgrades and maintain application stability, the following guidelines should be followed for changes involving:

- Public API changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should say what consistutes a public api change

  • removal of a public function
  • signature change of a public function
  • semantic changes if a public funciton
  • new method in a public trait that has no default impl
  • new required traits of a public trait (eg requiring a Foo to be Sync)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is good point.


### Migration Document Requirements:

For each upgrade, append a section in [migration guide](../../../MIGRATION_GUIDE.md) outlining the following:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we have merge conflicts there often?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course, like any other file, but should we easily resolvable.

Comment on lines +94 to +100
- List all deprecated methods and expected removal timelines.
- Provide alternative methods or workarounds.

#### Removals:

- List all methods or APIs that are removed in the current release.
- Suggest migration paths for impacted functionalities
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch, unless automated, we won't deprecate any code and won't remove and obsolete stuff.
let's not do this

@comphead
Copy link
Contributor Author

comphead commented Dec 20, 2024

Thanks @alamb and @findepi the main key point I got from your messages is to avoid method removal from migration doc and from development cycle. But if so, the obsolete methods will be living forever it makes the deprecated state no sense. As the deprecated it is a flag to method removal in the future.

Another point is the automation which sounds great to me, not sure if everything can be automated but we can do it once the entire process is approved

@alamb
Copy link
Contributor

alamb commented Dec 21, 2024

In my opinion we can't really change people's behavior easily just by defining a process -- it is a more subtle journey to get there.

I would recommend we start with recommendations / basic guidelines rather than try to define the whole process up from and we can refine them as we learn what makes the most sense for DataFusion

I don't think I have the intellectual capacity at the moment to drive such a process, but I do think I will have some after I get the current releases out in a few weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants