Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Robust Lifecycle Scripts #437

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

fritzy
Copy link
Contributor

@fritzy fritzy commented Aug 23, 2021

Proposal to implement more robust lifecycle scripts with reify context.

See issue npm/cli#3042

```js
{
"lifecycle": {
"install": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having discrete points in the reify process is better than trying to support the pre/post paradigm like this. It was an inadequate model and confuses people a lot (i.e. "pre" means "it runs before the x scripts" not "it runs before x")

I'd prefer not having pre/post be baked in like this. If we have a need for something to run at a time, we put an event there and name it accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Also, I think it would be good to first make the list of the "event moments" we want to expose, and identify where each current scripts entry gets run. It would also give us a way to call out where the "no userland scripts allowed" phase of the process starts and ends (ie, everything between _loadTrees and _reifyPackages in lib/arborist/reify.js).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo "it runs before x" is the semantic every single person wants, and the other semantic is "silly sugar for &&".


Lifecycle scripts such as `preinstall` `install` and `postinstall` are very commonly used, but difficult to reason about due to the way package trees are installed and updated. In the case of `install`, a package could be directly installed as a dependency by a user, installed as a global, indirectly installed as a dependency, or added when dependency versions changed to the point where a dependency could no longer be shared between two packages as a single version. Worst still, `uninstall` could be fired when the package won't be removed at all due to one dependant no longer requiring it, but another still needing to keep it.

An `uninstall` script could assume that the directory will be removed or not used in the future, and put the package into a bad state if it was merely deduped, for example. By providing scripts with more context, like whether a new directory is being made or destroyed, which depenants caused the installation, etc, we could properly cover all use-cases and remove confusion that would lead to destroyed data, bugs, and frustration.
Copy link
Contributor

@isaacs isaacs Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
An `uninstall` script could assume that the directory will be removed or not used in the future, and put the package into a bad state if it was merely deduped, for example. By providing scripts with more context, like whether a new directory is being made or destroyed, which depenants caused the installation, etc, we could properly cover all use-cases and remove confusion that would lead to destroyed data, bugs, and frustration.
An `uninstall` script could assume that the directory will be removed or not used in the future, and put the package into a bad state if it was merely deduped, for example. By providing scripts with more context, like whether a new directory is being made or destroyed, which dependants caused the installation, etc, we could properly cover all use-cases and remove confusion that would lead to destroyed data, bugs, and frustration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally use the American spellings of things, so dependent (for both noun and adjective) rather than dependant. Sorry, total nitpick, I know. (Still reading, will provide more substantive feedback as full review.)

@wraithgar
Copy link
Member

Something that was talked about in previous discussions about this was having some way to at least give people a fighting chance of making this an iterative change. Namely, that these could reference scripts. I don't think that's really the right solution but we should definitely keep that on our radar as we build examples. For instance making the lifecycle script entries call npm run xxx.

This does bring up the point of if these should be exclusive. i.e. if this object is populated at all, then the original lifecycle scripts are not ran.


Lifecycle scripts such as `preinstall` `install` and `postinstall` are very commonly used, but difficult to reason about due to the way package trees are installed and updated. In the case of `install`, a package could be directly installed as a dependency by a user, installed as a global, indirectly installed as a dependency, or added when dependency versions changed to the point where a dependency could no longer be shared between two packages as a single version. Worst still, `uninstall` could be fired when the package won't be removed at all due to one dependant no longer requiring it, but another still needing to keep it.

An `uninstall` script could assume that the directory will be removed or not used in the future, and put the package into a bad state if it was merely deduped, for example. By providing scripts with more context, like whether a new directory is being made or destroyed, which depenants caused the installation, etc, we could properly cover all use-cases and remove confusion that would lead to destroyed data, bugs, and frustration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally use the American spellings of things, so dependent (for both noun and adjective) rather than dependant. Sorry, total nitpick, I know. (Still reading, will provide more substantive feedback as full review.)


The primary reason for this RFC is the lack of `uninstall` lifecycle scripts in npm v7, and how difficult it would be to re-add them in a clear way.

See issue: #3042
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
See issue: #3042
See: [#3042](https://github.com/npm/cli/issues/3042)

```js
{
"lifecycle": {
"install": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Also, I think it would be good to first make the list of the "event moments" we want to expose, and identify where each current scripts entry gets run. It would also give us a way to call out where the "no userland scripts allowed" phase of the process starts and ends (ie, everything between _loadTrees and _reifyPackages in lib/arborist/reify.js).

Comment on lines +29 to +31
"pre" "bin/life_install.sh --phase=pre ${dependant} ${depenant_path} ${reify-reason}"
"pre" "bin/life_install.sh --phase=current ${dependant} ${reify-reason}"
"pre" "bin/life_install.sh --phase=post ${dependant} ${reify-reason}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"pre" "bin/life_install.sh --phase=pre ${dependant} ${depenant_path} ${reify-reason}"
"pre" "bin/life_install.sh --phase=current ${dependant} ${reify-reason}"
"pre" "bin/life_install.sh --phase=post ${dependant} ${reify-reason}"
"pre": "bin/life_install.sh --phase=pre ${dependent} ${dependent_path} ${reify_reason}",
"current": "bin/life_install.sh --phase=current ${dependent} ${reify_reason}",
"post": "bin/life_install.sh --phase=post ${dependent} ${reify_reason}"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is running during the reserved phase, then we're in trouble.

Before diffing, we don't know what will actually be done. Before unpacking, we don't have the scripts on disk to run. After unpacking, it's arguably not "pre"-install. But a userland script can modify the folder tree (or even the dependency graph, if it writes to package.json files), so it invalidates the work already done. We'd have to keep reloading the trees and recalculating the diff if any changes were made. If we don't do that, then we have security problems. Better to just keep the "pre" to "here's what we were asked to do, we are about to start getting ready to figure out how to do it", and the "post" is "here's what we did, and it's all the way done now".

So something like:

{
  "lifecycle": {
    "beforeTreeChange": "_loadTrees has not yet been called",
    "afterTreeChange": "_reifyPackages has completed",
    "beforeSave": "have not yet written to package.json and package-lock.json",
    "afterSave": "completely wrote lockfile and manifest"
  }
}

If beforeTreeChange and afterTreeChange exposed the various arguments that we currently pass into arborist.reify() (add, rm, update, etc.) then people could use that to decide whether this is an uninstall they care about. If afterTreeChange also had a way to get the diff somehow (though I'm not sure the best way to do this), then they could likewise pick the replacement for scripts.uninstall that most suits their need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we'd be exposing implementation pretty directly, which is subject to change. For example, a module-map based solution might not use trees at all. Is there a way we could be more future-proof and still be accurate about the lifecycle? Do package maintainers need all of that granularity?

@bnb
Copy link

bnb commented Sep 16, 2021

Would this RFC be a reasonable one to also include enhancements to the lifecycle flags, or should that be a separate concern?

@darcyclarke darcyclarke added Agenda will be discussed at the Open RFC call and removed Agenda will be discussed at the Open RFC call labels Jan 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants