Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache Scalafix instances globally #206

Merged
merged 4 commits into from
Oct 1, 2024

Conversation

lolgab
Copy link
Collaborator

@lolgab lolgab commented Sep 30, 2024

Fixes #175

@lolgab lolgab force-pushed the cache-scalafix-instances branch from c037bf0 to 7d57735 Compare September 30, 2024 14:48
@lolgab lolgab marked this pull request as ready for review September 30, 2024 14:56
@lolgab lolgab requested a review from joan38 September 30, 2024 15:22
@lolgab
Copy link
Collaborator Author

lolgab commented Sep 30, 2024

@bjaglin Do you mind reviewing it?

Copy link
Contributor

@bjaglin bjaglin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just added some ideas based on my experience with sbt-scalafix. Next step would be to include the withToolClasspath in the cache key, but the impact won't be as high/broad.

private val cache = new ConcurrentHashMap[(String, Seq[Repository]), SoftReference[Scalafix]]

def getOrElseCreate(scalaVersion: String, repositories: Seq[Repository]) =
cache.get((scalaVersion, repositories)) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a blocking method like computeIfAbsent would avoid getting several concurrent initialisations of the same instance on projects with many modules. This helped with sbt, not sure if the threading model is the same with mill.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea!
I used compute since I wanted to keep the SoftReference.


private val cache = new ConcurrentHashMap[(String, Seq[Repository]), SoftReference[Scalafix]]

def getOrElseCreate(scalaVersion: String, repositories: Seq[Repository]) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since repositories is provided, should we have it in the cache key as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, don't we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops 🙈 my bad!

val scalafix = Scalafix
.fetchAndClassloadInstance(scalaVersion, repositories.map(CoursierUtils.toApiRepository).asJava)
val scalafix = ScalafixCache
.getOrElseCreate(scalaVersion, repositories)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential tweak even further: as I was mentioning no later than yesterday in the docs, only a major.minor version is required, so FWIW you could strip the patch version to maximize cache hit ratio (the instance will be the same for 3.3.3 and 3.3.4).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That assumption is true now but it might become false in future versions of scalafix. Since it's very rare to use different patch versions in the same build, I wouldn't bother trying to be clever here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally fair. FTR, that assumption is based on Scala current semver semantics ensuring backward and forward source compatibility for patch releases, but it could indeed change in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could use the full scalafix classpath (as PathRefs, either sorted or as Set) as cache key.

@lolgab lolgab requested a review from bjaglin September 30, 2024 20:20
Copy link
Contributor

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably ok, but nevertheless I'd like to point out that the cache is neither limited nor evicted. This could be an issue in a long running Mill session with frequently changing Scalafix instances. Unlikely but possible.

@lolgab
Copy link
Collaborator Author

lolgab commented Oct 1, 2024

@lefou Maybe if I make it a Worker it can be cleaned thanks to com-lihaoyi/mill#3579 ?
What do you think?

@lolgab
Copy link
Collaborator Author

lolgab commented Oct 1, 2024

Oh but fixAction is not a Task. I'm going to merge and release as is. Then for 0.5 we could think about making fixAction a task.

@lolgab lolgab merged commit 8e6c57f into joan38:main Oct 1, 2024
1 check passed
@lolgab lolgab deleted the cache-scalafix-instances branch October 1, 2024 08:27
@lefou
Copy link
Contributor

lefou commented Oct 1, 2024

@lolgab Yeah, a Worker which returns a AutoClosable is the go-to solution. Since you share the `Scalafix instances, you probably need some reference counting, too, to not close instances still in use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memoize scalafix instance across modules & invocations
4 participants