WIP : Jupyter notebooks #4253

Quafadas · 2022-08-11T15:05:25Z

This is an (initial, horrible) draft PR which goes after
scalameta/metals-feature-requests#236

I believe it sets out a viable solution sketch. If we were able to get the almond installer working, and include the classpath in it's .json file, my experiments suggest that VSCode would (automagically) do the rest for us. I think I would need (quite some) help, to get ready for merge.

tgodzik · 2022-08-12T08:55:27Z

metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala

@@ -1967,6 +1970,70 @@ class MetalsLanguageServer(
      case ServerCommands.NewScalaProject() =>
        newProjectProvider.createNewProjectFromTemplate().asJavaObject

+      case ServerCommands.SetupNotebookKernelForThisProject() =>


We could probably do this automatically if a notebook path is sent.

Would you have a hint, where to look in metals, for when such a path might be sent?

That's the method annotated with @JsonNotification("textDocument/didOpen") it's send every time a file is open.

https://github.com/scalameta/metals/blob/main/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala#L1066

tgodzik · 2022-08-12T08:56:04Z

metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala

+        scribe.info("Found classpath ")
+        scribe.info("targetClasspath " + targetClasspath.mkString("\n"))
+        val jvmReprRepo = coursierapi.MavenRepository.of(
+          "https://maven.imagej.net/content/repositories/public/"


Why do you need the additional repo?

Almond requires an artefact jvmrepr. https://github.com/jupyter/jvm-repr

The jvm repr artefact, is only available on jitpack, as far as I can tell.

tgodzik · 2022-08-12T08:56:25Z

metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala

+          "https://maven.imagej.net/content/repositories/public/"
+        )
+        // TODO check scala version is valid. For now use 2.13.7
+        val scalaVersion = "2.13.7"


We can probably use the Scala version from the build target.

Quafadas · 2022-08-12T13:30:22Z

@tgodzik My biggest worry at the moment, is that when I try to actually run this in metals, I get a permission error.

I'm not sure what the restrictions metals has, but I don't know how to deal with this. I'll try to look again, and see if I messed something up the first time... but ...

The other problems I (hope!) can be solved with some work, but if we simply aren't allowed to make changes to the user path, which I think is what almond would need to install itself, then this idea isn't going to work and I'd have to go back to the drawing board. Does the shell runner have the same rights as metals? Is there any way to elevate it's privilege's?

tgodzik · 2022-08-12T13:54:14Z

@tgodzik My biggest worry at the moment, is that when I try to actually run this in metals, I get a permission error.

I'm not sure what the restrictions metals has, but I don't know how to deal with this. I'll try to look again, and see if I messed something up the first time... but ...

The other problems I (hope!) can be solved with some work, but if we simply aren't allowed to make changes to the user path, which I think is what almond would need to install itself, then this idea isn't going to work and I'd have to go back to the drawing board. Does the shell runner have the same rights as metals? Is there any way to elevate it's privilege's?

The shell should have the same permissions as Metals, since it will be an inherited process. There shouldn't be any issues here, we were able to run other process such as ammonite, which are quite similar to this. What is the exception you are getting?

CC @alexarchambault

tanishiking

Hi @Quafadas, thank you very much for kicking the ball rolling around notebook support for Metals!

It looks like what this PR tries to do is something like downloading an almond kernel for the specific Scala version that the project *.ipynb file belongs to (as Metals do that for worksheet feature).

However, I don't think this approach doesn't work

As you can see from the following gif, we can easily switch the kernel by clicking the upper-right button in the vscode notebook. The point is, **the URI of *.ipynb file and Scala project structure **
In my (data science experience, (not sure in the Scala case though), we often create notebook files in separate /notebooks directory, where it doesn't belong to any Scala project.

What the "notebook support" looks like

I was thinking the "notebook support" is like

Developers install almond kernel by themselves (don't need to install kernels from Metals)
- It might be helpful if Metals provide a command to install almond kernel (?) but it's optional (and I'm not sure it's really helpful)
When developers open an *.ipynb file where metadata.kernelspec.language == 'scala', metals magically provide LSP features

What we need to do for "notebook support"

Rough sketch

Construct a valid runnable Scala (Ammonite?) script
- In order to enjoy the LSP feature on notebooks, we have to combine all the Scala cells to a valid Scala script
- It might be nice to rely on LSP 3.17's notebook document synchronization feature
  - I'm not sure how the client concatenates cells or customizes the way to contamination (almond cells need to preprocess before contamination).
Also, we need to deconstruct the LSP response for the concatenated document into the right place (of notebook cells)
- I guess the LSP clients should handle that by something like this Middleware, and Scala should have Metals version of Middleware in somewhere (maybe in metals-vscode?) I'm not sure.
Metals should run for the given (notebook's) Scala version.
- It should be available from language_info.version
- (maybe) start BSP server of ... something (like scala-cli bsp) and connect to the server

VSCode would (automagically) do the rest for us

What do you point to by "the rest"?

…ating a launcher with two seperate classpaths... (not implemented). Updates lsp4k to 0.15.0 in order to get the server notebooks commands.

Quafadas · 2022-08-21T18:25:05Z

@tgodzik I looked into this some more, based on your feedback. What I've done.

I found that didOpenTextDocument command isn't triggered when a notebook is opened, so...
Bumped lsp4j to 0.15.0 to gain access to the didOpenNotebook command, and added that command to metals initialisation options according to the LSP spec.
Added the coursier-launcher dependancy to metals
Moved the bootstrap out into it's own "Notebooks" file, and tidied up some of the cruft, version setting etc.

I think this puts the bootstrap in the "right" place, and deals with all the boilerplate... I tried a very naive strategy for setting the classpath to include the project, and was not surprised when it failed.

A couple days ago, Alex A popped up on the coursier discussions board, and essentially answered the question of how to do what I wanted, "properly".
coursier/coursier#2479

There are a couple of key insights.

Need the courtier-launcher dependency.
Need to somehow split the classpath into two parts. One for launching the kernel itself, and one for ammonite inside, it, to actually play with (if I understood correctly - I may not have).

That part is not done... progress is slow :-(...

Quafadas · 2022-08-21T18:45:58Z

@tanishiking It's cool that someone else is looking this... I'm trying... but I may not be the right person! This is my first foray into tooling.

Here are some guides to my thought process ...

In my (data science experience, (not sure in the Scala case though), we often create notebook files in separate /notebooks directory, where it doesn't belong to any Scala project.

I am not a data scientist, but have done similar. My question: would you keep this workflow, if a .ipynb file, was a first class citizen of the project, with the same classpath and integrations that a .scala file has?

My hypothesis, is no, and that I would (personally) change my workflow. If the answer yes, then I think your solution could be a very good one, and maybe we could bi-furcate a solution along those lines, depending on whether our .ipynb was a source file in the project.

It looks like what this PR tries to do is something like downloading an almond kernel for the specific Scala version

My progress is very slow! I recently pushed some extra commits that take it a (little) further. You are right that this was step 1 - I agree, that this is a trivial piece of information, and hence not that exciting.

However, metals is also in the (unique?) position of having access to a second piece of (non-trivial) information - the project classpath.

As you can see from the following gif, we can easily switch the kernel by clicking the upper-right button in the vscode notebook. The point is, **the URI of *.ipynb file and Scala project structure **

Exactly... what I want to test, is setting up a kernel per project, with the right classpath, for that project. When an .ipynb file is opened, that is in the source directory of a project... metals knows it's classpath! We can easily switch between many kernels with the mechanism you propose.

There would theoretically be a second way of doing this. I believe one could set a CLASSPATH variable in the .env file of the project root, and would require a single "global" kernel per scala version. This can be found in the in the VSCode discussion board if memory serves. I see some downsides with this, I particular if it was part of a project that used sbt .env or mill .env plugins, one could make the user rather angry by trampling their settings. For this reason, I concluded it way be better to totally isolate each project in a kernel per project.

This a hypothesis I wish to test. It may not be a good mechanism.

What the "notebook support" looks like

I think you see here a slightly different vision. I had hoped, that if we can get the classpath problem right inside the LSP mechanism, that

an *.ipynb file could ultimately end up as a first class project citizen. VSCode already has some great diff tools, so version control makes sense etc, would be great to get compilation information "in project", potentially even debugging etc.
that it should be possible to (easily?) benefit from all past (and future) work done on LSP in metals, by wheeling in existing functionality, rather than inventing a new mechanism. Again, a hypothesis I am not yet in a position to test.

At that rate I am progressing, and without help... I guess this may never happen!

As you read, I do not claim, that this is certainly the right way. But I think it is an exciting future if it all hung together... ultimately I think we're pulling the same way.

Quafadas · 2022-08-21T18:46:15Z

@tanishiking Ummm, that was really long... sorry!

tanishiking · 2022-08-22T06:07:39Z

My question: would you keep this workflow, if a .ipynb file, was a first class citizen of the project, with the same classpath and integrations that a .scala file has?

src
`-- main
    `-- scala
        |-- A.scala
        `-- Foo.ipynb

Do you mean, we may want to put Foo.ipynb adjacent to A.scala in src/main/scala directory?
Basically, I don't think so. why do you want to put the *.ipynb there? Maybe I'm seeing a different vision.

We want to use the same Scala version with the A.scala?
- Don't need to put *.ipynb there, because we can switch the kernel Scala version freely.
We want to use the same libraries with the A.scala?
- Don't need to put *.ipynb there because we can download dependency by import $ivy.org.apache.spark::spark-sql:2.4.0, the way of using external dependency in Scala kernel is quite different from python.
We want to call something defined in A.scala from Foo.ipynb?
- Is it possible? Can we do that by feeding the project classpath to the almond?
- It would be really great if we can do that, but it looks pretty advanced feature of almond (?)
  - (while ammonite script can be loaded like this way How to import ammonite script file ? almond-sh/almond#710)

Don't worry, this task is really challenging one, and cutting-edge feature of LSP. Take you time :)

alexarchambault · 2022-08-22T09:25:51Z

Hi all, first thanks @Quafadas for starting up work on this! This feature will probably require quite some work overall to fully work, but earlier groundwork will definitely be useful.

I agree with @tanishiking that Metals shouldn't install Almond itself. If notebook support in VSCode allowed (allows?) LSP servers to supply their own kernels, then Metals should use that, but I don't know if it's supported.

As @tanishiking develops in his comment (around "What we need to do for "notebook support"), Metals will need to have some model of a notebook (basically, the kind of info its gets about projects via BSP, such as Scala version, class path, ...) in order to offer LSP features for it. It'll also need semantic DB files for it.

I think getting such a model can be done more faithfully than building a script for the notebook content. Almond could expose such a model itself, via a BSP connection. For that, it would need to detect when it's run by VSCode. If that's the case, it would add the semanticdb scalac plugin when it (asks Ammonite to) compile(s) things, so that we would get semanticdb files. It would also run a BSP server, expecting clients to connect via a socket file say, at a pre-defined location (like somewhere in ~/.local/share/almond/bsp/{kernel PID}, exact location built with directories-jvm). That BSP server would mainly answer "descriptive" endpoints, such as workspace/buildTargets, buildTarget/dependencySources, and the like. buildTarget/compile and buildTarget/test would be no-op (compilation being handled via the Jupyter protocol).

Then Metals would have to connect to that BSP server to know about the notebook. And internally in Metals, we would have one "project" per notebook, as you already discussed above.

alexarchambault · 2022-08-22T09:34:41Z

One hurdle though, is going to be the mismatch between the sources passed to scalac by Almond (via Ammonite) and the locations in the notebooks that Metals should be passed via LSP.

Basically, Almond / Ammonite creates one source per cell, looking roughly like

package ammonite.$sess
// imports of things from cmd0, cmd1, ... here (that is things defined in previous cells)
object cmd2 {
  // cell code goes here
  def $main() = {
    // generated code that prints variables, function names, etc. defined in cell code
  }
}

So the semantic DBs we get from scalac are about those files.

These semantic DBs will need to be post-processed, even maybe merged, so that they correspond to the document and locations that Metals gets asked about via LSP. As examples, post-processing (of diagnostics, semantic DBs, ...) in Scala CLI is handled around here. In Ammonite (when it runs a BSP server for scripts), it's handled around here.

Also, Metals sometimes needs to pass sources to the interactive compiler (at least for completions, and for some other features too). So it needs to be able to create on-the-fly a source that looks like the one above, so that the user code in it sees the previous cells in particular.

Quafadas · 2022-08-22T11:57:49Z

I want to think more about all the feedback above, but in response to;

If notebook support in VSCode allowed (allows?) LSP servers to supply their own kernels

I don't think that's the model.
microsoft/vscode-jupyter#10813

If this helps at all we do respect global jupyter paths for where we pick up kernelspecs.

https://jupyter-client.readthedocs.io/en/latest/kernels.html#kernel-specs

My understanding of that comment, is that VSCode will pick up any kernel it can find according to the Juypter kernelspec document. Hence I have not had the expectation, that the kernel was built into or owned by the LSP server itself. I was hoping to use metals server, to bootstrap kernels on demand, which are compatible with the current project...

@tanishiking Your examples above are great.

We want to call something defined in A.scala from Foo.ipynb?
Is it possible? Can we do that by feeding the project classpath to the almond?
It would be really great if we can do that, but it looks pretty advanced feature of almond (?)

This was my goal. Is it possible? I hope...

the previous two functionalities are waypoints that are needed to get there, but I would agree, not "goals" in their own right.

I guess I'd like to finish what I started, and see if it is indeed possible to get such a kernel to execute code found in "A.scala" in a Foo.ipynb notebook. Whether that was worthy of review / merge is then up to someone else :-).

Based on Alex's response, It sounds like true LSP support would then be a further, seemingly tougher step... I did read the LSP 3.17 notebook spec... it reads like it's designed to re-use as much prior art as possible, but I do not pretend, to understand how.

Quafadas · 2022-08-22T22:13:14Z

I had my best shot at what I wanted from this... and failed!

This branch does now appear to and setup a kernel which is recognised by VSCode when opening an ipynb file.

However, in a workspace with no dependancies, running 1 + 2 in a notebook, using that kernel, I get this

  dotty.tools.dotc.classpath.ZipAndJarFileLookupFactory.create(ZipAndJarFileLookupFactory.scala:28)
  dotty.tools.dotc.classpath.ZipAndJarFileLookupFactory.create$(ZipAndJarFileLookupFactory.scala:23)
  dotty.tools.dotc.classpath.ZipAndJarClassPathFactory$.create(ZipAndJarFileLookupFactory.scala:41)
  ammonite.compiler.Compiler.$anonfun$5(Compiler.scala:124)
  scala.collection.immutable.Vector2.map(Vector.scala:1872)
  scala.collection.immutable.Vector2.map(Vector.scala:433)
  ammonite.compiler.Compiler.ammonite$compiler$Compiler$$asDottyClassPath(Compiler.scala:124)
  ammonite.compiler.Compiler$$anon$9.classPath(Compiler.scala:76)

If I include dependancies in the project, then the kernel doesn't even start. That's both good and bad - I can see that those dependancies end up in the launcher jar - so it's trying to do the right thing. Just failing!

Point is, I have no idea where to even begin, on trying to sort this out, sadly. Further, based on the commentary above, it's not even clear that this is the "right" direction of travel.

If forced to guess, I would guess that this may not be a good strategy, but I have no evidence for that.
https://github.com/Quafadas/metals/blob/2f539ae3971c24ac3ee1ae33c9bbc0285aecc9ab/metals/src/main/scala/scala/meta/internal/metals/notebooks/Notebooks.scala#L80

Feedback welcomed... but, I'm stuck.

Quafadas · 2022-08-23T09:45:14Z

If anyone wants to take a look and thinks this is a sensible direction of travel / they can help, most welcome and I'll reopen.

tgodzik · 2022-08-23T11:50:12Z

Thanks @Quafadas for all the work! It seems that we might indeed need to rethink the approach and how we handle notebooks 🤔 I don't have a lot of expertise there unfortunately, but I hoped it wouldn't be that complex :sigh:

@alexarchambault @tanishiking Let's keep the discussion running in the feature request.

Quafadas · 2022-08-23T13:01:43Z

@tgodzik It was worth a punt!

It's healthy that there are lots of different views :-)... my own is that I guess I'm not that excited by LSP support (don't get me wrong, very cool)... but being able to just "run" a kernel, with the project context (even without the trimmings), I think could be super useful (for me)... and I was hoping too... a lot simpler :-)!

... so I gave it my best shot and failed! Which I'm fine with... I'm a bit bummed out that I don't have a distance function - do I have a typo... Or is it that trying to construct a classpath like this will never work... But hey ho - I learned a lot...

Most of which was how incredible the metals project is - thanks again for your willingness to answer!

tanishiking · 2022-08-23T15:43:18Z

@Quafadas Thanks again for working on this!
While "Language support for notebook" is something different and pretty complex task, being able to set up kernel from Metals would be still useful ❤️

If you're interested, maybe we can open another ticket for "setting up kernel from Metals" 👍 "running kernel with the project context" will be a stretch goal.

Quafadas added 7 commits July 10, 2022 16:45

Initial download attempt

47759fb

speed up local publish

82cc66e

Allow shell runner to accept extra repositories

edf0fc4

Try to figure out how to trigger a command

6b96e41

Initial, horrible implementation of kernel installation

86f8783

Formatting

8712736

Formatting

f21285b

Quafadas mentioned this pull request Aug 11, 2022

WIP: Creates a command which (one day I hope) will setup a juypter notebook kernel for the current project scalameta/metals-vscode#1101

Closed

tgodzik reviewed Aug 12, 2022

View reviewed changes

Quafadas added 2 commits August 14, 2022 22:40

This installs a kernel which actually runs

e6c3070

Formatting

5a02211

tanishiking reviewed Aug 21, 2022

View reviewed changes

Quafadas added 4 commits August 21, 2022 20:14

Creates a kernel on load of .ipynb file, sets up dependencies for cre…

d82a1ba

…ating a launcher with two seperate classpaths... (not implemented). Updates lsp4k to 0.15.0 in order to get the server notebooks commands.

formatting

10cb1c7

more formatting

0dd4ad0

Not gonna compile like that

273a8bd

Quafadas added 3 commits August 22, 2022 23:21

Stuck

f0865cf

fomattting

268b86a

This sets up a useless kernel, sadly.

2f539ae

tanishiking mentioned this pull request Aug 23, 2022

Add language support for notebook cells scalameta/metals-feature-requests#236

Open

Quafadas closed this Aug 23, 2022

alexarchambault mentioned this pull request Sep 20, 2022

docs: Add a design doc for Language Support for notebook cells #4309

Closed

tanishiking mentioned this pull request Sep 23, 2022

Design doc for Language support for VSCode notebook #4434

Closed

7 tasks

allen-li1231 mentioned this pull request Sep 14, 2024

Scala intellisense in Scala paragraphs + general latency issues allen-li1231/zeppelin-vscode#12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP : Jupyter notebooks #4253

WIP : Jupyter notebooks #4253

Quafadas commented Aug 11, 2022 •

edited

Loading

tgodzik Aug 12, 2022

Quafadas Aug 14, 2022

tgodzik Aug 15, 2022

tgodzik Aug 12, 2022

Quafadas Aug 12, 2022 •

edited

Loading

tgodzik Aug 12, 2022

Quafadas commented Aug 12, 2022

tgodzik commented Aug 12, 2022

tanishiking left a comment

Quafadas commented Aug 21, 2022

Quafadas commented Aug 21, 2022 •

edited

Loading

Quafadas commented Aug 21, 2022

tanishiking commented Aug 22, 2022 •

edited

Loading

alexarchambault commented Aug 22, 2022

alexarchambault commented Aug 22, 2022 •

edited

Loading

Quafadas commented Aug 22, 2022 •

edited

Loading

Quafadas commented Aug 22, 2022 •

edited

Loading

Quafadas commented Aug 23, 2022

tgodzik commented Aug 23, 2022

Quafadas commented Aug 23, 2022

tanishiking commented Aug 23, 2022

WIP : Jupyter notebooks #4253

WIP : Jupyter notebooks #4253

Conversation

Quafadas commented Aug 11, 2022 • edited Loading

tgodzik Aug 12, 2022

Choose a reason for hiding this comment

Quafadas Aug 14, 2022

Choose a reason for hiding this comment

tgodzik Aug 15, 2022

Choose a reason for hiding this comment

tgodzik Aug 12, 2022

Choose a reason for hiding this comment

Quafadas Aug 12, 2022 • edited Loading

Choose a reason for hiding this comment

tgodzik Aug 12, 2022

Choose a reason for hiding this comment

Quafadas commented Aug 12, 2022

tgodzik commented Aug 12, 2022

tanishiking left a comment

Choose a reason for hiding this comment

What the "notebook support" looks like

What we need to do for "notebook support"

Quafadas commented Aug 21, 2022

Quafadas commented Aug 21, 2022 • edited Loading

What the "notebook support" looks like

Quafadas commented Aug 21, 2022

tanishiking commented Aug 22, 2022 • edited Loading

alexarchambault commented Aug 22, 2022

alexarchambault commented Aug 22, 2022 • edited Loading

Quafadas commented Aug 22, 2022 • edited Loading

Quafadas commented Aug 22, 2022 • edited Loading

Quafadas commented Aug 23, 2022

tgodzik commented Aug 23, 2022

Quafadas commented Aug 23, 2022

tanishiking commented Aug 23, 2022

Quafadas commented Aug 11, 2022 •

edited

Loading

Quafadas Aug 12, 2022 •

edited

Loading

Quafadas commented Aug 21, 2022 •

edited

Loading

tanishiking commented Aug 22, 2022 •

edited

Loading

alexarchambault commented Aug 22, 2022 •

edited

Loading

Quafadas commented Aug 22, 2022 •

edited

Loading

Quafadas commented Aug 22, 2022 •

edited

Loading