Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse markdown outputs in notebooks along with the rest of the markdown on a page in order to programmatically generate MyST content #1026

Open
choldgraf opened this issue Mar 22, 2024 · 11 comments
Labels
enhancement New feature or request

Comments

@choldgraf
Copy link
Collaborator

choldgraf commented Mar 22, 2024

Jupyter notebook cells can produce text output that is specific to markdown (text/markdown). It'd be useful if MyST could parse this output along with the rest of the markdown in the notebook. This would allow for programmatic generation of notebook content, and along with the {embed} directive could allow you to nicely stitch together MyST content via any Jupyter Kernel.

For example:

CleanShot 2024-03-22 at 11 05 46@2x

As an example, the MyST-nb Sphinx extension documents this functionality here.

An example from our docs where this would be useful

In the admonitions docs here:

https://github.com/executablebooks/mystmd/blob/c0f51ee51890158081736722cb01bcf510699786/docs/admonitions.md?plain=1#L34-L98

We have a big list of all the different types of admonitions for demonstration. It's just boilerplate myst repeated over and over. If we parsed Markdown outputs as MyST, we could replace it with something like:

inner = []
for admonition in ["note", "tip", "warning"]:
	s = """
	````{tab-item}
    ```{%s}
    This is an %s admonition. 
    ```
    ````
    """ % admonition
	inner.append(s)
template = """
`````{tab-set}
%s
````` % "\n".join(inner)
"""

Related

@choldgraf choldgraf added the enhancement New feature or request label Mar 22, 2024
@choldgraf choldgraf changed the title Parse markdown outputs in notebooks along with the rest of the markdown on a page Parse markdown outputs in notebooks along with the rest of the markdown on a page in order to programmatically generate MyST content Sep 30, 2024
@HelgeGehring
Copy link

HelgeGehring commented Sep 30, 2024

continuing #1559 here:

An example for this would be the following:

  • Right now I have a nice notebook which analyzes exactly one physical structure.
  • The analysis results in several graphs / tables / images / ..., each generated by a code cell

My problem now is, that I have 2-20 different structures. For each I get graphs based on the same code.
Right now the only solution is to put them all below each other, leading to a quite long doc.

Better would be to have

  • tabs in which I can select which structure I want to see

and/or

  • a table which has only the main results (maybe broken down in a number or two), but on click on a row, I see a detailed analysis for that structure.

A simplified example might be statistics about bank accounts (not scientific, but less abstract).

  • It's simple to write a notebook which produces graphs / tables, ... based on an account number
  • Now I have 5 accounts and want this one notebook to do the same analysis with all of them. An overview would be the balance of each one in a table. But I'd like to be able to switch tabs/expand rows of the different account to see the detailed graphs.
  • Alternatively, I'd like to have automatically one notebook per account. As the code is the same, I don't wanna copy/paste

For different use-cases different approaches might be best. I'm wondering what might be possible:

  • have on the complete notebook level tabs, i.e. execute the notebook like github actions matrix with different parameters
  • have tabs within each output (which might be syncronized when switching (?))
  • Having outputs combined with hidden content, e.g. a table in which i can expand individual rows to see more details.

@agoose77 agoose77 moved this to Needed for beta in Jupyter Book MVP Oct 2, 2024
@choldgraf
Copy link
Collaborator Author

Workaround: Write to temporary text files and then use {include}

I discovered this workaround today, it's a bit cludgy but isn't too hacky. It takes advantage of the fact that code cells are executed before a page is parsed by MyST. This means that you can do something like the following:

## Generate content with Jupyter

```{code-cell} python
from pathlib import Path
p = Path("../_build/txt/tmp.txt")
p.parent.mkdir(parents=True, exist_ok=True)
_ = p.write_text("- **Testing**\n- Testing two\n- Testing three")
```

And then include it in the page with MyST markdown like so:

```{include} ../_build/txt/tmp.txt
```

This will:

  1. Generate some MyST Markdown in Jupyter
  2. Write it to a .txt file
  3. And later in the page, in MyST MD, we reference that .txt file with an {include} statement

So the file is first executed, the txt file is created, and MyST then includes it

@agoose77
Copy link
Contributor

agoose77 commented Nov 2, 2024

@choldgraf we should also make it possible to include files with .myst.json extensions, to support pulling in AST.

@choldgraf
Copy link
Collaborator Author

Yeah I was thinking that too. It also made me wonder if the notebook cell MyST support could be more like the plugin structure, rather than making it language specific.

For example, a cell tag like output-myst or output-myst-ast that would tell MyST to parse stdout as MyST (similar to what executable plugins do).

Does that make sense? If so I can update the issue body to reflect that suggestion.

@agoose77
Copy link
Contributor

agoose77 commented Nov 2, 2024

@choldgraf the natural way to do this would be to define a MIME type for MyST AST, and recognise it in our transforms!

@choldgraf
Copy link
Collaborator Author

That sounds like a good idea. Though if it were the only way, then each kernel would need to have a package that outputs MyST right? The benefit of tags and stdout is that anybody in any language could use it without developing anything specific.

@agoose77
Copy link
Contributor

agoose77 commented Nov 2, 2024

Although we don't need a package for this (certainly with ipython, you can just use display), I'm curious to understand your thought process - are you picturing a user printing stringified json to the stdout? Or myst markup, I.e text/markdown?

@choldgraf
Copy link
Collaborator Author

choldgraf commented Nov 2, 2024

My idea was inspired by the way that you handled "black box" outputs for the executable plugins infrastructure. So below I'll share a Python and an R cell that would generate either MyST MD or MyST AST. At build time, if the tag was identified on a cell, then any text/plain output (or however stdout gets logged) would be parsed differently. Without those tags, the output would just be parsed like any other stdout.

Below I'll use Print, but I think since Jupyter "returns" the result of the final executed line, it may not be necessary. It'd also cool if this worked with variables too, so that you could insert generated MyST elsewhere.

Python - MyST MD

```{code-cell} python
:tags: output-myst-md
print("- **Bolded** list item")
```

Python - MyST AST1

```{code-cell} python
:tags: output-myst-ast
ast = {
      "type": "list",
      "ordered": false,
      "spread": false,
      "children": [
        {
          "type": "listItem",
          "spread": True,
          "children": [
            {
              "type": "strong",
              "children": [
                {
                  "type": "text",
                  "value": "Bolded"
                }
              ]
            },
            {
              "type": "text",
              "value": " list item"
            }
          ]
        }
      ]
    }
print(ast)
```

Footnotes

  1. This is also a good demonstration of why I think it's way nicer to be able to parse MyST MD directly and not only MyST AST :-)

@fperez
Copy link

fperez commented Nov 5, 2024

I love the idea of parsing plain/MysT mardkown proper, as it's very easy to generate from many tools. In my code I very often have something like

from IPython.display import display, Markdown
md = lambda s: display(Markdown(s))

and I use md(x) everywhere as a "print markdown" shorthand to generate legible, pretty reporting with zero fuss.

Having the ability to properly handle that (along perhaps with a new MyST object in IPython.display) would be very useful, I think.

@choldgraf
Copy link
Collaborator Author

I decided to create a separate issue to track generating MyST AST directly from cell outputs, since that might be an easier short-term solution and get us part of the way there:

@agoose77
Copy link
Contributor

@rowanc1 @stevejpurves @fwkoch @choldgraf

I can't seem to paste these in Discord right now. Here are some quick discussion videos (~2min each) about the work I've been doing on Markdown output parsing:

https://github.com/user-attachments/assets/e0d61b90-2148-42e5-b433-1fddbad6f301
https://github.com/user-attachments/assets/a2d69933-1d2b-4c2a-8c83-3aba3b8f7c03

The PR I'm most excited about is currently #1671.

Need to dash, no rush here, but it would be good to book in a video meeting at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

4 participants