`_outline()` functions feedback #28

olivroy · 2024-05-17T15:02:02Z

Please comment on this issue if you tried the *_outline() functions and would like some functionality or encountered something unexpected

cf rstudio/rstudioapi#153 (comment)

Current plans

Potentially use lightparser for qmd files Use roxygen2 parser for roxy comments + lightparser for figure and tbl captions #29
Use roxygen2 file parsing (once I figure out how to use roxygen2::parse_package() + md (Use roxygen2 parser for roxy comments + lightparser for figure and tbl captions #29)
Make it easier to customize what output you want to see. Would be very useful to set as options in Rprofile since these functions should be used interactively.

Known issues

~~Falsely recognized commented code as headings in commented code.~~ solved by roxygen2.
~~multiline titles or fig-caps in qmd files, not recognized~~ solved by lightparser
~~Only yaml style comments recognized (not in the chunk header)~~ solved by lighparser

The text was updated successfully, but these errors were encountered:

violetcereza · 2024-05-24T00:50:03Z

Thanks again for building this! I have been using this for my project, but I wanted to intersperse different levels of headings with my TODO items. I had a lot of trouble figuring out context for my to-do items in the middle of a large notebook.

To accomplish this, I wrote a printing function that takes proj_outline() output and translates the tibble into something we can display with cli::tree(). Here's an example tree with to-do items interspersed:

test.qmd
└─Test
  ├─Quarto
  ├─Running Code
  │ ├─TODO: fix this in the code
  │ └─A sub header
  │   └─TODO: here's a todo in the text
  └─Back to header 1
    ├─Dont skip me
    │ └─header 5
    │   └─TODO: testing section
    └─Another sub header
      └─TODO: section test

I am still playing with the formatting. I started using cli::format_inline() to incorporate the contextual links you designed, but it doesn't show up here.

Below is the code I'm using. First, I simplify the outline tibble to something that makes sense to me. Then, I make the indent variable which is based off of n_leading_hash. Then, I transform a tibble with "indents" into a tree structure for cli::tree(). I can rewrite this as a pull request eventually, but I want to take a shot at using lightparser and spend more time customizing the appearance with colors and stuff before I integrate into your codebase.

  outline_data <- reuseme::file_outline(path = "test.qmd") %>%

  # Convert the many is_ columns into mutually exclusive "outline row types"
  pivot_longer(
    names_to = "type", names_prefix = "is_", c(
      starts_with("is_"), -is_md, -is_second_level_heading_or_more, -is_roxygen_comment, -is_notebook
    )
  ) %>%
  # # Double check that types are mututally exclusive
  # filter(all(value == F), .by = c(file_short, title, line_id))
  # filter(sum(value) > 1, .by = c(file_short, title, line_id))
  filter(value == T) %>%
  # We drop these because they don't serve to add much context to TODOs (they don't affect heirarchy)
  filter(type != "tab_or_plot_title") %>%

  # Some useful definitions!
  mutate(
    title = coalesce(outline_el, title_el),
    file_short = fs::path_file(file),
    n_leading_hash = type %>% case_match(
      c("todo_fixme", "tab_or_plot_title") ~ NA,
      .default = n_leading_hash
    )
  ) %>%

  # For each file, stick a item at the top of the outline
  group_by(file, file_short) %>%
  group_modify(\(data, group) data %>% add_row(
    .before = 0,
    n_leading_hash = -1,
    title = group$file_short,
    type = "file"
  )) %>%

  mutate(
    # Processing how title displays based on type
    print_title = type %>% case_match(
      "todo_fixme" ~ link,
      .default = link_rs_api
    ) %>% coalesce(title) %>% map_chr(cli::format_inline),

    # Assign TODO items (and other items missing n_leading_hash)
    # to be indented under the last seen header level
    indent = coalesce(n_leading_hash, zoo::na.locf0(n_leading_hash+1)),

    # If there are any headers that skip an intermediate level, pick them up
    skip_level = indent > lag(indent)+1,
    skip_level_should_be = ifelse(skip_level, lag(indent)+1, NA),
    skip_level_adjust = case_when(
      # All the items below on the outline should be adjusted backwards
      skip_level ~ skip_level_should_be-indent,
      # Unless we reach a point on the outline where we're back up in
      # the hierachy, so stop adjusting.
      indent <= zoo::na.locf0(skip_level_should_be) ~ 0
    ) %>%
      # Carry the adjustments to later rows
      zoo::na.locf0() %>% coalesce(0),

    indent = indent + skip_level_adjust
  ) %>%
  ungroup() %>%
  select(title, type, n_leading_hash, indent, print_title)

outline_data %>%
  mutate(
    # Give items IDs so titles do not have to be unique
    item_id = as.character(row_number()),
    indent_wider = indent,
    x = TRUE
  ) %>%

  # We need these wide cumsum `header1` type fields to determine which items belong to which parents
  pivot_wider(names_from = indent_wider, values_from = x, values_fill = F, names_prefix = "header") %>%
  mutate(across(starts_with("header"), cumsum)) %>%

  # For each row, pick the IDs of all direct children from the outline
  pmap(function(...) with(list(..., childdata = .), tibble(
    title,
    print_title,
    indent,
    item_id,
    type,
    parent_level_id = get(stringr::str_c("header", indent)),
    children_ids = childdata %>%
      dplyr::rename(childindent = indent) %>%
      dplyr::filter(
        childindent == indent+1,
        cumsum(childindent == indent) == parent_level_id
      ) %>%
      dplyr::pull(item_id) %>% list()
  ))) %>%
  list_rbind() %>%
  # View()
  select(item_id, children_ids, print_title) %>%
  cli::tree()

And here's the qmd text I used:

---
title: "Test"
format: html
editor: visual
---

## Quarto

Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see <https://quarto.org>.

## Running Code

When you click the **Render** button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

```{r}
1 + 1
# TODO: fix this in the code
```

You can add options to executable code like this

### A sub header

```{r}
#| echo: false
2 * 2
```

The `echo: false` option disables the printing of code (only output is displayed).

TODO: here's a todo in the text

# Back to header 1

## Dont skip me

##### header 5

TODO: testing section

## Another sub header

TODO: section test

olivroy · 2024-05-25T14:08:06Z

thank you so much for your insightful comments! I am trying to wrap up parsing roxygen2 comments properly. I will get to this soon!

I really like that you created an example code! I will try to see what I can come up with!

For indentation, it is planned. As you can see, there are already different colors for first level headings vs the rest.

I divided topics between "important" and "not_important", but I can add more categories, colors and more levels of indentation!

violetcereza · 2024-06-06T22:29:29Z

Current iteration of my version! Fork and pull request incoming this week...

Features I plan to look at:

Collapse tree items with one child into Parent -> Child combined nodes for brevity
Somehow adapt to HTML output, so I can embed the TODO list into a notebook (or copy/paste into word processor) to show my boss
If you wanted to use cli_h1() type markup, I could adapt it to use special styles for the top-level nodes in the tree
Use lightparser
Rework how outlines are stored under the hood a bit

After playing with the complete_todo() links a bit, I really like the idea but I don't find it useful enough to keep for me. I update todo annotations relatively rarely, so it isn't a hassle to click the pound sign (to jump to the line) and manually delete the item. Additionally, sometimes TODO items have code on the left side and removing the line breaks the code. I'm not sure supporting this feature is worth all the regex work and potential risk of damaging my scripts.

olivroy · 2024-06-06T22:37:58Z

Good to hear!

For complete_todo(), I understand, I did fix that up though in 14ec8e7.

Let me know if it still breaks for you.

Also instead of showing Done v?, could also just be green done clickable emoji?

I just disabled it generally in 92c7a66 (only keeping it for active file and files named TODO.R

Apologies in advance for potentially breaking your code

I think I will disable roxygen comments parsing

I really like how you style differently the text!

complete_todo() links should probably just appear if you are requesting the active file only. You are right that you likely won't need to complete todos when requesting outline for another project.

olivroy · 2024-06-07T07:15:50Z

@violetcereza I am looking forward to your PR! Feel free to open smaller chunks PRs first to get changes you'd like to see first. https://code-review.tidyverse.org/author/focused.html

Add as much snapshot tests as you can, so it is easy to visualize output. Therefore, if some things end up changing, we will have a visual reference. https://testthat.r-lib.org/articles/snapshotting.html

Let me know if you have any questions

olivroy · 2024-06-07T13:45:03Z

Features I plan to look at:

Collapse tree items with one child into Parent -> Child combined nodes for brevity

👍 looking forward to seeing examples of this!

Somehow adapt to HTML output, so I can embed the TODO list into a notebook (or copy/paste into word processor) to show my boss

That is a good idea!

If you wanted to use cli_h1() type markup, I could adapt it to use special styles for the top-level nodes in the tree

Not strong opinions there. Recently, I fixed titles h3 to escape markup b4803a2#diff-708c3b67fc1b91c821cc6c00c9c8108cabcdd44c7edd9366cf62a373b717ca0cR440 in outline.R

Use lightparser

If possible, file this a separate PR?

Rework how outlines are stored under the hood a bit

I will be happy to see this too! Separate PR if possible too :)

I am looking forward. Here, I am seeing 3 steps.

Add lightparser support for parsing md/ Rmd / qmd files.
Tweak how outline is stored to work better with tree
Add support for tree. (Ideally this would affect the print method) I can let you show me, but my initial thoughts are the following. (I really like how the tree looks, and will probably end up deprecating the current way of doing, but for now I'd like to keep both functional unless there are no drawbacks from using trees)
Add an option that tells that you want tree output options("reuseme.outline_tree") for example (that would strip complete_todo().

At the end of file_outline(), the following code.

if (isTRUE(getOption("reuseme.outline_tree") {
  # modify result to be tree in `outline_tree()` helper.
  result <- outline_tree(result)
  class(result) <- c("cli_tree", "outline_report", "tbl_df", "tbl", "data.frame")
}

This way, this could just use the `cli:::print.cli_tree() method?

Alternatively (maybe better) refactor the print.outline_report() to print differently whether the tree option is TRUE.

Let me know what you think!

If tree output

Any (non-trivial) code related to trees should live in R/outline-tree.R :)

Don't let this remove your creativity, these are just the thoughts I have assembled since yesterday.

Apologies also if some of my recent changes broke your code. (I am still experiencing). Hopefully it is not too bad to repair.

I have removed some variables from output:

I have renamed some functions and columns (line_id to line), added some features.

reuseme/R/outline.R

Lines 715 to 730 in 127fe6c

 rs_version = NULL, 

 outline_el2 = NULL, 

 condition_to_truncate = NULL, 

 condition_to_truncate2 = NULL, 

 style_fun = NULL, 

 is_saved_doc = NULL, 

 is_roxygen_comment = NULL, 

 is_notebook = NULL, 

 complete_todo_link = NULL, 

 is_news = NULL, 

 # I may put it back ... 

 importance = NULL, 

 # may be useful for debugging 

 before_and_after_empty = NULL, 

 # may be useful for debugging 

 has_inline_markup = NULL

Feel free to share any code. I can rework it to account for my recent changes. And make adjustments as needed to revert some changes.

Any other comment / feedback welcome! You inspired me to make changes I didn't think of at first (avoid printing complete_todo() all the time is a big one!)

violetcereza · 2024-06-07T19:01:22Z

Thanks! I will be taking a look at it today...

Here's what I'm thinking big picture for the outline interface:

Users want to see the high level structure of a project/dir/file integrated with various code annotations
- Outlines are intrinsically hierarchical, but users vary about how they split things across files / use annotations
- The RStudio outline panel makes navigation easy, and should be a guide for our outline
Different types of files have different structure, with different ideas of what makes an "outline item" and what affects the hierarchy
- Some people use the WORK tag while I don't. Other people might have custom annotations that come with activated cli links
- The outline structure should incorporate directories. If a directory happens to be the root of a project (.Rproj present), the outline item gets a special type = "project" and special custom annotation that activates reuseme::proj_switch
- In notebooks, chunk names are important for some people. Other elements include tabs, plot titles, subtitles
- In R scripts, function names and code sections can be nested
- Parsing for NEWS.md?
Users want something highly customizable for their project/habits, and want to be able to inspect the outline internals for various reasons
- I think the reuseme::outline API is getting complicated, and a lot of the tasks do two things: filter the items and change how they display
- I propose that filtering is done by users with dplyr and the resulting table is passed to the print method.
- Simplify the data structure to make filtering easy. All outlines will be a tibble with class outline_report and the following columns:
  - file, title, type, line, indent
- I can't think of any drawbacks to using trees (maybe lack of multiline support?) but adding a theme API could provide total back compatibility.
- I think it's easier to work with indent rather than the parent/children IDs required by cli_tree(). While both hierarchical structures have their strengths, it's easier to read scan down indent in table format. However, rearranging items will break the structure of indents. I propose implementing sorting in the sort expression to print.outline_report to make this easy and flexible.

The new simplified API

`proj_outline(proj = NULL)`

If no proj, proxy for dir_outline(rstudioapi::getActiveProject()) otherwise dir_outline(proj_list(proj)).

`dir_outline(recurse = T)`

Run file_outline() on everything and aggregate into tibble under top level folder item.

`file_outline(path = rstudioapi::documentPath())`

Run file_outline() on everything and aggregate into tibble under top level folder item. Uses my algorithm to turn arbitrary indentation into strict hierarchical levels and make indent variable more meaningful.

`print.outline_report(outline, sort = F, collapse_nodes = T, prune_nodes = T, theme = NULL)`

Prints a multicolored tree. Uses my algorithm to turn outline indentation into parent/child structure. Optionally collapse nodes with one child into Parent -> Child nodes for brevity. Optionally prune nodes with no sub-annotations.

Allows sort with a data mask that operates recursively on sub-tibbles of sibling items (with a nod to data.tree sorting).

theme is a function(outline, indent) that allows customization for output (TBD).

Deprecations

Please let me know if my proposed API simplifications destroy any functionality you were using.

proj_outline(pattern = "x")
# NOW
proj_outline() |>
  filter(file |> str_detect("x"))

proj_outline(print_todo = F)
# NOW
proj_outline() |>
  filter(type != "todo")

proj_outline(work_only = F)
# NOW
proj_outline() |>
  print(prune_nodes = F)

proj_outline(alpha = T)
# NOW
proj_outline() |>
  print(sort = title)

proj_outline(recent_only = T)
# NOW
# Only select 5 most recently modified files?
proj_outline() |>
  mutate(modified = file %>% map_dbl(file.mtime)) |>
  filter(modified >= modified |> unique() |> sort() |> nth(-5))

# Extra tricks: print only files that have TODO items, and only up to level 3 headings
# (include todos under omitted headings)
# (print headings with no todo items as well)
proj_outline() |>
  filter(.by = file, any(type == "todo"), indent <= 3 | type != "heading") |>
  print(prune_nodes = F)

data <- proj_outline()
data$file_path # redundant with file?
data$file_short # use fs::path_file()
data$file_ext # use fs::path_ext()

# These should be cleaned and aggregated into `title` with some choices made by `file_outline()`
c(data$content, 
  data$outline_el,
  data$title_el, etc...)

# These should be internal to `print.outline_report`
c(data$link, 
  data$text_in_link,
  data$style_fun,
  data$link_rs_api,
  data$file_hl
)

olivroy · 2024-06-07T20:19:21Z

I like this! Do you want to open a PR, or do you want me to make these changes first?

violetcereza · 2024-06-26T06:55:07Z

Hey I know I am procrastinating writing unit tests on my PR...

But WRT to lightparser: I started trying out VSCode this week and I learned about "Language Server Protocols." Basically, there's too many types of files and too many IDEs and LSP provides a common interface and prevents the headache of re-implementing code analysis. I wonder if we could plug into the foldingRangeProvider service from languageserver to generate outlines.

olivroy · 2024-06-27T01:24:22Z

I understand. On my side, I have been able to successfully move link creation inside the print method. I am thinking about what you are proposing.

Feel free to continue sharing what you find! The end goal is to improve our workflows, not to stick to a suboptimal solution

olivroy added the help wanted Extra attention is needed label May 17, 2024

olivroy mentioned this issue May 17, 2024

Access contents of document outline rstudio/rstudioapi#153

Open

coderabbitai bot mentioned this issue May 24, 2024

Use roxygen2 parser for roxy comments + lightparser for figure and tbl captions #29

Open

violetcereza mentioned this issue Jun 7, 2024

Rewrite print.outline_report #32

Open

olivroy mentioned this issue Jun 13, 2024

Move link creation inside print.outline_report() #36

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`_outline()` functions feedback #28

`_outline()` functions feedback #28

olivroy commented May 17, 2024 •

edited

Loading

violetcereza commented May 24, 2024 •

edited by olivroy

Loading

olivroy commented May 25, 2024

violetcereza commented Jun 6, 2024

olivroy commented Jun 6, 2024 •

edited

Loading

olivroy commented Jun 7, 2024

olivroy commented Jun 7, 2024 •

edited

Loading

violetcereza commented Jun 7, 2024 •

edited

Loading

olivroy commented Jun 7, 2024

violetcereza commented Jun 26, 2024 •

edited

Loading

olivroy commented Jun 27, 2024 •

edited

Loading

_outline() functions feedback #28

_outline() functions feedback #28

Comments

olivroy commented May 17, 2024 • edited Loading

violetcereza commented May 24, 2024 • edited by olivroy Loading

olivroy commented May 25, 2024

violetcereza commented Jun 6, 2024

olivroy commented Jun 6, 2024 • edited Loading

olivroy commented Jun 7, 2024

olivroy commented Jun 7, 2024 • edited Loading

violetcereza commented Jun 7, 2024 • edited Loading

The new simplified API

proj_outline(proj = NULL)

dir_outline(recurse = T)

file_outline(path = rstudioapi::documentPath())

print.outline_report(outline, sort = F, collapse_nodes = T, prune_nodes = T, theme = NULL)

Deprecations

olivroy commented Jun 7, 2024

violetcereza commented Jun 26, 2024 • edited Loading

olivroy commented Jun 27, 2024 • edited Loading

`_outline()` functions feedback #28

`_outline()` functions feedback #28

olivroy commented May 17, 2024 •

edited

Loading

violetcereza commented May 24, 2024 •

edited by olivroy

Loading

olivroy commented Jun 6, 2024 •

edited

Loading

olivroy commented Jun 7, 2024 •

edited

Loading

violetcereza commented Jun 7, 2024 •

edited

Loading

`proj_outline(proj = NULL)`

`dir_outline(recurse = T)`

`file_outline(path = rstudioapi::documentPath())`

`print.outline_report(outline, sort = F, collapse_nodes = T, prune_nodes = T, theme = NULL)`

violetcereza commented Jun 26, 2024 •

edited

Loading

olivroy commented Jun 27, 2024 •

edited

Loading