Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts / observations about arguments / fn name consistency. #681

Closed
17 of 18 tasks
olivroy opened this issue Jul 11, 2023 · 23 comments
Closed
17 of 18 tasks

Thoughts / observations about arguments / fn name consistency. #681

olivroy opened this issue Jul 11, 2023 · 23 comments

Comments

@olivroy
Copy link
Collaborator

olivroy commented Jul 11, 2023

Priorities

  • Rename xlsx_file argument (may open an issue for that for discussion?
    really could just be file, other data import packages like readxl, haven, readr use path or file. I disagree with using x as it's more widely used for a vector or for data, not for a file. file is used in wb_load() (Gh issue 681 pt2 #708)

I dont like either, because xlsx_file can be both. A file or a workbook (hopefully not a url). That's why I was suggesting some neutral alternative. So far the only consensus is that xlsx_file is unacceptable and has to be replaced for the next release (another deprecated argument ...).
Another possibility would be wb.

  • Rename path -> file in wb_save()

What I may do

  • Rewrite examples - long-term TODO. (Review examples #720)
  • After all, this is a function that is probably never going to be used ever. Creating custom table styles is quite a complex task and I would not recommend it at all. Instead I would try to create a function for custom pivot table styles! :)
  • wb_row_heights.Rd, other examples. Separate create_comment() from wb_add/remove() for consistency.

Low priorities

  • wbChartSheet is untested, and I couldn't find much doc.'
    • underlying driver for wb_chartsheet() which can be used with wb_add_mschart() and loaded charts. There is not much to test.
  • Change params in wb_add_conditional_formatting() to snake_case

Done

Not planned

  • The helper download_testfiles() did not work for me. (I had to manually download them from the repo, and place them in the testfiles directory. Not planned, however, I think that maybe downloading the zip with https://github.com/JanMarvin/openxlsx-data/archive/refs/heads/main.zip may be of interest, later. Fixed!

  • na.strings, I think should be just na, (readr and readxl uses this argument) na.numbers could be merged.. (I am not aware of all use cases however.. but anyway, should be renamed to snake case too

I don't want to change this. na.strings is around since the first days. Yes, other functions use other arguments. na.numbers is probably something we could have avoided, but I added it. Just because it was possible.

  • font_name will not change, so font and font_name coexist

  • package options will remain in camelCase as they are not used in code, it doesn't matter.

  • Autocomplete works only for pipe functions wb_*, as it is a known behavior of R6 (custom `.DollarNames` #251 is related)

@olivroy olivroy changed the title Cleanup ideas Thoughts / observations about arguments / fn name consistency. Jul 11, 2023
@JanMarvin
Copy link
Owner

Regarding snapshot tests, I never read the docs, the folder appeared, I set it to be ignored. I don't know what it's good for, testthat is so bloated and I really would prefer something like tinytest

  • I am not a library(tidyverse) user and I don't like tibbles and I think that in our context e.g. rownames are important (and a few internal functions depend on their existence). If you want to work on it and it is possible without new dependencies, and if you depend on it, please create a pull request. I will not put my foot down and I have added hms support in a similar way, and also labelled, but I will not actively support tibbles. Hard pass from me.

The remaining issues have to wait for another time. I'm really tired and it's been a long day/week already. You have a good eye and I'm impressed by your hands on approach, thanks!

@olivroy
Copy link
Collaborator Author

olivroy commented Jul 11, 2023

No problem! I really appreciate your work! I think it can be really useful. Always appreciate an actively maintained package.

No rush, I just wanted to record my thoughts while browsing the source code!

Regarding tibbles, I understand, I may just make my own wrappers when needed, it's pretty simple

@olivroy
Copy link
Collaborator Author

olivroy commented Jul 13, 2023

Just so I am sure of where the package is headed:

  • you want to discourage use of write_* functions, in favour of wb_add_*?
  • When chaining, we have to use functions that don't start with the wb_* prefix?
  • you want to move away from start_col, start_row, I don't remember what are the replacements

It is because I probably want to do another PR to update examples, tweak the reference function index and finalize vignettes review/
Also, I imagine you don't want to teach the old methods in the vignettes?

Also, before removing @export, adding @keywords internal will be good for functions you don't want publicized too much. as they don't show up in reference index.

It would be very good to do an article/blog post of openxlsx2 0.7.1 -> openxlsx2 1.0.0 to summarise thoughts / philosophy

@JanMarvin
Copy link
Owner

  • Yes wb_add_ is what I use. write_ functions are like void(), they do not return something, but modify an object in place. It is not that write_ functions are broken, but I don't use them directly and always via wb_add_ functions.
  • Yes exactly, this should already be written down in one of the vignettes. We have
# pipe
wb <- wb_workbook() |> wb_add_worksheet()
# chain
wb <- wb_workbook()$add_worksheet()

In wb_to_df() it is perfectly fine to use start_col/start_row. Like get me all data starting in row 5. But in other functions I think it is easier for users to use dims a so called A1 worksheet cell reference. I get that it is not native to R users, but it's a) easy to adapt and b) simple to create with rowcol_to_dims(seq_len(nrow(mtcars)), seq_along(mtcars)).

The vignettes should be updated, that's a task for the release, but can be done after 1.0. To me 1.0 is merely a fix point from whereon I want to call the API somewhat stable. Like it is not changing unless really required. I will still develop for and with it, therefore it is not the end of the line or necessarily a full tested entirely stable package. But a huge reason why I'm so annoyed by the tidyverse is the constantly changing API. I get it from a developer point of view, but ... I'm not employed as a developer and in my job I need some continuity. If I have to rerun an analysis from last year, I do not have the time to read into the newest tidyverse packages and relearn everything from scratch (I know about renv, but to me that's a solution to a self inflicted problem).

I'll think about the blog post, but I'm of the opinion that not so much is changing here. It's still not easy to get peoples opinion on your software. Maybe people are using and loving openxlsx2, maybe the don't. Usually you only get response when something isn't working 😄

@JanMarvin
Copy link
Owner

More camelCase from class_workbook_wrappers.R (I reenabled the object_name_linter in .lintr, but there is to much white noise all over the place):

  • wb_page_setup
  • wb_freeze_pane
  • wb_protect
  • wb_set_last_modified_by (bah, who put this in my code)

@olivroy

This comment was marked as resolved.

@olivroy

This comment was marked as resolved.

@JanMarvin
Copy link
Owner

Regarding dims_to_dataframe(), yes, it's the building block for much of openxlsx2.

Regarding the write_ functions, I'm kinda thinking the same. They are remnants from openxlsx where writeData() was needed and I wanted to provide them for a smoother transition (and it was long before Jordan introduced me to R6 and rewrote much of the code). Similar to the different wb_to_df() functions. Initially it began as a quick draft to create something, but it became the foundation for the package. Still there are wb_read(), read_xlsx() and maybe another one I don't remember. But I'm not sure if we should keep them around or ditch them.

@olivroy
Copy link
Collaborator Author

olivroy commented Jul 17, 2023

I think they should be kept around, but I think they shouldn't be advertised

i.e export + document, but not mentioning in reference index, or vignettes.

@JanMarvin
Copy link
Owner

  • The helper download_testfiles() did not work for me. (I had to manually download them from the repo, and place them in the testfiles directory.

Unless you want to rewrite them, I'd say we leave this. As stated, I do not care, they work for me and it's unlikely that others want to use this.

* [ ]  Rewrite examples (I might let you do it, since there may be duplication, and you may want to reorg a bit)

It's a long term to do. Not high on my long To-Do list

* [ ]  `wbChartSheet` is untested, and I couldn't find much doc.'

Yeah, it's the underlying driver for wb_chartsheet() which can be used with wb_add_mschart() and loaded charts. There is not much to test. Low prio for me

* [ ]  `wbWorkbook` doc improvement See https://janmarvin.github.io/openxlsx2/dev/reference/wbWorkbook.html

Unfortunately I do not think there is much for us to do. Jordan and I agreed that we didn't want to duplicate the roxygen code and even the pkgdown page is very much unreadable. The user should always try to look for the wrapper function man page.

* [ ]  create_tablestyle() -> `create_table_style()`

Can live with the current example. After all, this is a function that is probably never going to be used ever. Creating custom table styles is quite a complex task and I would not recommend it at all. Instead I would try to create a function for custom pivot table styles! :)

* [ ]  comment functions (adding `@keywords internal` in [[experimental] standardize function arguments #678](https://github.com/JanMarvin/openxlsx2/pull/678)  made all functions unfindable on the pkg website.

Ongoing task

* [ ]| ~Export `rowcol_to_dim` to be put in `dims` helper, or use `rowcol_to_dims()` everywhere. and use the internal helper `rowcol_to_dim()` if length is 1?~  (wb_dims() [provide wb_dims() #691](https://github.com/JanMarvin/openxlsx2/pull/691) ) Use `wb_dims()` when necessary ([Gh issue 681 pt1 #693](https://github.com/JanMarvin/openxlsx2/pull/693))

Should be fine now. All public facing functions should now use wb_dims()

  * In some exported functions, the default argument is `rowcol_to_dim(...)`, which is undocumented currently.

We might make it internal and ignore the short timespan it was publicly used.

* [ ]  Vignette to document API change between v0.7 and 1.0 (will just need final tweaks after [Update vignettes to snake_case, and other code quality updates #682](https://github.com/JanMarvin/openxlsx2/pull/682))

Would be nice, but ... we'll see if there is time.

* [ ]  `data_table` vs `datatable` (this may be not that useful, as it's only for internal consistency.), the `write_datatable()` exists.

If we make this internal, we do not have to care anymore.

And now for the controversial part...

outstanding argument problems

food for thought

* `na.strings`, I think should be just `na`, (readr and readxl uses this argument) `na.numbers` could be merged.. (I am not aware of all use cases however.. but anyway, should be renamed to snake case too

I don't want to change this. na.strings is around since the first days. Yes, other functions use other arguments. na.numbers is probably something we could have avoided, but I added it. Just because it was possible.

* `xlsx_file` really could just be `file` or `path`, other data import packages like readxl, haven, readr use path or file. I disagree with using `x` as it's more widely used for a vector or for data, not for a file.

I dont like either, because xlsx_file can be both. A file or a workbook (hopefully not a url). That's why I was suggesting some neutral alternative. So far the only consensus is that xlsx_file is unacceptable and has to be replaced for the next release (another deprecated argument ...)

* `font_name` -> `font`

I don't see anything wrong with this. Yes we have name in wb_add_font() and font_name in others, but I think that this is acceptable.

* `params` in `wb_add_conditional_formatting()` are still camelCase

I don't want to become a fanatic now, tackling ever camel case I see, but this should be simple, just match the snake case code in the function.

* package options are still in camelCase

Usually these will go in some file that will never be touched again, maybe even .Rprofile. You can add a under the hood replacement, but these are not part of the API, so no rush.

* ~Autocomplete doesn't seem to work for some functions with chains (works with add_formula(), but not `add_worksheet()`~ I need to do more testing.

It's a known behavior of R6 (see also #251). Unfortunately the answer seems to be: for autocomplete use pipes.

@olivroy
Copy link
Collaborator Author

olivroy commented Jul 18, 2023

Thanks for your reply!

I edited in the original comment to reflect this!

On my part, I may open 1-2 more doc-related PRs if that's okay with you

@JanMarvin
Copy link
Owner

Sure! But please, before you put in a lot of time into something, ask if this is something worth going for. I know I'm a bit annoying with what I want and what I don't. This is just to avoid frustration for everyone involved.

@olivroy
Copy link
Collaborator Author

olivroy commented Jul 18, 2023

I am sorry that I caused frustration, :( Hope my latest PR is not as bad as the last one.

But I was thinking 3 PRs

  1. Finalize doc/pkgdown organization, my main goal would be to unite corresponding wb_add with wb_remove in their own .Rd as well as wb_get with their wb_set counterparts so they all look like https://janmarvin.github.io/openxlsx2/dev/reference/wb_order.html
  2. Take a last look at vignettes
  3. Review examples (minimally)

@JanMarvin
Copy link
Owner

No sorry, you got that wrong. I'm not frustrated, I want to avoid frustration on your side. Sometimes people want to contribute a huge pull request, but it's either not the right time, or not wanted by the author(s). Obviously if one has invested a lot of time on something, this rejection obviously causes frustration. Therefore, I often feel that a mutual agreement on the course of action can avoid this or at least soften the severity.

For a while I've been doing this solely, now at least there's someone actively looking at the code!

@olivroy
Copy link
Collaborator Author

olivroy commented Jul 19, 2023

Finally, I came across the set vs add functions..
Are there notable differences ? Imo, most set_ should be _add

_add are related to worksheet?
_set related to workbooks?

I think that the sheet argument of set functions could be deprecated in favour of either all sheets.

wb_add_cell_style() vs wb_set_cell_style()

Or maybe, some set functions are the exact copy of add, in this case, deprecate and suggest the add function.

@JanMarvin
Copy link
Owner

wb_get_/wb_set_ functions are more like getter/setter functions. wb_add_ adds something new to a workbook (a worksheet, data, images or graphs), wb_get_ returns the state of something (visibility of sheets) in the workbook and wb_set_ modifies the state (change visibility to hidden).

Ideally wb_add_ should provide a wb_remove_ pair, but we lack many of the latter, but mostly because - similar to openxlsx developers earlier - we care a bit more about workbook creation than modification. Nevertheless these functions should be added, but this requires a lot of testing with various files. Also writing functions, just to add them is a unfulfilling task.

Since modifications in these would break much of the API, this would be a task for openxlsx3 or at least release 2.0. While I appreciate the suggestion, I have no desire to go down this rabbit hole.

@olivroy
Copy link
Collaborator Author

olivroy commented Jul 19, 2023

I completely understand! I may add some of these explanations in the function index.

I just wanted to point it out!

And if someone has the need for a new remove function, it can be done, (I don't)

@olivroy
Copy link
Collaborator Author

olivroy commented Aug 6, 2023

wb_save() still has the path argument, could it be changed for file?

@JanMarvin
Copy link
Owner

Yes we can switch it for consistency. I'm a bit under the weather and open a PR soon. Might be last minute change for the pending release 0.8

@JanMarvin JanMarvin mentioned this issue Aug 7, 2023
@olivroy
Copy link
Collaborator Author

olivroy commented Aug 7, 2023

Thanks for the work and your patience! Closing this for now! I may come back when / if you decide to another round of cleanups, but the package looke good now imo! less exported functionw consistent arguments and function names

Bravo!

@olivroy olivroy closed this as completed Aug 7, 2023
@JanMarvin
Copy link
Owner

I have to thank you for the input. Really appreciate it!
Some things were not achieved, but I hope that's fine, at least I hope that I explained my reasons when I didn't want to apply changes and why.

Hopefully you too got something out of this experience. Maybe a better working package or a broader understanding of what's going on inside openxlsx2.

@olivroy
Copy link
Collaborator Author

olivroy commented Aug 7, 2023

I understand completely! This is your package after all. You have more experience than I have also.

I am really glad you listened to my advice. Because for me, this important gain in consistency will allow me to write for production, teach and recommend it around me. Props on the argument changes, this looked long to do, but a worthwhile change for the future!

I like that the syntax of openxlsx2 is now much more intuitive than openxlsx (I always struggled with it) hopefully, many more people start using it and openxlsx2 will be the reference package in a few years.

Edit: I have never done that, but I may write a blog post about this, trying to advertise it a little more on social medias.
Edit 2: Maybe consider creating a logo would be awesome too, I guess it could be very similar to openxlsx logo.

@JanMarvin
Copy link
Owner

Don't sell yourself and your contributions short! Well see what the future brings, I fear a small few fallout will happen. Some of my own old code was already screaming at me, but after all, like you said, it should be a change for the future.
I love openxlsx and the work Alex, Philipp and all the other contributors have taken into it and it's no competition and neither of us gains a dime from the development :)

Edit1: Haha, well go ahead! From time to time I try to add references in stack overflow, but I'm not a sales person and I don't really care if others us my projects. First and foremost I write packages because I need the functionality.

Edit2: I'm not creative enough for a logo and the previous one was thrown at Philip and me, and that was somewhat unpleasant even though I like it. I had experimented with something in text to image functions like "XML going into a spreadsheet", but you can imagine how well that worked. Maybe a tiny turtle or something. Anyways another thing I don't care about, I never choose software because of a fancy logo or something. At least I like to think so! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants