Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[formula] Preserve formula when reading/saving from/to data.frame #1101

Closed
J-Moravec opened this issue Aug 6, 2024 · 5 comments
Closed

[formula] Preserve formula when reading/saving from/to data.frame #1101

J-Moravec opened this issue Aug 6, 2024 · 5 comments

Comments

@J-Moravec
Copy link

J-Moravec commented Aug 6, 2024

Motivation:

Currently, I am reading some excel sheets, filling in some values, and then saving them.
I am trying to corrupt the original data as little as possible, so preservation of the original state is quite important.

Problem:

When reading a sheet into DF, modifying values, and then saving it (using add_data), the formulas are not preserved, instead they are converted into a numerical value (if show_formulas = FALSE) or string (if show_formula = TRUE).

Desired outcome

When reading using read_excel or wb_to_df, the formulas would be of class formula and would be written as such without having to manually add them using add_formula.

From examples, it seems that wb_add_data can already handle class formula, so its really about an option to read them.

Potential complications

Now that I am thinking about it, I see multiple complications:

data.frame assumes that a whole column is of the same type, this cannot be guaranteed.
Perhaps something like "=SUM(A1:A5)" as a text representation would be useful in mixed types?

There are probably more complications that I am not aware of.

Alternative solutions

If only a subset, or particular non-formula columns are modified e.g., through wb_save, formula is preserved.

@JanMarvin
Copy link
Owner

Hi @J-Moravec , thanks for the suggestion. Indeed there are a few issues: shared formulas, array formulas (which cell is the reference, does the formula stretch over multiple cells), cm formulas (basically array formulas with cell modification argument), mixed column types etc.

Thus my preferred method to update would be to write the data into the spreadsheet without the formulas, or copy the cells with formulas.

Nevertheless it should useful to apply the formula class to cells with spreadsheet formulas. Either array or just formula.

@J-Moravec
Copy link
Author

@JanMarvin Thanks for your response.

I won't even pretend to understand all the possible problems or types of formulas you mentioned, so I will defer to you.

Could you please hint how would one detect which cells contain formula? I saw it in the workbook$spreadhseet data, but don't see any accessor for this.

My quick and dirty solution was to get the data.frame from the sheet 2x, once with show_formula = TRUE, and once with show_formula = FALSE, and add formula class where the two data.frames differ. Dunno if it will work in every occasion, but my unittests passed for the use-case.

@JanMarvin JanMarvin reopened this Aug 8, 2024
@JanMarvin
Copy link
Owner

Hi @J-Moravec , that is one way to do it. I have added a PR in #1103 which has limited support for reading and conserving formulas.

@JanMarvin
Copy link
Owner

Reading a column containing only formulas should work now. For now this wont detect array formulas and probably lacks many other features, but Rome wasn't built in a day :)

@J-Moravec
Copy link
Author

Awesome. Thanks for a quick solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants