Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to default constructor #438

Merged
merged 30 commits into from
Sep 12, 2024
Merged

Updates to default constructor #438

merged 30 commits into from
Sep 12, 2024

Conversation

t-kalinowski
Copy link
Member

This PR works towards resolving #376 and #404.

The key change involves the value of formals(<constructor>). Previously, all constructor argument default values were set to class_missing, and the appropriate property default was determined at instantiation.

With this PR, constructor arguments now default to prop$default, which can be either an expression or a value. This is resolved using standard promise-forcing semantics during object construction.

Additionally, all custom setters for non-dynamic properties are now invoked during object construction. Previously, this only occurred if the property value was provided as an argument. Below is a short demo of the new functionality:

library(S7)

Person <- new_class(
  name = 'Person',
  properties = list(
    # required constructor argument
    name = new_property(class_character, default = quote(expr = )),
    
    # optional constructor argument, read-only after construction. 
    # Defaults to the time of the call.
    birthdate = new_property(
      class = class_Date,
      default = quote(Sys.Date()),
      setter = function(self, value) {
        if (!is.null(self@birthdate))
          stop("Can't set read-only property Person@birthdate")
        self@birthdate <- value
        self
      }
    ), 
    
    # dynamic property, not a constructor argument
    age = new_property(class = class_any, getter = \(self) {
      Sys.Date() - self@birthdate
    })
  )
)

dput(formals(Person))
#> as.pairlist(alist(name = , birthdate = Sys.Date()))

try(Person())
#> Error in Person() : argument "name" is missing, with no default

Person('Mary')@age
#> Time difference of 0 days

Person('Mary', as.Date('1970-01-01'))@age
#> Time difference of 19974 days

p <- Person("Mary")
try(p@birthdate <- as.Date('1970-01-01'))
#> Error in (function (self, value)  : 
#>   Can't set read-only property Person@birthdate

Created on 2024-09-08 with reprex v2.1.1

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great improvement.

Apart from the comments below, does new_property() need some documentation updates?

R/base.R Outdated Show resolved Hide resolved
@@ -35,8 +31,8 @@ base_default <- function(type) {
list = list(),
expression = expression(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this now be quote(expression())? (Not sure, as I haven't yet read the code that evaluates quoted defaults).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe they never get evaluated, they just get inlined into the functions arguments? That is potentially confusing because R won't display a argument with default 1:10 and quote(1:10) differently. Or if it's a more complex object (like a data.frame), deparsing might be confusing? But maybe that just suggests that quoting the defaults should be the rule, not the exception?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.... because R won't display a argument with default 1:10 and quote(1:10) differently.

Huh? not really .. (I assume you meant something else...) :

> ff <- function(x = 1:10, y = quote(1:10)) { x - eval(y) }
> ff()
 [1] 0 0 0 0 0 0 0 0 0 0
> ff
function(x = 1:10, y = quote(1:10)) { x - eval(y) }
> args(ff)
function (x = 1:10, y = quote(1:10)) 
NULL
> 

Copy link
Member

@hadley hadley Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant this problem:

f <- function() {}
formals(f) <- list(x = 1:10, y = quote(1:10))
str(f)
#> function (x = 1:10, y = 1:10)

Created on 2024-09-09 with reprex v2.1.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be what @hadley is describing (please confirm)

# literal constants get inlined
f <- function(x = 1) x
typeof(formals(f)$x)
#> [1] "double"

# expressions stay expressions in 'normally' defined functions
f <- function(x = 1:10) x
typeof(formals(f)$x)
#> [1] "language"
f
#> function(x = 1:10) x

# but, generated functions with arguments that cannot be generated 
# as a literal by the parser present the same as "normally" defined functions
formals(f)$x <- 1:10
typeof(formals(f)$x)
#> [1] "integer"
f # prints the same as before, even though x is no longer type 'language'
#> function (x = 1:10) 
#> x

Created on 2024-09-09 with reprex v2.1.1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't use formals<- properly -- you must use alist() :

> a1 <- alist(a=1:10); a2 <- alist(a = quote(1:10)); identical(a1, a2)
[1] FALSE
> f1 <- f2 <- function(){}; formals(f1) <- a1; formals(f2) <- a2; identical(f1,f2)
[1] FALSE
> str(f1)
function (a = 1:10)  
> str(f2)
function (a = quote(1:10))  
> 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmaechler this is an issue with creating functions in non-standard ways, so we definitely want to use list() here.

@t-kalinowski while inlining objects into the constructor args directly will generally be safe, I think it's going to potentially cause a lot of confusion.

Copy link
Member Author

@t-kalinowski t-kalinowski Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my first pass at this (unpushed), I did have all branches in base_default() wrapped in quote(...). I reverted it once I realized it was unnecessary.

Also, it's about 3x faster to call a constructor with an inlined value (I'm not sure how much this matters here; in absolute numbers, these are both very fast).

f1 <- f2 <- function(x = integer()) x
formals(f2)$x <- integer()

bench::mark(f1(), f2()) |> print() |> plot() |> print()
#> # A tibble: 2 × 13
#>   expression      min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
#>   <bch:expr> <bch:tm> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
#> 1 f1()          246ns  328ns  1937884.        0B        0 10000     0     5.16ms
#> 2 f2()           41ns  123ns  5988817.        0B        0 10000     0     1.67ms
#> # ℹ 4 more variables: result <list>, memory <list>, time <list>, gc <list>
#> Loading required namespace: tidyr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that this is more of a user facing documentation issue — i.e. I think we should generally recommend that the user wrap their default values in quote, unless they really know what they're doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use NSE for new_property(default) and automatically quote default. We could then allow using I() as an NSE escape hatch.

That said, as I consider it, I'm coming to the conclusion that using NSE here isn't the best approach. The laziness of a default value should be explicit, clearly communicated to readers, and opt-in for authors.

I agree we'll want to update docs to discuss this and encourage usage of quote().

R/class-spec.R Show resolved Hide resolved
R/class-spec.R Outdated Show resolved Hide resolved
for (prop in missing_props) {
prop(object, prop, check = FALSE) <- prop_default(class@properties[[prop]])
}
# Set properties. This will potentially invoke custom property setters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot simpler 😄

R/constructor.R Outdated Show resolved Hide resolved
tests/testthat/helper.R Outdated Show resolved Hide resolved
R/class-spec.R Outdated Show resolved Hide resolved
@hadley
Copy link
Member

hadley commented Sep 9, 2024

Also worth noting in the news bullet that this is going to affect all existing exported constructors — you'll need to re-document to ensure that your docs match the updated usage.

@t-kalinowski
Copy link
Member Author

t-kalinowski commented Sep 10, 2024

I updated docs and NEWS, and added tests. I believe this is ready to merge.

The failing CI is unrelated; it looks like CRAN servers are currently down.


Last-minute addition worth noting for completeness: I updated new_property() to accept class_missing as a type (potentially as part of a union) and to accept a symbol as an arg default. This allows the following usage:

new_property(class_missing | class_character)

which is practically equivalent to:

new_property(class_character, default = quote(expr =))

I imagine most users will prefer specifying a missing argument with an informatively named helper like this:

new_property(class_character, default = rlang::missing_arg())

Please take a look at the updated tests for more examples.

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!

R/property.R Outdated Show resolved Hide resolved
R/property.R Outdated Show resolved Hide resolved
R/property.R Show resolved Hide resolved
R/property.R Outdated Show resolved Hide resolved
#'
#' # Properties can also have a 'missing' default value, making them
#' # required arguments to the default constructor.
#' # You can generate a missing arg with `quote(expr =)` or `rlang::missing_arg()`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should S7 provide missing_arg() too? Could call it required()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

required() would be a nice companion to nullable(), proposed in #433.

R/utils.R Show resolved Hide resolved
@@ -103,14 +103,14 @@ validate <- function(object, recursive = TRUE, properties = TRUE) {
validate_properties <- function(object, class) {
errors <- character()

for (prop in class@properties) {
for (prop_obj in class@properties) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this name change clarifies what's going on here.

t-kalinowski and others added 2 commits September 10, 2024 13:00
Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
@mmaechler
Copy link
Collaborator

mmaechler commented Sep 11, 2024

Last-minute addition worth noting for completeness: I updated new_property() to accept class_missing as a type (potentially as part of a union) and to accept a symbol as an arg default. This allows the following usage:

new_property(class_missing | class_character)

which is practically equivalent to:

new_property(class_character, default = quote(expr =))

I imagine most users will prefer specifying a missing argument with an informatively named helper like this:

new_property(class_character, default = rlang::missing_arg())

Hmm... I'm sorry but I don't like the look of this, I've not checked the new behavior yet, hence only commenting what I see above. There are strong opinions and good reason why your "rlang::missing_arg()" is not explicitly available in R (i.e. base R), even though it has been around forever and "discussed" on R-devel / R-help lists.

As I don't understand why you think a default of "missing arg" would be preferable, I assume most users would also not understand. After all, the R way to require an argument is to not specify a default.
Yes, it is appropriate to say this is a matter of taste... but for something as "basic" as S7 that should become a fundamental package for R, I'm strongly against introducing such code / examples / ...

When I wrote earlier that I like the idea of being allow to pass a "language" object, specifically of class "call", I did not think I'd ever advocate using or even allowing quote(expr=) (which is actually not a "call" but a "symbol") but rather "real" calls such as quote( sin(x) ) as we were talking about something like a deferred / postponed evaluation.

@t-kalinowski
Copy link
Member Author

t-kalinowski commented Sep 11, 2024

Thank you, @mmaechler. I agree that quote(expr=) is not the prettiest user-facing UI.

The original user question was, "Can properties not have a default value?"

I believe we're in agreement that it should be possible to conveniently declare an object with a "required" property, meaning, a property whose value must be provided when the object instance is constructed. The question then becomes: What syntax in new_property() should we support that results in the auto-generated constructor having a missing default for that property? Concretely, given new_property(name = 'foo', default = ???), what can ??? be so that identical(formals(constructor)$foo, quote(expr=)) is TRUE?

Some alternatives to consider:

  1. Change the default default from NULL to be missing. This change would flip the default property default, from being the "empty" instance of the class to being required. I.e., optional property arguments would become opt-in, with all property arguments defaulting to being necessary. However, I don't think we want to pursue this approach.

  2. Accept a sentinel value, for example: new_property(default = class_missing). But this also brings complications, and to me does not seem to offer much beyond readability to the user-facing UI, compared to quote(expr=). Readability might be better addressed through a helper.

I'm eager to hear your thoughts. Any other ideas we should consider?


I would like to get this resolved and merged soon, as this PR is just the first step towards addressing other constructor-related issues, which we're actively discussing in the relevant threads (namely, alternative ways to opt-in or opt-out from having the property included as a constructor argument, ways to make a property read-only, and also, working through the proposal for accepting prototype seed arguments).

@t-kalinowski
Copy link
Member Author

I quickly explored "Alternative 1" described in my previous comment, changing the signature of new_property() so that default defaults to missing. You can view the changes needed to make tests pass here:
new-constructor-defaults...new-missing-prop-default-default

@hadley
Copy link
Member

hadley commented Sep 12, 2024

IMO we should merge this PR since it's a substantial improvement, but we can continue to iterate on the exact form that required properties should take.

@mmaechler
Copy link
Collaborator

I quickly explored "Alternative 1" described in my previous comment, changing the signature of new_property() so that default defaults to missing. You can view the changes needed to make tests pass here: new-constructor-defaults...new-missing-prop-default-default

Thanks a lot, Tomasz, for doing this! Now, after browsing the the required changes to the tests, I'd clearly agree that this alternative is worse than the "main" proposal (wrt which you did show the diffs).
I'm definitely preferring that the user and class / property programmers does not need to specify a default in almost all cases... again following S4's new("<foo>") behavior (for all of its slots, unless there's an explicit initializer for the class), e.g., default for atomic type <foo> is just to provide <foo>(0) which is the same as <foo>().

What I'm still completely confused about is your assumption that the default for default should usually be explicitly specified by the user, and then use the "missing object" for that. I'd assume that very often the default default would make sense, e.g., if all properties of a new class/object are atomic, then no explicit defaults would be needed at all even for the whole class with these properties --- and then the same recursively, when a property is from a class which is simple enough then new_<...>() would work for that class and hence ...

@t-kalinowski
Copy link
Member Author

t-kalinowski commented Sep 12, 2024

Thank you, @mmaechler! I'll proceed with the merge.

Regarding your comment:

I'm still completely confused by your assumption that the default should usually be explicitly specified by the user.

To clarify a bit: I believe class authors should be able to create properties that require user-supplied values at instantiation, without excessive syntax.

I think the current default default is fine; however, the only way to currently make a property whose value is "required" by the constructor is to write a constructor by hand.

@t-kalinowski t-kalinowski merged commit f1cb93c into main Sep 12, 2024
11 checks passed
@hadley hadley deleted the new-constructor-defaults branch September 12, 2024 17:23
@lawremi
Copy link
Collaborator

lawremi commented Sep 14, 2024

Looks like I am a little late to the party. One general comment is that a common convention in S4 is to provide user-facing constructors that abstract the call to new(). Thus, a lot of these concerns were passed on to the developer. S7 is forging a new path when it comes to trying to create a user-friendly constructor automatically. In practice with S7, I've created a convenience constructor anyway, because I'd like to keep the actual constructor as simple as possible, i.e., it should only be concerned with constructing the object from a set of properties.

Another way to think of a "required" property would be that its default would yield an invalid object. The question then is whether the constructor should refuse to construct an invalid object. I don't think it needs to; that is the purpose of the validity function. Like in the example at the top, there should be a validity function to ensure that @name is a single string. Then, the developer could be a bit lazy and just let the default of character() fail the check. It's not quite as good an error message, but it is fairly easy for the user to infer the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants