Skip to content
This repository has been archived by the owner on May 14, 2024. It is now read-only.

Remove interface layer of jaspResults #6

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

vandenman
Copy link
Contributor

So I was playing a bit with this and it turned out to be not too difficult.

I think the main thing to check is if specializing Rcpp's standard_delete_finalizer doesn't introduce a memory leak. I don't think so, but @JorisGoosen probably knows better than me. Perhaps that function should only be specialized in JASP (because we call destroyAllAllocatedObjects elsewhere), and in R we can just let the garbage collector do its thing.

In addition, I changed properties on jaspInterface with getters and setters that did nothing but forward to the jaspObjects to fields on the jaspObjects, e.g.,

// old
.property("width", &jaspPlot_Interface::getWidth, &jaspPlot_Interface::setWidth, <docstring>)
// new
.field("width", &jaspPlot::_width, <docstring>)

Before we'd ever merge this, I think we should (locally) run all unit tests to see if this doesn't break anything.

@vandenman
Copy link
Contributor Author

So right now we have 4 layers:

R6 <- S4 <- Interface <- jaspObject

After this PR, we'd have 3 layers:

R6 <- S4 <- jaspObject

However, we can also cut the S4 layer, and have:

R6 <- jaspObject

here, the R6 would contain a "smart" pointer (i.e., Rcpp::XPtr<jaspObject*>). Right now, the S4 object contains this pointer (and has associated methods generated from jaspModuleRegistration.h).

Removing the S4 layer might be cleaner, but, it does imply that we need to rewrite all the internals of the R-side. For example, we can no longer do S4obj$method but instead would need to do fun(pointer, arguments). Also the methods defined in jaspModuleRegistration.h would become mostly obsolete and we'd need to write the wrappers ourselves which increases the amount of glue code we need. For example:

Rcpp::CharacterVector getColNames(Rcpp::XPtr<jaspTable*> ptr)
{
	ptr->_colNames;
}
void setColNames(Rcpp::XPtr<jaspTable*> ptr, Rcpp::CharacterVector newNames)
{
	ptr->_colNames = newNames;
}

on the R side for the developers, nothing changes of course since they only interact with the R6 layer.

@JorisGoosen any thoughts?

@JorisGoosen
Copy link
Contributor

@JorisGoosen any thoughts?
I'll be responding inline below:

So right now we have 4 layers:

R6 <- S4 <- Interface <- jaspObject

After this PR, we'd have 3 layers:

R6 <- S4 <- jaspObject

This has the problem that the pointer in S4 is copied by the Rcpp::Modules stuff whenever the S4 object is assigned to a variable. Which was the original reason for having that interface layer in between.

However, we can also cut the S4 layer, and have:

R6 <- jaspObject

here, the R6 would contain a "smart" pointer (i.e., Rcpp::XPtr<jaspObject*>). Right now, the S4 object contains this pointer (and has associated methods generated from jaspModuleRegistration.h).

This would definitely solve the problem alluded to above with the S4 contains ref to jaspObject.

Removing the S4 layer might be cleaner, but, it does imply that we need to rewrite all the internals of the R-side. For example, we can no longer do S4obj$method but instead would need to do fun(pointer, arguments). Also the methods defined in jaspModuleRegistration.h would become mostly obsolete and we'd need to write the wrappers ourselves which increases the amount of glue code we need. For example:

Rcpp::CharacterVector getColNames(Rcpp::XPtr<jaspTable*> ptr)
{
	ptr->_colNames;

//Probably best not to directly reference private/internal variables like that though.

}
void setColNames(Rcpp::XPtr<jaspTable*> ptr, Rcpp::CharacterVector newNames)
{
ptr->_colNames = newNames;
//Using a function here would be better
}

How else would we do this?

Well... If we are going to be manually coding these wrappers then we are recreating the just removed Interface + modules right?

So the only way this would be an improvement is if we could generate this, and this pkg does that: https://github.com/richfitz/RcppR6
And is in fact what yamovi uses... So that might be something?
But then we would still need to define the yaml for the R6 class. But I think that also means we can remove those definitions from zzzWrappers.R

on the R side for the developers, nothing changes of course since they only interact with the R6 layer.

Indeed

@vandenman
Copy link
Contributor Author

Well... If we are going to be manually coding these wrappers then we are recreating the just removed Interface + modules right?

Yes, the main reason to do this is that it would make things more lightweight. Note that I'm not really sure whether it makes sense to cut out this layer at all, these are just some ideas about how to cut it out, if we want to do so.

So the only way this would be an improvement is if we could generate this, and this pkg does that: richfitz/RcppR6

That's an option, but there you basically do:

  1. Run some script to generate the wrapper classes.
  2. Compile the package.

So there are two options.

  1. Generate R6 wrappers for the C++ objects directly. The downside of this is that we then need to (somehow) define additional R functions for convenience things (e.g., error handling). We could try to define all convenience functions on the C++ side, but I think it's easier to do this on the R side. For example, on the R side, you could print a more human-friendly error message when someone passes an incorrect type.

  2. Generate functions (not classes) that interact with the C++ objects. For example, on the C++ side we generate for jaspTable

Rcpp::CharacterVector getColNames(Rcpp::XPtr<jaspTable*> ptr)
{
    return ptr->getColNames();
}
void setColNames(Rcpp::XPtr<jaspTable*> ptr, Rcpp::CharacterVector newNames)
{
	ptr->setColNames(newNames);
}

and on the R side, we write manually

jaspTableR <- R6::R6class(
  ... # stuff

  active = list(
    colNames = function(x) { if missing(x) getColNames(private$jaspObjectPtr) else setColnames(jaspObjectPtr, x) }
  ),
  # inherited from jaspObjectR
  private = list(
    jaspObjectPtr = new("externalptr") # default is null pointer
  )
)

For option 1, the layers become

R6 <- R6 (generated wrapper) <- C++

For option 2, the layers become

R6 <- wrapper functions <- C++

Rather than 1, I'd also be happy to stick with the generated S4 layer, as I'm not sure what it would add to replace S4 with R6 (instead of cutting it out). Note that both the generated S4 and R6 also define the wrapper functions for option 2, but they just hide those functions in a class interface.

@JorisGoosen
Copy link
Contributor

Well... If we are going to be manually coding these wrappers then we are recreating the just removed Interface + modules right?

Yes, the main reason to do this is that it would make things more lightweight. Note that I'm not really sure whether it makes sense to cut out this layer at all, these are just some ideas about how to cut it out, if we want to do so.

It would make adding new stuff a lot easier because it would go from adding two manual codes (one in jaspModuleRegistration.h and an jaspObject_Interface-class to just one, so it would still be less work once we've done it.
That would be nice.

So the only way this would be an improvement is if we could generate this, and this pkg does that: richfitz/RcppR6

1. Generate R6 wrappers for the C++ objects directly. The downside of this is that we then need to (somehow) define additional R functions for convenience things (e.g., error handling). We could try to define all convenience functions on the C++ side, but I think it's easier to do this on the R side. For example, on the R side, you could print a more human-friendly error message when someone passes an incorrect type.

Ok, so that doesn't help too much then.

2. Generate functions (not classes) that interact with the C++ objects. For example, on the C++ side we generate for jaspTable
Rcpp::CharacterVector getColNames(Rcpp::XPtr<jaspTable*> ptr)
{
    return ptr->getColNames();
}
void setColNames(Rcpp::XPtr<jaspTable*> ptr, Rcpp::CharacterVector newNames)
{
	ptr->setColNames(newNames);
}

This is basically the same as what the jaspObject_interface-classes do, but then instead of storing the pointer in R we store it in C++.

and on the R side, we write manually

jaspTableR <- R6::R6class(
  ... # stuff

  active = list(
    colNames = function(x) { if missing(x) getColNames(private$jaspObjectPtr) else setColnames(jaspObjectPtr, x) }
  ),
  # inherited from jaspObjectR
  private = list(
    jaspObjectPtr = new("externalptr") # default is null pointer
  )
)

...

Rather than 1, I'd also be happy to stick with the generated S4 layer, as I'm not sure what it would add to replace S4 with R6 (instead of cutting it out). Note that both the generated S4 and R6 also define the wrapper functions for option 2, but they just hide those functions in a class interface.

Maybe we should have a look at https://github.com/r-lib/cpp11 ?
Because I feel like the ways we've thought of now in this PR are not making things any easier.

If we can somehow call functions on a pointer to an object a bit easier from R it would make it a lot more smooth.

@JorisGoosen
Copy link
Contributor

What might be possible is calling the C++ functions in such a way that we supply the normally invisible this pointer as first argument, because I'm fairly sure that is how C++ normally does that. So then we could just call the jaspObject functions directly from R perhaps. And we wouldn't need to rewrite to cp11 (and anyway, I see nothing there about supporting calling functions on classes).

@vandenman
Copy link
Contributor Author

This is basically the same as what the jaspObject_interface-classes do, but then instead of storing the pointer in R we store it in C++.

The difference is that the wrapper function I suggest functions are directly callable from R, whereas the jaspObject_interface-classes store a pointer but still require another C++ wrapper function to be exposed in R. See example 1.2 here where they manually expose a class. I'd intend to generate those wrapper functions (e.g., Uniform__draw) directly, (but without SEXP).

Maybe we should have a look at r-lib/cpp11 ?

In cpp11, the approach where we use functions with pointers to jaspObjects would translate to something like this:

cpp11::strings getColNames(cpp11::external_pointer<jaspTable*> ptr)
{
    return ptr->getColNames();
}
void setColNames(cpp11::external_pointer<jaspTable*> ptr, cpp11::strings newNames)
{
	ptr->setColNames(newNames);
}

cpp11 is a newer C++ to R interface. It's smaller, has faster compilation times, and does more things that we want, for example, automatic UTF-8 conversions for everything passed from R. However, cpp11 does not support modules, so we could only use this if we write the glue code ourselves.

What might be possible is calling the C++ functions in such a way that we supply the normally invisible this pointer as first argument, because I'm fairly sure that is how C++ normally does that. So then we could just call the jaspObject functions directly from R perhaps.

that would be pretty cool, however, doesn't that mean we need to write those functions with SEXP? If that's the case, I think this might introduce more problems than it would solve 😛

@JorisGoosen
Copy link
Contributor

This is basically the same as what the jaspObject_interface-classes do, but then instead of storing the pointer in R we store it in C++.

The difference is that the wrapper function I suggest functions are directly callable from R, whereas the jaspObject_interface-classes store a pointer but still require another C++ wrapper function to be exposed in R. See example 1.2 here where they manually expose a class. I'd intend to generate those wrapper functions (e.g., Uniform__draw) directly, (but without SEXP).

Ah yes, ok that is nicer.

Maybe we should have a look at r-lib/cpp11 ?

In cpp11, the approach where we use functions with pointers to jaspObjects would translate to something like this:

cpp11::strings getColNames(cpp11::external_pointer<jaspTable*> ptr)
{
    return ptr->getColNames();
}
void setColNames(cpp11::external_pointer<jaspTable*> ptr, cpp11::strings newNames)
{
	ptr->setColNames(newNames);
}

cpp11 is a newer C++ to R interface. It's smaller, has faster compilation times, and does more things that we want, for example, automatic UTF-8 conversions for everything passed from R. However, cpp11 does not support modules, so we could only use this if we write the glue code ourselves.

Well, that would just mean a lot of rewriting and the UTF-8 conversion is (a) already covered now and (b) will be obsolete soon anyway once the win-ucrt build of R is finished. So it probably isn't the best idea.

What might be possible is calling the C++ functions in such a way that we supply the normally invisible this pointer as first argument, because I'm fairly sure that is how C++ normally does that. So then we could just call the jaspObject functions directly from R perhaps.

that would be pretty cool, however, doesn't that mean we need to write those functions with SEXP? If that's the case, I think this might introduce more problems than it would solve 😛

Well, not necessarily with Rcpp because then we could at least use those types (Rcpp::String etc) in the glue right?

But this would need to be tested to see whether we can pull that off, if I look at that example you linked it is at least not something they suggest!

@vandenman
Copy link
Contributor Author

Well, that would just mean a lot of rewriting and the UTF-8 conversion is (a) already covered now and (b) will be obsolete soon anyway once the win-ucrt build of R is finished. So it probably isn't the best idea.

The thing is, if we implement the wrappers functions we can mostly use std/ JSON structures for all jaspObject classes and have the conversion between R and C++ structures be done by the wrappers function. At that point, switching between Rcpp and cpp11 (or at least, trying it out) should be relatively easy (updating the types for generated wrappers + the types for the state).

But this would need to be tested to see whether we can pull that off, if I look at that example you linked it is at least not something they suggest!

What they say this:

As it is generally a bad idea to expose external pointers ‘as is’, they usually get wrapped as a slot of an S4 class.

Whether we use S4 or R6 for the wrapping doesn't really matter I think, that's just about exposing it to users.

@JorisGoosen
Copy link
Contributor

Well, that would just mean a lot of rewriting and the UTF-8 conversion is (a) already covered now and (b) will be obsolete soon anyway once the win-ucrt build of R is finished. So it probably isn't the best idea.

The thing is, if we implement the wrappers functions we can mostly use std/ JSON structures for all jaspObject classes and have the conversion between R and C++ structures be done by the wrappers function. At that point, switching between Rcpp and cpp11 (or at least, trying it out) should be relatively easy (updating the types for generated wrappers + the types for the state).

Sure, but that would then also entail rewriting all the code on all jaspObjects that now handle Rcpp classes directly. So while this is good for future support of other languages/cp11 it does add a significant amount of work there. At least, I think.
But maybe it isn't quite so bad, to know this we would need to do an inventory and Im guessing you had a good look at all the functions we call by now so maybe you know?

But this would need to be tested to see whether we can pull that off, if I look at that example you linked it is at least not something they suggest!

What they say this:

As it is generally a bad idea to expose external pointers ‘as is’, they usually get wrapped as a slot of an S4 class.

Whether we use S4 or R6 for the wrapping doesn't really matter I think, that's just about exposing it to users.

Yeah these pointers would obviously be inside the R6 not exposed to the user that would be terrible.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants