You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Editor comments:
The iris dataset is very well-known, but it is also infamous because of its eugenics links.
Since having a good example dataset is very important, would you consider replacing it with another one, like maybe the palmerpenguins one, even if it comes at the cost of adding a (possibly optional) dependency?
The text was updated successfully, but these errors were encountered:
Frankly, @maelle, this is something that is not necessarily going to be fully resolved. I think that this issue has many consequences.
iris is the scientific dataset that comes with base R, and the objection should be communicated to the base R team. I think that at least in the basic README a relevant dataset that is available on all R installations is needed.
The other R Built-in Data Sets all have their problems. mtcars and ToothGrowth are not well defined. USArrests is about rape victims, I do not like that. PlantGrowth is too simple in strucutre.
If somebody digs up a bit more about the mtcars I open ot replace iris with that in the README.
open to remove the iris dataset from the vignettes
pen to remove the iris datasets from the tests, if there are volunteers to do it. It would require rewriting more than 100 unit tests, and I can commit to gradually do this, but it would be an unjustified burden on the author to do this.
On a less procedural note, I think that most R users do not associate the iris flowers dataset with eugenics. Before adding it extensively to the package, I read about the history of the dataset extensively, and this did not even come up. I would really like to balance the sensitivity of people who may have such connotations and those who are sensitive to renaming things and censoring scientific history. The iris dataset is part of R perhaps since the beginnig, and there are about 2 million R users who are familiar with it. For most of these people, the iris dataset is associated with open data, open source programming and statistics.
I am open to this suggestion but rewriting more than a hundred of unit tests makes this a low priority. In the README, I can only imagine the use of a base R dataset. I will remove iris from the vignettes gradually and replace it with the pinguins or something similar.
Editor comments:
The iris dataset is very well-known, but it is also infamous because of its eugenics links.
Since having a good example dataset is very important, would you consider replacing it with another one, like maybe the palmerpenguins one, even if it comes at the cost of adding a (possibly optional) dependency?
The text was updated successfully, but these errors were encountered: