Skip to content

Commit

Permalink
Merge pull request #1 from flystar233/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
flystar233 authored Sep 5, 2024
2 parents 4022944 + f5d6291 commit 60c33b7
Show file tree
Hide file tree
Showing 13 changed files with 148 additions and 87 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Authors@R:
family = "Xu",
role = c("aut", "cre"),
email = "flystar233@gmail.com")
Description: Provides a method to find the outlier in custom data by quantile random forests("Quantile Regression Forests", Journal of Machine Learning Research, 7(Jun), 983-999, 2006.). It directly calls the ranger function of the ranger package to perform data fitting and prediction. We also implement the evaluation of outlier prediction results.Compared with random forest detection of outliers, this method has higher accuracy and stability.
Description: Provides a method to find the outlier in custom data by quantile random forests method. Introduced by Meinshausen Nicolai (2006) <doi:10.5555/1248547.1248582>. It directly calls the ranger() function of the 'ranger' package to perform data fitting and prediction. We also implement the evaluation of outlier prediction results. Compared with random forest detection of outliers, this method has higher accuracy and stability on large datasets.
LazyData: false
License: MIT + file LICENSE
Depends:
Expand Down
4 changes: 2 additions & 2 deletions R/generateOutliers.r
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@
#' @returns data with some outliers.
#' @export
#' @examples
#' generateOutliers(iris, p = 0.05, sd_factor = 5, seed = 123)
#' generateOutliers(iris, p = 0.05, sd_factor = 5)
generateOutliers <- function(data, p = 0.05, sd_factor = 5, seed = NULL){
if (p < 0 || p > 1|| sd_factor <= 0) {
stop("p and sd_factor must be between 0 and 1")
}
if (is.null(seed)) {
set.seed(123)
set.seed(as.numeric(Sys.time()))
} else {
set.seed(seed)
}
Expand Down
10 changes: 8 additions & 2 deletions R/outqrf.r
Original file line number Diff line number Diff line change
Expand Up @@ -87,10 +87,11 @@ get_right_rank <- function(response,outMatrix,median_outMatrix,rmse_){
#' This function finds outliers in a dataset using quantile random forests.
#'
#' @param data a data frame
#' @param quantiles_type '1000':seq(from = 0.001, to = 0.999, by = 0.001), '400':seq(0.0025,0.9975,0.0025)
#' @param quantiles_type specify the type of quantile generation.Default is 1000.
#' @param threshold a threshold for outlier detection
#' @param verbose a boolean value indicating whether to print verbose output
#' @param impute a boolean value indicating whether to impute missing values
#' @param weight a boolean value indicating whether to use weight. if TRUE, The actual threshold will be threshold*r2.
#' @param ... additional arguments passed to the ranger function
#' @return
#' An object of class "outqrf" and a list with the following elements.
Expand Down Expand Up @@ -119,6 +120,7 @@ outqrf <-function(data,
threshold =0.025,
impute = TRUE,
verbose = 1,
weight = FALSE,
...){
# Initial check
if (!is.data.frame(data)) {
Expand Down Expand Up @@ -191,7 +193,11 @@ outqrf <-function(data,
rmse <- c(rmse,rmse_)
rank_value <- get_right_rank(response,outMatrix,median_outMatrix,rmse_)
outlier <- data.frame(row = as.numeric(row.names(data)),col = v,observed = response, predicted = median_outMatrix,rank = rank_value)
outlier<- outlier|>dplyr::filter(rank<=threshold_low| rank>=threshold_high)
if (weight){
outlier<- outlier|>dplyr::filter(rank<=threshold_low*qrf$r.squared| rank>=1-threshold_low*qrf$r.squared)}
else{
outlier<- outlier|>dplyr::filter(rank<=threshold_low| rank>=threshold_high)
}
outliers <- rbind(outliers,outlier)
}
# names of the variables
Expand Down
31 changes: 24 additions & 7 deletions docs/articles/outqrf.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions docs/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ pkgdown: 2.1.0
pkgdown_sha: ~
articles:
outqrf: outqrf.html
last_built: 2024-08-28T05:43Z
last_built: 2024-09-04T09:48Z
urls:
reference: flystar233.github.io/outqrf/reference
article: flystar233.github.io/outqrf/articles
Loading

0 comments on commit 60c33b7

Please sign in to comment.