-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Under sampling
fernando edited this page Aug 16, 2014
·
4 revisions
UnderSampler is an object that under-samples the majority class(es) at random with replacement.
Parameters:
- ratio : Controls the number of new samples to draw. The number of new samples is given by int(ratio * num_minority_samples)
- random_state : Seed for random numbers generation.
Methods:
- fit : Find the target statistics to determine the minority class, and the number of samples in each class.
- transform : Returns the re sampled version of the original data set (X, y) passed to fit.
- fit_transform : Automatically performs both fit and transform.
TomekLinks is an object that identifies all Tomek link between the majority and minority class and eliminates the link element that belongs to the majority class.
Parameters:
Methods:
- fit : Find the target statistics to determine the minority class, and the number of samples in each class.
- transform : Returns the re sampled version of the original data set (X, y) passed to fit.
- fit_transform : Automatically performs both fit and transform.
ClusterCentroids is an object that under-samples the majority by replacing cluster of samples by the cluster centroid of a KMeans algorithm.
(Experimental) A KMeans algorithm is fitted to the data, the number of clusters N being decided by the level of under sampling. The majority samples are then completely replaced by the set cluster centroids from KMeans.
Parameters:
- kargs : Dictionary to pass any parameters to the scikit-learn KMeans object.
- ratio : Controls the number of new samples to draw. The number of new samples is given by int(ratio * num_minority_samples)
- random_state : Seed for random numbers generation.
Methods:
- fit : Find the target statistics to determine the minority class, and the number of samples in each class.
- transform : Returns the re sampled version of the original data set (X, y) passed to fit.
- fit_transform : Automatically performs both fit and transform.