Ensembles module
|
Abstract, base ensemble streaming class |
|
Accuracy Updated Ensemble |
|
Accuracy Weighted Ensemble |
|
|
|
Wang, Yi, Yang Zhang, and Yong Wang. "Mining data streams |
|
Ditzler, Gregory, and Robi Polikar. |
|
Ditzler, Gregory, and Robi Polikar. |
|
Online Bagging. |
|
Oversamping-Based Online Bagging. |
|
Gao, Jing, et al. "Classifying Data Streams with Skewed Class Distributions and Concept Drifts." IEEE Internet Computing 12.6 (2008): 37-49. |
|
Recursive Ensemble Approach. |
|
Streaming Ensemble Algorithm. |
|
Undersampling-Based Online Bagging. |
|
Weighted Aging Ensemble. |
|
Kappa Updated Ensemble |
- class strlearn.ensembles.AUE(base_estimator=None, n_estimators=10, n_splits=5, epsilon=1e-10)
Bases:
StreamingEnsemble
Accuracy Updated Ensemble
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.AWE(base_estimator=None, n_estimators=10, n_splits=5)
Bases:
StreamingEnsemble
Accuracy Weighted Ensemble
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.CDS(base_estimator=None, n_estimators=10, a=2, b=2)
Bases:
StreamingEnsemble
Ditzler, Gregory, and Robi Polikar. “Incremental learning of concept drift from streaming imbalanced data.” IEEE Transactions on Knowledge and Data Engineering 25.10 (2013): 2283-2301.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.DWM(base_estimator=None, beta=0.5, theta=0.01, p=1, weighted=False)
Bases:
StreamingEnsemble
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.KMC(base_estimator=None, n_estimators=10)
Bases:
StreamingEnsemble
- Wang, Yi, Yang Zhang, and Yong Wang. “Mining data streams
with skewed distribution by static classifier ensemble.” Opportunities and Challenges for Next-Generation Applied Intelligence. Springer, Berlin, Heidelberg, 2009. 65-71.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.KUE(base_estimator=None, n_estimators=10, n_candidates=1)
Bases:
StreamingEnsemble
Kappa Updated Ensemble
- ensemble_support_matrix(X)
Ensemble support matrix.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.NIE(base_estimator=None, n_estimators=5, param_a=1, param_b=1)
Bases:
StreamingEnsemble
Ditzler, Gregory, and Robi Polikar. “Incremental learning of concept drift from streaming imbalanced data.” IEEE Transactions on Knowledge and Data Engineering 25.10 (2013): 2283-2301.
- ensemble_support_matrix(X)
Ensemble support matrix.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.OOB(base_estimator=None, n_estimators=5, time_decay_factor=0.9)
Bases:
StreamingEnsemble
Oversamping-Based Online Bagging.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.OUSE(base_estimator=None, n_estimators=10, n_chunks=10)
Bases:
ClassifierMixin
,BaseEnsemble
Gao, Jing, et al. “Classifying Data Streams with Skewed Class Distributions and Concept Drifts.” IEEE Internet Computing 12.6 (2008): 37-49.
- ensemble_support_matrix(X)
Ensemble support matrix.
- fit(X, y)
Fitting.
- minority_majority_name(y)
Returns minority and majority data
- minority_majority_split(X, y, minority_name, majority_name)
Returns minority and majority data
- Parameters:
X (array-like, shape (n_samples, n_features)) – The training input samples.
y (array-like, shape (n_samples)) – The target values.
- Return type:
tuple (array-like, shape = [n_samples, n_features], array-like, shape = [n_samples, n_features])
- Returns:
Tuple of minority and majority class samples
- partial_fit(X, y, classes=None)
Partial fitting.
- predict(X)
Predict classes for X.
- Parameters:
X (array-like, shape (n_samples, n_features)) – The training input samples.
- Return type:
array-like, shape (n_samples, )
- Returns:
The predicted classes.
- class strlearn.ensembles.OnlineBagging(base_estimator=None, n_estimators=10)
Bases:
StreamingEnsemble
Online Bagging.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.REA(base_estimator=None, n_estimators=10, post_balance_ratio=0.5, k_parameter=10, weighted=False, pruning=False)
Bases:
StreamingEnsemble
Recursive Ensemble Approach.
Sheng Chen, and Haibo He. “Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach.” Evolving Systems 2.1 (2011): 35-50.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.ROSE(base_estimator=None, n_estimators=10, n_candidates=1, subspace_mean=0.7, buffer_limit=1000, min_lambda=4)
Bases:
StreamingEnsemble
Robust Online Self-Adjusting Ensemble
- ensemble_support_matrix(X)
Ensemble support matrix.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.SEA(base_estimator=None, n_estimators=10, metric=<function accuracy_score>)
Bases:
StreamingEnsemble
Streaming Ensemble Algorithm.
Ensemble classifier composed of estimators trained on the fixed number of previously seen data chunks, prunning the worst one in the pool.
- Parameters:
n_estimators (integer, optional (default=10)) – The maximum number of estimators trained using consecutive data chunks and maintained in the ensemble.
metric (function, optional (default=accuracy_score)) – The metric used to prune the worst classifier in the pool.
- Variables:
ensemble (list of classifiers) – The collection of fitted sub-estimators.
classes (array-like, shape (n_classes, )) – The class labels.
- Example:
>>> import strlearn as sl >>> stream = sl.streams.StreamGenerator() >>> clf = sl.ensembles.SEA() >>> evaluator = sl.evaluators.TestThenTrainEvaluator() >>> evaluator.process(clf, stream) >>> print(evaluator.scores_) ... [[0.92 0.91879699 0.91848191 0.91879699 0.92523364] [0.945 0.94648779 0.94624912 0.94648779 0.94240838] [0.925 0.92364329 0.92360881 0.92364329 0.91017964] ... [0.925 0.92427885 0.924103 0.92427885 0.92890995] [0.89 0.89016179 0.89015879 0.89016179 0.88297872] [0.935 0.93569212 0.93540766 0.93569212 0.93467337]]
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.StreamingEnsemble(base_estimator, n_estimators, weighted=False)
Bases:
ClassifierMixin
,BaseEstimator
Abstract, base ensemble streaming class
- ensemble_support_matrix(X)
Ensemble support matrix.
- fit(X, y)
Fitting.
- minority_majority_name(y)
Returns minority and majority data
- minority_majority_split(X, y, minority_name, majority_name)
Returns minority and majority data
- Parameters:
X (array-like, shape (n_samples, n_features)) – The training input samples.
y (array-like, shape (n_samples)) – The target values.
- Return type:
tuple (array-like, shape = [n_samples, n_features], array-like, shape = [n_samples, n_features])
- Returns:
Tuple of minority and majority class samples
- msei(clf, X, y)
MSEi score from original AWE algorithm.
- mser(y)
MSEr score from original AWE algorithm.
- partial_fit(X, y, classes=None)
Partial fitting
- predict(X)
Predict classes for X.
- Parameters:
X (array-like, shape (n_samples, n_features)) – The training input samples.
- Return type:
array-like, shape (n_samples, )
- Returns:
The predicted classes.
- predict_proba(X)
Predict proba.
- prior_proba(y)
Calculate prior probability for given labels
- class strlearn.ensembles.UOB(base_estimator=None, n_estimators=5, time_decay_factor=0.9)
Bases:
StreamingEnsemble
Undersampling-Based Online Bagging.
- partial_fit(X, y, classes=None)
Partial fitting
- class strlearn.ensembles.WAE(base_estimator=None, n_estimators=10, theta=0.1, post_pruning=False, pruning_criterion='accuracy', weight_calculation_method='kuncheva', aging_method='weights_proportional', rejuvenation_power=0.0)
Bases:
StreamingEnsemble
Weighted Aging Ensemble.
The method was inspired by Accuracy Weighted Ensemble (AWE) algorithm to which it introduces two main modifications: (I) classifier weights depend on the individual classifier accuracies and time they have been spending in the ensemble, (II) individual classifier are chosen on the basis on the non-pairwise diversity measure.
- Parameters:
base_estimator (ClassifierMixin class object) – Classification algorithm used as a base estimator.
n_estimators (integer, optional (default=10)) – The maximum number of estimators trained using consecutive data chunks and maintained in the ensemble.
theta (float, optional (default=0.1)) – Threshold for weight calculation method and aging procedure control.
post_pruning (boolean, optional (default=False)) – Whether the pruning is conducted before or after adding the classifier.
pruning_criterion (string, optional (default='accuracy')) – Selection of pruning criterion.
weight_calculation_method (string, optional (default='kuncheva')) – same_for_each, proportional_to_accuracy, kuncheva, pta_related_to_whole, bell_curve,
aging_method (string, optional (default='weights_proportional')) – weights_proportional, constant, gaussian.
rejuvenation_power (float, optional (default=0.0)) – Rejuvenation dynamics control of classifiers with high prediction accuracy.
- Variables:
ensemble (list of classifiers) – The collection of fitted sub-estimators.
classes (array-like, shape (n_classes, )) – The class labels.
weights (array-like, shape (n_estimators, )) – Classifier weights.
- Examples:
>>> import strlearn as sl >>> from sklearn.naive_bayes import GaussianNB >>> stream = sl.streams.StreamGenerator() >>> clf = sl.ensembles.WAE(GaussianNB()) >>> ttt = sl.evaluators.TestThenTrain( >>> metrics=(sl.metrics.balanced_accuracy_score)) >>> ttt.process(stream, clf) >>> print(ttt.scores) [[[0.91386218] [0.93032581] [0.90907219] [0.90544872] [0.90466186] [0.91956783] [0.90776942] [0.92685422] [0.92895186] ...
- partial_fit(X, y, classes=None)
Partial fitting