LeveragingBagging#
- class capymoa.classifier.LeveragingBagging[source]#
- Bases: - MOAClassifier- Leveraging Bagging for evolving data streams using ADWIN. - Leveraging Bagging for evolving data streams using ADWIN [1] is a meta-strategy. - >>> from capymoa.classifier import LeveragingBagging >>> from capymoa.datasets import ElectricityTiny >>> from capymoa.evaluation import prequential_evaluation >>> >>> stream = ElectricityTiny() >>> classifier = LeveragingBagging(stream.get_schema()) >>> results = prequential_evaluation(stream, classifier, max_instances=1000) >>> print(f"{results['cumulative'].accuracy():.1f}") 87.4 - __init__(
- schema=None,
- CLI=None,
- random_seed=1,
- base_learner=None,
- ensemble_size=100,
- minibatch_size=None,
- number_of_jobs=None,
- Construct a Leveraging Bagging classifier. - Parameters:
- schema – The schema of the stream. If not provided, it will be inferred from the data. 
- CLI – Command Line Interface (CLI) options for configuring the ARF algorithm. If not provided, default options will be used. 
- random_seed – Seed for the random number generator. 
- base_learner – The base learner to use. If not provided, a default Hoeffding Tree is used. 
- ensemble_size – The number of trees in the ensemble. 
- minibatch_size – The number of instances that a learner must accumulate before training. 
- number_of_jobs – The number of parallel jobs to run during the execution of the algorithm. By default, the algorithm executes tasks sequentially (i.e., with number_of_jobs=1). Increasing the number_of_jobs can lead to faster execution on multi-core systems. However, setting it to a high value may consume more system resources and memory. This implementation focuses more on performance, therefore the predictive performance is modified. It’s recommended to experiment with different values to find the optimal setting based on the available hardware resources and the nature of the workload. 
 
 
 - predict(instance)[source]#
- Predict the label of an instance. - The base implementation calls - predict_proba()and returns the label with the highest probability.- Parameters:
- instance – The instance to predict the label for. 
- Returns:
- The predicted label or - Noneif the classifier is unable to make a prediction.
 
 - predict_proba(instance)[source]#
- Return probability estimates for each label. - Parameters:
- instance – The instance to estimate the probabilities for. 
- Returns:
- An array of probabilities for each label or - Noneif the classifier is unable to make a prediction.
 
 - train(instance)[source]#
- Train the classifier with a labeled instance. - Parameters:
- instance – The labeled instance to train the classifier with. 
 
 - random_seed: int#
- The random seed for reproducibility. - When implementing a classifier ensure random number generators are seeded.