imputegap.recovery.optimization package¶
Submodules¶
Module contents¶
- class imputegap.recovery.optimization.BaseOptimizer[source]¶
Bases:
object
A base class for optimization of imputation algorithm hyperparameters.
Provides structure and common functionality for different optimization strategies.
Methods¶
- _objective(**kwargs):
Abstract method to evaluate the imputation algorithm with the provided parameters. Must be implemented by subclasses.
- optimize(input_data, incomp_data, metrics, algorithm, **kwargs):
Abstract method for the main optimization process. Must be implemented by subclasses.
- optimize(input_data, incomp_data, metrics, algorithm, **kwargs)[source]¶
Abstract method for optimization. Must be implemented in subclasses.
This method performs the optimization of hyperparameters for a given imputation algorithm. Each subclass implements a different optimization strategy (e.g., Greedy, Bayesian, Particle Swarm) and uses the _objective function to evaluate the parameters.
Parameters¶
- input_datanumpy.ndarray
The ground truth time series dataset.
- incomp_datanumpy.ndarray
The contaminated time series dataset to impute.
- metricslist of str
List of selected metrics for optimization.
- algorithmstr
The imputation algorithm to optimize.
- **kwargsdict
Additional parameters specific to the optimization strategy (e.g., number of iterations, particles, etc.).
Returns¶
- tuple
A tuple containing the best parameters and their corresponding score.
- class imputegap.recovery.optimization.Optimization[source]¶
Bases:
object
A class for performing optimization of imputation algorithm hyperparameters.
This class contains methods for various optimization strategies such as Greedy, Bayesian, Particle Swarm, and Successive Halving, used to find the best parameters for different imputation algorithms.
Methods¶
- Greedy.optimize(input_data, incomp_data, metrics=[“RMSE”], algorithm=”cdrec”, n_calls=250):
Perform greedy optimization for hyperparameters.
- Bayesian.optimize(input_data, incomp_data, metrics=[“RMSE”], algorithm=”cdrec”, n_calls=100, n_random_starts=50, acq_func=’gp_hedge’):
Perform Bayesian optimization for hyperparameters.
- ParticleSwarm.optimize(input_data, incomp_data, metrics, algorithm, n_particles, c1, c2, w, iterations, n_processes):
Perform Particle Swarm Optimization (PSO) for hyperparameters.
- SuccessiveHalving.optimize(input_data, incomp_data, metrics, algorithm, num_configs, num_iterations, reduction_factor):
Perform Successive Halving optimization for hyperparameters.
- class Bayesian[source]¶
Bases:
BaseOptimizer
Bayesian optimization strategy for hyperparameters.
- optimize(input_data, incomp_data, metrics=['RMSE'], algorithm='cdrec', n_calls=100, n_random_starts=50, acq_func='gp_hedge')[source]¶
Perform Bayesian optimization for hyperparameters.
Parameters¶
- input_datanumpy.ndarray
The ground truth time series dataset.
- incomp_datanumpy.ndarray
The contaminated time series dataset to impute.
- metricslist of str, optional
List of selected metrics for optimization (default is [“RMSE”]).
- algorithmstr, optional
The imputation algorithm to optimize (default is ‘cdrec’).
- n_callsint, optional
Number of calls to the objective function (default is 100).
- n_random_startsint, optional
Number of random initial points (default is 50).
- acq_funcstr, optional
Acquisition function for the Gaussian prior (default is ‘gp_hedge’).
Returns¶
- tuple
A tuple containing the best parameters and their corresponding score.
- class Greedy[source]¶
Bases:
BaseOptimizer
Greedy optimization strategy for hyperparameters.
- optimize(input_data, incomp_data, metrics=['RMSE'], algorithm='cdrec', n_calls=250)[source]¶
Perform greedy optimization for hyperparameters.
Parameters¶
- input_datanumpy.ndarray
The ground truth time series dataset.
- incomp_datanumpy.ndarray
The contaminated time series dataset to impute.
- metricslist of str, optional
List of selected metrics for optimization (default is [“RMSE”]).
- algorithmstr, optional
The imputation algorithm to optimize (default is ‘cdrec’).
- n_callsint, optional
Number of calls to the objective function (default is 250).
Returns¶
- tuple
A tuple containing the best parameters and their corresponding score.
- class ParticleSwarm[source]¶
Bases:
BaseOptimizer
Particle Swarm Optimization (PSO) strategy for hyperparameters.
- optimize(input_data, incomp_data, metrics, algorithm, n_particles, c1, c2, w, iterations, n_processes)[source]¶
Perform Particle Swarm Optimization for hyperparameters.
Parameters¶
- input_datanumpy.ndarray
The ground truth time series dataset.
- incomp_datanumpy.ndarray
The contaminated time series dataset to impute.
- metricslist of str, optional
List of selected metrics for optimization (default is [“RMSE”]).
- algorithmstr, optional
The imputation algorithm to optimize (default is ‘cdrec’).
- n_particlesint
Number of particles used in PSO.
- c1float
PSO parameter, personal learning coefficient.
- c2float
PSO parameter, global learning coefficient.
- wfloat
PSO parameter, inertia weight.
- iterationsint
Number of iterations for the optimization.
- n_processesint
Number of processes during optimization.
Returns¶
- tuple
A tuple containing the best parameters and their corresponding score.
- class RayTune[source]¶
Bases:
BaseOptimizer
RayTune optimization strategy for hyperparameters.
- optimize(input_data, incomp_data, metrics=['RMSE'], algorithm='cdrec', n_calls=1, max_concurrent_trials=-1)[source]¶
Perform Ray Tune optimization for hyperparameters.
Parameters¶
- input_datanumpy.ndarray
The ground truth time series dataset.
- metricslist of str, optional
List of selected metrics for optimization (default is [“RMSE”]).
- algorithmstr, optional
The imputation algorithm to optimize (default is ‘cdrec’).
- n_callsint, optional
Number of calls to the objective function (default is 10).
- max_concurrent_trialsint, optional
Number of trials run in parallel, related to your total memory / cpu / gpu (default is 2). Please increase the value if you have more resources
Returns¶
- tuple
A tuple containing the best parameters and their corresponding score.
- class SuccessiveHalving[source]¶
Bases:
BaseOptimizer
- optimize(input_data, incomp_data, metrics, algorithm, num_configs, num_iterations, reduction_factor)[source]¶
Perform Successive Halving optimization for hyperparameters.
Parameters¶
- input_datanumpy.ndarray
The ground truth time series dataset.
- incomp_datanumpy.ndarray
The contaminated time series dataset to impute.
- metricslist of str, optional
List of selected metrics for optimization (default is [“RMSE”]).
- algorithmstr, optional
The imputation algorithm to optimize (default is ‘cdrec’).
- num_configsint
Number of configurations to try.
- num_iterationsint
Number of iterations for the optimization.
- reduction_factorint
Reduction factor for the number of configurations kept after each iteration.
Returns¶
- tuple
A tuple containing the best parameters and their corresponding score.