imputegap.tools.utils package

The imputegap.tools.utils package provides various utility functions and tools for handling algorithm parameters, evaluation, and other operations in the imputation process.

Submodules

Modules

Submodule Documentation

imputegap.tools.utils module

imputegap.tools.utils.config_contamination(ts, pattern, dataset_rate=0.4, series_rate=0.4, block_size=10, offset=0.1, seed=True, limit=1, shift=0.05, std_dev=0, explainer=False, probabilities=None)[source]

Configure and execute contamination for selected imputation algorithm and pattern.

Parameters

ratefloat

Mean parameter for contamination missing percentage rate.

ts_testTimeSeries

A TimeSeries object containing dataset.

patternstr

Type of contamination pattern (e.g., “mcar”, “mp”, “blackout”, “disjoint”, “overlap”, “gaussian”).

block_size_mcarint

Size of blocks removed in MCAR

Returns

TimeSeries

TimeSeries object containing contaminated data.

imputegap.tools.utils.config_impute_algorithm(incomp_data, algorithm)[source]

Configure and execute algorithm for selected imputation imputer and pattern.

Parameters

incomp_dataTimeSeries

TimeSeries object containing dataset.

algorithmstr

Name of algorithm

Returns

BaseImputer

Configured imputer instance with optimal parameters.

imputegap.tools.utils.display_title(title='Master Thesis', aut='Quentin Nater', lib='ImputeGAP', university='University Fribourg')[source]

Display the title and author information.

Parameters

titlestr, optional

The title of the thesis (default is “Master Thesis”).

autstr, optional

The author’s name (default is “Quentin Nater”).

libstr, optional

The library or project name (default is “ImputeGAP”).

universitystr, optional

The university or institution (default is “University Fribourg”).

Returns

None

imputegap.tools.utils.load_parameters(query: str = 'default', algorithm: str = 'cdrec', dataset: str = 'chlorine', optimizer: str = 'b', path=None)[source]

Load default or optimal parameters for algorithms from a TOML file.

Parameters

querystr, optional

‘default’ or ‘optimal’ to load default or optimal parameters (default is “default”).

algorithmstr, optional

Algorithm to load parameters for (default is “cdrec”).

datasetstr, optional

Name of the dataset (default is “chlorine”).

optimizerstr, optional

Optimizer type for optimal parameters (default is “b”).

pathstr, optional

Custom file path for the TOML file (default is None).

Returns

tuple

A tuple containing the loaded parameters for the given algorithm.

imputegap.tools.utils.load_share_lib(name='lib_cdrec', lib=True)[source]

Load the shared library based on the operating system.

Parameters

namestr, optional

The name of the shared library (default is “lib_cdrec”).

libbool, optional

If True, the function loads the library from the default ‘imputegap’ path; if False, it loads from a local path (default is True).

Returns

ctypes.CDLL

The loaded shared library object.

imputegap.tools.utils.save_optimization(optimal_params, algorithm='cdrec', dataset='', optimizer='b', file_name=None)[source]

Save the optimization parameters to a TOML file for later use without recomputing.

Parameters

optimal_paramsdict

Dictionary of the optimal parameters.

algorithmstr, optional

The name of the imputation algorithm (default is ‘cdrec’).

datasetstr, optional

The name of the dataset (default is an empty string).

optimizerstr, optional

The name of the optimizer used (default is ‘b’).

file_namestr, optional

The name of the TOML file to save the results (default is None).

Returns

None

imputegap.tools.utils.search_path(set_name='test')[source]

Find the accurate path for loading test files.

Parameters

set_namestr, optional

Name of the dataset (default is “test”).

Returns

str

The correct file path for the dataset.

imputegap.tools.utils.verification_limitation(percentage, low_limit=0.01, high_limit=1.0)[source]

Format and verify that the percentage given by the user is within acceptable bounds.

Parameters

percentagefloat

The percentage value to be checked and potentially adjusted.

low_limitfloat, optional

The lower limit of the acceptable percentage range (default is 0.01).

high_limitfloat, optional

The upper limit of the acceptable percentage range (default is 1.0).

Returns

float

Adjusted percentage based on the limits.

Raises

ValueError

If the percentage is outside the accepted limits.

Notes

  • If the percentage is between 1 and 100, it will be divided by 100 to convert it to a decimal format.

  • If the percentage is outside the low and high limits, the function will print a warning and return the original value.