imputegap.algorithms.cdrec package

The imputegap.algorithms.cdrec package contains various imputation algorithms used for handling missing values in time series data.

Submodules

Modules

imputegap.algorithms.cdrec.cdrec(incomp_data, truncation_rank, iterations, epsilon, logs=True, lib_path=None)[source]

CDRec algorithm for matrix imputation of missing values using Centroid Decomposition.

Parameters

incomp_datanumpy.ndarray

The input matrix with contamination (missing values represented as NaNs).

truncation_rankint

The truncation rank for matrix decomposition (must be greater than 1 and smaller than the number of series).

epsilonfloat

The learning rate (stopping criterion threshold).

iterationsint

The maximum number of iterations allowed for the algorithm.

logsbool, optional

Whether to log the execution time (default is True).

lib_pathstr, optional

Custom path to the shared library file (default is None).

Returns

numpy.ndarray

The imputed matrix with missing values recovered.

Example

>>> recov_data = cdrec(incomp_data=incomp_data, truncation_rank=1, iterations=100, epsilon=0.000001, logs=True)
>>> print(recov_data)
imputegap.algorithms.cdrec.native_cdrec(__py_matrix, __py_rank, __py_epsilon, __py_iterations)[source]

Perform matrix imputation using the CDRec algorithm with native C++ support.

Parameters

__py_matrixnumpy.ndarray

The input matrix with missing values (NaNs).

__py_rankint

The truncation rank for matrix decomposition (must be greater than 0 and less than the number of columns).

__py_epsilonfloat

The epsilon value, used as the threshold for stopping iterations based on difference.

__py_iterationsint

The maximum number of allowed iterations for the algorithm.

Returns

numpy.ndarray

The recovered matrix after imputation.

References

Khayati, M., Cudré-Mauroux, P. & Böhlen, M.H. Scalable recovery of missing blocks in time series with high and low cross-correlations. Knowl Inf Syst 62, 2257–2280 (2020). https://doi.org/10.1007/s10115-019-01421-7