Benchmarking ICA algorithms

Alexios Galanos

Code

Sys.setenv(TZ = "UTC")
# load required packages
library(iTensor)
library(tsmarch)
library(flextable)
library(data.table)
library(xts)

Motivation

The Generalized Orthogonal GARCH (GOGARCH) model uses Independent Component Analysis (ICA) to extract independent sources of volatility from multivariate time series, enabling more precise and interpretable modeling of their joint dynamic structure.

In the tsmarch package there are 2 ICA algorithms currently implemented:

The RADICAL algorithm of Learned-Miller and Fisher (2003) which is based on an efficient entropy estimator (due to Vasicek (1976)) which is robust to outliers and requires no strong characterization assumptions about the data generating process. However, despite best efforts to speed up the original implementation using multi-threading, this remains a relatively slow algorithm and also suffers from convergence issues in high dimensions.
The FASTICA algorithm of Hyvarinen (1999) is a fixed point iteration method which has proven to be extremely fast and very suitable for high dimensional systems.

Both of the algorithms are implemented locally in tsmarch, and the FASTICA is a direct translation of the Matlab code of the original authors, retaining a lot of additional options not present in other translations in R. Both of the algorithms provide exactly the same type of interface for the whitening stage such as different options for the type of covariance to use.

This short blog post looks at extending the benchmark provided in the iTensor package vignette in order to include the 2 ICA implementations in the tsmarch package.

Benchmark

The benchmark uses a set of 4 data sets with known independent sources (S) and uses the Correlation Index of (Sobhani et al. 2022) to evaluate the performance of each ICA algorithm in extracting the true signals from the mixed series (X).

The 4 data sets are generated from the following processes:

ICA with time-independent sub-Gaussian data (tisubg)
ICA with time-independent super-Gaussian data (tisupg)
ICA with data mixed with signals having no time dependence and different kurtosis (tidkurt)
ICA with time-dependent data (td)

Code

data <- list(iTensor::toyModel("ICA_Type1"),
             iTensor::toyModel("ICA_Type2"),
             iTensor::toyModel("ICA_Type3"),
             iTensor::toyModel("ICA_Type4"))
allICA <- function(X, J){
  # Classic ICA Methods
  out.FastICA <- ICA(X = X, J = J, algorithm = "FastICA")
  out.InfoMax <- ICA(X = X, J = J, algorithm = "InfoMax")
  out.ExtInfoMax <- ICA(X = X, J = J, algorithm = "ExtInfoMax")
  out.JADE <- ICA2(X = X, J = J, algorithm = "JADE")
  out.tsmarch_fastica <- fastica(X, components = J, fun = "tanh")
  out.tsmarch_radical <- radical(X, components = J, augment = TRUE, replications = 100)
  list(out.FastICA = out.FastICA, out.InfoMax = out.InfoMax,
       out.ExtInfoMax = out.ExtInfoMax, out.JADE = out.JADE,
       out.tsmarch_fastica = out.tsmarch_fastica,
       out.tsmarch_radical = out.tsmarch_radical)
}
data_names <- c("tisubg","tisupg","tidkurt","td")
out <- lapply(1:4, function(i) {
  allICA(X = data[[i]]$X_observed, J = 3)
})
corr_index <- lapply(1:4, function(i) {
  tmp <- lapply(out[[i]], function(x, S) {CorrIndex(cor(S, Re(x$S)))}, S = data[[i]]$S_true)
  Name <- gsub("out.", "", names(tmp))
  Value <- round(unlist(tmp), 3)
  data.table(Algorithm = Name, DGP = data_names[i], Value = Value)
})
tab <- dcast(rbindlist(corr_index), Algorithm~DGP, value.var = "Value")
tab[Algorithm == "tsmarch_fastica", Algorithm := "FASTICA (tsmarch)"]
tab[Algorithm == "tsmarch_radical", Algorithm := "RADICAL (tsmarch)"]
tab <- flextable(tab) |> separate_header() |> flextable::colformat_double(digits = 3)
tab <- tab |> add_footer_row(colwidths = 5, values = as_paragraph(
  "Notes to table: td (time dependent), tidkurt (data mixed with signals having no time dependence and different kurtosis), tisupg (time-independent super-Gaussian data), tisubg (time-independent sub-Gaussian data). The values represent the Correlation Index of Sobhani (2022) (the closer to 0, the better the performance)."))
tab <- tab |> align(i = 1, j = 1:5, align = "justify", part = "footer")
tab <- tab |> set_caption(caption = "ICA Benchmarks", html_escape = TRUE)
tab <- tab |> set_table_properties(layout = "autofit")
tab

Table 1: ICA Benchmark

Algorithm	td	tidkurt	tisubg	tisupg
ExtInfoMax	0.081	0.213	0.263	0.268
FastICA	0.081	0.230	0.263	0.268
InfoMax	8.026	3.799	7.439	0.322
JADE	0.080	0.408	0.221	0.374
FASTICA (tsmarch)	0.080	0.213	0.267	0.259
RADICAL (tsmarch)	0.106	0.293	0.173	0.342
Notes to table: td (time dependent), tidkurt (data mixed with signals having no time dependence and different kurtosis), tisupg (time-independent super-Gaussian data), tisubg (time-independent sub-Gaussian data). The values represent the Correlation Index of Sobhani (2022) (the closer to 0, the better the performance).

The results of the small toy benchmark show that both ICA implementations in the tsmarch package are very competitive in terms of performance providing a certain degree of confidence.

GOGARCH Demo and Discussion

In this next example, we illustrate the use of the 2 algorithms in the GOGARCH model using dimensionality reduction and compare their weighted output.

Code

data(globalindices)
train_set <- 1:1600
series <- 1:10
y <- as.xts(globalindices[, series])
model_fastica <- gogarch_modelspec(y, distribution = "nig", model = "gjrgarch", ica = "fastica", components = 5) |> estimate(seed = 1010, stabilization = TRUE, fun = "skew", fine = "tanh")
model_radical <- gogarch_modelspec(y, distribution = "nig", model = "gjrgarch", ica = "radical", components = 5) |> estimate(seed = 1010, augment = TRUE)
w_fastica <- tsaggregate(model_fastica, weights = rep(1/10, 10))
w_radical <- tsaggregate(model_radical, weights = rep(1/10, 10))
par(mfrow = c(2,1), mar = c(2,2,2,2))
plot(index(y), as.numeric(w_fastica$sigma), type = "l", ylab = "", xlab = "", main = "Weighted Sigma[t]", xaxt = "n", cex.main = 0.8, cex.axis = 0.8)
lines(index(y), as.numeric(w_radical$sigma), col = 2, lty = 2)
grid()
legend("topleft", c("FASTICA","RADICAL"), col = 1:2, lty = 1:2, bty = "n", cex = 0.8)
plot(index(y), as.numeric(w_fastica$skewness), type = "l", ylab = "", xlab = "", main = "Weighted Skewness[t]", cex.main = 0.8, cex.axis = 0.8)
lines(index(y), as.numeric(w_radical$skewness), col = 2, lty = 2)
grid()

Both algorithms give reasonably close results. However, the FASTICA algorithm has a lot more options, particularly with respect to the choice of non-linearity (fun) and fine tuning (fine) functions. The choices, which relate to assumptions about the underlying source densities, are:

pow3 is optimal for sources with power exponential density,
skew is optimal for sources with density satisfying \(f(x) \propto \exp(ax^3 + bx^2 + cx)\),
gauss is optimal for sources with density satisfying \(f(x) \propto \exp(a\exp(-x^2/2)+bx^2+cx)\),
tanh is optimal for sources with symmetric two-group Gaussian location mixtures.

Additionally, both the tanh and gauss functions take additional parameters which determine the degree of separation of the sources, depending on the underlying densities. A more thorough discussion is available in Virta and Nordhausen (2017).

Based on some experimentation, here are some suggestions when choosing between these algorithms:

When components <= 30, choose RADICAL with augment set to TRUE. For speed, use RcppParallel::setThreadOptions as part of the code is written in C++ and benefits from multi-threading.
When components> 30, use FASTICA, but experiment with different non-linearity functions and the other arguments provided by the algorithm.

Back to top

References

Hyvarinen, Aapo. 1999. “Fast and Robust Fixed-Point Algorithms for Independent Component Analysis.” IEEE Transactions on Neural Networks 10 (3): 626–34.

Learned-Miller, Erik G, and John W Fisher. 2003. “ICA Using Spacings Estimates of Entropy.” The Journal of Machine Learning Research 4: 1271–95.

Sobhani, Elaheh, Pierre Comon, Christian Jutten, and Massoud Babaie-Zadeh. 2022. “CorrIndex: A Permutation Invariant Performance Index.” Signal Processing 195: 108457.

Vasicek, Oldrich. 1976. “A Test for Normality Based on Sample Entropy.” Journal of the Royal Statistical Society Series B: Statistical Methodology 38 (1): 54–59.

Virta, Joni, and Klaus Nordhausen. 2017. “On the Optimal Non-Linearities for Gaussian Mixtures in FastICA.” In Latent Variable Analysis and Signal Separation: 13th International Conference, LVA/ICA 2017, Grenoble, France, February 21-23, 2017, Proceedings 13, 427–37. Springer.

Citation

BibTeX citation:

@online{galanos2024,
  author = {Galanos, Alexios},
  title = {Benchmarking {ICA} Algorithms},
  date = {2024-12-09},
  url = {https://www.nopredict.com/blog/posts/2024-12-09-ica-benchmark/},
  langid = {en}
}

For attribution, please cite this work as:

Galanos, Alexios. 2024. “Benchmarking ICA Algorithms.” December 9, 2024. https://www.nopredict.com/blog/posts/2024-12-09-ica-benchmark/.