/

### Blending Independent Components and Principal Components Analysis 2.3 The underlying rationale for ICA

Next page

2.3          The underlying rationale for ICA

ICA is based on the generic yet often physically realistic assumption that if different input signals are coming from different underlying physical processes then these input signals will be largely independent of each other. ICA aims to identify how to decompose output signals into (linear combination) mixtures of different input signals that are as independent as possible of each other. Several variants exist, which we also describe below, where ‘independent’ is replaced by an alternative statistical property that we might also expect might differentiate between input signals.

ICA is related to more traditional methods of analysing large data sets believed to involve linear combinations of underlying factors, such as principal components analysis (PCA) and factor analysis (FA). However, it arguably differs from them in important ways. ICA seeks to find a set of independent source signals. In contrast, PCA and FA seek to find a set of signals which are merely uncorrelated with each other. By uncorrelated we mean that the correlation coefficients between the different supposed input signals are zero. Lack of correlation is a potentially much weaker property than independence. Independence implies a lack of correlation, but lack of correlation does not imply independence. The correlation coefficient in effect ‘averages’ the correlation across the entire distributional form. For example, two signals might be strongly positively correlated in one tail, strongly negatively correlated in another tail, and show little correspondence in the middle of the distribution. The correlation between them, as measured by their correlation coefficient, might thus be zero, but it would be wrong then to conclude that the behaviour of the two signals were independent of each other (particularly, in this instance, in the tails of the distributional form).

How this works in practice can perhaps best be introduced, as in Stone (2004), by using the example of two people speaking into two different microphones, the aim of the exercise being to differentiate, as far as possible, between the two voices. The microphones give different weights to the different voices (e.g. there might be a muffler between one of the speakers and one of the microphones). To simplify matters the microphones are assumed to be equidistant from each source, so that phase differentials are not relevant to the problem at hand. ICA and related techniques rely on the following observations:

(a)    The two input signals, i.e. the two individual voices, are likely to be largely independent of each other, when examined at fine time intervals. However, the two output signals, i.e. the signals coming from the microphones will not be as independent, since they involve mixtures (albeit differently weighted) of the same underlying input signals.

(b)   If histograms of the amplitudes of each voice (when examined at these fine time intervals) are plotted then they will most probably differ from the traditional bell-shaped histogram corresponding to random noise. Conversely, the signal mixtures are likely to be more normal in nature.

(c)    The temporal complexity of any mixture is typically greater than (or equal to) that of its simplest, i.e. least complex, constituent source signal.

These observations lead to the following algorithm for source signal extraction:

If source signals have some property X and signal mixtures do not (or have less of it) then given a set of signal mixtures we should attempt to extract signals with as much X as possible, since these extracted signals are then likely to correspond as closely as possible to the original source signals.

Different variants of ICA and its related techniques ‘unmix’ output signals, thus aiming to recover the original input signals, by substituting ‘independence’, ‘non-normality’ and ‘lack of complexity’ for X in the above prescription.