/

### Blending Independent Components and Principal Components Analysis 3.1 Characteristics of PCA

Next page

3.1          Characteristics of PCA

Earlier, we noted some similarities between ICA and PCA but also noted some differences. In particular, ICA focuses on ‘independence’, ‘non-Normality’ and ‘lack of complexity’ and assumes that source signals exhibit these features, whereas PCA focuses merely on lack of correlatedness. Indeed, PCA analysis will even ‘unmix’ pure Gaussian (i.e. normally distributed) signals. Or rather, it will decompose multiple Gaussian output signals into some presumed orthogonal Gaussian input signals, which can be ordered with ones higher up the ordering explaining more of the variability in the output signal ensemble than ones lower down the ordering.

PCA involves calculating the eigenvectors and eigenvalues of the covariance matrix, i.e. the matrix of correlation coefficients between the different signals. Usually, it would be assumed that the covariance matrix had been calculated in a manner that gives equal weight to each data point. However, this is not essential; we could equally use a computation approach in which different weights were given to different data points, e.g. an exponentially decaying weighting that gives greater weight to more recent observations.

For different output signals , with covariance matrix, , between the output signals, PCA searches for the (some possibly degenerate) eigenvectors, , satisfying the following matrix equation for some scalar . For a non-negative definite symmetric matrix (as should be if it actually corresponds to a covariance matrix), the values of are all non-negative. We can therefore order the eigenvalues in descending order . The corresponding eigenvectors are orthogonal i.e. have if and are also typically normalised so that (i.e. so that they have ‘unit length’). For any ’s that are equal, we need to choose a corresponding number of orthonormal eigenvectors that span the relevant subspace.

By we mean the vector such that the signal corresponding to eigenvalue is expressible as: If is the matrix with coefficients then the orthonomalisation convention adopted above means that where is the identity matrix. This means that and hence we may also write the output signals as a linear combination of the eigenvector signals as follows (in each case up to a constant value, since the covariances do not depend on means of series): Additionally, we have if and if . We also note that if are the coefficients of then each individual is (here assuming that we have been using ‘sample’ rather than ‘population’ values for variances: Hence the sum of the variances of each output signal, i.e. the trace of the covariance matrix, satisfies: We can interpret this as indicating that the aggregate variability of the output signals (i.e. the sum of their individual variabilities) is equal to the sum of the eigenvalues. Hence the larger the eigenvalue the more the corresponding eigenvector signal ‘contributes’ to the aggregate variability across the ensemble of possible output signals.