- given some data you transform the data to a new coordinate system
- such that the greatest variance comes to lie on the 1st coordinate axis
- the second greatest variance comes to lie on the 2nd coordinate axis
- and so on

- if your data points are N dimensional and you use the new PCA N dimensional representation, you have the data points described in a coordinate system that better fits to the distribution of your data in the N dimensional space
- if you desribe the data using only the first M « N dimensions, i.e., its projection on the first M principal axes, you compress your data!

- the principal components correspond to the Eigenvectors of the covariance matrix
- so compute the Eigenvectors of the data covariance matrix
- e.g.
- using the QR-algorithm
- using the Eigendecomposition
- using the Singular Value Decomposition (SVD) (see video below to see how to compute the Eigendecomposition using SVD)

Well, for a data matrix X actually we don't use the real covariance matrix C:

C_ij = E[ (X_i - mean_i) (X_j - meanj) ]

but the sample covariance matrix.

Nicely explained by Prof. Alexander Ihler what the PCA is good for, how to compute it, and how to use it using the example of Eigen-Face representations of arbitrary faces.

BTW: Alexander Ihler has also more machine learning related video tutorials on his youtube channel

public/principal_component_analysis_pca.txt · Last modified: 2014/01/11 13:21 (external edit) · []