presentation1 pca
Post on 04-Jun-2018
228 Views
Preview:
TRANSCRIPT
-
8/13/2019 Presentation1 Pca
1/48
Principal Component
Analysis
Courtesy :University of Louisville, CVIP Lab
-
8/13/2019 Presentation1 Pca
2/48
PCA
PCA is
A backbone of modern data analysis.
Ablack boxthat is widely usedbut poorly understood.PCA
OK lets dispel the magic behind
this black box
-
8/13/2019 Presentation1 Pca
3/48
PCA - Overview
It is a mathematical tool from applied linear
algebra.
It is a simple, non-parametric method ofextractingrelevant information from confusingdata sets.
It provides a roadmap for how toreduceacomplex data set to a lower dimension.
-
8/13/2019 Presentation1 Pca
4/48
What do we need under our BELT?!!!
Basics of statistical measures, e.g. variance andcovariance.
Basics of linear algebra:
-Matrices
-Vector space
-Basis
-Eigen vectors and eigen values
-
8/13/2019 Presentation1 Pca
5/48
Variance
A measure of the spread of the data in a data
set with mean X
Variance is claimed to be the original statistical
measure of spread of data.
5
-
8/13/2019 Presentation1 Pca
6/48
Covariance Variance - measure of the deviation from the mean for points in one
dimension, e.g., heights
Covariance - a measure of how mucheach of the dimensionsvariesfrom the meanwith respect to each other.
Covariance is measured between 2 dimensions to see if there is a
relationship between the 2 dimensions, e.g., number of hours studiedand grade obtained.
The covariance between one dimension and itself is the variance
n
( X
X )(X
X )var(X ) =
cov(X ,Y )
i =1
n
i =1=
i i
( n 1 )
( X X )( Y Y )i i
( n 1)
-
8/13/2019 Presentation1 Pca
7/48
Covariance
What is the interpretation of covariance calculations?
Say you have a 2-dimensional data set
-X: number of hours studied for a subject
-Y: marks obtained in that subject
And assume the covariance value (between X and Y) is:104.53
What does this value mean?
7
-
8/13/2019 Presentation1 Pca
8/48
CovarianceExact value is not as important as its sign.
A positive value of covariance indicates that bothdimensions increase or decrease together, e.g., as thenumber of hours studied increases, the grades in that
subject also increase.
A negative value indicateswhile one increases the otherdecreases, or vice-versa, e.g., active social life vs.performance in ECE Dept.
If covariance is zero: the two dimensions areindependentof each other, e.g., heights of students vs. grades obtainedin a subject.
8
-
8/13/2019 Presentation1 Pca
9/48
Covariance
Why bother with calculating (expensive)
covariance when we could just plot the 2 valuesto see their relationship?
Covariance calculations are used to find
relationships between dimensions in highdimensional data sets (usually greater than 3)
where visualization is difficult.
9
-
8/13/2019 Presentation1 Pca
10/48
Covariance Matrix
Representing covariance among dimensions as amatrix, e.g., for 3 dimensions:
Properties:
-Diagonal:variancesof the variables
-cov(X,Y)=cov(Y,X), hence matrix issymmetricalaboutthe diagonal (upper triangular)
-m-dimensional data will result inmxmcovariancematrix10
-
8/13/2019 Presentation1 Pca
11/48
-
8/13/2019 Presentation1 Pca
12/48
-
8/13/2019 Presentation1 Pca
13/48
-
8/13/2019 Presentation1 Pca
14/48
-
8/13/2019 Presentation1 Pca
15/48
-
8/13/2019 Presentation1 Pca
16/48
-
8/13/2019 Presentation1 Pca
17/48
Transformation Matrices
Scale vector (3,2) by a value 2 to get (6,4)Multiply by the square transformation matrix
And we see that the result is still scaled by 4.
WHY?
A vector consists of both length and direction. Scaling avector only changes its length and not its direction. This isan important observation in the transformation of matricesleading to formation ofeigenvectors and eigenvalues.
Irrespective of how much we scale (3,2) by, the solution(under the given transformation matrix) is always a multiple
of 4. 22
-
8/13/2019 Presentation1 Pca
18/48
Eigenvalue Problem
The eigenvalue problem is any problem having thefollowing form:
A.v=.v
A:mxmmatrixv:mx 1 non-zero vector
: scalar
Any value of for which this equation has asolution is called the eigenvalue of A and the vectorvwhich corresponds to this value is called theeigenvector ofA.
23
-
8/13/2019 Presentation1 Pca
19/48
-
8/13/2019 Presentation1 Pca
20/48
Calculating Eigenvectors & Eigenvalues
Simple matrix algebra shows that:
A.v=.v
A.v-.I.v=0
(A- . I). v = 0
Finding the roots of |A- . I| will give the eigenvaluesand for each of these eigenvalues there will be aneigenvector
Example
25
-
8/13/2019 Presentation1 Pca
21/48
-
8/13/2019 Presentation1 Pca
22/48
-
8/13/2019 Presentation1 Pca
23/48
Example of a problem
We collected mparameters about 100 students:
-Height
-Weight
-Hair color
-Average grade
-
We want to find the most important parameters thatbest describe a student.
-
8/13/2019 Presentation1 Pca
24/48
Example of a problem
Each student has a vector of datawhich describes him of length m:
-(180,70,purple,84,)
We have n = 100 such vectors.Lets put them in one matrix,
where each column is one
student vector.
So we have a mxnmatrix. Thiswill be the input of our problem.
-
8/13/2019 Presentation1 Pca
25/48
Example of a problem
Every student is a vector thatlies in an m-dimensionalvector space spanned by anorthnormal basis.
All data/measurement vectorsin this space are linear
combination of this set ofunit length basis vectors.
n-students
-
8/13/2019 Presentation1 Pca
26/48
-
8/13/2019 Presentation1 Pca
27/48
-
8/13/2019 Presentation1 Pca
28/48
Questions
How we describemost important
features using math?
-Variance
How do we represent our data so thatthe most important features can be
extracted easily?-Change of basis
-
8/13/2019 Presentation1 Pca
29/48
Redundancy
Multiple sensors record the same dynamic information.
Consider a range of possible plots between two arbitrary measurement
typesr1andr2.
Panel(a) depicts two recordings with no redundancy, i.e. they areun-correlated, e.g. persons height and his GPA.
However, in panel(c) both recordings appear to bestrongly related, i.e.
one can be expressed in terms of the other.
-
8/13/2019 Presentation1 Pca
30/48
-
8/13/2019 Presentation1 Pca
31/48
Covariance Matrix
where
The ijthelement of the variance is the dot product between thevector of the ith measurement type with the vector of the jth
measurement type.
-SXis a square symmetricmmmatrix.
-The diagonal terms of SX are the variance of particular
measurement types.
-The off-diagonal terms ofSX are the covariance betweenmeasurement types.
-
8/13/2019 Presentation1 Pca
32/48
Covariance Matrix
where
ComputingSX quantifies the correlations between all possiblepairs of measurements. Between one pair of measurements, a
large covariance corresponds to a situation like panel (c), whilezero covariance corresponds to entirely uncorrelated data as in
panel (a).
-
8/13/2019 Presentation1 Pca
33/48
PCA Process - STEP 1
Subtract the mean from each of the dimensions
This produces a data set whose mean is zero.
Subtracting the mean makes variance and covariancecalculation easier by simplifying their equations.
The variance and co-variance values are not affected by
the mean value.
Suppose we have two measurement typesX1 andX2,
hencem= 2, and ten samples each, hencen= 10.
56
-
8/13/2019 Presentation1 Pca
34/48
-
8/13/2019 Presentation1 Pca
35/48
-
8/13/2019 Presentation1 Pca
36/48
-
8/13/2019 Presentation1 Pca
37/48
-
8/13/2019 Presentation1 Pca
38/48
PCA Process - STEP 4
Reduce dimensionality and formfeature vector
The eigenvector with thehighesteigenvalue is theprincipalcomponentof the data set.
In our example, the eigenvector with the largest eigenvalueis the one that points down the middle of the data.
Once eigenvectors are found from the covariance matrix,
the next step is toorder them by eigenvalue, highest tolowest. This gives the components in order of significance.
61
-
8/13/2019 Presentation1 Pca
39/48
PCA Process - STEP 4
Now, if youd like, you can decide toignorethe componentsof lesser significance.
You dolose some information, but if the eigenvalues aresmall, you dont lose much
m dimensions in your data
calculatemeigenvectors and eigenvalues
choose only the firstreigenvectors
final data set has onlyrdimensions.
62
-
8/13/2019 Presentation1 Pca
40/48
PCA Process - STEP 4
When the is are sorted in descending order, the proportionof variance explained by therprincipal components is:
r
i1+ +K+
i=1
m = 1+
2 r
+K
+ +K
+i=1
i
2 p m
If the dimensions are highly correlated, there will be a small
number of eigenvectors with large eigenvalues andrwill bemuch smaller thanm.
If the dimensions are not correlated,rwill be as large asmand PCA does not help.
63
-
8/13/2019 Presentation1 Pca
41/48
-
8/13/2019 Presentation1 Pca
42/48
-
8/13/2019 Presentation1 Pca
43/48
PCA Process - STEP 5
FinalDatais the final data set, with data items incolumns, and dimensions along rows.
What does this give us?
The original datasolely in terms of the vectors wechose.
We have changed our data from being in terms
of the axesX1andX2, to now be in terms ofour 2 eigenvectors.
66
-
8/13/2019 Presentation1 Pca
44/48
PCA Process - STEP 5
FinalData (transpose: dimensions along columns)
newX1
newX2
0.827870186 0.175115307
1.77758033 0.142857227
0.992197494 0.384374989
0.274210416 0.130417207
1.67580142 0.209498461
0.912949103 0.175282444
0.0991094375 0.349824698
1.14457216 0.0464172582
0.438046137 0.0177646297
1.22382956 0.162675287 67
-
8/13/2019 Presentation1 Pca
45/48
PCA Process - STEP 5
68
-
8/13/2019 Presentation1 Pca
46/48
Reconstruction of Original Data
Recall that:FinalData = RowFeatureVector x RowZeroMeanData
Then:
RowZeroMeanData = RowFeatureVector-1x FinalData
And thus:
RowOriginalData = (RowFeatureVector-1x FinalData) +
OriginalMean
If we use unit eigenvectors, the inverse is the sameas the transpose (hence, easier).
69
-
8/13/2019 Presentation1 Pca
47/48
-
8/13/2019 Presentation1 Pca
48/48
Next
We will practice with each other how would we use this powerful
mathematical tool in one of computer vision applications, which is
2D face recognition so stay excited
top related