-ap&lle · 无监督学习中的 选代表和被代表问题-ap&lle 张响亮...

34
督学中的 代表和被代表问题 - AP & LLE 张响亮 Xiangliang Zhang King Abdullah University of Science and Technology CNCC, Oct 25, 2018 Hangzhou, China

Upload: others

Post on 07-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

无监督学习中的选代表和被代表问题

- AP & LLE张响亮

Xiangliang Zhang

King Abdullah University of Science and Technology

CNCC, Oct 25, 2018

Hangzhou, China

Page 2: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Outline

2

• Affinity Propagation (AP)[Frey and Dueck, Science, 2007]

• Locally Linear Embedding (LLE)[Roweis and Saul, Science, 2000]

Page 3: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Affinity Propagation

3

[Frey and Dueck, Science 2007]

Page 4: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Affinity Propagation

4

[Frey and Dueck, NIPS 2005]

We describe a new method that, for the first time to our knowledge, combines the advantages of model-based clustering and affinity-based clustering.

Component

Mixingcoefficient

Page 5: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Clustering: group the similar points together

21 )( miCx

km x

miµ-SS Î=

iCxm

m xC miÎ

S=||

Minimize

where

21 )( miCx

km x

miµ-SS Î=

}|{ miim Cxx Î=µ

Minimize

where

K-medoidsK-medians

Page 6: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Inspired by greedy k-medoids

Data to cluster:

The likelihood of belong to the cluster with center

is the Bayesian prior probability that is a cluster center

The responsibility of k-thcomponent generating xi

Assign xi with center si

Choose a new center

Page 7: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Understanding the processMessage sent from xi to eachcenter/exemplar: preference tobe with each exemplar

Hard decision for cluster centers/exemplars

Introduce: “availabilities”, which is sent from exemplars to other points, and provides soft evidence of the preference for each exemplar to be available as a center for each point

Page 8: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

The method presented in NIPS‘05

8

• Responsibilities are computed using likelihoods and availabilities• Availabilities are computed using responsibilities, recursively

Affinities

Page 9: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Interpretation by Factor Graph

9

is the index of the exemplar for

Objective function is

Constraints:

Should not be emptywith a single exemplar

An exemplar mustselect itself as exemplar

Page 10: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Input and Output of AP in Science07

10

Preference (prior)

Page 11: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

AP: a message passing algorithm

11

Page 12: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Iterations of Message passing in AP

12

Page 13: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Iterations of Message passing in AP

13

Page 14: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Iterations of Message passing in AP

14

Page 15: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Iterations of Message passing in AP

15

Page 16: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Iterations of Message passing in AP

16

Page 17: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Iterations of Message passing in AP

17

Page 18: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Iterations of Message passing in AP

18

Page 19: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Iterations of Message passing in AP

19

Page 20: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Summary AP

Xiangliang Zhang, KAUST CS340: Data Mining 20

Page 21: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Extensive study of AP

21

Page 22: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Outline

22

• Affinity Propagation (AP)[Frey and Dueck, Science, 2007]

• Locally Linear Embedding (LLE)[Roweis and Saul, Science, 2000]

Page 23: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Locally Linear Embedding (LLE)

23

[Roweis and Saul, Science, 2000]

Saul and Roweis. Think globally, fit locally: unsupervised learning of low dimensional manifolds. JMLR 2003

Page 24: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

LLE - motivations

24

Page 25: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Inspired by MDS

25

Multidimensional Scaling (MDS), find embedding of objects in low-dim space, preserving pairwise distance

Given pairwise similarity Embedding to find

Eliminate the need to estimate pairwise distancesbetween widely separated data points?

Page 26: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

LLE – general idea

� Locally, on a fine enough scale, everything looks linear

� Represent object as linear combination of its neighbors

� Assumption: same linear representation will hold in the low dimensional space

� Find a low dimensional embedding which minimizes reconstruction loss

26

Page 27: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

LLE – matrix representation

1.Select k nearest neighbors

2.Reconstruct xi by its K nearest neighbors

27

2

i jjiji ||xwx|| (W) å å-=e

Find W by minimizing

Page 28: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

LLE – matrix representation

3.Need to solve system Y = W*Y

28

Find the embedding vectors Yi by minimizing:

)econvariancunit with ( IYYN1 and

origin) on the (centered 0Y s.t.

W)(IW)(IM where

)Y(YM||YWY|| (Y)

N

ii

Ti

N

ii

T

N

i

N

jjiij

2

i jjiji

å =

å =

--=

åå •=å å-=e

Page 29: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

LLE – algorithm summary

1. Find k nearest neighbors in X space

2. Solve for reconstruction weights W

3. Compute embedding coordinates Y using weights W:

� Create a sparse matrix M = (I-W)T(I-W)

� Set Y to be the eigenvectors corresponding to the bottom dnon-zero eigenvectors of M

29

neighborsnearest K s x'of one is and ),x()x(C whereC

C w jkT

jjk1-T

-1hhh --==

111

Page 30: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Continuing, SNE

30

Allows “many-to-one” mappings in which a single ambiguous object really belongs in several disparate locations in the low-dimensional space, while LLE makes one-to-one mapping.

pj|i is the asymmetric probability that i would pick j as its neighbor

Gaussian Neighborhood in original space

qj|i is induced probability that point i picks point j as its neighbor

Gaussian Neighborhood in low-dim space

Page 31: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Continuing, t-SNE

� uses a Student-t distribution (heavier tail) rather than a Gaussian to compute the similarity between two points in the low-dimensional space

� symmetrized version of the SNE cost functionwith simpler gradients

31

Page 32: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

LLE - Follow up work

LLE output strongly depends on selection of k

Jing Chen and Yang Liu. Locally linear embedding: a survey.Artificial Intelligence Review (2011)

32

[Chen and Liu, 2011]

Page 33: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

[Ting and Jordan, 2018]

Page 34: -AP&LLE · 无监督学习中的 选代表和被代表问题-AP&LLE 张响亮 XiangliangZhang King Abdullah University of Science and Technology CNCC,Oct25,2018

Thank you for your attention!

Lab of Machine Intelligence and kNowledge Engineering (MINE): http://mine.kaust.edu.sa/