方圆并济:基于spark on angel 的高性能分布式机器学习€¦ · [spark on angel] lr...

46
方圆并济:基于 Spark on Angel 的高性能分布式机器学习

Upload: others

Post on 28-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

方圆并济:基于 Spark on Angel 的高性能分布式机器学习

Page 2: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Page 3: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

源起

Page 4: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

腾讯的产品需求

SmallModel

d

Big Datan

d

d<<n

SparseBig Data

d

Big Model

d

d ≈ n

寻找满足十亿级维度的工业级的分布式机器学习平台

Page 5: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Executor

Driver

ModelExecutor

Executor

Executor

Executor

Executor

Driver

Model

Executor

Executor

Executor

Spark机器学习的瓶颈

Page 6: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

One Issue

https://issues.apache.org/jira/browse/SPARK-6932

A Prototype of Parameter Server

2015

Page 7: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Glint & Yahoo

2016

Page 8: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

理念

Worker

PS PS PS

Spark Worker Worker Worker Worker

Angel mutable

immutable

—— 方圆并济

Page 9: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Spark on Angel

Page 10: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

核心抽象

MapperReducer RDD PSModel

Page 11: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

RDD vs PSModel

RDD-1 RDD-2 RDD-3 RDD-4 RDD-5

PSModel

epoch-1 epoch-2 epoch-3 epoch-4 epoch-5

epoch……………………

Page 12: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

RDD的核心抽象RDD

Partition-1

Partition-2

Partition-3

Partition-4

Partition-n

Compute Func

…………………

Dependencies

NodeMemory Node Disk

MemoryBlock -n

DiskBlock -n

Preferred locationsPartitioners

RDD

RDD

…………………

(Transformation or Action)

Page 13: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

PSModel的核心抽象

PSModelM

pull

ΔM

push

Shard

PSServer

MatrixContext

Sync

PSPartitioner

Partition1

Partition2

Partition-……

Partition3

PSClient

Clock

Page 14: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Spark on Angel的架构

PSAgent PSAgent

SPARKRDD ……………………

Parameter Server Shard

PSServer

Shard

PSServer

PSAgent

Shard

PSServer

PSModel

Executor

TASK

TASK

TASK

PSModel

Executor

TASK

TASK

TASK

AngelContext

SparkDriver

……………………

Page 15: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

PSAgentPSAgentPSAgent

Parameter Server

Model M pull ΔMpush

Shard

PSServer

Shard

PSServer

Shard

PSServer

Worker

psFuncModel PartitionersyncProtocol

PsClient

DataBlock

Task

PsClient

DataBlock

Task

Page 16: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

•••

丰富的机器学习及数学计算库

•••

友好的用户编程接口

•••

工业级别可用的参数服务器

Page 17: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel和Glint的比较

PSPartitioner

Partition1

Partition2

Partition-……

Partition3

更丰富的模型切分 更灵活的异步模式 更强大的psFunc

Page 18: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel的定位

https://github.com/tencent/angel

Page 19: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Spark on Angel的开发

Page 20: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel的API设计

TrainTask

1. Start PS

2. Load Model

3.runTask

4.parse & preProcess

5.train

6.learn

HDFS

8.Save ModelHDFS

AngelClient

MLLearner

DataBlockLabledData

LabledData

LabledData

MLModel

7.push & pullPSModel

PSModel

PSModel

Model

PSServer

MLRunner

Page 21: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

MLModelRDD

Spark on Angel的API设计

RDD2

RDD3

……

RDD1

Shard

PSServer

AngelClient

PSClient

AngelSpark on AngelSpark

SparkPSContext

PSModel

{ RDD_PS_Functions }

PSVector PSMartrix

BreezePSVector CachedPSVector

Page 22: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Spark on Angel的基础写法

••••

Page 23: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

<<class>>BreezeVector

def round(t: T):Tdef dot(t: T):Tdef max(t: T):T

<<trait>>NumericOps[T]

def round(t: T):Tdef dot(t: T):Tdef max(t: T):T

<<class>>BreezePSVector

def round(t: T):Tdef dot(t: T):Tdef max(t: T):T

混入相同特征

PSAgent

进行透明替换

Angle PS

•••

Vector的透明替换

Executor

Task

BreezePSVector

BreezePSVector

BreezePSVector

PSClient

Page 24: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN
Page 25: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel的算法

Spark on AngelAvailable

Page 26: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

LR on Angel

Pull parameters from PS

Push update value to PS

2.

PS PS PS PS

Worker Worker Worker

HDFS HDFS HDFS

0.

1.

Page 27: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

[Spark on Angel] LR

[spark_on_angel_quick_start.md]

{BreezeOps}

wPS gradientPS

Angel

Spark sampleRDDmapPartitions

DenseVectorArray

Page 28: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

优化方法

Page 29: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

[Spark on Angel] LR with Optimizer

wPS statePS Angel

DenseVector

SparksampleRDD

mapPartitions

SGD OWLQN LBFGS

Breeze.optimizer

DiffFunction(BreezePSVector) : (Double, BreezePSVector)

[spark_on_angel_optimizer.md]

Page 30: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

GBDT:树模型+Boosting

Age<30

Wage<10K

IsMale?Y

Y

YN

N

N

tree 1 tree 2

predict( ) 5+0.5=5.5

predict( ) 10+1.5=11.5

predict( ) 1+1.5=2.5

predict( ) 1+0.5=1.5

predict( ) 1+1.5=2.5

A

B

C

D

E

Page 31: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

GBDT on Angel: 模型存储

feature value

feature ID

leaf prediction

PS1

feature value

feature ID

leaf prediction

PS2

feature value

feature ID

leaf prediction

PS3

grad histogram

hess histogram

Page 32: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

GBDT on Angel(1):构建森林

PS1 PS2 PS3

Worker1 Worker2 Worker3

Page 33: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

GBDT on Angel(2): 分裂树节点

find split feature & value

[gbdt_on_angel.md]

Page 34: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel

Spark

[Spark on Angel] GBDT

Instance RDD Gradient RDD Prediction RDDzip zip

InstanceLayout

PS

map

Grad Histogram

PS

SplitFeature

PS

SplitValue

PS

LeafWeight

PS

[spark_on_angel_gbdt.md]

Page 35: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN
Page 36: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

(Spark on Angel)vs Spark —— LR

Page 37: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel vs XGBoost —— GBDT

Page 38: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel vs Spark —— LDA

Page 39: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel vs Spark —— GD-LR

Page 40: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel vs Spark —— ADMM-LR

Page 41: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Spark on Angel的特点

Page 42: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

OpenSource & Perspective

Page 43: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Angel开源

• [GBDT] The purposes of using parameter server in GBDT #7

(PR 60)

Page 44: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

学术创新

• 国际顶级会议Paper(CCF A类)

Page 45: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

版本展望(What is Next)

V1.3 V1.5 V2.0

Page 46: 方圆并济:基于Spark on Angel 的高性能分布式机器学习€¦ · [Spark on Angel] LR with Optimizer wPS statePS Angel DenseVector Spark sampleRDD mapPartitions SGD OWLQN

Q & A微博:@明风

喜欢记得给个Star噢 [email protected]

机器学习系统 & 算法工程师

We are Hiring