As shown above, Angel's core API classes, ordered by when (in general) they are called during model training, include:
- MLRunner
- MLRunner creates AngelClient with factory class based on conf, and calls AngelClient's interfaces in order according to the standard
train
process
- MLRunner creates AngelClient with factory class based on conf, and calls AngelClient's interfaces in order according to the standard
-
- Starts PSServer
- Initializes PSServer and loads empty model
- After training, saves the model to HDFS from multiple PSServers
-
- Starts
train
process when called by AngelClient
- Starts
-
- TrainTask calls
parse
andpreProcess
methods to read data from HDFS, and assemble data into DataBlock that contains multiple LabeledData - TrainTask calls
train
method to create, and pass DataBlock to, the MLLearner object
- TrainTask calls
-
- MLLearner calls its own
learn
method, reads DataBlock, computes the model delta, and pushes to / pull from PSServer through PSModel inside MLModel, eventually obtaining a complete MLModel
- MLLearner calls its own
-
- According to the algorithm's need, creates and holds multiple PSModels
-
- Encapsulates all the interfaces in AngelClient that communicate with PSServer, facilitating MLLearner calls
Understanding these core classes and processes will be quite helpful for implementing high performance machine-learning algorithms that can run on Angel.