Open
Description
What happened + What you expected to happen
I've noticed some common problems when network package dropped in real depolyment, and I have some proposal regarding these problems. I've discussed with @fanlai0990, and I would like to hear from more contributors to figure out the best plan. @mosharaf @AmberLJC @ewenw @IKACE
- problem: server->client UPDATE_MODEL package dropped, server->client MODEL_TEST in error (stale model/no model)
solution: ignore UPDATE_MODEL, send model in MODEL_TEST package - problem: server->client CLIENT_TRAIN package dropped, server->client DUMMY_EVENT forever
solution: keep event inside queue until client confirm event completed
pitfall:- multi-thread executor may ping the same event more than once
- UPDATE_MODEL no confirmation, no way to tell if UPDATE_MODEL finished
Versions / Dependencies
fedscale-0.5
server: ubuntu 16
client: android 23
Reproduction script
Issue Severity
High: It blocks me from completing my task.
Activity