You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jun 6, 2024. It is now read-only.
Currently, task port number is assigned by rest-server.
In some situation, the port number will conflict with other job.
Then it will cause job retry util conflict resolved. (since the retried job may be scheduled to the same node, port number may conflict again)
Current solution:
Runtime/Restserver use Hash(podUid, portName, portIndex) to generate port number, podUid as the seed. For distributed job, since one task can get other tasks' podUid, portName and portIndex from framework, it can calculate other tasks' port number independently.
If port number conflict, job failed. New pods are created with different UID. Then calculate new port number.