Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable to specify driver id by user. #3084

Merged
merged 2 commits into from
Nov 3, 2018

Conversation

jovany-wang
Copy link
Contributor

@jovany-wang jovany-wang commented Oct 18, 2018

What do these changes do?

It's very useful for us to specify driver id to a driver.
In Java, we could specify it by ray.conf, and in Python, we should do it by def init():

ray.init(driver=b"01234567890123456789")

If we don't specify one, it will generate a random for us.

Related issue number

N/A

@@ -1664,6 +1667,7 @@ def init(redis_address=None,
plasma_directory=None,
huge_pages=False,
include_webui=True,
worker_id=None,
Copy link
Collaborator

@robertnishihara robertnishihara Oct 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the driver_id then it needs to be called driver_id.

@@ -2017,7 +2026,10 @@ def connect(info,
assert not worker.connected, error_message
assert worker.cached_functions_to_run is not None, error_message
# Initialize some fields.
worker.worker_id = random_string()
if worker_id is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do some type checking. I think we want to assert that it is a Ray ObjectID or that it is 20 bytes or something like that. All of that is a bit ugly unfortunately.

@robertnishihara
Copy link
Collaborator

I have mixed feelings about this. I think we want to restrict driver_id's to have type ray.DriverID, but we don't have that, so it would need to be ray.ObjectID, which feels weird because it's a driver, not an object. You're code example seems to use a short string, which doesn't make sense to me. Doesn't it need to be 20 bytes?

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8729/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8732/
Test PASSed.

@jovany-wang
Copy link
Contributor Author

jovany-wang commented Oct 18, 2018

@robertnishihara Thanks a lot for your comments about this.

I have addressed some comments:
adding type checking and renaming worker_id to driver_id.

It's too awkward for me to do type checking for these codes. As @robertnishihara mentioned, which type should driver_id be? Maybe we should add an UniqueID type.

Another question I found is worker.worker_id's type. I guess we assume that it's a string or bytes. One problem that comes with this is :

    # Initialize some fields.
    if mode is WORKER_MODE:
        worker.worker_id = random_string()
    else:
        # This is the code path of driver mode.
        if driver_id is None:
            driver_id = ray.ObjectID(random_string())

        elif isinstance(driver_id, (bytes, str)):
            if len(driver_id) is not ray_constants.ID_SIZE:
                raise Exception("The length of given driver id str must be 20.")
            driver_id = ray.ObjectID(driver_id)

        elif not isinstance(driver_id, ray.ObjectID):
            raise Exception("The type of given driver id must be str or ray.ObjectID.")

        worker.worker_id = driver_id.id()

We should assume driver_id is a ray.ObjectID because we should cast from a str or a ray.ObjectID, And the next we should cast driver_id to str(or bytes) to match worker.worker_id.

I think the better way to solve it is to use worker_id as ray.ObjectID instead of str(or bytes). And we needn't worry about the length of worker_id any more, right?

@jovany-wang
Copy link
Contributor Author

@guoyuhong cc

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8734/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8746/
Test PASSed.

@robertnishihara
Copy link
Collaborator

@guoyuhong @jovan-wong what is the use case for specifying the driver ID?

@guoyuhong
Copy link
Contributor

@robertnishihara We will enable cross-language calling which makes streaming work smoothly. Before Java Driver calling a Python worker, the python code should be deployed to a path or vise versa. Knowing the driver Id before hand will enable use to deploy the python/java code to a directory related to the driver id.

@raulchen
Copy link
Contributor

Another purpose is that we want to have an external 'job manager' that runs all drivers with pre-defined driver ids, so it can manage the drivers by the ids.

@robertnishihara do you have any other questions? This PR looks good to me.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8795/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8797/
Test FAILed.

if driver_id is None:
driver_id = ray.ObjectID(random_string())

elif isinstance(driver_id, (bytes, str)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check that it is bytes for Python 2 and str for Python 3.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, for now, let's require it to be an ObjectID. I think the more restrictive we make it the better since it makes it easier to change later.

@@ -1731,6 +1736,7 @@ def init(redis_address=None,
Store with hugetlbfs support. Requires plasma_directory.
include_webui: Boolean flag indicating whether to start the web
UI, which is a Jupyter notebook.
driver_id: The id of driver, and it will be ignored in non-driver mode.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id -> ID

remove the stuff after the ,, there is no such thing as "non-driver" mode.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least from the perspective of the user and ray.init, there is no such thing as "non-driver" mode.

Copy link
Collaborator

@robertnishihara robertnishihara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I left some comments.

Add type checking and address comments.

Fix small thing.

Add ut.

Fix lint

Remove useless code.

Address comments.
@jovany-wang
Copy link
Contributor Author

@robertnishihara Tks 4 ur comments.
I have addressed them, and could u plz c it again?

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9000/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9001/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9002/
Test PASSed.

@robertnishihara
Copy link
Collaborator

@jovan-wong thanks, I fixed some linting errors. It can be merged once tests pass.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9014/
Test PASSed.

@robertnishihara robertnishihara merged commit ca7d4c2 into ray-project:master Nov 3, 2018
@jovany-wang jovany-wang deleted the enable-driver-id branch November 13, 2018 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants