Skip to content

Conversation

@mrocklin
Copy link
Member

@mrocklin mrocklin commented Aug 2, 2019

This helps to normalize scheduler addresses before comparison

Fixes #2336

I'm working on a test now, but I thought I'd push this up in the meantime.

This helps to normalize scheduler addresses before comparison

Fixes dask#2336
@mrocklin
Copy link
Member Author

mrocklin commented Aug 2, 2019

cc @jacobtomlinson

@mrocklin
Copy link
Member Author

mrocklin commented Aug 2, 2019

So previously we would pass the address along with serialized Futures for two reasons:

  1. To verify that the client we're connecting to is connected to the right scheduler (this comes up in some advanced multi-client/multi-scheduler situations)
  2. To create a new client and connect it to a scheduler if no client is found locally

When we call get_client() from within a worker, we don't need to worry about 2 (we already have a fine connection to the scheduler) but we do still need to verify that this is the right cluster for us to be on. Using the address for this verification was error prone, because network names are odd.

Now we also pass along the scheduler ID string, and use that for verification instead. This makes us a bit more robust to bad addresses.

@jacobtomlinson
Copy link
Member

Just given this a test with this MRE.

On this branch it hangs for me at client.publish_dataset(df=df).

@mrocklin
Copy link
Member Author

mrocklin commented Nov 7, 2019

@quasiben in case you're interested.

Base automatically changed from master to main March 8, 2021 19:03
@mrocklin mrocklin requested a review from fjetter as a code owner January 23, 2024 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to use published datasets in a different client

2 participants