Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tfjs-node] Backend overwritten when new environment (such as a worker) is created #3463

Closed
Pierrci opened this issue Jun 16, 2020 · 4 comments

Comments

@Pierrci
Copy link

Pierrci commented Jun 16, 2020

TensorFlow.js version

tfjs-node 2.0.1

Browser version

Node.js 12.16.3

Describe the problem or feature request

When tfjs-node is required in a new environment (such as a worker when using worker threads) and has already been required in another environment before (such as the main thread), the new TF backend instantiated in the new environment overwrites any previously instantiated one.

This behavior implies that if you try to:

  1. Load a SavedModel in an environment (such as the main thread, or a worker thread)
  2. Load another SavedModel in a different environment (another worker thread).

Then the SavedModel loaded in 1/ will "disappear" (Tensor not referenced errors when trying to run it), even in the environment it was created in. BUT the environment in 1/ will be able to access and run the new SavedModel created in 2/.

It seems to me that the expected behavior would be rather one of those:

  • (a) The TF backend is shared between all the different environments tfjs-node is required in. It means that if a SavedModel is loaded in env 2, it will also be available in env 1, but without overwriting any SavedModel loaded in env 1. In the same way, models loaded in env 1 will be available in env 2.
  • (b) Different, isolated TF backend for each environment, leading in turn to environments isolated from each other when handling SavedModels, each having to load the models it wants to use.

Code to reproduce the bug / link to feature request

I coded fixes corresponding to the two different behaviors:

I've been experimenting with the two fixes for my use case which uses multiple worker threads to interact with SavedModels successfully. I'm willing to work on a PR for (a) or (b), which can be amended obviously with your feedback (particularly for the solution for (b)).

@rthadur rthadur added comp:node.js P1 type:bug Something isn't working labels Jun 17, 2020
@rthadur rthadur added P2 and removed P1 labels Jul 16, 2020
@pyu10055
Copy link
Collaborator

@Pierrci Hi, looks like the Agent concept would create good isolation. Are you still willing to contribute your changes for the option b? thanks

@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you.

@google-ml-butler
Copy link

Closing as stale. Please @mention us if this needs more attention.

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants