-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Reincarnate TensorBoardWSGIApp as new DataProvider-friendly entry point #2576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
wchargin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall:
-
Understanding this flow takes a bit of mental gymnastics, but
perhaps it’s still less convoluted than before. I’ll have to let it
sink in a bit. -
I like the setup of passing a null object around internally (e.g.,
tostart_reloading_multiplexer) to reduce the number of data flow
paths. I like a bit less the setup of using this null object also as
a bag forTBContextattributes, but I can live with it. But
leaking this null object abstraction onto the plugin surface via
context.multiplexerseems worth avoiding, if we can swing it.
tensorboard/backend/application.py
Outdated
| if isinstance(plugin, base_plugin.TBLoader): | ||
| return plugin | ||
| if issubclass(plugin, base_plugin.TBLoader): | ||
| return plugin() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice QOL change; I know that it’s bitten us in the past. Want
to update tensorboard/default.py to use this form for internal
consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but I'll also need to update program.py to support this form since that's actually where we do the expansion in the current code, and I'd rather not do that in this PR.
I can either leave this functionality in this path only, or I can back out the commit adding it and do the change to both places and default.py in a separate PR. Any preference?
Down the road of course there should be just one path, but I was going to do more refactoring to pull out the contents of standard_tensorboard_wsgi() so that it can be invoked from program.py and then passed into TensorBoardWSGIApp() from there in a clean way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn’t realize that this was an additional path. Up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reverted this change for this PR and will send a followup for this separately.
tensorboard/backend/application.py
Outdated
| return plugin() | ||
| if issubclass(plugin, base_plugin.TBPlugin): | ||
| return base_plugin.BasicLoader(plugin) | ||
| raise ValueError("Not a TBLoader or TBPlugin subclass: %s" % plugin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be TypeError? Docs for ValueError say, “Inappropriate
argument value (of correct type).”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in the followup.
tensorboard/backend/application.py
Outdated
| return plugin() | ||
| if issubclass(plugin, base_plugin.TBPlugin): | ||
| return base_plugin.BasicLoader(plugin) | ||
| raise ValueError("Not a TBLoader or TBPlugin subclass: %s" % plugin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider "…" % (plugin,) to prevent explosions if plugin is a tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in the followup.
| flags, plugin_loaders, data_provider, assets_zip_provider, multiplexer) | ||
|
|
||
|
|
||
| def TensorBoardWSGIApp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we take this opportunity to give this function a better name? Its
casing suggests that it’s a type, but it’s a function; and it’s not
clear from the name how this differs from class TensorBoardWSGI.
Perhaps something like build_tensorboard_wsgi connotes its purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan (in #2573, one of the WSGI cleanup items) was to convert this into the actual plugin class and remove TensorBoardWSGI. This is blocked on updating the plugin tests so they don't need it as a shim, which I was going to do as a cleanup task to avoid blocking this PR.
So that's why I left it named in class style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
tensorboard/backend/application.py
Outdated
| data_provider: Instance of `tensorboard.data.provider.DataProvider`. May | ||
| be `None` if `flags.generic_data` is set to `"false"` in which case | ||
| `deprecated_multiplexer` must be passed instead. | ||
| deprecated_multiplexer: Optional instance of EventMultiplexer to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth noting that this is a plugin_event_multiplexer.EventMultiplexer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| """ | ||
| logger.info('_DbModeMultiplexer initializing for %s', db_uri) | ||
| super(_DbModeMultiplexer, self).__init__() | ||
| self.db_uri = db_uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this serve as a conduit for DB-related parameters? IIUC nothing
uses this curently—the URI and connection provider are always pulled off
the context, not the multiplexer.
Oof, I see: there’s a getattr, which is why my greps turned up
nothing. What do you think of either explicitly mentioning that these
are read in TensorBoardWSGIApp, or changing that logic above to be
something like
db_uri = None
db_connection_provider = None
if isinstance(deprecated_multiplexer, _DbModeMultiplexer):
db_uri = deprecated_multiplexer.db_uri
db_connection_provider = deprecated_multiplexer.db_connection_provider
deprecated_multiplexer = None # don't pass no-op muxer into `TBContext`to also resolve my gripe about including a non-functional multiplexer in
the context at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above response about the non-functional multiplexer.
I can change the getattr to two different isinstance checks (one per db multiplexer type) if you'd prefer that. I was not trying to be terribly principled since this code is all going away soon, but I'm happy to make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced getattrs with isinstance.
| reload_interval = -1 | ||
| db_connection_provider = create_sqlite_connection_provider(db_uri) | ||
| db_connection_provider = create_sqlite_connection_provider(flags.db) | ||
| multiplexer = _DbModeMultiplexer(flags.db, db_connection_provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s a little funny for a non-functional multiplexer to exist in the
TBContext: currently plugins case on DB mode by saying, “if I have a
DB connection provider, use DB mode, else use multiplexer mode”, but
they could just as well say, “if I have a multiplexer, use multiplexer
mode, else use DB mode”—but that will no longer work. Concern?
(See also comment on _DbModeMultiplexer.__init__ below.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is basically that I don't trust plugins to all defensively check whether multiplexer is not None. Some of them do this, but inconsistently. For example, the projector plugin checks that self.multiplexer is not None in only 2 out of 3 uses. The profile plugin doesn't check at all.
So rather than trying to go through all the existing plugins to ensure they can handle a non-existent multiplexer, I figured I'd ensure that there is always a dummy placeholder multiplexer that just returns empty data instead.
I realize this isn't that clean, but our near-term goal is to refactor away all the multiplexer-calling code anyway, so I was reluctant to spend much time improving it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it; that this is all slated to be removed soon changes the equation.
This is fine with me, then.
nfelt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! Will update PR to address the remaining comments.
| reload_interval = -1 | ||
| db_connection_provider = create_sqlite_connection_provider(db_uri) | ||
| db_connection_provider = create_sqlite_connection_provider(flags.db) | ||
| multiplexer = _DbModeMultiplexer(flags.db, db_connection_provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is basically that I don't trust plugins to all defensively check whether multiplexer is not None. Some of them do this, but inconsistently. For example, the projector plugin checks that self.multiplexer is not None in only 2 out of 3 uses. The profile plugin doesn't check at all.
So rather than trying to go through all the existing plugins to ensure they can handle a non-existent multiplexer, I figured I'd ensure that there is always a dummy placeholder multiplexer that just returns empty data instead.
I realize this isn't that clean, but our near-term goal is to refactor away all the multiplexer-calling code anyway, so I was reluctant to spend much time improving it now.
| flags, plugin_loaders, data_provider, assets_zip_provider, multiplexer) | ||
|
|
||
|
|
||
| def TensorBoardWSGIApp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan (in #2573, one of the WSGI cleanup items) was to convert this into the actual plugin class and remove TensorBoardWSGI. This is blocked on updating the plugin tests so they don't need it as a shim, which I was going to do as a cleanup task to avoid blocking this PR.
So that's why I left it named in class style.
| """ | ||
| logger.info('_DbModeMultiplexer initializing for %s', db_uri) | ||
| super(_DbModeMultiplexer, self).__init__() | ||
| self.db_uri = db_uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above response about the non-functional multiplexer.
I can change the getattr to two different isinstance checks (one per db multiplexer type) if you'd prefer that. I was not trying to be terribly principled since this code is all going away soon, but I'm happy to make that change.
| reload_interval = -1 | ||
| db_connection_provider = create_sqlite_connection_provider(db_uri) | ||
| db_connection_provider = create_sqlite_connection_provider(flags.db) | ||
| multiplexer = _DbModeMultiplexer(flags.db, db_connection_provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it; that this is all slated to be removed soon changes the equation.
This is fine with me, then.
| flags, plugin_loaders, data_provider, assets_zip_provider, multiplexer) | ||
|
|
||
|
|
||
| def TensorBoardWSGIApp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
This reincarnates
TensorBoardWSGIApp(), which is removed in #2575, as a new API for constructing the TB WSGI app using a specific DataProvider implementation (with a deprecated argument for passing in a multiplexer instead, the current behavior that we intend to migrate away from).To do this, we rearrange a lot of the logic in
standard_tensorboard_wsgi()so that all three possible paths (regular logdir loading, db import, and db read-only) emit a single multiplexer instance. To that end, I've changedDbImportMultiplexerto implement the read surface (by returning empty data) and introduced a shim_DbModeMultiplexerfor the db read-only mode just as a conduit for parameters.This addresses core tasks 2 and 3 in #2573, with the first four commits corresponding to task 2 and the next 2 to task 3.
cc @bileschi