-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve and formalize TensorBoard's Python filesystem abstraction #5286
Comments
Some updates: we've now added fsspec filesystem support via #5248, but it's still optional (we don't depend on fsspec, so users have to install it) and it's also only accessible when Also, to clarify the nature of the abstraction discussed in this issue, it isn't really a filesystem per se, or at least not in the usual sense - we actually want it to represent a one-level-higher abstraction, a "filesystem proxy" that can resolve scheme-prefixed paths to multiple concrete filesystems. (That said, the ambiguity in terminology arises because it is sort of tempting to shoehorn this proxy object itself into a filesystem interface - like TB does with DataProvider and DispatchingDataProvider, more or less - but just like DispatchingDataProvider, this abstraction can get kind of leaky/unwieldy. And in practice with fsspec, it has already led to confusion around how things like globbing interact with schemes and prefixes.) Furthermore, we actually have multiple "filesystem proxy" implementations in TensorBoard today:
Right now we have fallback from 2 to 3, but the choice of 1 vs 2&3 is mutually exclusive. We should fix this so TF users aren't artificially limited to only the TF filesystems, and can benefit from the fsspec support. The end state we want is that individual I/O calls can be dispatched across all possible proxies. This could be thought of at a 4th filesystem proxy in its own right, the "automatic" one, that supports the union of all the various concrete filesystems/schemes supported by TF, fsspec, and our built-in support. If users don't actually want automatic dispatch for some reason, we can provide a flag to specify that only a single proxy should be used (e.g. just TF). For the automatic dispatch we'd basically have a flow roughly like this pseudocode: scheme = ... # any prefix up to "://" - don't just use ":" to avoid confusion w/ Windows paths
options = []
def can_import(module_name, pip_name=None):
pip_name = pip_name or module_name
try:
__import__(package)
return True
except ImportError:
options.append("pip install " + pip_name)
return False
if can_import("tensorflow"):
if scheme in tf.io.gfile.get_registered_schemes():
return tf_proxy
if can_import("tensorflow_io", "tensorflow-io"):
if scheme in tf.io.gfile.get_registered_schemes():
return tf_proxy
if scheme in BUILT_IN_SUPPORTED_SCHEMES:
return builtin_proxy
if can_import("fsspec"):
try:
fsspec.get_filesystem_class(scheme)
return fsspec_proxy
except ImportError as e:
options.append(str(e))
except ValueError as e:
# Unrecognized scheme, fall through
pass
raise UnsupportedScheme(options) |
This is a generic tracking issue for improvements to TensorBoard's Python filesystem abstraction. (This doesn't impact Rustboard, which today should be the read path for regular users, but there are still users who have to use the Python read path because they can't use Rustboard, and the write path for TensorBoard's summary APIs to write without TensorFlow is still entirely in Python.)
Currently, TensorBoard's Python code relies on the TensorFlow filesystem API via
tf.io.gfile
, which has built-in support for a number of protocols besides local disk. When TensorFlow is not available, we fall back to a stub implementation of thetf.io.gfile
API intensorboard/compat/tensorflow_stub/io/gfile.py
, which today only supports local disk and S3. The stub implementation achieves some reuse between filesystem implementations by having an intermediate set of methods that both implementations expose, but this abstraction A) isn't even defined other than by virtue of common methods, i.e. there's no abstract base class or spec, and B) has a number of issues.Here are a few of the issues with that abstraction:
walk()
method for filesystems, just a function, as pointed out in gfile: add support for fsspec filesystems #5248 (comment), so filesystems aren't able to take advantage of fast paths they might supportWe already have a goal (bottom of #3666) to ultimately migrate away from the stub approach and instead properly inject a filesystem abstraction into places that do I/O. As part of that, we should consider revisiting this abstraction and formalizing it properly. One possibility is to just adopt the existing
fsspec
abstraction, since it's a whole project around this idea: https://filesystem-spec.readthedocs.io/en/latest/If we did that, we'd likely still want to be able to use the TF filesystem APIs when available (so that existing TensorBoard users who also have TensorFlow and thus benefit from built-in support for cloud filesystems don't experience a regression), so one open question would be determining whether it's possible to shoehorn the TF filesystem support into this setup.
The text was updated successfully, but these errors were encountered: