A TensorFlow Hub module is imported into a TensorFlow program by
creating a Module
object from a string with its URL or filesystem path,
such as:
m = hub.Module("path/to/a/module_dir")
This adds the module's variables to the current TensorFlow graph. Running their initializers will read their pre-trained values from disk. Likewise, tables and other state is added to the graph.
When creating a module from a URL, the module content is downloaded
and cached in the local system temporary directory. The location where
modules are cached can be overridden using TFHUB_CACHE_DIR
environment
variable.
For example, setting TFHUB_CACHE_DIR
to /my_module_cache
:
$ export TFHUB_CACHE_DIR=/my_module_cache
and then creating a module from a URL:
m = hub.Module("https://storage.googleapis.com/tfhub-test-modules/google/half-plus-two/1.tar.gz")
results in downloading the unpacked version of the module in
/my_module_cache
.
Once instantiated, a module m
can be called zero or more times like a Python
function from tensor inputs to tensor outputs:
y = m(x)
Each such call adds operations to the current TensorFlow graph to compute
y
from x
. If this involves variables with trained weights, these are
shared between all applications.
Modules can define multiple named signatures in order to allow being applied
in more than one way. A module's documentation should describe the available
signatures. The call above applies the signature named "default"
. Other
signature names can be specified with the optional signature=
argument.
If a signature has multiple inputs, they must be passed as a dict,
with the keys defined by the signature. Likewise, if a signature has
multiple outputs, these can be retrieved as a dict by passing as_dict=True
,
under the keys defined by the signature. (The key "default"
is for the
single output returned if as_dict=False
.)
So the most general form of applying a Module looks like:
outputs = m(dict(apples=x1, oranges=x2), signature="my_method", as_dict=True)
y1 = outputs["cats"]
y2 = outputs["dogs"]
A caller must supply all inputs defined by a signature, but there is no
requirement to use all of a module's outputs.
TensorFlow will run only those parts of the module that end up
as dependencies of a target in tf.Session.run()
. Indeed, module publishers may
choose to provide various outputs for advanced uses (like activations of
intermediate layers) along with the main outputs. Module consumers should
handle additional outputs gracefully.