Skip to content

Commit

Permalink
Document how to run inference for TF2 models.
Browse files Browse the repository at this point in the history
Context: tensorflow#479
PiperOrigin-RevId: 290569934
  • Loading branch information
TensorFlow Hub Authors authored and vbardiovskyg committed Jan 20, 2020
1 parent 67a5e78 commit 7e217af
Showing 1 changed file with 27 additions and 3 deletions.
30 changes: 27 additions & 3 deletions docs/common_issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,14 +61,38 @@ $ python

## Running inference on a pre-initialized module

If you are applying a module over data multiple times (e.g. to serve user
requests) you should use TensorFlow Session.run to avoid the overhead of
constructing and initializing parts of the graph multiple times.
If you are writing a Python program that applies a module many times on input
data, you can apply the following recipes. (Note: For serving requests in
production servives, consider go/servo or other scalable, Python-free
solutions.)

Assuming your use-case model is **initialization** and subsequent **requests**
(for example Django, Flask, custom HTTP server, etc.), you can set-up the
serving as follows:

### TF2 SavedModels

* In the initialization part:
* Load the TF2.0 model.

```python
import tensorflow_hub as hub

embedding_fn = hub.load("https://tfhub.dev/google/universal-sentence-encoder/4")
```

* In the request part:
* Use the embedding function to run inference.

```python
embedding_fn(["Hello world"])
```

This call of a tf.function is optimized for performance, see
[tf.function guide](https://www.tensorflow.org/guide/function).

### TF1 Hub modules

* In the initialization part:
* Build the graph with a **placeholder** - entry point into the graph.
* Initialize the session.
Expand Down

0 comments on commit 7e217af

Please sign in to comment.