Skip to content

Commit

Permalink
Add link to ML Services in docs
Browse files Browse the repository at this point in the history
  • Loading branch information
seddonm1 committed Jul 18, 2019
1 parent 079a83c commit e6a796c
Show file tree
Hide file tree
Showing 4 changed files with 52 additions and 0 deletions.
21 changes: 21 additions & 0 deletions docs-src/content/patterns/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,27 @@ FROM (
) valid
```

## Machine Learning Model as a Service

To see an example of how to host a simple model as a service (in this case [resnet50](https://www.kaggle.com/keras/resnet50)) see:<br>
https://github.com/tripl-ai/arc/tree/master/src/it/resources/flask_serving

To see how to host a [TensorFlow Serving](https://www.tensorflow.org/serving/) model see:<br>
https://github.com/tripl-ai/arc/tree/master/src/it/resources/tensorflow_serving

To easily scale these services without managed infrastructure you can use [Docker Swarm](https://docs.docker.com/engine/swarm/) which includes a basic load balancer to distribute load across many (`--replicas n`) single-threaded services:

```bash
# start docker services
docker swarm init && \
docker service create --replicas 2 --publish 5000:5000 flask_serving/simple:latest
```

```bash
# to stop docker swarm
docker swarm leave --force
```

## Machine Learning Prediction Thresholds

When used for classification, the [MLTransform](../transform/#mltransform) stage will add a `probability` column which exposes the highest probability score from the Spark ML probability vector which led to the predicted value. This can then be used as a boundary to prevent low probability predictions being sent to other systems if, for example, a change in input data resulted in a major change in predictions.
Expand Down
6 changes: 6 additions & 0 deletions docs-src/content/transform/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,9 @@ The `GraphTransform` stage takes either a list of views of graph nodes and views

The `HTTPTransform` stage transforms the incoming dataset by `POST`ing the value in the incoming dataset with column name `value` (must be of type `string` or `bytes`) and appending the response body from an external API as `body`.

A good use case of the `HTTPTransform` stage is to call an external [RESTful](https://en.wikipedia.org/wiki/Representational_state_transfer) machine learning model service. To see an example of how to host a simple model as a service see:<br>
https://github.com/tripl-ai/arc/tree/master/src/it/resources/flask_serving

### Parameters

| Attribute | Type | Required | Description |
Expand Down Expand Up @@ -336,6 +339,9 @@ This means this API is likely to change.

The `TensorFlowServingTransform` stage transforms the incoming dataset by calling a [TensorFlow Serving](https://www.tensorflow.org/serving/) service. Because each call is atomic the TensorFlow Serving instances could be behind a load balancer to increase throughput.

To see how to host a simple model in [TensorFlow Serving](https://www.tensorflow.org/serving/) see:<br>
https://github.com/tripl-ai/arc/tree/master/src/it/resources/tensorflow_serving

### Parameters

| Attribute | Type | Required | Description |
Expand Down
19 changes: 19 additions & 0 deletions docs/patterns/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -897,6 +897,25 @@ <h4 id="trailer">trailer</h4>
) valid
</code></pre>

<h2 id="machine-learning-model-as-a-service">Machine Learning Model as a Service</h2>

<p>To see an example of how to host a simple model as a service (in this case <a href="https://www.kaggle.com/keras/resnet50">resnet50</a>) see:<br>
<a href="https://github.com/tripl-ai/arc/tree/master/src/it/resources/flask_serving">https://github.com/tripl-ai/arc/tree/master/src/it/resources/flask_serving</a></p>

<p>To see how to host a <a href="https://www.tensorflow.org/serving/">TensorFlow Serving</a> model see:<br>
<a href="https://github.com/tripl-ai/arc/tree/master/src/it/resources/tensorflow_serving">https://github.com/tripl-ai/arc/tree/master/src/it/resources/tensorflow_serving</a></p>

<p>To easily scale these services without managed infrastructure you can use <a href="https://docs.docker.com/engine/swarm/">Docker Swarm</a> which includes a basic load balancer to distribute load across many (<code>--replicas n</code>) single-threaded services:</p>

<pre><code class="language-bash"># start docker services
docker swarm init &amp;&amp; \
docker service create --replicas 2 --publish 5000:5000 flask_serving/simple:latest
</code></pre>

<pre><code class="language-bash"># to stop docker swarm
docker swarm leave --force
</code></pre>

<h2 id="machine-learning-prediction-thresholds">Machine Learning Prediction Thresholds</h2>

<p>When used for classification, the <a href="../transform/#mltransform">MLTransform</a> stage will add a <code>probability</code> column which exposes the highest probability score from the Spark ML probability vector which led to the predicted value. This can then be used as a boundary to prevent low probability predictions being sent to other systems if, for example, a change in input data resulted in a major change in predictions.</p>
Expand Down
6 changes: 6 additions & 0 deletions docs/transform/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -827,6 +827,9 @@ <h5 id="since-1-0-9-supports-streaming-true">Since: 1.0.9 - Supports Streaming:

<p>The <code>HTTPTransform</code> stage transforms the incoming dataset by <code>POST</code>ing the value in the incoming dataset with column name <code>value</code> (must be of type <code>string</code> or <code>bytes</code>) and appending the response body from an external API as <code>body</code>.</p>

<p>A good use case of the <code>HTTPTransform</code> stage is to call an external <a href="https://en.wikipedia.org/wiki/Representational_state_transfer">RESTful</a> machine learning model service. To see an example of how to host a simple model as a service see:<br>
<a href="https://github.com/tripl-ai/arc/tree/master/src/it/resources/flask_serving">https://github.com/tripl-ai/arc/tree/master/src/it/resources/flask_serving</a></p>

<h3 id="parameters-3">Parameters</h3>

<table>
Expand Down Expand Up @@ -1606,6 +1609,9 @@ <h5 id="since-1-0-0-supports-streaming-true-3">Since: 1.0.0 - Supports Streaming

<p>The <code>TensorFlowServingTransform</code> stage transforms the incoming dataset by calling a <a href="https://www.tensorflow.org/serving/">TensorFlow Serving</a> service. Because each call is atomic the TensorFlow Serving instances could be behind a load balancer to increase throughput.</p>

<p>To see how to host a simple model in <a href="https://www.tensorflow.org/serving/">TensorFlow Serving</a> see:<br>
<a href="https://github.com/tripl-ai/arc/tree/master/src/it/resources/tensorflow_serving">https://github.com/tripl-ai/arc/tree/master/src/it/resources/tensorflow_serving</a></p>

<h3 id="parameters-8">Parameters</h3>

<table>
Expand Down

0 comments on commit e6a796c

Please sign in to comment.