Bluegreen #7

blublinsky · 2019-08-22T15:36:49Z

Extension adding canary testing and speculative model serving

deanwampler

I think there is too much that needs changing, minor things (camel-case names) to larger questions about what should be configurable and how best to configurable (percentage splitting logic). So, I'm not going to merge this branch until after the Pipelines team reviews the current master branch.

deanwampler · 2019-08-29T20:58:09Z

model-serving/src/main/scala/com/lightbend/modelserving/model/actor/DataSplittingActor.scala

+ * Actor that handles messages to update a splitting policy and to split input according to the percentages.
+ * @param label used for identifying the app, e.g., as part of a file name for persistence of the split policy.
+ */
+class DataSplittingActor(


Why was this done with an actor when FlowOps.groupBy would be sufficient? https://doc.akka.io/api/akka/current/akka/stream/scaladsl/Source.html#groupBy[K](maxSubstreams:Int,f:Out=%3EK):akka.stream.scaladsl.SubFlow[Out,Mat,FlowOps.this.Repr,FlowOps.this.Closed]

Following the same paradigm that I used before. I am using actors everywhere when there is a stateful execution. In this case its configuration

deanwampler · 2019-08-29T21:02:14Z

...ain/scala/com/lightbend/modelserving/model/actor/SpeculativeModelServingCollectorActor.scala

+  override def receive: PartialFunction[Any, Unit] = {
+
+    // Process a new merger definition
+    case speculativeStreamMerger: SpeculativeStreamMerger ⇒


Why is SpeculativeStreamMerger defined with Avro and all the overhead that goes with managing it as "state"? Isn't all you really need just configuration properties that are loaded at startup?

To support dynamic configuration?

deanwampler · 2019-08-29T21:15:32Z

model-serving/src/main/scala/com/lightbend/modelserving/model/actor/DataSplittingActor.scala

+
+  protected val filePersistence = FilePersistence.apply(null)
+
+  protected var currentTransformer: Option[StreamSplitter] = None


It appears that StreamSplitter is streamed in. That adds a lot of code to support runtime configurability and hence I'm not so sure it's worth it. Isn't reading the values at the beginning from the configuration sufficient?

What happens if I don't update the values in weeks? Does Akka Streams have a timeout that assumes the stream died? If I really wanted the percentages to be configurable, wouldn't it be simpler to have a web service in the streamlet where I can send commands to update the values?

The idea was to be able to configure everything to avoid restarting

deanwampler · 2019-08-29T21:17:53Z

model-serving/src/main/scala/com/lightbend/modelserving/model/h2o/H2OModel.scala

    val probs = prediction.classProbabilities
-    val probability = if (probs.length == 2) probs(1) else 0.0
-    (prediction.label, probability)
+    try {


I'm not sure how an exception can be thrown in this case, but if BinomialModelPrediction can throw one here, what does that mean? Should we log an error or something rather than silently return a default value?

If prediction is null, then it barfs. I did not trace if fr enough to see how this can happen, but did see it.

deanwampler · 2019-08-29T21:21:17Z

model-serving/src/main/scala/com/lightbend/modelserving/model/persistence/FilePersistence.scala


  /**
-   * Save the state to a file system.
+   * Restore the state of a splitter from a file system. Use [[stateExists]] first to determine


FilePersistence is meant to be opaque to model details. Please don't add this logic here.

deanwampler · 2019-08-29T21:55:26Z

...c/main/scala/pipelines/examples/modelserving/speculative/SpeculativeWineModelCollector.scala

+
+    protected def startFlow =
+      FlowWithPipelinesContext[StartSpeculative].mapAsync(1) {
+        descriptor ⇒ splieeterCollector.ask(descriptor).mapTo[Done]


deanwampler · 2019-08-29T21:55:36Z

...c/main/scala/pipelines/examples/modelserving/speculative/SpeculativeWineModelCollector.scala

+
+    def makeSource(frequency: FiniteDuration = 5.millisecond): Source[Long, NotUsed] =
+      Source.repeat(1L)
+        .throttle(1, frequency)


Magic numbers...

deanwampler · 2019-08-29T21:56:48Z

...culative/src/main/scala/pipelines/examples/modelserving/speculative/WineModelIngressRR.scala

@@ -0,0 +1,68 @@
+package pipelines.examples.modelserving.speculative


I saw the same code above? Copy-paste error?

They are in different modules, so yes

deanwampler · 2019-08-29T21:58:00Z

...eculative/src/main/scala/pipelines/examples/modelserving/speculative/model/WineDecider.scala

+  override def decideResult(results: List[WineResult]): WineResult = {
+
+    results.size match {
+      case 0 ⇒ // No results, can only happen if we timed out


This is why your Decider trait must return an Either[String,...], with the string holding an error message. I don't care if this example code here, the Decider is intended for the library.

Sorry, whats wrong with this?

You are returning a bogus record. If I wrote a production application with this logic, would it be acceptable to return a bogus, made-up record and also not log an error? I don't think it would be acceptable. That's why this function needs to return Either and the error case needs to handled higher in the stack.

Wait, decider is doing the right thing. If there is no results, the only thing that it can do is to return an error that everything timed out, which it does. If all results are errors, this error is returned, and only if we have at least one result, then it will calculate result

There is no higher level thing - the decider is the master

deanwampler · 2019-08-29T21:59:10Z

.../scala/pipelines/examples/modelserving/speculative/model/WineSpeculativeRecordSplitter.scala

+import pipelines.examples.modelserving.winequality.data.WineResult
+import pipelines.examples.modelserving.winequality.speculative.WineResultRun
+
+class WineSpeculativeRecordSplitter extends SpeculativeRecordSplitter[WineResult] {


Now I really don't understand what this *Splitter abstraction is for. This code seems to be boilerplate that doesn't do anything.

Oh, we do need to split the record into UUID and base

Its a generic implementation that splits record with UUID into UUID and record

Why introduce a whole class to do such a trivial operation? It's not even very type safe (the source argument).

This goes back to to inheritance in Avro. What is a generic way to split a record before knowing its type. Dealing with Avro is a bitch

skonto · 2019-10-03T02:09:05Z

@deanwampler @blublinsky I suggest big PRS (if not broken down to smaller ones) to have: a) design doc b) test coverage c) proper docs for the novice user like seldon has (https://docs.seldon.io/projects/seldon-core/en/latest/examples/istio.html) e) proper description

Going through seldon and other stuff... I am curious why we re-invent the wheel at the application layer while people try to solve this at the K8s layer? kubeflow/kubeflow#667

blublinsky added 12 commits August 9, 2019 11:34

Started bluegreen implementation - WIP

a37add5

build passes - WIP

8b3f940

added speculative - WIP

2a1a839

added speculative - WIP

3777f19

added speculative - WIP

7744d1a

fixed SBT - WIP

9cb5dfe

updated Readme - WIP

2452dd4

code cleanup - WIP

9547ec8

More bug fixes

ccc3949

More bug fixes

ddd7767

Fixed sinks

3104074

Fixed Influx

4db3a7b

deanwampler suggested changes Aug 29, 2019

View reviewed changes

blublinsky added 9 commits September 6, 2019 16:13

Cleaned up persistence

bc2ed1c

Add disk save

5e3d2c9

AWS Egress

e38e8b0

Add zipping/unzipping of TF Bundled model

973e739

Add zipping/unzipping of TF Bundled model

2e00bfb

Add Flow for S3

e84ea04

Add Flow for S3

130f009

Add Flow for S3

50834c6

Add Flow for S3

3551500

blublinsky added 7 commits October 7, 2019 13:07

Updated to the latest version of pipeline library

4098fb9

Massive update to cloudflow 1.2

51aaece

Massive update to cloudflow 1.2

92594bc

Refactored mains to tests

688e424

Some error fixes

0b7ac7a

Updated to 1.2.1

030cf1b

Updated to 1.2.2

2c814c5

blublinsky added 3 commits November 8, 2019 13:38

Updated to 1.2.2

53a9ede

More cleanup

0edf5d6

More cleanup

37f3efc


		protected val filePersistence = FilePersistence.apply(null)

		protected var currentTransformer: Option[StreamSplitter] = None

		@@ -0,0 +1,68 @@
		package pipelines.examples.modelserving.speculative

Uh oh!

Bluegreen #7

Are you sure you want to change the base?

Bluegreen #7

Uh oh!

Conversation

blublinsky commented Aug 22, 2019

Uh oh!

deanwampler left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

skonto commented Oct 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

skonto commented Oct 3, 2019 •

edited

Loading