Skip to content

Conversation

vishalbollu
Copy link
Contributor

@vishalbollu vishalbollu commented Apr 25, 2019

Closes #56

Checklist:

  • Run make test and make lint
  • Test end to end manually (e.g. build/push all images, restart local operator, and run cx refresh in an example folder)
  • Update documentation
  • Update examples and cx init
  • Alert team if dev environment changed
  • Cherry-pick into release branches if it's a bugfix
  • Delete the branch once it's merged

throttle_secs: <int> # do not re-evaluate unless the last evaluation was started at least this many seconds ago (default: 600)

compute:
compute: # Resources for training and evaluations steps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention it's the TensorFlow job? Like Resources for training and evaluations steps (TensorFlow)? Similarly # Resources for constructing training dataset (Spark)? We get Omer's stamp of approval on this

Copy link
Contributor Author

@vishalbollu vishalbollu Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ospillinger thoughts on the doc comments?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

UserString string
}

func MustNewQuantity(str string) Quantity {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used?

}
trainingDatasets = append(trainingDatasets, modelName)
trainingDatasetIDs.Add(dataset.GetID())
dependencyIDs := ctx.AllComputedResourceDependencies(dataset.GetID())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need to append the transformedColumn.Computes since transforming the data happens in the same step as preparing the dataset

@vishalbollu vishalbollu merged commit b4b1319 into master Apr 26, 2019
@vishalbollu vishalbollu deleted the training-dataset-resource-bug branch April 26, 2019 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resources not allocated to Spark workloads to generate training datasets

3 participants