Improve TFX Taxi Sample and Components. #512

qimingj · 2018-12-11T06:47:05Z

Confusion matrix and ROC components support flexible target.
Use single mode instead of 4 different modes for TFDV, TFMA, TFT, Predicton.
TFDV output's schema path directly.
Add Confusion Matrix and ROC components to tf taxi sample.

This change is

hongye-sun · 2018-12-11T17:36:24Z

components/local/confusion_matrix/src/confusion_matrix.py

+  if args.target_lambda:
+    df['target'] = df.apply(eval(args.target_lambda), axis=1)
+
+  # Convert "True" to "True_" and "False" to "False_" for frontend to work.


This looks very hacky. Could you add a bug ID here to explain why it's needed?

We are still investigating. #446. Will add that to comments

hongye-sun · 2018-12-11T17:41:31Z

components/local/roc/src/roc.py

+  parser.add_argument('--target_lambda', type=str,
+                      help='a lambda function as a string to determine positive or negative.' +
+                           'For example, "lambda x: x[\'a\'] and x[\'b\']". If missing, ' +
+                           'trueclass must be set.')


and input must have a target column.

hongye-sun · 2018-12-11T18:09:10Z

samples/tfx/taxi-cab-classification-pipeline.py

+        '--output', '%s/{{workflow.name}}/roc' % output,
+        '--predictions', predictions,
+        '--target_lambda', """lambda x: 1 if (x['target'] > x['fare'] * 0.2) else 0""",
+        '--trueclass', 'true',


Is it still needed?

k8s-ci-robot · 2018-12-11T22:16:38Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…and E2E clusters (kubeflow#512) * cleanup_ci needs to use different expiration times for auto deployed and E2E clusters * The auto deployed clusters are now using unique names rather than being recycled and we rely on cleanup_ci.py to GC old auto-deployments (kubeflow#444) * To support this we need to use variable expiration times. * Deployments created by tests should expire as soon as the test is done (so 1-2 hours) * But auto-deployed clusters need to live longer * There are only refreshed periodically by a cron job (~8 hours) we don't want to delete the cluster before a new one is deployed because we need a cluster for the example tests * We want to leave the clusters up as long as we can to facilitate debugging by people working on example tests. Related to kubeflow#444 * Address reviewer comments.

* Fix typo in resources * Allow number of workers to be specified for Tornando * remove comment

…p CRs for the users (kubeflow#512) * add auto apply custom task CRs before pipelineRun creation * fix lint * store loop resource in ordered dict * Update kfp-admin-guide.md * sorted pipeline params to avoid randomness * sorted pipeline params to avoid randomness * regenerate tests * convert template string from yaml to json * regenerate tests * clean up unnecessary helper function * resolve merge conflicts * resolve merge conflicts * remove unnecessary go mod package * only opt-in to embed the recursion loop annotations * only opt-in to embed the recursion loop annotations * enable opt-in only via package variable * add example for how to inline recursive loops * resolve conflicts * regenerate tests * make auto generate and apply recursive pipelineloop in annotations as default * move parameter sorting to the end * add parameter sorting for recursion loops * sort recursive task parameters * add more details on how to opt-out auto apply custom resource

qimingj requested review from Ark-kun, gaoning777 and hongye-sun December 11, 2018 06:47

k8s-ci-robot added size/L labels Dec 11, 2018

hongye-sun approved these changes Dec 11, 2018

View reviewed changes

qimingj closed this Dec 11, 2018

qimingj force-pushed the master branch from 5bcf10c to fa1886c Compare December 11, 2018 22:16

k8s-ci-robot added approved size/XS and removed size/L labels Dec 11, 2018

magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023

Allow Tornando workers to be specified (kubeflow#512)

5896d33

* Fix typo in resources * Allow number of workers to be specified for Tornando * remove comment

HumairAK pushed a commit to red-hat-data-services/data-science-pipelines that referenced this pull request Mar 11, 2024

Fix merging issue from kubeflow#512 (kubeflow#674)

df63b78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve TFX Taxi Sample and Components. #512

Improve TFX Taxi Sample and Components. #512

qimingj commented Dec 11, 2018 •

edited by jlewi

Loading

hongye-sun Dec 11, 2018

qimingj Dec 11, 2018

hongye-sun Dec 11, 2018

hongye-sun Dec 11, 2018

k8s-ci-robot commented Dec 11, 2018

Improve TFX Taxi Sample and Components. #512

Improve TFX Taxi Sample and Components. #512

Conversation

qimingj commented Dec 11, 2018 • edited by jlewi Loading

hongye-sun Dec 11, 2018

Choose a reason for hiding this comment

qimingj Dec 11, 2018

Choose a reason for hiding this comment

hongye-sun Dec 11, 2018

Choose a reason for hiding this comment

hongye-sun Dec 11, 2018

Choose a reason for hiding this comment

k8s-ci-robot commented Dec 11, 2018

qimingj commented Dec 11, 2018 •

edited by jlewi

Loading