Refactor PerfZero for simplicity and ease-of-use #292

lindong28 · 2019-01-28T09:33:50Z

This patch makes the following improvement. It depends on cl/230280956 which adds leading_indicators_test.py to OSS tensorflow/benchmarks.

Add leading_indicators_test.py to benchmarks/scripts and run it in PerfZero
Read benchmark results from local protobuf file written by tf.test.Benchmark.report_benchmark()
Simplify Perfzero report logic implementation
Rename environment variables from ROGUE_* to PERFZERO_*
Replace print(..) with logging.info(..) and logging.debug(..)
Print benchmark summary in a human readable format
Use datetime string as execution id and include it in the output path name
Specify full list of environment variables with documentation in README.md
Configure bigquery table name via environment variable
Upload gpu driver version to bigquery
Print messages to both stdout and log file

benchmark summary is {
  "system_info": {
    "platform_name": "workstation-z420",
    "cpu_socket_count": 1,
    "gpu_count": 2,
    "cpu_model": "Intel(R) Xeon(R) CPU @ 2.20GHz",
    "cpu_core_count": 8,
    "gpu_model": "Tesla V100-SXM2-16GB"
  },
  "execution_id": "2019-01-29-04-25-52-683031",
  "benchmark_info": {
    "output_url": "gs://tf-performance/test-results/2019-01-29-04-25-52-683031/",
    "date_time": "2019-01-29-04-26-45",
    "project_name": "perfzero-dev"
  },
  "benchmark_result": {
    "success": false,
    "benchmark_name": "Resnet50Benchmarks.benchmark_fake_1gpu_gpuparams",
    "wall_time_ms": 49606,
    "iters": 100,
    "loss": "5.864147",
    "top_5_accuracy": null,
    "top_1_accuracy": null
  },
  "ml_framework_info": {
    "framework": "tensorflow",
    "version": "1.13.0-dev20190128",
    "git_version": "v1.12.0-6820-g0f59cbd297",
    "build_type": "OTB",
    "channel": "NIGHTLY"
  }
}

The information above is translated to old format and uploaded to bigquery.

The PerfZero configuration is now documented in README.md

Add leading_indicators_test.py to benchmarks/scripts and run it in PerfZero Read benchmark results from local protobuf file written by tf.test.Benchmark.report_benchmark() Simplify Perfzero report logic implementation Rename environment variables from ROGUE_* to PERFZERO_* Replace print(..) with logging.info(..) and logging.debug(..) Print benchmark summary in a human readable format Use datetime string as execution id and include it in the output path name Specify full list of environment variables with documentation in README.md Configure bigquery table name via environment variable Upload gpu driver version to bigquery Print messages to both stdout and log file

tfboyd · 2019-01-28T23:02:34Z

perfzero/lib/perfzero/report_utils.py

+  conn.close()
+
+
+#def upload_with_stream_mode(client, dataset, table, row):


Stream should be the default and needs to exist. I think you are swapping back in the current reporting. I think it makes sense to not change the reporting structure until we have a strong need. Changing it does not get us a lot of value right now and adds a lot of work. I realize the current structure is a bit messy right now.

tfboyd · 2019-01-28T23:04:08Z

perfzero/lib/perfzero/report_utils.py

+
+  ml_framework_info = {}
+  ml_framework_info['framework'] = 'tensorflow'
+  ml_framework_info['version'] = tf.__version__


I left this off as I was testing pytorch and mxnet with the same setup and wanted to share reporting. I could be talked into keeping this here but I kind of prefer we not require tensorflow to do reporting.

tfboyd · 2019-01-28T23:04:22Z

perfzero/lib/perfzero/report_utils.py

+
+def build_execution_summary(execution_id, project_name, platform_name,
+                            output_url, benchmark_result):
+  import tensorflow as tf


If you do keep this you need a pylint disable I think.

lindong28 requested a review from tfboyd January 28, 2019 09:33

googlebot added the cla: yes label Jan 28, 2019

lindong28 mentioned this pull request Jan 28, 2019

Export benchmark stats using tf.test.Benchmark.report_benchmark() tensorflow/models#6103

Merged

lindong28 force-pushed the add-leading-indiciators-test branch from 9ecb008 to 183975b Compare January 28, 2019 10:00

This was referenced Jan 28, 2019

Add resnet56 short tests. tensorflow/models#6101

Merged

Support running all methods in a class with filter:bench.* #291

Merged

lindong28 force-pushed the add-leading-indiciators-test branch 5 times, most recently from 675bb9f to 561211d Compare January 29, 2019 05:15

lindong28 changed the title ~~Add leading_indicators_test.py to benchmarks/scripts and run it in Perfzero~~ Refactor PerfZero for simplicity and ease-of-use Jan 29, 2019

lindong28 force-pushed the add-leading-indiciators-test branch 6 times, most recently from 3e464b8 to d798e18 Compare February 1, 2019 07:17

lindong28 force-pushed the add-leading-indiciators-test branch from d798e18 to d4785fc Compare February 1, 2019 20:25

tfboyd approved these changes Feb 1, 2019

View reviewed changes

tfboyd merged commit 8a8c7a2 into tensorflow:master Feb 1, 2019

lindong28 deleted the add-leading-indiciators-test branch February 16, 2019 23:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor PerfZero for simplicity and ease-of-use #292

Refactor PerfZero for simplicity and ease-of-use #292

Uh oh!

lindong28 commented Jan 28, 2019 •

edited

Loading

Uh oh!

tfboyd Jan 28, 2019

Uh oh!

tfboyd Jan 28, 2019

Uh oh!

tfboyd Jan 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		conn.close()


		#def upload_with_stream_mode(client, dataset, table, row):

Refactor PerfZero for simplicity and ease-of-use #292

Refactor PerfZero for simplicity and ease-of-use #292

Uh oh!

Conversation

lindong28 commented Jan 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tfboyd Jan 28, 2019

Choose a reason for hiding this comment

Uh oh!

tfboyd Jan 28, 2019

Choose a reason for hiding this comment

Uh oh!

tfboyd Jan 28, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lindong28 commented Jan 28, 2019 •

edited

Loading