Skip to content
This repository was archived by the owner on Dec 9, 2024. It is now read-only.

Conversation

@lindong28
Copy link
Contributor

@lindong28 lindong28 commented Jan 28, 2019

This patch makes the following improvement. It depends on cl/230280956 which adds leading_indicators_test.py to OSS tensorflow/benchmarks.

Add leading_indicators_test.py to benchmarks/scripts and run it in PerfZero
Read benchmark results from local protobuf file written by tf.test.Benchmark.report_benchmark()
Simplify Perfzero report logic implementation
Rename environment variables from ROGUE_* to PERFZERO_*
Replace print(..) with logging.info(..) and logging.debug(..)
Print benchmark summary in a human readable format
Use datetime string as execution id and include it in the output path name
Specify full list of environment variables with documentation in README.md
Configure bigquery table name via environment variable
Upload gpu driver version to bigquery
Print messages to both stdout and log file
benchmark summary is {
  "system_info": {
    "platform_name": "workstation-z420",
    "cpu_socket_count": 1,
    "gpu_count": 2,
    "cpu_model": "Intel(R) Xeon(R) CPU @ 2.20GHz",
    "cpu_core_count": 8,
    "gpu_model": "Tesla V100-SXM2-16GB"
  },
  "execution_id": "2019-01-29-04-25-52-683031",
  "benchmark_info": {
    "output_url": "gs://tf-performance/test-results/2019-01-29-04-25-52-683031/",
    "date_time": "2019-01-29-04-26-45",
    "project_name": "perfzero-dev"
  },
  "benchmark_result": {
    "success": false,
    "benchmark_name": "Resnet50Benchmarks.benchmark_fake_1gpu_gpuparams",
    "wall_time_ms": 49606,
    "iters": 100,
    "loss": "5.864147",
    "top_5_accuracy": null,
    "top_1_accuracy": null
  },
  "ml_framework_info": {
    "framework": "tensorflow",
    "version": "1.13.0-dev20190128",
    "git_version": "v1.12.0-6820-g0f59cbd297",
    "build_type": "OTB",
    "channel": "NIGHTLY"
  }
}

The information above is translated to old format and uploaded to bigquery.

The PerfZero configuration is now documented in README.md

@lindong28 lindong28 requested a review from tfboyd January 28, 2019 09:33
@lindong28 lindong28 force-pushed the add-leading-indiciators-test branch from 9ecb008 to 183975b Compare January 28, 2019 10:00
@lindong28 lindong28 force-pushed the add-leading-indiciators-test branch 5 times, most recently from 675bb9f to 561211d Compare January 29, 2019 05:15
@lindong28 lindong28 changed the title Add leading_indicators_test.py to benchmarks/scripts and run it in Perfzero Refactor PerfZero for simplicity and ease-of-use Jan 29, 2019
@lindong28 lindong28 force-pushed the add-leading-indiciators-test branch 6 times, most recently from 3e464b8 to d798e18 Compare February 1, 2019 07:17
Add leading_indicators_test.py to benchmarks/scripts and run it in PerfZero
Read benchmark results from local protobuf file written by tf.test.Benchmark.report_benchmark()
Simplify Perfzero report logic implementation
Rename environment variables from ROGUE_* to PERFZERO_*
Replace print(..) with logging.info(..) and logging.debug(..)
Print benchmark summary in a human readable format
Use datetime string as execution id and include it in the output path name
Specify full list of environment variables with documentation in README.md
Configure bigquery table name via environment variable
Upload gpu driver version to bigquery
Print messages to both stdout and log file
@lindong28 lindong28 force-pushed the add-leading-indiciators-test branch from d798e18 to d4785fc Compare February 1, 2019 20:25
conn.close()


#def upload_with_stream_mode(client, dataset, table, row):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stream should be the default and needs to exist. I think you are swapping back in the current reporting. I think it makes sense to not change the reporting structure until we have a strong need. Changing it does not get us a lot of value right now and adds a lot of work. I realize the current structure is a bit messy right now.


ml_framework_info = {}
ml_framework_info['framework'] = 'tensorflow'
ml_framework_info['version'] = tf.__version__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this off as I was testing pytorch and mxnet with the same setup and wanted to share reporting. I could be talked into keeping this here but I kind of prefer we not require tensorflow to do reporting.


def build_execution_summary(execution_id, project_name, platform_name,
output_url, benchmark_result):
import tensorflow as tf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do keep this you need a pylint disable I think.

@tfboyd tfboyd merged commit 8a8c7a2 into tensorflow:master Feb 1, 2019
@lindong28 lindong28 deleted the add-leading-indiciators-test branch February 16, 2019 23:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants