-
Notifications
You must be signed in to change notification settings - Fork 887
BERT nightly benchmark on Inferentia2 #2283
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2283 +/- ##
==========================================
+ Coverage 69.39% 69.82% +0.42%
==========================================
Files 77 77
Lines 3441 3420 -21
Branches 57 57
==========================================
Hits 2388 2388
+ Misses 1050 1029 -21
Partials 3 3 see 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@namannandan If the transformers version expected is 4.19.0, where is this being set?
@agunapal the issue with the transformers version is only observed when tracing the model. Loading the traced model and inference works as expected even with more recent versions of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@namannandan Is the issue with the validate_benchmark.py resolved now?
42ac457
to
4146b29
Compare
Successful benchmark run with validation: https://github.com/pytorch/serve/actions/runs/4986426850 |
Description
Benchmark BERT model on Inferentia2 instance
Model artifacts:
Self hosted runner(inf2.8xlarge):
Type of change
Feature testing
Checkpoint file generation
Note: The artifacts above were traced using
transformers
version4.19.0
. With more recenttransformers
versions, the traced model for Neuron may generate incorrect inference result. Model output isNaN
.MAR file generation
Workflow test
Test branch:
test-inf2-benchmark
Workflow run and artifacts: https://github.com/pytorch/serve/actions/runs/4834127396
(Artifacts and metrics are being published but validation fails currently).
Benchmark results:
TorchServe Benchmark on neuronx
Date: 2023-04-28 20:57:15
TorchServe Version: torchserve-nightly==2023.4.27
scripted_mode_bert_neuronx_batch_1
scripted_mode_bert_neuronx_batch_2
scripted_mode_bert_neuronx_batch_4
scripted_mode_bert_neuronx_batch_8
Checklist: