-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yolox export & blade optimize support #49
Conversation
wuziheng
commented
Apr 29, 2022
- YoloX jit export support with UT
- YoloX jit export with blade optimize enable by export config. (EG:speed up yoloxs from 42fps to 123fps on 1080Ti)
print(err) | ||
|
||
|
||
def blade_yolox_optimize(script_model, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blade_yolox_optimize
should be as a common api,decoupling from yolox
else: | ||
print('test_pipeline not found, using default preprocessing!') | ||
raise ValueError('export model config without test_pipeline') | ||
cfg.model.type = 'YOLOXExport' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove YOLOXExport api, should be a tool not a model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may be different for different model, we use
yolox blade as name because the blacklist op for blade is specialized for yolox, whether
useful for other cnn or transformer need to be checked later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or support export mode in model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need fix
easycv/apis/export.py
Outdated
input = 255 * torch.rand((batch_size, 3) + img_scale) | ||
yolox_trace = torch.jit.trace(model_export, input.to(device)) | ||
|
||
if getattr(cfg.export, 'use_blade', False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be extracted to api export_blade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
|
||
@MODELS.register_module | ||
class YOLOXExport(YOLOX): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove YOLOXExport
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using yoloXExport to fit the jit trace input/output requirenments, otherwise we should change the original YOLOX and add a config for it to open the post process.
@@ -39,6 +40,25 @@ def test_yolox_detector(self): | |||
self.assertEqual(len(output['detection_boxes']), 7) | |||
self.assertEqual(output['ori_img_shape'], [230, 352]) | |||
|
|||
def test_yolox_jit_detector(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add unittest for export jit and blade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
import torch | ||
import torchvision | ||
|
||
os.environ['DISC_ENABLE_STITCH'] = 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.environ['DISC_ENABLE_STITCH'] = os.environ.get('DISC_ENABLE_STITCH', 'true')
os.environ['DISC_EXPERIMENTAL_SPECULATION_TLP_ENHANCE'] = os.environ.get('DISC_EXPERIMENTAL_SPECULATION_TLP_ENHANCE', 'true')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
torch_version = 'failed' | ||
torch_cuda = 'failed' | ||
env_flag = False | ||
print( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace print with logging, other print too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
torch_config.customize_op_black_list = [ | ||
'aten::select', 'aten::index', 'aten::slice', 'aten::view' | ||
] | ||
torch_config.fp16_fallback_op_ratio = 0.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whether fp16_fallback_op_ratio support configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this config is specialized for yolox, other models should be test and add configs and then we should know how to refact
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, config by outside config[export][blade_config]
model_inputs=tuple(inputs), | ||
) | ||
benchmark(script_model, inputs, backend, batch, 'easycv') | ||
benchmark(model, inputs, backend, batch, 'easycv script') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
benchmark is not necessary for blade_yolox_optimize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use benchmark to print optimize result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
easycv/predictors/detector.py
Outdated
det_out = self.model( | ||
img, mode='test', img_metas=[data_dict['img_metas']._data]) | ||
|
||
if self.trace_able: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: self.traceable
os.environ['DISC_ENABLE_STITCH'] = os.environ.get('DISC_ENABLE_STITCH', 'true') | ||
os.environ['DISC_EXPERIMENTAL_SPECULATION_TLP_ENHANCE'] = os.environ.get('DISC_EXPERIMENTAL_SPECULATION_TLP_ENHANCE', 'true') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those environmental variables are only needed by DISC stitch. It's recommended to be used only when the DISC stitch strategty is perfered.
import ctypes | ||
_cudart = ctypes.CDLL('libcudart.so') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will cause a import error even we dont use it.
|
||
|
||
@contextmanager | ||
# def opt_trt_config(enable_fp16=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remove this line
@contextmanager | ||
def opt_blade_mixprec(): | ||
try: | ||
dummy = torch.classes.torch_blade.MixPrecision(True) | ||
yield | ||
finally: | ||
pass | ||
|
||
|
||
@contextmanager | ||
def opt_disc_config(enable_fp16=True): | ||
torch_config = torch_blade.config.Config() | ||
torch_config.enable_fp16 = enable_fp16 | ||
try: | ||
with torch_config: | ||
yield | ||
finally: | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Add those configurations if needed in the future. I prefer to remove those configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
results = [] | ||
|
||
|
||
def printStats(backend, timings, batch_size=1, model_name='default'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the following codes have no relation to Blade but are related to the benchmark. I prefer to add a benchmark tutorial in some place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert approve
refactor this CR in #66 |
LGTM |