Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yolox export & blade optimize support #49

Closed
wants to merge 14 commits into from

Conversation

wuziheng
Copy link
Collaborator

  1. YoloX jit export support with UT
  2. YoloX jit export with blade optimize enable by export config. (EG:speed up yoloxs from 42fps to 123fps on 1080Ti)

L1VzZXJzL3d6aC9MaWJyYXJ5L0FwcGxpY2F0aW9uIFN1cHBvcnQvaURpbmdUYWxrLzE3MjQxNzE5NF92Mi9JbWFnZUZpbGVzLzE2NTEyMjIwMjQ2MDBf5oiq5bGPMjAyMi0wNC0yOSDkuIvljYg0LjQ2LjUxLnBuZw==

@wenmengzhou wenmengzhou added the enhancement New feature or request label Apr 30, 2022
print(err)


def blade_yolox_optimize(script_model,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blade_yolox_optimize should be as a common api,decoupling from yolox

else:
print('test_pipeline not found, using default preprocessing!')
raise ValueError('export model config without test_pipeline')
cfg.model.type = 'YOLOXExport'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove YOLOXExport api, should be a tool not a model

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be different for different model, we use
yolox blade as name because the blacklist op for blade is specialized for yolox, whether
useful for other cnn or transformer need to be checked later

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or support export mode in model

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need fix

input = 255 * torch.rand((batch_size, 3) + img_scale)
yolox_trace = torch.jit.trace(model_export, input.to(device))

if getattr(cfg.export, 'use_blade', False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be extracted to api export_blade

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done



@MODELS.register_module
class YOLOXExport(YOLOX):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove YOLOXExport

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using yoloXExport to fit the jit trace input/output requirenments, otherwise we should change the original YOLOX and add a config for it to open the post process.

@@ -39,6 +40,25 @@ def test_yolox_detector(self):
self.assertEqual(len(output['detection_boxes']), 7)
self.assertEqual(output['ori_img_shape'], [230, 352])

def test_yolox_jit_detector(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add unittest for export jit and blade

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import torch
import torchvision

os.environ['DISC_ENABLE_STITCH'] = 'true'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.environ['DISC_ENABLE_STITCH'] = os.environ.get('DISC_ENABLE_STITCH', 'true')
os.environ['DISC_EXPERIMENTAL_SPECULATION_TLP_ENHANCE'] = os.environ.get('DISC_EXPERIMENTAL_SPECULATION_TLP_ENHANCE', 'true')

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

torch_version = 'failed'
torch_cuda = 'failed'
env_flag = False
print(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace print with logging, other print too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

torch_config.customize_op_black_list = [
'aten::select', 'aten::index', 'aten::slice', 'aten::view'
]
torch_config.fp16_fallback_op_ratio = 0.3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whether fp16_fallback_op_ratio support configurable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this config is specialized for yolox, other models should be test and add configs and then we should know how to refact

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, config by outside config[export][blade_config]

model_inputs=tuple(inputs),
)
benchmark(script_model, inputs, backend, batch, 'easycv')
benchmark(model, inputs, backend, batch, 'easycv script')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

benchmark is not necessary for blade_yolox_optimize

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use benchmark to print optimize result.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

det_out = self.model(
img, mode='test', img_metas=[data_dict['img_metas']._data])

if self.trace_able:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: self.traceable

Comment on lines 14 to 15
os.environ['DISC_ENABLE_STITCH'] = os.environ.get('DISC_ENABLE_STITCH', 'true')
os.environ['DISC_EXPERIMENTAL_SPECULATION_TLP_ENHANCE'] = os.environ.get('DISC_EXPERIMENTAL_SPECULATION_TLP_ENHANCE', 'true')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those environmental variables are only needed by DISC stitch. It's recommended to be used only when the DISC stitch strategty is perfered.

Comment on lines +21 to +22
import ctypes
_cudart = ctypes.CDLL('libcudart.so')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: not needed here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will cause a import error even we dont use it.



@contextmanager
# def opt_trt_config(enable_fp16=True):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove this line

Comment on lines +104 to +122
@contextmanager
def opt_blade_mixprec():
try:
dummy = torch.classes.torch_blade.MixPrecision(True)
yield
finally:
pass


@contextmanager
def opt_disc_config(enable_fp16=True):
torch_config = torch_blade.config.Config()
torch_config.enable_fp16 = enable_fp16
try:
with torch_config:
yield
finally:
pass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add those configurations if needed in the future. I prefer to remove those configurations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

results = []


def printStats(backend, timings, batch_size=1, model_name='default'):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the following codes have no relation to Blade but are related to the benchmark. I prefer to add a benchmark tutorial in some place.

Copy link
Collaborator

@Cathy0908 Cathy0908 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert approve

@Cathy0908 Cathy0908 self-requested a review May 5, 2022 12:19
@wenmengzhou
Copy link
Collaborator

refactor this CR in #66

@wuziheng
Copy link
Collaborator Author

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants