Skip to content

[Speed]speed up python executor in fluid #8729

Closed
@jacquesqiao

Description

@jacquesqiao

Background

problem

In our Python executor, every executor.run will clone the program and then add feed and fetch op to the cloned program. The following profile demonstrates that the Program.clone is very time-consuming. I add a simple cache to void the program clone, and the result is very impressive.

solution

avoid the clone.

Experiment

profile script

#8674

condition

  • one card without parallel_do
batch_num before(s) after(s) after/before before/after
10 9.91367912292 7.47897624969 0.7544 1.3255395915090542
20 19.6153604984 14.6058752537 0.744 1.3429774085898059
30 28.9721696377 21.7348105907 0.7501 1.332984684490269

timeline

before optimize

image

after optimize

image

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions