Skip to content

[Speed] sequence_pool op need to be enhanced #9099

Closed
@dzhwinter

Description

@dzhwinter

Every single call to sequence like op, it will produce a bunch of kernel calls. It should be enhanced.

thread0::array_to_lod_tensor                2           62.4829     30.9197     31.5633     31.2415     3262.35     0           327.645
thread0::mul                                763         61.21       0.038688    0.71024     0.0802228   642.303     0.000488281 25.7815
thread0::sequence_softmax_grad              69          57.409      0.04448     2.6696      0.832014    4320.91     0.000488281 0.0126953
thread0::lod_tensor_to_array                2           56.6311     27.8057     28.8253     28.3155     810.032     5.60864     327.656

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions