-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] LSTMSequence and LSTMCell optimization #26767
Open
michal-miotk
wants to merge
117
commits into
openvinotoolkit:master
Choose a base branch
from
michal-miotk:lstm_with_onednn
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+2,374
−430
Open
Changes from 104 commits
Commits
Show all changes
117 commits
Select commit
Hold shift + click to select a range
9ce143a
compiles lstm_seq
michal-miotk 027f991
more kernel args
michal-miotk c191c58
bigger proper run chances
michal-miotk d461e66
19jul
michal-miotk 01fa2ac
inference works
michal-miotk 1f017fd
in middle of implementation
michal-miotk 5787c7d
problems with inputs get element in kernel
michal-miotk 837db22
not compile
michal-miotk d4ce531
wipx
michal-miotk 19c268e
wip
michal-miotk f5273bc
solved problem with too much inputs kernel
michal-miotk d50b3be
wip
michal-miotk 63a8dfd
more changes
michal-miotk f54ecc1
wip
michal-miotk 3748a11
wip
michal-miotk fae772a
wip
michal-miotk c00ff8a
proper shape for 2 outputs
michal-miotk 1c08b14
Squashed commit of the following:
michal-miotk 6968881
Squashed commit of the following:
michal-miotk 31fcb79
cleaning
michal-miotk 4b16eef
Merge branch 'master' into lstm2
michal-miotk dcad182
updated to new primitive_base api, disabled lstm to tensor transforma…
michal-miotk d6aeb54
now it should compile on windows, changed kernel name
michal-miotk 9688f63
deleted cell, deleted input_forget
michal-miotk 5003d47
generic primitive
michal-miotk 5937b14
fix compilation problem, smaller lws
michal-miotk 8b31a91
wip
michal-miotk 2ff5a7c
wip, not resolved fail on dynamic
michal-miotk 2d9e5c6
fixed failing dynamic test
michal-miotk 702e941
change name cldnn::rnn -> cldnn::lstm_seq
michal-miotk f4d3b71
fix bad order of inputs in lstm_elt constructor
michal-miotk 0c7103c
changed input order in kernel
michal-miotk f37482a
Squashed commit of the following:
michal-miotk 0058c57
Merge branch 'master' into lstm2
michal-miotk 1ac26d3
fix bad initialization in kernel
michal-miotk 31040bf
generic kernel
michal-miotk 83aa74f
deleted unnecessary cancelled buffer fusing for cell
michal-miotk 0cce00c
Merge branch 'master' into lstm2
michal-miotk 0e37c8a
bigger local workgroup, turned off buffer fusing for lstm cell
michal-miotk 72b48d1
speedup 1.5x after unrolling loop
michal-miotk 7a747c5
barrier in better place
michal-miotk 9b99f04
direction condition on macro, more macro
michal-miotk 5052e26
reducing temp_cell_state
michal-miotk aa5d906
Revert "reducing temp_cell_state"
michal-miotk 4b524fd
reducing temp cell state
michal-miotk c47c943
minor kernel speedup (1fps)
michal-miotk e486376
deleted unnecessary tab for input and hidden result
michal-miotk fe72cc8
fix windows compilation
michal-miotk d62f223
more clear kernel algorithm
michal-miotk 0b1fa3d
wip
michal-miotk 3e1fe20
wip vectorized
michal-miotk cac921c
more vector
michal-miotk a165f30
fix for vec size, deleted MAX_SEQ_LENGTH
michal-miotk 8f74962
Revert "fix for vec size, deleted MAX_SEQ_LENGTH"
michal-miotk 732eb52
fix vec_size
michal-miotk 165dd9b
optimizations for bigger gpus
michal-miotk 1b9cc98
fix for windows
michal-miotk 37ab01b
fix conversion error
michal-miotk c99ddc0
Merge branch 'master' into lstm2
michal-miotk 60a0675
merge most important from lstm23
michal-miotk 1b23648
deleted cout
michal-miotk 7c1bf37
Merge branch 'master' into lstm_with_onednn
michal-miotk 40abc31
mainly changes from code review
michal-miotk 56031d9
merged some_wip
michal-miotk d954fe8
Merge branch 'master' into lstm_with_onednn
michal-miotk 78cc4fc
correct in registry
michal-miotk 81ca2ed
Merge branch 'master' into lstm_with_onednn
michal-miotk 431d937
deleted level zero, undo changes in visualize_tree
michal-miotk 6b6800f
fix bad name in OV_GPU_PRIMITIVE_IMPL
michal-miotk db8d75b
returning on conversion to tensor iterator
michal-miotk a9cd3cf
Squashed commit of the following:
michal-miotk bfb80ba
Merge branch 'master' into lstm_with_onednn
michal-miotk 57faed2
wip
michal-miotk 7f097ba
wip
michal-miotk a79eca5
Merge branch 'master' into lstm_with_onednn
michal-miotk 8d4e46b
should work, turned off forcing immad
michal-miotk 00c6237
added lstm_seq and lstm_cell in implementation manager
michal-miotk 31b8ef0
Merge branch 'master' into lstm_with_onednn
michal-miotk 07c1ac2
little cleaning
michal-miotk a78ef3a
turnedoff immad check for onednn
michal-miotk 5bcab62
deleted unused var
michal-miotk d564228
redo level_zero_ext to cdb761
michal-miotk b16bdac
redo mistake change to ov_subgraph
michal-miotk 173b5b2
enabled tests for bfyx kernel
michal-miotk c8eb682
set to turn on onednn
michal-miotk 43acd2b
turned of impl selection for childs and grandchilds of node, cleaning
michal-miotk 0002e54
added cl_cache extension for *.onednn.cl_cache files
michal-miotk 7741a46
renamed post_optimize_lstm_weights, deleted unused function select_im…
michal-miotk ac352ea
repair cache tests
michal-miotk d0fb8b4
Merge branch 'master' into lstm_with_onednn
michal-miotk a1497c4
initialized memory in infer_request_dynamic tests
michal-miotk f12aebd
fix for failing caching tests
michal-miotk 6170710
deleted event handling as in case in in_order que it is not used
michal-miotk 01dc7dc
preventing duplicates
michal-miotk e9bf370
repairs in infer_request set and get tensor
michal-miotk 7158776
fused test repair
michal-miotk 5e21106
set in order queue as default test config
michal-miotk daa83b5
only bfyx format for lstm_seq
michal-miotk 6af1f3f
skipping conv fusion tests
michal-miotk 02942e5
skipping f16 deconv gpu tests
michal-miotk f8dbec3
conv_fp32_multi_eltquant skip in conv_fusion_test
michal-miotk 2abe8f8
Merge branch 'master' into lstm_with_onednn
michal-miotk 00826ad
update hash as input format of weights is custom after post_optimize_…
michal-miotk 36b4853
change format in conv_fp32_multi_eltwise_concat basic test
michal-miotk c358ab3
fix shape calc for onednn, only bfyx supported for lstmocl
michal-miotk 19b1d93
Revert "optimizations for bigger gpus"
michal-miotk 4da2df6
deleted all get_index safe in lstm bfyx kernel
michal-miotk 303bf7d
applying review part1
michal-miotk bf9f13f
fix check of dimensions
michal-miotk 459e1ad
fix check of input dim lstm cell
michal-miotk 14e53f4
enable onednn for tests ON, LSTMSeq accept bfyx and fbyx format
michal-miotk 063ac02
dot op, vec_size=4
michal-miotk 892131b
Revert "skipping conv fusion tests"
michal-miotk b539a3f
Revert "conv_fp32_multi_eltquant skip in conv_fusion_test"
michal-miotk dc8ac73
lstm_weights optimization is part of post_optimize_weights
michal-miotk a5165a8
fix forbiddnen size_t->int conversion
michal-miotk cc6b4b5
Revert "update hash as input format of weights is custom after post_o…
michal-miotk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
74 changes: 74 additions & 0 deletions
74
src/plugins/intel_gpu/include/intel_gpu/primitives/lstm_cell.hpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
// Copyright (C) 2018-2024 Intel Corporation | ||
// SPDX-License-Identifier: Apache-2.0 | ||
// | ||
|
||
#pragma once | ||
#include "primitive.hpp" | ||
#include "activation.hpp" | ||
#include <vector> | ||
#include <algorithm> | ||
#include "intel_gpu/graph/serialization/activation_serializer.hpp" | ||
#include "rnn.hpp" | ||
|
||
|
||
namespace cldnn { | ||
|
||
struct lstm_cell : public primitive_base<lstm_cell> { | ||
CLDNN_DECLARE_PRIMITIVE(lstm_cell) | ||
|
||
lstm_cell() : primitive_base("", {}), input_forget(0) { | ||
params.clip = 0; | ||
params.offset_order = lstm_weights_order::iofz; | ||
params.direction = 0; | ||
} | ||
|
||
using vec_activation = std::vector<activation_func>; | ||
using vec_activation_param = std::vector<activation_additional_params>; | ||
|
||
/// @brief Constructs lstm layer. | ||
/// @param RNNParam common params for rnns | ||
/// @param input_forget Provide 0 if using lstm without coupled input-forget gates. | ||
lstm_cell(const RNNParams& p, | ||
bool input_forget) | ||
: primitive_base(p.id, p.get_inputs(), p.num_outputs, \ | ||
{optional_data_type()}, {p.output_padding}), \ | ||
params(p), | ||
input_forget(input_forget) {} | ||
|
||
RNNParams params; | ||
bool input_forget; | ||
|
||
size_t hash() const override { | ||
size_t seed = primitive::hash(); | ||
seed = hash_combine(seed, params.hash()); | ||
seed = hash_combine(seed, input_forget); | ||
return seed; | ||
} | ||
|
||
bool operator==(const primitive& rhs) const override { | ||
if (!compare_common_params(rhs)) | ||
return false; | ||
auto rhs_casted = downcast<const lstm_cell>(rhs); | ||
return params == rhs_casted.params && input_forget == rhs_casted.input_forget; | ||
} | ||
|
||
void save(BinaryOutputBuffer& ob) const override { | ||
primitive_base<lstm_cell>::save(ob); | ||
params.save(ob); | ||
ob << input_forget; | ||
} | ||
|
||
void load(BinaryInputBuffer& ib) override { | ||
primitive_base<lstm_cell>::load(ib); | ||
params.load(ib); | ||
ib >> input_forget; | ||
} | ||
|
||
protected: | ||
std::vector<input_info> get_dependencies() const override { | ||
return {}; | ||
} | ||
}; | ||
|
||
|
||
} // namespace cldnn |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: alignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done