-
Notifications
You must be signed in to change notification settings - Fork 5.7k
design of RNNOp #3727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
design of RNNOp #3727
Conversation
doc/design/ops/rnn.md
Outdated
- init_memory, the variable to help initialize memory | ||
|
||
### step scopes | ||
Each RNN has more than one step times, and the stepnet will be executed in every step time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RNN might run one or more steps.
doc/design/ops/rnn.md
Outdated
|
||
<p aligh="center"> | ||
<img src="./images/rnn.png"/><br/> | ||
fig 2 the RNN's data flow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fig 2 ==> Figure 2
doc/design/ops/rnn.md
Outdated
|
||
There are several important concepts: | ||
|
||
- stepnet, the network execute in every time step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stepnet => step-net
doc/design/ops/rnn.md
Outdated
|
||
There are several important concepts: | ||
|
||
- stepnet, the network execute in every time step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the network to be executed in each step
doc/design/ops/rnn.md
Outdated
- init-memory, the variable to help initialize state in the first time step. | ||
|
||
### step scopes | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step Scope
doc/design/ops/rnn.md
Outdated
- init-memory, the variable to help initialize state in the first time step. | ||
|
||
### step scopes | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The step-net could have local variables defined. In each step of RNN execution, a scope is created to hold corresponding variables. Such a scope is known as a step scope.
doc/design/ops/rnn.md
Outdated
h_t = U h_{t-1} + W x_t | ||
$$ | ||
|
||
Here, $h_t$ is time $t$'s state, $h_t$ is time $t-1$'s state, in implementation, we call the a variable that store a state memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
", in implementation, we call the a variable that store a state memory." can be deleted
doc/design/ops/rnn.md
Outdated
$$ | ||
|
||
Here, $h_t$ is time $t$'s state, $h_t$ is time $t-1$'s state, in implementation, we call the a variable that store a state memory. | ||
In step time $t$, $h_t$ is memory, $h_{t-1}$ is pre-memory (short for previous memory). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"In step time
doc/design/ops/rnn.md
Outdated
Here, $h_t$ is time $t$'s state, $h_t$ is time $t-1$'s state, in implementation, we call the a variable that store a state memory. | ||
In step time $t$, $h_t$ is memory, $h_{t-1}$ is pre-memory (short for previous memory). | ||
|
||
In each step scope |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+In each step scope
+- each memory variable has a corresponding pre-memory variable
+- before a time step executes, copy (or make a reference) the value of previous step scope's memory to the pre-memory variable in current step scope.
=>
In the implementation, we can make an ex-memory variable either "refers to" the memory variable of the previous step, or copy the value of the previous memory variable to the current ex-memory variable.
doc/design/ops/rnn.md
Outdated
- each memory variable has a corresponding pre-memory variable | ||
- before a time step executes, copy (or make a reference) the value of previous step scope's memory to the pre-memory variable in current step scope. | ||
|
||
### C++ API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C++ API
doc/design/ops/rnn.md
Outdated
- void Run(const framework::Scope& scope, const platform::DeviceContext& dev_ctx) const; | ||
- run all the time steps. | ||
|
||
### User interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python Interface
|
||
rnn = pd.create_rnn_op(output_num=1) | ||
with rnn.stepnet(): | ||
x = rnn.add_input(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example uses rnn.add_input
. But the next example uses rnn.segment_input
. Are they the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I will change all to rnn.add_input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to differentiate two types of input: sequence input and static input. Each instance has different static input. But for one instance, it's same across all time steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the static inputs will be treated as global variables and that doesn't need to be passed as input.
the add_input
statement only mark the sequence input that needs to be segmented for RNN's step times. @emailweixu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static inputs are different from parameters. They will still need to be splitted according to whether that instance is participating at a timestep, where parameters do not need to be splitted.
|
||
We can define an RNN's step-net using Block: | ||
|
||
```python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this API works with attention model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This syntax should be compatible with Paddle V1, but without the support of Beam Search.
doc/design/ops/rnn.md
Outdated
# update current memory | ||
h.update(new_state) | ||
# indicate that h variables in all step scopes should be merged | ||
rnn.set_output(0, h) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "0" mean in set_output
? Every set_output
in this PR uses "0" as argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 means the "0-th" argument
doc/design/ops/rnn.md
Outdated
h.update( | ||
pd.matmul(W, sentence) + pd.matmul(U, h.pre_state())) | ||
# get the last state as sentence's info | ||
rnn.set_output(0, h) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the 0
here indicating the first output?
How can we specify that an RNN should return just the output from the last step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rnn = pd.create_rnn_op()
with rnn.stepnet():
x = rnn.set_inputs(X)
# declare a memory (rnn's step)
h = rnn.add_memory(init=a)
# h.pre_state() means previous memory of rnn
new_state = pd.add_two( pd.matmul(W, x) + pd.matmul(U, h.pre_state()))
# update current memory
h.update(new_state)
# indicate that h variables in all step scopes should be merged
rnn.set_outputs(h)
# output last step
out = rnn(output_all_steps=False)
can we use the argument output_all_steps
to output all steps or just the last step?
Fixes #3823