Skip to content

Commit 61dfab6

Browse files
authored
Create 双向 LSTM.md
1 parent 4ea331a commit 61dfab6

File tree

1 file changed

+112
-0
lines changed

1 file changed

+112
-0
lines changed

NLP/双向 LSTM.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
本文结构:
2+
3+
- 为什么用双向 LSTM
4+
- 什么是双向 LSTM
5+
- 例子
6+
7+
---
8+
9+
#### 为什么用双向 LSTM?
10+
11+
单向的 RNN,是根据前面的信息推出后面的,但有时候只看前面的词是不够的,
12+
例如,
13+
14+
我今天不舒服,我打算____一天。
15+
16+
只根据‘不舒服‘,可能推出我打算‘去医院‘,‘睡觉‘,‘请假‘等等,但如果加上后面的‘一天‘,能选择的范围就变小了,‘去医院‘这种就不能选了,而‘请假‘‘休息‘之类的被选择概率就会更大。
17+
18+
---
19+
20+
#### 什么是双向 LSTM?
21+
22+
双向卷积神经网络的隐藏层要保存两个值, A 参与正向计算, A' 参与反向计算。
23+
最终的输出值 y 取决于 A 和 A':
24+
25+
![](http://upload-images.jianshu.io/upload_images/1667471-ad054c3a8b703f28.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
26+
27+
即正向计算时,隐藏层的 s_t 与 s_t-1 有关;反向计算时,隐藏层的 s_t 与 s_t+1 有关:
28+
29+
![](http://upload-images.jianshu.io/upload_images/1667471-b6dddc4e9d2b5fd4.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
30+
31+
![](http://upload-images.jianshu.io/upload_images/1667471-d2e41409e1337748.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
32+
33+
在某些任务中,双向的 lstm 要比单向的 lstm 的表现要好:
34+
35+
![](http://upload-images.jianshu.io/upload_images/1667471-bba99f50ee3d9784.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
36+
37+
---
38+
39+
#### 例子
40+
41+
下面是一个 keras 实现的 双向LSTM 应用的小例子,任务是对序列进行分类,
42+
例如如下 10 个随机数:
43+
44+
`0.63144003 0.29414551 0.91587952 0.95189228 0.32195638 0.60742236 0.83895793 0.18023048 0.84762691 0.29165514`
45+
46+
累加值超过设定好的阈值时可标记为 1,否则为 0,例如阈值为 2.5,则上述输入的结果为:
47+
48+
`0 0 0 1 1 1 1 1 1 1`
49+
50+
和单向 LSTM 的区别是用到 Bidirectional:
51+
`model.add(Bidirectional(LSTM(20, return_sequences=True), input_shape=(n_timesteps, 1)))`
52+
53+
54+
```
55+
from random import random
56+
from numpy import array
57+
from numpy import cumsum
58+
from keras.models import Sequential
59+
from keras.layers import LSTM
60+
from keras.layers import Dense
61+
from keras.layers import TimeDistributed
62+
from keras.layers import Bidirectional
63+
64+
# create a sequence classification instance
65+
def get_sequence(n_timesteps):
66+
# create a sequence of random numbers in [0,1]
67+
X = array([random() for _ in range(n_timesteps)])
68+
# calculate cut-off value to change class values
69+
limit = n_timesteps/4.0
70+
# determine the class outcome for each item in cumulative sequence
71+
y = array([0 if x < limit else 1 for x in cumsum(X)])
72+
# reshape input and output data to be suitable for LSTMs
73+
X = X.reshape(1, n_timesteps, 1)
74+
y = y.reshape(1, n_timesteps, 1)
75+
return X, y
76+
77+
# define problem properties
78+
n_timesteps = 10
79+
80+
# define LSTM
81+
model = Sequential()
82+
model.add(Bidirectional(LSTM(20, return_sequences=True), input_shape=(n_timesteps, 1)))
83+
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
84+
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['acc'])
85+
86+
# train LSTM
87+
for epoch in range(1000):
88+
# generate new random sequence
89+
X,y = get_sequence(n_timesteps)
90+
# fit model for one epoch on this sequence
91+
model.fit(X, y, epochs=1, batch_size=1, verbose=2)
92+
93+
# evaluate LSTM
94+
X,y = get_sequence(n_timesteps)
95+
yhat = model.predict_classes(X, verbose=0)
96+
for i in range(n_timesteps):
97+
print('Expected:', y[0, i], 'Predicted', yhat[0, i])
98+
99+
```
100+
101+
---
102+
103+
学习资料:
104+
https://zybuluo.com/hanbingtao/note/541458
105+
https://maxwell.ict.griffith.edu.au/spl/publications/papers/ieeesp97_schuster.pdf
106+
http://machinelearningmastery.com/develop-bidirectional-lstm-sequence-classification-python-keras/
107+
108+
---
109+
推荐阅读
110+
[历史技术博文链接汇总](http://blog.csdn.net/aliceyangxi1987/article/details/71911003)
111+
也许可以找到你想要的:
112+
[入门问题][TensorFlow][深度学习][强化学习][神经网络][机器学习][自然语言处理][聊天机器人]

0 commit comments

Comments
 (0)