git pull rnn and update keras_rnn_lm folder
This is the first step to implement RNN language models on FPGAs. All modules are heavily commented. We will use High-Level Synthesis to turn these code into Hardware Description Languages (HDL).
We prepared everything for you, simply run 'make' in command line, then './main' to see the result generated by 1000 samples.
If you don't want to run the program by yourself, there are some sample results:
log_double.out: 27.8125%
log_float.out: 26.8750%
These results are the inference accuracy of the next words. Each word is predicted by input the sequences of previous 50 words to RNN. We used 'double' and 'float' in C to test the results, which are float64 and float32. However, these results are slightly less accurate than Keras, which has an accuracy of 28.1250%.
fc.c: Fully-connected Layers
rnn.c: RNN Layers
activation.c: relu & tanh
softmax.c: softmax
main.c: float32 inference
main_double.c: float64 inference
./datasets: sequences composed of 50 words, the result we want to predict is the next single word
./h5: convert the original one HDF5 file which contains weights to several txt files
./model: all weights in txt version, with high precision (at least for float and double)
constant.h: RNN dimensions, word dictionary size, etc.
load_data.c: load weights and datasets from txt files to C
params_init.c: initialize weights to zeros (usually used with malloc)
The accuracy of C implemetation results is slightly worse than Keras. We proposed several reasons and tried to fix them. However, after excluded several possible reasons, there are still something that we can not implemented in C, which may be the cause of accuracy drop.
c standard lib manual: use our tanh instead of build yours
I write some python to output very precise txt format float numbers. But result is still the same.
double -> very close to Keras result (0.3% accuracy drop) while
float -> 1.3% drop
can influence result probability distribution accuracy, but not result
not very possible, at least I didn't explicitly write that
possible, but I have no idea how to implement that in C
Most possible reasons:
(1) Keras optimization for tanh
(2) Keras matrix multiplication optimization (on precision)
both are not easy to implement in our C code
list all contents: h5dump -n pre-trained-rnn.h5
open an attribute h5dump -a "/model_weights/dense_1/weight_names" pre-trained-rnn.h5
view a dataset (show weights) h5dump -d "/model_weights/dense_1/dense_1/bias:0" pre-trained-rnn.h5
h5 to txt: h5dump -o dense_1_bias.txt -y -w 1000000000 dense_1_bias.h5