Skip to content

yifengzhong-cat/Deep_learning_paper_tutorial

 
 

Repository files navigation

项目主要是介绍对深度学习领域的一些论文的精读

主要包含五个部分:

  1. 深度学习的经典论文-原文(Original paper)

  2. 带注释的深度学习经典论文(paper with notes)

  3. 论文笔记(notes)

  4. 补充的一下其他论文(Supplementary knowledge)

  5. 补充的一些知识点(Supplementary paper)

详读的经典论文包括(一直更新):

  1. LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444
  2. Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554.
  3. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
  4. Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507.
  5. Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
  6. He, Kaiming, et al. "Deep residual learning for image recognition." arXiv preprint arXiv:1512.03385 (2015). [pdf]
  7. Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
  8. Kingma, Diederik, and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014)
  9. Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).
  10. Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database[C]//2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009: 248-255.
  11. Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The journal of machine learning research, 2014,15(1): 19291958.
  12. Ba J L, Kiros J R, Hinton G E. Layer normalization[J]. arXiv preprint arXiv:1607.06450, 2016.
  13. Courbariaux M, Hubara I, Soudry D, et al. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1[J]. arXivpreprint arXiv:1602.02830, 2016.
  14. Jaderberg M, Czarnecki W M, Osindero S, et al. Decoupled neural interfaces using synthetic gradients[C]//International Conference on Machine Learning. PMLR, 2017: 1627-1635.
  15. Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. arXiv preprint arXiv:1207.0580, 2012.
  16. Chen T, Goodfellow I, Shlens J. Net2net: Accelerating learning via knowledge transfer[J]. arXiv preprint arXiv:1511.05641, 2015.
  17. Wei T, Wang C, Rui Y, et al. Network morphism[C]//International Conference on Machine Learning. 2016: 564-572.
  18. Han S, Mao H, Dally W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding[J]. arXiv preprint arXiv:1510.00149, 2015.
  19. Andrychowicz M, Denil M, Gomez S, et al. Learning to learn by gradient descent by gradient descent[C]//Advances in neural information processing systems. 2016: 3981-3989.
  20. Sutskever I, Martens J, Dahl G, et al. On the importance of initialization and momentum in deep learning[C]//International conference on machine learning. 2013: 1139-1147.
  21. Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size[J]. arXiv preprintarXiv:1602.07360, 2016.
  22. Xiong W, Droppo J, Huang X, et al. Achieving human parity in conversational speech recognition[J]. arXiv preprint arXiv:1610.05256, 2016.
  23. Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups[J]. IEEE Signal processing magazine, 2012, 29(6): 82-97.
  24. Amodei D, Ananthanarayanan S, Anubhai R, et al. Deep speech 2: End-to-end speech recognition in english and mandarin[C]//International conference on machine learning. 2016: 173-182.
  25. Sak H, Senior A, Rao K, et al. Fast and accurate recurrent neural network acoustic models for speech recognition[J]. arXiv preprint arXiv:1507.06947, 2015.
  26. Graves A, Mohamed A, Hinton G. Speech recognition with deep recurrent neural networks[C]//2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013: 6645-6649.
  27. Graves A, Jaitly N. Towards end-to-end speech recognition with recurrent neural networks[C]//International conference on machine learning. 2014: 1764-1772.
  28. Heaven D. Deep Trouble for Deep Learning[J]. Nature, 2019, 574(7777): 163-166.
  29. Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[C]//Advances in neural information processing systems. 2016: 3630-3638.
  30. Kingma D P, Welling M. Auto-encoding variational bayes[J]. arXiv preprint arXiv:1312.6114, 2013.
  31. Le Q V. Building high-level features using large scale unsupervised learning[C]//2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013: 8595-8598.
  32. Van den Oord A, Kalchbrenner N, Espeholt L, et al. Conditional image generation with pixelcnn decoders[C]//Advances in neural information processing systems. 2016: 4790-4798.
  33. Gregor K, Danihelka I, Graves A, et al. Draw: A recurrent neural network for image generation[J]. arXiv preprint arXiv:1502.04623, 2015.
  34. Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems. 2014: 2672-2680.
  35. Oord A, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks[J]. arXivpreprint arXiv:1601.06759, 2016.
  36. Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv preprint arXiv:1511.06434, 2015.
  37. Vinyals O, Le Q. A neural conversational model[J]. arXiv preprint arXiv:1506.05869,2015.
  38. Graves A. Generating sequences with recurrent neural networks[J]. arXiv preprint arXiv:1308.0850, 2013.
  39. Cho K, Van Merriënboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv:1406.1078, 2014.
  40. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
  41. Sukhbaatar S, Weston J, Fergus R. End-to-end memory networks[C]//Advances in neural in formation processing systems. 2015: 2440-2448.
  42. Weston J, Chopra S, Bordes A. Memory networks[J]. arXiv preprint arXiv:1410.3916,2014.
  43. Graves A, Wayne G, Danihelka I. Neural turing machines[J]. arXiv preprint arXiv:1410.5401, 2014.
  44. Vinyals O, Fortunato M, Jaitly N. Pointer networks[C]//Advances in neural information processing systems. 2015: 2692-2700.
  45. Zaremba W, Sutskever I. Reinforcement learning neural turing machines-revised[J]. arXiv preprint arXiv:1505.00521, 2015.
  46. Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[C]//Advances in neural information processing systems. 2016: 3630-3638.

About

Share some classic papers and detailed notes in deep learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published