forked from lazyprogrammer/machine_learning_examples
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathextra_reading.txt
41 lines (28 loc) · 1.48 KB
/
extra_reading.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Reinforcement Learning: A Tutorial Survey and Recent Advances - Abhijit Gosavi
http://web.mst.edu/~gosavia/joc.pdf
Algorithms for Reinforcement Learning - Csaba Szepesv´ari
http://old.sztaki.hu/~szcsaba/papers/RLAlgsInMDPs-lecture.pdf
Markov Decision Processes in Artificial Intelligence
https://zodml.org/sites/default/files/Markov_Decision_Processes_and_Artificial_Intelligence.pdf
MDP Preliminaries
http://nanjiang.cs.illinois.edu/files/cs598/note1.pdf
Concentration Inequalities and Multi-Armed Bandits
http://nanjiang.cs.illinois.edu/files/cs598/note_bandit.pdf
Notes on Tabular Methods
http://nanjiang.cs.illinois.edu/files/cs598/note3.pdf
Notes on State Abstractions
http://nanjiang.cs.illinois.edu/files/cs598/note4.pdf
Notes on Fitted Q-iteration
http://nanjiang.cs.illinois.edu/files/cs598/note5.pdf
Convergence of Stochastic Iterative Dynamic Programming Algorithms
https://papers.nips.cc/paper/764-convergence-of-stochastic-iterative-dynamic-programming-algorithms.pdf
Sutton & Barto
http://incompleteideas.net/sutton/book/the-book-2nd.html
Finite-Sample Analysis of Proximal Gradient TD Algorithms
https://marek.petrik.us/pub/Liu2015.pdf
Finite Sample Analyses for TD(0) with Function Approximation
https://arxiv.org/pdf/1704.01161.pdf
Mastering the game of Go with deep neural networks and tree search - Silver, D. et al.
https://storage.googleapis.com/deepmind-media/alphago/AlphaGoNaturePaper.pdf
Learning Rates for Q-learning
http://www.jmlr.org/papers/volume5/evendar03a/evendar03a.pdf