Skip to content

Latest commit

 

History

History
34 lines (19 loc) · 301 Bytes

devlog.md

File metadata and controls

34 lines (19 loc) · 301 Bytes

[] RND network

[] beta -> UVFA

[] retrace loss

#r2d2 : value based

정리

학습대상 :

  1. R2D2
  2. Embedding Model
  3. G_function

핵심개념

  1. intrinsic reward
  2. alpha-beta
  3. UVFA

질문

구현할 것

Agent57

  • Meta Controller (Beta, gamma)
  • Long trace
  • Separate network