Closed
Description
Problem Description
Given the incredible performance of the DDPG + JAX prototype (#187), it's worth prototyping TD3 + JAX as well. @joaogui1 is super experienced with JAX and has expressed interest in working on this. Thanks @joaogui1 for your interest! This issue tracks the development effort.
I suggest extending the DDPG prototype link to work with TD3. Here is a couple of additional resources:
- CleanRL's DDPG docs: https://docs.cleanrl.dev/rl-algorithms/ddpg/
- CleanRL's TD3 docs: https://docs.cleanrl.dev/rl-algorithms/td3/
To see exactly how CleanRL's DDPG differs from TD3, a filediff between ddpg_continuous_action.py
and td3_continuous_action.py
would explicitly show the code differences:
There is a contribution checklist to help with making the contribution when making the PR. See #186 as an example.
Thanks again @joaogui1 and let me know if you run into any issues!
Metadata
Assignees
Labels
No labels