Open
Description
Motivation
I suggest adding an implementation of TQC to the examples. I suggest adding this as a request in the 'call for distributions' stack. I would be happy to take on the implementation.
Solution
Add a performant, clear and minimal implementation of TQC to the examples. I would base the implementation on the already existing implementation of SAC in the examples. This seems reasonable due to the overall similar structure of the algorithms. In the Stable-Baselines3 library implementation of TQC, the implementation is likewise based on their corresponding implementation of SAC.
Checklist
- I have checked that there is no similar issue in the repo (required)
- Add a request to the call for contributions.
- Adapt the implementation of SAC from the examples to TQC
- Benchmark?