Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deep Mimic example #19

Open
tsampazk opened this issue Jan 22, 2021 · 10 comments
Open

Deep Mimic example #19

tsampazk opened this issue Jan 22, 2021 · 10 comments
Labels
enhancement New feature or request tracker Used to track process of a project

Comments

@tsampazk
Copy link
Member

Relevant discussion here #18, suggestion by rohit-kumar-j.

Original Deep Mimic implementation

Basic Deep Mimic example could include a "teacher" cartpole robot that uses a PID controller and a "student" cartpole robot that is exactly the same as the existing cartpole example using RobotSupervisor, plus an emitter/receiver scheme to receive information from the "teacher" cartpole robot.

@tsampazk tsampazk added the enhancement New feature or request label Jan 22, 2021
@tsampazk
Copy link
Member Author

@all-contributors please add @rohit-kumar-j for ideas

@allcontributors
Copy link
Contributor

@tsampazk

I've put up a pull request to add @rohit-kumar-j! 🎉

@tsampazk
Copy link
Member Author

@all-contributors please add @rohit-kumar-j for ideas

@allcontributors
Copy link
Contributor

@tsampazk

I've put up a pull request to add @rohit-kumar-j! 🎉

@rohit-kumar-j
Copy link

@tsampazk, Thank you for adding me as a contributor!

I am currently working on the Deep Mimic Example in pybullet. Testing methods that would help to parent the stock humanoid so any other similarly structured robot can be used for training without much initial setup. Here are some results:

In this video, the stock humanoid is driving the custom-designed robot using inverse kinematics (I'm hoping that this will be the reward function for the robot during training):

full_body_ik-2021-02-17.mp4

Once, complete, hopefully, we can port this example to Webots.

Warm Regards,
Rohit

@tsampazk
Copy link
Member Author

@rohit-kumar-j This looks really promising Rohit! I'm looking forward to seeing the complete example, so as to start working on porting it to Webots. I think it would make for an impressive example to be added in the deepworlds repository.

@rohit-kumar-j
Copy link

@tsampazk, I agree. Unfortunately, I do not know Webots code-base and methods, hence, I can help out with the logic and implementation while simultaneously learning Webots. I hope it is okay if I post updates on the example in this thread itself.

Warm Regards,
Rohit Kumar J

@tsampazk
Copy link
Member Author

I hope it is okay if I post updates on the example in this thread itself.

@rohit-kumar-j Yeap sounds fine, go ahead. 😀

@tsampazk tsampazk added the tracker Used to track process of a project label Feb 17, 2021
@rohit-kumar-j
Copy link

Here is an update on the ghost robot which the robot will need to follow:
This will be used to train the RL algorithm in accordance with Deep Mimic's policies (at least that's the hope for now)

  • Some of the data is a bit choppy as it is using IK to follow mocap from the humanoid (which is hidden).
  • The speed of motion of each joint, the robot base, and the joint calibration offsets are variable(needs tweaking, some of them are shown).
  • There are issues with joint retargeting as of now(right foot of the robot), but, it will be tweaked later.
ghost_robot_slerp_fn-2021-02-19_12.34.54.mp4

Warm Regards,
Rohit Kumar J

@rohit-kumar-j
Copy link

rohit-kumar-j commented Apr 21, 2021

We are at preliminary training by using IK. The code structure that was initially built upon was... unelegant, however, the agents are stand-alone, hence we may be able to re-build the environment files while reusing the agents(or perhaps use the agents in deep worlds:thinking: :thought_balloon: ). However, this may take some time.

Initial_training-2021-04-22_00.23.09.mp4

The checkpoint file(agent) here, is at 18 million samples, according to the deep mimic paper, it takes about 61 mil samples for the stock humanoid to achieve a perfect walking gait and 48 mil samples for Atlas. They also mention that it takes 2 days to train the humanoid. To get to the 18 mil samples mark as seen in this video, it took me 24 hours of training with 12(or was it 6? 🤔 ) cores (actually on my friend's PC). I think it needs some tuning to optimize the results.

Hopefully, I could begin developing this example on Webots once this is fully trained.

Warm Regards,
Rohit Kumar J

PS: The sudden jump at 00:12 from the robot was me dragging the robot with the mouse :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request tracker Used to track process of a project
Projects
None yet
Development

No branches or pull requests

2 participants