Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize the khr-3hv environment. #49

Closed
wants to merge 22 commits into from

Conversation

NikosKokkinis
Copy link
Contributor

Finalize the khr-3hv environment for the GSoC 2021.

  • Create the environment that works with Ray and Stable-Baseline PPO.
  • Logging on Wandb website.
  • Step-by-step documentation.

NikosKokkinis and others added 22 commits May 28, 2021 15:21
Add example with cartpole_discrete using stable-baseline PPO
Add a first working version of the khr-3hv robot
Finilize the Stable baseline with the cartpole env
Finilize the Stable baseline with the cartpole env
Add a first working version of the khr-3hv robot

Finilize the Stable baseline with the cartpole env

Add a first working version of the khr-3hv robot

Finilize the Stable baseline with the cartpole env
Add example with cartpole_discrete using stable-baseline PPO
Better version of the khr-3hv enviroment. Reducing observations and actions
Finalize the khr-3hv environment for the GSoC 2021.

- Create the environment that works with Ray and Stable-Baseline PPO.
- Logging on Wandb website.
- Step-to-step documentation.
Finalize the Nao environment for thr GSoC 2021
Fix path error on the khr-3hv robot readme and add reward plots
@KelvinYang0320 KelvinYang0320 self-requested a review August 26, 2021 08:30
@KelvinYang0320
Copy link
Member

Hello, @NickKok. I am reviewing the khr-3hv part of your work.
Did you get AttributeError: module 'aioredis' has no attribute 'create_redis_pool' when USE_Ray = True?
Check out this link.
I downgraded aioredis to v1.3.1 to run and keep reviewing your work for now.

Copy link
Member

@KelvinYang0320 KelvinYang0320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this khr-3hv example and the nice tutorial you provided!
I think users will enjoy this example.
I've left a few comments for khr-3hv.
I didn't review cartpole & nao since the description of this PR is about khr-3h.

@@ -0,0 +1,154 @@
import numpy as np
from scipy.ndimage.interpolation import shift
Copy link
Member

@KelvinYang0320 KelvinYang0320 Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this unused shift.

import ray
from ray import tune
from ray.tune.registry import register_env
from ray.tune import grid_search
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this unused import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using Ray. It is needed the tune, register_env. Regarding the ray and grid_serach, I could remove those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NickKok Hi, sorry for the confusion.
I mean that we can remove grid_search & ray and keep tune. Thanks!

import ray.rllib.agents.ppo as ppo
from ray.tune.logger import DEFAULT_LOGGERS
from ray.tune.integration.wandb import WandbLoggerCallback
from ray.tune.integration.wandb import wandb_mixin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this unused import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need the ppo and the WandbLoggerCallback to log the results on wandb.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion.
I think we can remove wandb_mixin only. Thanks!

from ray.tune.registry import register_env
from ray.tune import grid_search
import ray.rllib.agents.ppo as ppo
from ray.tune.logger import DEFAULT_LOGGERS
Copy link
Member

@KelvinYang0320 KelvinYang0320 Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this unused DEFAULT_LOGGERS.

Comment on lines +9 to +13
Viewpoint {
fieldOfView 0.660595
orientation -0.9627454208252096 0.19381572750322326 0.1885648918873455 0.25970754728137774
position 0.22818627812370207 1.6236991612459881 10.33705776276167
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Viewpoint {
fieldOfView 0.660595
orientation -0.9627454208252096 0.19381572750322326 0.1885648918873455 0.25970754728137774
position 0.22818627812370207 1.6236991612459881 10.33705776276167
}
Viewpoint {
fieldOfView 0.660595
position 0.05 1.4 9
}

The viewpoint is a little bit too inclined.
Please update the gif in the readme file as well, thanks!

Comment on lines +24 to +43
# Scheduler for the learning rate when using the stable baselines
def linear_schedule(initial_value: float) -> Callable[[float], float]:
"""
Linear learning rate schedule.
:param initial_value:
:return: current learning rate depending on remaining progress
"""
if isinstance(initial_value, str):
initial_value = float(initial_value)

# TODO decrease the lr by .10
def func(progress_remaining: float) -> float:
"""
Progress will decrease from 1 (beginning) to 0
:param progress_remaining: (float)
:return: (float)
"""
return progress_remaining * initial_value

return func
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is something that we could use in the future. It's for the learning rate scheduler when using Stable-Baseline PPO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I think we can keep these lines then.

model.learn(total_timesteps=100000000, callback=checkpoint_callback)
else:
# Load the weights of a trained Agent
model = PPO.load("./results/Deepbots-khr3hv-SB-reward-weight,prev_pos,obs-x,y,z,vel-done-0.5-RNN-5,clip_range-0.2,lr-0.0001_6778500_steps")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can pass a file path in supervisorsManager.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it could be possible

Copy link
Member

@KelvinYang0320 KelvinYang0320 Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address this on a following PR or this PR. Thank you!

)
ray.shutdown()
else:
# Crate an agent having the predifined model hyperparameters
Copy link
Member

@KelvinYang0320 KelvinYang0320 Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ray.init()?
When TRAIN = False and USE_Ray = True, I get System error: Ray has not been started yet. You can start Ray with 'ray.init()'.
Adding ray.init() do fix this error.

# Crate an agent having the predifined model hyperparameters
agent = ppo.PPOTrainer(config=model_config)
# Restore the train weights
agent.restore("./results/Deepbots-khr3hv-Ray-reward-weight,prev_pos,obs-x,y,z,vel-done-0.5-RNN-5,clip_range-0.2,lr-0.0001_6778500_steps")
Copy link
Member

@KelvinYang0320 KelvinYang0320 Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can pass a file path in supervisorsManager.py.
Then it will be visible to for end-user.
Please address this on a following PR or this PR. Thank you!


from RobotUtils import RobotFunc
from deepbots.supervisor.controllers.robot_supervisor import RobotSupervisor
from gym.spaces import Box, Discrete
Copy link
Member

@KelvinYang0320 KelvinYang0320 Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this unused Discrete.

Suggested change
from gym.spaces import Box, Discrete
from gym.spaces import Box

@KelvinYang0320
Copy link
Member

@NickKok Thank you for contributing many examples to deepworlds!
Also, the code quality is great!

This PR includes

I only do the code review on khr-3hv as the description of this PR.
Could you separate these examples into different branches and open PR for each?
And write the title & description for each PR according to changed files.
In this way, I think we can discuss or record these examples more clearly. Thanks! 😄

@NikosKokkinis
Copy link
Contributor Author

Hello @KelvinYang0320.

No I didn't get any error on : AttributeError: module 'aioredis' has no attribute 'create_redis_pool' when USE_Ray = True?

When you downgraded aioredis to v1.3.1, does it work with Ray?

@KelvinYang0320
Copy link
Member

When you downgraded aioredis to v1.3.1, does it work with Ray?

@NickKok Yes, everything goes fine when downgrading aioredis to v1.3.1, as suggested on this website.
I follow your readme to pip install all packages.
Could you start from pip install these packages again to see if you can reproduce this problem?
Thank you

@tsampazk
Copy link
Member

Hey @NickKok! We would like to take over this PR and finalize it if you don't mind, unless you have some free time to do so yourself. You could resolve whatever comments exist right now, merge it and we can take over perfecting it later.

This needs to be closed in combination with other fixes we've been doing for #58, so as to bring all existing examples up to speed with the latest Webots release.

@NikosKokkinis
Copy link
Contributor Author

Hello @tsampazk. It won't be possible to work on it till the end of August. After that It might be possibl during weekends. If you would like to continue the work feel free. I could help you from September.

@tsampazk
Copy link
Member

Alright @NickKok, thanks for the quick response, we will probably continue working on this and finalize it soon then. There's always more work to be done so if you feel like it you can contribute in the future 😃

@KelvinYang0320 KelvinYang0320 marked this pull request as draft July 4, 2022 02:17
@KelvinYang0320 KelvinYang0320 mentioned this pull request Jul 4, 2022
10 tasks
@tsampazk
Copy link
Member

Hey @KelvinYang0320, could we close this PR as i think all the goals were met by #76 ?

@KelvinYang0320 KelvinYang0320 marked this pull request as ready for review November 16, 2022 10:42
@KelvinYang0320 KelvinYang0320 marked this pull request as draft November 16, 2022 14:07
@KelvinYang0320
Copy link
Member

@tsampazk There are three examples, cartpole, khr-3hv, and nao, in this PR.

I think we can close this PR and open an issue for it. What do you think?

@tsampazk
Copy link
Member

Oh yes i didn't notice the other two, sorry. The cartpole stable-baselines stuff are already present in the dev branch 😕, so we need a new PR that adds the nao example? Bottom line i think it's better to close it and open an issue as you suggested!

@tsampazk tsampazk closed this Nov 16, 2022
@KelvinYang0320
Copy link
Member

@tsampazk I have a nao branch on my fork. Do you want me to open a draft PR or push it to a new branch on aidudezzz/deepworlds directly?
As for cartpole, I think he also modified our existing cartpole stable-baselines. Unfortunately, I cannot remember the detail. 😕

@tsampazk
Copy link
Member

tsampazk commented Nov 16, 2022

@tsampazk I have a nao branch on my fork. Do you want me to open a draft PR or push it to a new branch on aidudezzz/deepworlds directly?

I don't mind, whatever you feel is easier for you!

As for cartpole, I think he also modified our existing cartpole stable-baselines. Unfortunately, I cannot remember the detail. confused

It's ok, by taking a quick look at the sb cartpole we need to refactor and update that example anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants