ΕΕΛΛΑΚ - Λίστες Ταχυδρομείου

Re: Extend deepbots to support RL environments

Hello Manos,
Thanks for being patient with me.
I have shared a draft on the gsoc portal. Looking forward to your comments.
Thanks.

Best,
Sanket


On Fri, Apr 2, 2021 at 12:40 PM Manos Kirtas <manolis [ dot ] kirt [ at ] gmail [ dot ] com> wrote:

> Hello Sanket,
>
> It's ok to stick on snake example if you are interested most!
>
> Ray <https://docs.ray.io/en/master/index.html> is a framework offers
> scalability and hyperparameters tuning options. In this end, we can add
> wrappers in deepbots in order to support ray. More specifically, RLlib
> <https://docs.ray.io/en/master/rllib.html>provides an API in order train
> models efficiently and in a distributed manner. The major concern about
> integrating ray in deepbots is the parallelization. For example OpenAI gym
> uses Vectoralized Enviroments
> <https://stable-baselines.readthedocs.io/en/master/guide/vec_envs.html>. *Vectorized
> Environments are a method for stacking multiple independent environments
> into a single environment. Instead of training an RL agent on 1 environment
> per step, it allows us to train it on **n** environments per step. *When
> it comes to Webots parallelization can be happened in two different ways.
> The first one is to create multiple instances of Webots in order to run
> different environments*. *The second one is to create a grid in Webots
> world with different runs of the same example (for example a 3x3 grid in
> which we try to solve 9 cartpole instances). We have made some progress in
> former case, in which we are using external controller as described in webots
> documentation.
> <https://cyberbotics.com/doc/guide/running-extern-robot-controllers>
> Both ways are equally good, in order to integrate RLLib. Of course this
> can be an optional task, after complete the examples that you are mentioned
> in proposal. RLlib can help us to achieve faster convergences and better
> models in the existing examples
>
> Thank you,
> Manos.
>
> P.S As the GSoC platform gives the option to post a draft proposal, you
> can use it and I will provide you feedback if need be.
>
> On 1/4/21 1:07 π.μ., Sanket Thakur wrote:
>
> Hello Manos,
> Thank you for your feedback. I have tried to follow up with your reviews
> and added few concerns in the proposal itself for you to comment on.
>
> Best,
> Sanket
>
>
> On Mon, Mar 29, 2021 at 8:00 AM Manos Kirtas <manolis [ dot ] kirt [ at ] gmail [ dot ] com>
> wrote:
>
>> Thank you Sanket,
>>
>> I have added some comments directly on the PDF. Feel free contact me in
>> order to discuss them. Excuse me for the delay!
>>
>> Best,
>>
>> Manos.
>> On 23/3/21 7:52 μ.μ., Sanket Thakur wrote:
>>
>> Hello Manos,
>> Thanks for your reply.
>> I have modified my proposal accordingly. We can still iterate over it.
>> Let me know your thoughts on it.
>>
>> Best,
>> Sanket
>>
>> On Tue, Mar 23, 2021 at 12:15 PM Manos Kirtas <manolis [ dot ] kirt [ at ] gmail [ dot ] com>
>> wrote:
>>
>>> Hello,
>>>
>>> Glad to hear that you are interested on contributing in deepbots
>>> project. I have some comments on you proposal
>>>
>>>    - It will be helpful if you specify in that you will work on first.
>>>    For example, there is an extensive list of examples
>>>    <https://github.com/aidudezzz/deepbots/issues/85> that you can work
>>>    on. Being more specific can help us to guide you in order to compose a
>>>    strong proposal
>>>    - What types of problems you are interested for? What kind of robots
>>>    can be used to replicate those gym examples?
>>>    - Are you interested on contributing RL algorithms? Is there any
>>>    existing implementation that can be used or we should develop it from
>>>    scratch. I totally recommend to take on a look on existing implementation
>>>    (such as stable-baselines
>>>    <https://stable-baselines.readthedocs.io/en/master/>)
>>>    - If you are interested on implementing RL algorithms from scratch,
>>>    it would be great to cite the respective papers.
>>>    - Which framework you are going to use in order to implement those
>>>    algorithms (etc. pytorch, tensorflow)?
>>>    - Elaborate as more as possible the custom testbets that you are
>>>    interested to develop. What's you ideas? What type of task we want to
>>>    solve? What robot can be used? Can this problem be solved with both
>>>    discrete and continuous action space?
>>>    - I found very interesting idea to have an infrastructure for
>>>    hyperparameter optimization! Can we use an existing framework for that, for
>>>    example ray? <https://ray.io/>
>>>
>>> In my prospective it will be better to stick on 3 specific
>>> examples/tasks in order to further examine what can be used. Additionally,
>>> I feel that firstly we should take a look on existing implementations of RL
>>> algorithms and integrate them on those specific examples. Stable-baselines
>>> are already supported from deepbots and can be easily integrated on any
>>> example with not so much effort. Regarding the hyperparameter optimization,
>>> I will recommend ray since a have some experience and I can guide you. Of
>>> course any other ideas it is more than welcome to be discussed!
>>>
>>> Finally, I find very useful to include a timeline in which you can
>>> schedule your different ideas and develop a plan that can be feasible on
>>> the given timeline.
>>>
>>> Those some comments can extend our discussion in order to develop a
>>> strong proposal. I'm glad to hear your thoughts about the above comments!
>>>
>>> Best regards,
>>>
>>> Manos.
>>>
>>>
>>>  On 22/3/21 12:35 μ.μ., Sanket Thakur wrote:
>>>
>>> Hello,
>>> I am writing this to express my interest to work on '* Extend deepbots
>>> to support stable-baselines and implement gym-style default RL
>>> environments *' as a part of Gsoc 2021.
>>> I am attaching my proposal for the project and relevant contributions.
>>> It'd be great to hear your reviews on it.
>>>
>>> Thanks.
>>>
>>> Best,
>>> Sanket
>>>
>>> ----
>>> Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects.,https://lists.ellak.gr/gsoc-developers/listinfo.html
>>> Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr> <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.
>>>
>>>
----
Λαμβάνετε αυτό το μήνυμα απο την λίστα: Λίστα αλληλογραφίας και συζητήσεων που απευθύνεται σε φοιτητές developers \& mentors έργων του Google Summer of Code - A discussion list for student developers and mentors of Google Summer of Code projects.,
https://lists.ellak.gr/gsoc-developers/listinfo.html
Μπορείτε να απεγγραφείτε από τη λίστα στέλνοντας κενό μήνυμα ηλ. ταχυδρομείου στη διεύθυνση <gsoc-developers+unsubscribe [ at ] ellak [ dot ] gr>.

πλοήγηση μηνυμάτων