Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Turn doc tests into '.. doctest::' #37492

Merged
merged 10 commits into from
Jul 18, 2023

Conversation

ArturNiederfahrenhorst
Copy link
Contributor

Why are these changes needed?

As part of our migration to unify code snippets and doc test style (https://docs.ray.io/en/master/ray-contribute/writing-code-snippets.html#how-to-handle-hard-to-test-examples), this PR migrates our code examples in RLlib.

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
MyCustomTraceCallbacks,
....
]))
The resulting DefaultCallbacks will call all the sub-callbacks' callbacks
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just keep AlgorithConfig out of the picture.
The example we would have to construct to make this a copy/paste thing would be quite large.

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
@ArturNiederfahrenhorst ArturNiederfahrenhorst requested a review from a team as a code owner July 18, 2023 16:24
@kouroshHakha kouroshHakha merged commit 1ff3b1d into ray-project:master Jul 18, 2023
"agent_2": np.array(...),
}
)
.. testcode::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a newline after this or the example won't render.

}
)
.. testcode::
import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having one large testcode, I'd recommend splitting this into multiple testcode for better readability:

        Represent a list of agent data from one env step() call.
 
        .. testcode::

            import numpy as np
            ac = AgentConnectorDataType(
                env_id="env_1",
                agent_id=None,
                data={
                    "agent_1": np.array([1, 2, 3]),
                    "agent_2": np.array([4, 5, 6]),
                }
            )

        Or a single agent data ready to be preprocessed.

        .. testcode::

            ac = AgentConnectorDataType(
                env_id="env_1",
                agent_id="agent_1",
                data=np.array([1, 2, 3]),
            )

        etc.

Comment on lines +286 to +287
.. testcode::
from ray.rllib.connectors.action.lambdas import (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need newline here

Comment on lines +271 to +273
# We use PPO and torch as an example here because many of the showcased
# components need implementations to come together. However, the same
# pattern is generally applicable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be more readable if you move it out of the testcode as plaintext

# remove a module
learner.remove_module("new_player")
# Take one gradient update on the module and report the results
# results = learner.update(...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Here and below) What would happen if we uncommented this and replaced the ellipses with an actual argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to construct a multi-agent training batch here.
This would be a nested dict with quite a lot of KV pairs with all values being torch tensors.
It's just too bug of a thing to construct here. We need to be able to generate example batches some time in the future but we are not there today.


model = MyModel()
model.forward({"obs": torch.randn(32, 64)}) # No error
model.forward({"obs": torch.randn(32, 32)}) # raises ValueError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause CI to fail. You could put this in a subsequent testcode that's explicitly skipped

    model.forward({"obs": torch.randn(32, 64)}) # No error

... and this example raises a ValueError

.. testcode::
    :skipif: True

    model.forward({"obs": torch.randn(32, 32)}) 

.. testoutput::
    
    Traceback:
        ...


.. code-block:: python
# Example for creating a sampling loop:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is long. For better readability, it might be better to split this into separate testcodes (one for sampling, training, inference, etc)

@@ -13,10 +13,24 @@ class Distribution(abc.ABC):
"""The base class for distribution over a random variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file explicitly skipped in the BUILD file. Could we remove it from the list?

>>> action_logits = model.forward(obs)
>>> action_dist = Distribution(action_logits)

.. doctest::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This directive is unecessary

Comment on lines +147 to +182
.. doctest::

>>> import numpy as np
>>> from ray.rllib.models.distributions import Distribution

>>> class Uniform(Distribution):
... def __init__(self, lower, upper):
... self.lower = lower
... self.upper = upper
...
... def sample(self):
... return self.lower + (self.upper - self.lower) * np.random.rand()
...
... def logp(self, x):
... ...
...
... def kl(self, other):
... ...
...
... def entropy(self):
... ...
...
... @staticmethod
... def required_input_dim(space):
... ...
...
... def rsample(self):
... ...
...
... @classmethod
... def from_logits(cls, logits, **kwargs):
... return Uniform(logits[:, 0], logits[:, 1])

>>> logits = np.array([[0.0, 1.0], [2.0, 3.0]])
>>> my_dist = Uniform.from_logits(logits)
>>> sample = my_dist.sample()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is longer, so it'd be more readable as testcode

Bhav00 pushed a commit to Bhav00/ray that referenced this pull request Jul 28, 2023
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: NripeshN <nn2012@hw.ac.uk>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants