Add stats reporter class and re-enable missing stats #3076

ervteng · 2019-12-12T00:21:46Z

This PR adds back in certain statistics missing from the AgentProcessor PR (entropy, learning rate) that were only known to the AgentProcessor and not the Trainer.

We do this by creating a global class called a StatsReporter, that takes in a category, key, and float value. This StatsReporter can then write the mean of these values out on command. Currently this is still handled by the Trainer.

The StatsReporter also keeps a list of Writer classes - currently, we only have a Tensorboard writer but we can imagine adding more in the future (e.g. REST API writer, CSV writer).

Note: Why is this a PR to the AgentProcessor PR and not to Master? The AgentProcessor marks the first time we have multiple sources for stats, and thus requires that certain stats (related to Policy inference, e.g. reward, entropy, episode steps) come from a different place than others (related to Policy training, e.g. loss).

chriselion · 2019-12-12T01:37:50Z

ml-agents/mlagents/trainers/agent_processor.py

        # Note: this is needed until we switch to AgentExperiences as the data input type.
        # We still need some info from the policy (memories, previous actions)
        # that really should be gathered by the env-manager.
        self.policy = policy
-        self.episode_steps: Dict[str, int] = {}
+        self.episode_steps: Counter = Counter()
+        self.episode_rewards: Dict[str, float] = defaultdict(lambda: 0.0)


nit: defaultdict(float) is more common I think.

ml-agents/mlagents/trainers/agent_processor.py

chriselion · 2019-12-12T01:50:00Z

ml-agents/mlagents/trainers/stats.py

+        for writer in self.writers:
+            writer.write_text(category, text, step)
+
+    def get_mean_stat(self, category: str, key: str) -> float:


What do you think about combining get_mean_stat, get_std_stat, and get_num_stats into something like get_summary_stats() that returns a NamedTuple with count, mean, and stddev. I think that would clean up the usage in write_summary() a bit.

chriselion · 2019-12-12T19:53:56Z

ml-agents/mlagents/trainers/ppo/trainer.py

+            stats.stats_reporter.add_stat(
+                self.summary_path,
+                self.policy.reward_signals[name].value_name,
+                np.mean(v),


Is v always non-empty? Do you need to guard against NaNs anywhere?

Guess it's same behavior as before...

v shouldn't be NaN unless there are NaNs in the network (dun dun dun)

ml-agents/mlagents/trainers/stats.py

harperj · 2019-12-12T19:54:50Z

ml-agents/mlagents/trainers/rl_trainer.py

-                self.stats[self.policy.reward_signals[name].stat_name].append(
-                    rewards.get(agent_id, 0)
+                stats.stats_reporter.add_stat(
+                    self.summary_path,


Feels a little weird that the "category" here is a filepath. The "category" seems like something that should just be the filename / behavior name, whereas the base bath should be something used to configure the Tensorboard writer.

harperj

Couple more comments and a request for tests.

ml-agents/mlagents/trainers/stats.py

harperj · 2019-12-12T21:24:05Z

ml-agents/mlagents/trainers/stats.py

+        """
+        for key in self.stats_dict[category]:
+            if len(self.stats_dict[category][key]) > 0:
+                stat_mean = float(np.mean(self.stats_dict[category][key]))


I don't love that we won't have any method for logging min/max values via this interface. Not sure I have a great solution for this at the moment, though.

Hmm, that's a good idea, I think it might be worth adding.

I'm inclined to get the StatsReporter interface installed into the code and then work on what goes behind it in future PRs - currently it's blocking a larger trainer refactor that's in turn blocking another trainer refactor :P

ml-agents/mlagents/trainers/stats.py

ervteng

I've restructured the StatsWriter to share static writers but have different instances per trainer, and added tests.

harperj · 2019-12-13T20:38:51Z

ml-agents/mlagents/trainers/agent_processor.py

+        trainer: Trainer,
+        policy: TFPolicy,
+        max_trajectory_length: int,
+        stats_category: str,


If this is coming from the trainer, why not pass a StatsReporter?

harperj · 2019-12-13T20:39:52Z

ml-agents/mlagents/trainers/trainer.py

+        :param key: The name of the text.
+        :param input_dict: A dictionary that will be displayed in a table on Tensorboard.
+        """
+        #        try:


Remove commented out code if it won't be used.

Good catch. This shouldn't be here at all.

harperj

Two minor comments, otherwise LGTM

Ervin Teng added 10 commits December 10, 2019 13:52

Initial (broken) commit

aa69d8f

Merge branch 'develop-agentprocessor' into develop-agentprocessor-stats

e0bbc88

Separated reporter and writer

080a419

Convert PPO to use stat object

851816a

Fix some exceptions

84a3122

Convert SAC

22d34dc

Move episode rewards and length to agent processor

4994461

Merge branch 'develop-agentprocessor' into develop-agentprocessor-stats

2e81401

Don't evaluate value heads during inference

2881ade

Fix writing lessons to Tensorboard

d3e1678

ervteng requested review from harperj, chriselion and vincentpierre December 12, 2019 00:21

Ervin Teng added 6 commits December 11, 2019 16:26

Add docstrings

207b4ec

Fix bug with lesson nums

c09b3a8

Fix tests

e9fb728

Update docstring

63e5e5e

Remove duplicate stats

1a1b751

Change to counters for episode steps

2eb75a5

chriselion reviewed Dec 12, 2019

View reviewed changes

ml-agents/mlagents/trainers/agent_processor.py Outdated Show resolved Hide resolved

chriselion reviewed Dec 12, 2019

View reviewed changes

Ervin Teng added 8 commits December 11, 2019 18:04

Merge branch 'develop-agentprocessor' into develop-agentprocessor-stats

c92e6e8

Stats summary instead of 3 methods

be74221

Change count to num

e7b4d7f

Fix PPO and SAC tests

3b337be

Fix type check

6218933

Merge branch 'develop-agentprocessor' into develop-agentprocessor-stats

e347bcd

Merge branch 'develop-agentprocessor' into develop-agentprocessor-stats

e98a28f

Merge branch 'develop-agentprocessor' into develop-agentprocessor-stats

bd13b65

Add typing to curriculum and address mypy

715e5d4

chriselion reviewed Dec 12, 2019

View reviewed changes

harperj reviewed Dec 12, 2019

View reviewed changes

chriselion approved these changes Dec 12, 2019

View reviewed changes

harperj suggested changes Dec 12, 2019

View reviewed changes

Ervin Teng added 4 commits December 12, 2019 15:01

Convert StatsReporter to local, change category scope

391d490

Remove whitespace

5484108

Fix PPO test

4f4a75c

Add test for stats

496cb68

ervteng requested a review from harperj December 13, 2019 00:22

ervteng commented Dec 13, 2019

View reviewed changes

Fix tests

7d60c4d

harperj reviewed Dec 13, 2019

View reviewed changes

harperj approved these changes Dec 13, 2019

View reviewed changes

Ervin Teng added 3 commits December 13, 2019 13:48

Address final comments

0ece283

fix AP test

e10fa4b

Fix tensorflow mock

8231b40

ervteng merged commit 0f08718 into develop-agentprocessor Dec 13, 2019

delete-merged-branch bot deleted the develop-agentprocessor-stats branch December 13, 2019 23:29

github-actions bot locked as resolved and limited conversation to collaborators May 17, 2021

Add stats reporter class and re-enable missing stats #3076

Add stats reporter class and re-enable missing stats #3076

Uh oh!

Conversation

ervteng commented Dec 12, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

harperj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ervteng left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

harperj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!