Skip to content

Add stats reporter class and re-enable missing stats #3076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Dec 13, 2019

Conversation

ervteng
Copy link
Contributor

@ervteng ervteng commented Dec 12, 2019

This PR adds back in certain statistics missing from the AgentProcessor PR (entropy, learning rate) that were only known to the AgentProcessor and not the Trainer.

We do this by creating a global class called a StatsReporter, that takes in a category, key, and float value. This StatsReporter can then write the mean of these values out on command. Currently this is still handled by the Trainer.

The StatsReporter also keeps a list of Writer classes - currently, we only have a Tensorboard writer but we can imagine adding more in the future (e.g. REST API writer, CSV writer).

Note: Why is this a PR to the AgentProcessor PR and not to Master? The AgentProcessor marks the first time we have multiple sources for stats, and thus requires that certain stats (related to Policy inference, e.g. reward, entropy, episode steps) come from a different place than others (related to Policy training, e.g. loss).

# Note: this is needed until we switch to AgentExperiences as the data input type.
# We still need some info from the policy (memories, previous actions)
# that really should be gathered by the env-manager.
self.policy = policy
self.episode_steps: Dict[str, int] = {}
self.episode_steps: Counter = Counter()
self.episode_rewards: Dict[str, float] = defaultdict(lambda: 0.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: defaultdict(float) is more common I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

for writer in self.writers:
writer.write_text(category, text, step)

def get_mean_stat(self, category: str, key: str) -> float:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about combining get_mean_stat, get_std_stat, and get_num_stats into something like get_summary_stats() that returns a NamedTuple with count, mean, and stddev. I think that would clean up the usage in write_summary() a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

stats.stats_reporter.add_stat(
self.summary_path,
self.policy.reward_signals[name].value_name,
np.mean(v),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is v always non-empty? Do you need to guard against NaNs anywhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess it's same behavior as before...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v shouldn't be NaN unless there are NaNs in the network (dun dun dun)

self.stats[self.policy.reward_signals[name].stat_name].append(
rewards.get(agent_id, 0)
stats.stats_reporter.add_stat(
self.summary_path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels a little weird that the "category" here is a filepath. The "category" seems like something that should just be the filename / behavior name, whereas the base bath should be something used to configure the Tensorboard writer.

Copy link
Contributor

@harperj harperj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more comments and a request for tests.

"""
for key in self.stats_dict[category]:
if len(self.stats_dict[category][key]) > 0:
stat_mean = float(np.mean(self.stats_dict[category][key]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love that we won't have any method for logging min/max values via this interface. Not sure I have a great solution for this at the moment, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's a good idea, I think it might be worth adding.

I'm inclined to get the StatsReporter interface installed into the code and then work on what goes behind it in future PRs - currently it's blocking a larger trainer refactor that's in turn blocking another trainer refactor :P

@ervteng ervteng requested a review from harperj December 13, 2019 00:22
Copy link
Contributor Author

@ervteng ervteng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've restructured the StatsWriter to share static writers but have different instances per trainer, and added tests.

trainer: Trainer,
policy: TFPolicy,
max_trajectory_length: int,
stats_category: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is coming from the trainer, why not pass a StatsReporter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

:param key: The name of the text.
:param input_dict: A dictionary that will be displayed in a table on Tensorboard.
"""
# try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code if it won't be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This shouldn't be here at all.

Copy link
Contributor

@harperj harperj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments, otherwise LGTM

@ervteng ervteng merged commit 0f08718 into develop-agentprocessor Dec 13, 2019
@delete-merged-branch delete-merged-branch bot deleted the develop-agentprocessor-stats branch December 13, 2019 23:29
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants