Allow sample_weight / class_weight to be applied to metrics #7482

nicolewhite · 2017-07-31T19:35:17Z

I noticed a lot of issues related to the sample weights not being applied to metrics.

There is currently not an easy way to build a custom metric to achieve this, as a custom metric can only accept y_true and y_pred. It is probably possible with a callback metric but that seems like a lot of work for something that is in popular demand and should be supported as a first-class option. I noticed this was attempted in #4335, but was abandoned. I also noticed that the setting was per-metric in that PR, whereas I am proposing the setting at the compile() level such that its application is consistent between fit() and evaluate(). Example usage:

from keras.models import Sequential
from keras.layers import Dense

import numpy as np

X = np.random.normal(size=(100, 10))
y = np.random.randint(2, size=100)
sample_weight = np.random.normal(size=100)
loss = 'binary_crossentropy'

model = Sequential()
model.add(Dense(10, input_shape=(10, )))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss=loss, optimizer='rmsprop', metrics=[loss], weight_metrics=False)

model.fit(X, y, epochs=5, verbose=2, sample_weight=sample_weight)

Epoch 1/5
0s - loss: -2.1792e-02 - binary_crossentropy: 0.9223
Epoch 2/5
0s - loss: -3.2389e-02 - binary_crossentropy: 0.9294
Epoch 3/5
0s - loss: -3.5780e-02 - binary_crossentropy: 0.9285
Epoch 4/5
0s - loss: -3.6447e-02 - binary_crossentropy: 0.9311
Epoch 5/5
0s - loss: -3.8856e-02 - binary_crossentropy: 0.9264

Now with weight_metrics=True:

model.compile(loss=loss, optimizer='rmsprop', metrics=[loss], weight_metrics=True)

model.fit(X, y, epochs=5, verbose=2, sample_weight=sample_weight)

Epoch 1/5
0s - loss: -3.7481e-02 - binary_crossentropy: -3.7481e-02
Epoch 2/5
0s - loss: -4.5657e-02 - binary_crossentropy: -4.5657e-02
Epoch 3/5
0s - loss: -5.1353e-02 - binary_crossentropy: -5.1353e-02
Epoch 4/5
0s - loss: -5.3839e-02 - binary_crossentropy: -5.3839e-02
Epoch 5/5
0s - loss: -5.6900e-02 - binary_crossentropy: -5.6900e-02

I think the consistency here is nice. Additionally, if sample_weight is passed to evaluate() and weight_metrics=True in compile(), any metrics will also be weighted.

model.evaluate(X, y, verbose=2, sample_weight=sample_weight)

[-0.084627518355846407, -0.084627518355846407]

fchollet · 2017-08-01T23:08:33Z

keras/engine/training.py

@@ -604,7 +564,7 @@ class Model(Container):
    """

    def compile(self, optimizer, loss, metrics=None, loss_weights=None,
-                sample_weight_mode=None, **kwargs):
+                sample_weight_mode=None, weight_metrics=False, **kwargs):


Shouldn't it be weigh_metrics?

Yeah that is a better word choice. Fixed.

fchollet · 2017-08-01T23:35:03Z

The problem I have with this feature is that sample weights only exist as a way to modulate gradient contributions for different samples (or classes) during training. It's a training modulator, like gradient clipping, for instance. It isn't supported at all in inference mode. The purpose of per-sample gradient weighting during training is to get better unweighted metrics.

I understand that some people want to modulate their metrics as well. That sounds like a different feature altogether though.

nicolewhite · 2017-08-02T15:51:58Z

sample weights only exist as a way to modulate gradient contributions for different samples (or classes) during training

I think this is one use case for sample weights (like severe class imbalance), whereas another is to indicate that you care more about some samples than others because of <insert business reason>. In this case, you are sometimes interested in optimizing a weighted metric. For example, the context in which I encountered this was in a classification problem where each positive sample has some attached monetary value. It is more important to me that a positive sample with a high monetary value is classified correctly than a positive sample with a low monetary value. In this scenario, it seems to me that it is appropriate to both use the sample weight, which is a function of this monetary value, in both gradient contributions and metric contributions. What do you think?

I understand that some people want to modulate their metrics as well. That sounds like a different feature altogether though.

Can you elaborate on why this is a different feature? I feel like this PR is that feature!

fchollet · 2017-08-02T17:27:21Z

Can you elaborate on why this is a different feature? I feel like this PR is that feature!

I meant that if it's a different feature (as opposed to a natural generalization of an existing feature) it would require a new API keyword, in order to prevent user confusion (although they are already confused, so I guess it's too late).

I agree that there seems to be enough user demand to warrant this feature. I am okay with merging it. I also agree that reusing the sample_weight argument is the sensible thing to do. Given these assumptions, it follows that the switch between the two behaviors should be a compile argument (like sample_weight_mode).

Remains to ponder whether weigh_metrics is the best possible argument name. This would become permanently part of the public API, so it's worth thinking about it twice. What would be some other options?

nicolewhite · 2017-08-03T05:21:07Z

I am not sure I can think of a better term that is not overly verbose / confusing. A couple other options are:

Pass in separate weights for the metric weights as metric_sample_weight. But then you'll end up with people mostly just passing in the same value twice.

model.fit(..., sample_weight=sample_weight, metric_sample_weight=sample_weight)

You'd also have to add metric_class_weight which is not ideal.

Allow for a list of metrics that will be weighted with weighted_metrics.

model.compile(..., weighted_metrics=['accuracy'])

This is somewhat appealing because you could track both weighted and unweighted metrics easily.

model.compile(..., metrics=['accuracy'], weighted_metrics=['accuracy'])

nicolewhite · 2017-08-04T16:55:26Z

Do you know why the tests are failing? Seems unrelated. They were passing prior to the weight_metrics -> weigh_metrics change.

fchollet · 2017-08-04T17:02:55Z

Looks unrelated. Re-running tests.

fchollet · 2017-08-04T18:00:37Z

The following API

model.compile(..., weighted_metrics=['accuracy'])

could be a good solution. But the initial proposal is likely to be more user-friendly. Still unsure at this point...

mitkeyastromouse · 2018-08-28T14:21:58Z

It's cool that "weighted_metrics" are now supported. However, as opposed to plain "metrics", they don't seem to be saved into (nor loaded from) checkpoints, which means I need to re-compile (hopefully that works!).

A propos the discussion regarding the use of the feature: I use sample_weights to mark invalid samples in my targets, or more generally, to specify confidence that the given target value is correct.

trianta2 · 2019-03-15T16:29:50Z

Out of curiosity, why not pass in the weights into the custom metric or loss function?

I'm not too familiar with the keras codebase, but it seems that here (as of writing at least) we should be able to introspect on the metric function, and see how many required positional arguments it takes.

For backwards compatibility, we could infer proper usage, e.g.:

2 args --> fn(y_true, y_pred)
3 args --> fn(y_true, y_pred, weights)
4 args --> fn(y_true, y_pred, weights, mask)

If a 2 arg function is provided and weighting is specified, we could default to the current logic.

My use case is that I want to normalize my weights and reduce my score array in a different manner.

fchollet reviewed Aug 1, 2017

View reviewed changes

nicolewhite added 3 commits August 1, 2017 16:12

Allow sample_weight / class_weight to be applied to metrics

be23f8d

Fix pep8

77675d4

weight_metrics -> weigh_metrics

1ea33b9

nicolewhite force-pushed the sample-weights branch from c3d438f to 1ea33b9 Compare August 1, 2017 23:22

fchollet closed this Aug 4, 2017

fchollet reopened this Aug 4, 2017

nicolewhite mentioned this pull request Aug 6, 2017

Add weighted_metrics arg to compile #7536

Merged

nicolewhite closed this Aug 11, 2017

nicolewhite deleted the sample-weights branch August 14, 2017 19:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow sample_weight / class_weight to be applied to metrics #7482

Allow sample_weight / class_weight to be applied to metrics #7482

Uh oh!

nicolewhite commented Jul 31, 2017 •

edited

Loading

Uh oh!

fchollet Aug 1, 2017

Uh oh!

nicolewhite Aug 1, 2017

Uh oh!

fchollet commented Aug 1, 2017

Uh oh!

nicolewhite commented Aug 2, 2017

Uh oh!

fchollet commented Aug 2, 2017

Uh oh!

nicolewhite commented Aug 3, 2017

Uh oh!

nicolewhite commented Aug 4, 2017

Uh oh!

fchollet commented Aug 4, 2017

Uh oh!

fchollet commented Aug 4, 2017

Uh oh!

mitkeyastromouse commented Aug 28, 2018

Uh oh!

trianta2 commented Mar 15, 2019

Uh oh!

Uh oh!

Allow sample_weight / class_weight to be applied to metrics #7482

Allow sample_weight / class_weight to be applied to metrics #7482

Uh oh!

Conversation

nicolewhite commented Jul 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fchollet Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

nicolewhite Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

fchollet commented Aug 1, 2017

Uh oh!

nicolewhite commented Aug 2, 2017

Uh oh!

fchollet commented Aug 2, 2017

Uh oh!

nicolewhite commented Aug 3, 2017

Uh oh!

nicolewhite commented Aug 4, 2017

Uh oh!

fchollet commented Aug 4, 2017

Uh oh!

fchollet commented Aug 4, 2017

Uh oh!

mitkeyastromouse commented Aug 28, 2018

Uh oh!

trianta2 commented Mar 15, 2019

Uh oh!

Uh oh!

nicolewhite commented Jul 31, 2017 •

edited

Loading