Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add ROUGE to mmeval #72

Merged
merged 4 commits into from
Jan 30, 2023

Conversation

go-with-me000
Copy link
Contributor

Motivation

The task of this PR is to realize ROUGE metric. I have completed the implementation of rouge and provided test files for the test function. And the BLEU metric is modified.

Modification

In order to implement the route metric, we also made amendments to the previous situation that bleu could not evaluate Chinese data.

Checklist

  1. Add rouge metric
  2. Provide test file for rouge
  3. Modified bleu metric

@go-with-me000 go-with-me000 marked this pull request as draft January 3, 2023 03:25
@go-with-me000 go-with-me000 reopened this Jan 3, 2023
@go-with-me000 go-with-me000 marked this pull request as ready for review January 4, 2023 09:08
@go-with-me000 go-with-me000 changed the title Cky/rouge metric Add ROUGE to mmeval Jan 9, 2023
@go-with-me000 go-with-me000 changed the title Add ROUGE to mmeval [Feature] Add ROUGE to mmeval Jan 9, 2023
mmeval/metrics/bleu.py Outdated Show resolved Hide resolved
mmeval/metrics/bleu.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/bleu.py Outdated Show resolved Hide resolved
if isinstance(predictions, str):
predictions = [predictions]

if isinstance(references, str):
Copy link
Collaborator

@zhouzaida zhouzaida Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If references is sequence[str], should it be wrapped with []?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In lines 105-113 of the code, the situation of sequence [str] is analyzed

mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
f'`tokenizer` supports Callable, str or None, but not `{type(tokenizer)}`' # noqa: E501
self.accumulate = accumulate

def add(self, predictions: Union[str, Sequence[str]], references: Union[str, Sequence[str], Sequence[Sequence[str]]]) -> None: # type: ignore # yapf: disable # noqa: E501
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to support the string input?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that there may be only one statement for reference and prediction in the simplest case, you can use the simplest method of passing in two strings instead of adding many layer brackets, which may be more convenient for users.

recall = matches / reference_len
if precision == recall == 0.0:
return dict(
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0))
precision=0., recall=0., fmeasure=0.)

Dict[str, float]: Calculate the score of rougeL.
"""
pred_len, reference_len = len(pred), len(reference)
if 0 in (pred_len, reference_len):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if 0 in (pred_len, reference_len):
if pred_len == 0 or reference_len == 0:

pred_len, reference_len = len(pred), len(reference)
if 0 in (pred_len, reference_len):
return dict(
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0))
precision=0., recall=0., fmeasure=0.)

reference_ngarms = _create_ngrams(reference, n_gram)
pred_len = sum(pred_ngarms.values())
reference_len = sum(reference_ngarms.values())
if 0 in (pred_len, reference_len):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if 0 in (pred_len, reference_len):
if pred_len == 0 or reference_len == 0:

reference_len = sum(reference_ngarms.values())
if 0 in (pred_len, reference_len):
return dict(
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
precision=float(0.0), recall=float(0.0), fmeasure=float(0.0))
precision=0., recall=0.0, fmeasure=0.)

Comment on lines +43 to +44
tokenizer_fn (Union[Callable, str, None]): A user's own tokenizer function.
Defaults to None.
Copy link
Collaborator

@zhouzaida zhouzaida Jan 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tokenizer_fn (Union[Callable, str, None]): A user's own tokenizer function.
Defaults to None.
tokenizer_fn (Callable or str, optional): A user's own tokenizer function.
Defaults to None.
New in version 0.3.0.

Args:
token (Sequence[str]): A series of tokens about sentences.
n_gram (int): The maximum number of words contained in a phrase
when calculating word fragments. Defaults to 4.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
when calculating word fragments. Defaults to 4.
when calculating word fragments.

mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
Args:
predictions (Sequence[str]): An iterable of predicted sentences.
references (Sequence[Sequence[str]): An iterable of
referenced sentences.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
referenced sentences.
referenced sentences.

mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
@zhouzaida
Copy link
Collaborator

zhouzaida commented Jan 14, 2023

test_get_n_gram in tests/test_metrics/test_bleu.py should be moved to tests/test_metrics/test_utils/test_ngram_process.py

@zhouzaida zhouzaida requested a review from yhcc January 14, 2023 09:36
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/utils/ngram_process.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/bleu.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@ice-tong ice-tong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ice-tong
Copy link
Collaborator

Suggest rename mmeval/metrics/utils/ngram_process.py -> mmeval/metrics/utils/grammar.py

mmeval/metrics/bleu.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
@zhouzaida
Copy link
Collaborator

This PR can be merged after resolving the above comments.

mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
mmeval/metrics/rouge.py Outdated Show resolved Hide resolved
@zhouzaida zhouzaida merged commit 5bd89db into open-mmlab:main Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants