Skip to content

no attribute 'add_bos_token' error and MonolingualDataset AssertionError when evaluating a roberta base checkpoint model on language modelling? #2496

Closed
@jerrygaoLondon

Description

🐛 Bug

The bug happens on latest version 0.9.0

Traceback (most recent call last):
  File "<my_local_path>/bin/fairseq-eval-lm", line 33, in <module>
    sys.exit(load_entry_point('fairseq==0.9.0', 'console_scripts', 'fairseq-eval-lm')())
  File "<my_local_path>/lib/python3.7/site-packages/fairseq_cli/eval_lm.py", line 223, in cli_main
    main(args)
  File <my_local_path>/python3.7/site-packages/fairseq_cli/eval_lm.py", line 157, in main
    if args.add_bos_token:
AttributeError: 'Namespace' object has no attribute 'add_bos_token'

There is no way to pass the arg 'add_bos_token' from command line and 'add_bos_token' is not mentioned in doc.

Adding '--context-window' in command line removes the 'no attribute 'add_bos_token'' error , but it causes another AssertionError in lm_context_window_dataset.py

Traceback (most recent call last):
  File "<my_local_path>/bin/fairseq-eval-lm", line 33, in <module>
    sys.exit(load_entry_point('fairseq==0.9.0', 'console_scripts', 'fairseq-eval-lm')())
  File "<my_local_path>/lib/python3.7/site-packages/fairseq_cli/eval_lm.py", line 223, in cli_main
    main(args)
  File "<my_local_path>/lib/python3.7/site-packages/fairseq_cli/eval_lm.py", line 84, in main
    pad_idx=task.source_dictionary.pad(),
  File "<my_local_path>/lib/python3.7/site-packages/fairseq/data/lm_context_window_dataset.py", line 18, in __init__
    assert isinstance(dataset, MonolingualDataset)
AssertionError

When running with "validate.py" script (as follows) from master branch, it throws different "Unable to infer Criterion arguments" error:

python fairseq/fairseq_cli/validate.py data-bin/<mydata> --path <my_roberta_checkpoint_path>/model.pt --task masked_lm --max-tokens 128

The error message:

Traceback (most recent call last):
  File ".../fairseq/fairseq_cli/validate.py", line 132, in <module>
    cli_main()
  File ".../fairseq/fairseq_cli/validate.py", line 128, in cli_main
    distributed_utils.call_main(args, main, override_args=override_args)
  File ".../fairseq/fairseq/distributed_utils.py", line 189, in call_main
    main(args, **kwargs)
  File ".../fairseq/fairseq_cli/validate.py", line 65, in main
    criterion = task.build_criterion(model_args)
  File ".../fairseq/fairseq/tasks/fairseq_task.py", line 267, in build_criterion
    return criterions.build_criterion(args, self)
  File ".../fairseq/fairseq/registry.py", line 44, in build_x
    return builder(args, *extra_args, **extra_kwargs)
  File ".../fairseq/fairseq/criterions/fairseq_criterion.py", line 56, in build_criterion
    '{}.build_criterion'.format(cls.__name__)
NotImplementedError: Unable to infer Criterion arguments, please implement MaskedLmLoss.build_criterion

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

cmd example that i run is as follows:

fairseq-eval-lm data-bin/<mydata> --path /<my_model_checkpoint>/model.pt  --max-sentences 2 --tokens-per-sample 128 --task masked_lm --criterion masked_lm

The parameters are same as my training setting. This is a Roberta model trained with 'masked_lm' with 'fairseq-train' and was preprocessed with 'fairseq-preprocess' and encoded with 'fastBPE' (i.e., --bpe=fastbpe). Thus, the test data is pre-encoded with the fastbpe and preprocessed.

There is no issue when retrain a model on this checkpoint with 'fairseq-preprocess' .

Code sample

N/A

Expected behavior

Should return me the perplexity score on the test data in data-bin/.

Environment

  • fairseq Version (0.9.0):
  • PyTorch Version (e.g., 1.0): 1.5.1
  • OS (e.g., Linux): Ubuntu 16.04.4 LTS
  • How you installed fairseq (pip, source): pip
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version: CUDA Version 9.0.176
  • GPU models and configuration: Tesla P100-PCIE (NVIDIA-SMI 440.64.00)
  • Any other relevant information: python 3.7

Additional context

This is related to the issue #1324 1324

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions