Consistent speech model input names for the Seq2SeqTrainer generate function

# 🚀 Feature request

Could we maybe have a consistent naming convention for speech models? So far we have:
- [`input_features`](https://huggingface.co/transformers/model_doc/speech_to_text.html#speech2textforconditionalgeneration)
- [`input_values`](https://huggingface.co/transformers/model_doc/wav2vec2.html#wav2vec2forctc)
- [`input_ids`](https://huggingface.co/transformers/model_doc/speech_to_text_2.html#speech2text2forcausallm)

From what I can tell, these are mostly the same for the purposes of how the `Seq2SeqTrainer` interprets them.

## Motivation

This would prevent the need for custom `Seq2SeqTrainer` classes and would make training more modular.

## Your contribution

A change in param names would do the trick but could break a lot of code. Alternatively adding the capability to accept different key values in the `generate`  function [here](https://github.com/huggingface/transformers/blob/41436d3dfb98e0d17f018db29790b65663358edf/examples/legacy/seq2seq/seq2seq_trainer.py#L219) would work too using a (clunky) mapping such as `INPUT_MAPPING_LABELS = {"input_features": "input_ids", "input_values": "input_ids", "input_ids": "input_ids"}`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent speech model input names for the Seq2SeqTrainer generate function #13825

🚀 Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development