Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prompts for LinCE dataset Sentiment Analysis Task #746

Closed
wants to merge 19 commits into from
Closed

Conversation

RosenZhang
Copy link

Added 5 prompts to SA task from LinCE. The templates are close to ones of imdb dataset. The LinCE dataset is in Huggingface Datasets but it's unavailable from the promptsource interface. The filter_english_datasets method in utils.py is hence modified to add LinCE in the list before filtering. If there's more correct way to do so or other problems with the prompts, please comment below. Thanks!

@RosenZhang RosenZhang changed the base branch from main to eval-hackathon April 26, 2022 20:51
@awebson awebson self-assigned this Apr 26, 2022
@RosenZhang RosenZhang changed the title Add prompts to LinCE dataset Sentiment Analysis Task Add prompts for LinCE dataset Sentiment Analysis Task Apr 27, 2022
RosenZhang and others added 2 commits April 27, 2022 12:07
* Add GEM/xsum prompts

* uncommit this hack

* Add GEM in INCLUDED_USERS

Co-authored-by: Albert Webson <awebson@cs.brown.edu>
@awebson awebson self-requested a review April 27, 2022 21:46
Copy link
Contributor

@awebson awebson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ruochen!

  1. Does this same prompt apply to all other subsets of LinCE? Or are we only asked to evaluate the sa_spaeng subset?

  2. Some prompts are missing the field of Answer Choices: positive ||| negative ||| neutral

  3. This prompt's wording could be more natural

The following post expresses what sentiment?

What sentiment does the following post express? Positive, negative, or neutral?

(in that case, you should also mark the "Choices in template" flag. That is, models are explicitly told the choices "Positive, negative, or neutral?" in the input.)

  1. We're looking for at least 5 original task prompts. You're missing one.

@RosenZhang
Copy link
Author

Thanks Ruochen!

  1. Does this same prompt apply to all other subsets of LinCE? Or are we only asked to evaluate the sa_spaeng subset?
  2. Some prompts are missing the field of Answer Choices: positive ||| negative ||| neutral
  3. This prompt's wording could be more natural

The following post expresses what sentiment?

What sentiment does the following post express? Positive, negative, or neutral?

(in that case, you should also mark the "Choices in template" flag. That is, models are explicitly told the choices "Positive, negative, or neutral?" in the input.)

  1. We're looking for at least 5 original task prompts. You're missing one.

Hi Albert, thanks so much for reviewing this!
Re comments:

  1. The templates in this PR would only apply to the sa_spaeng subset, other subsets are for different tasks, namely, language identification, POS, NER. I'm currently about to finish templates on NER tasks and probably will update the templates for it later today. I'm curious if we should develop prompts for all different tasks in this PR or go ahead to eval harness for testing the whole pipeline first? I think a bit more time is needed for developing prompts for language identification and POS tasks as there are less examples. (Welcome any pointers to similar tasks! Thanks! )
  2. For Answer Choices, apologize for the inconsistency. I was wondering how the answer choices are passed into the model? In this dataset, the target is given in string format directly like 'sa':'positive' and therefore referred directly as sa in the template, unlike other NER task where we may refer to the answer choices and write target as choices[label]. Would answer choices still be required in this case?
  3. Noted, I'll improve the wording of the prompts!
  4. Also noted, I'll try to re-phrase and add an extra one! (If I understand correctly, negation wouldn't consider as original task right?)

@RosenZhang
Copy link
Author

Close this PR due to issues when rebasing to newly updated eval-hackathon branch.

@RosenZhang RosenZhang closed this Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants