Add community notebook for T5 sentiment span extraction #4700

enzoampil · 2020-06-01T01:51:25Z

This is an example notebook that aims to increase the coverage of T5 fine-tuning examples to address #4426 .

This notebook presents a high level overview of T5, its significance for the future of NLP in practice, and a thoroughly commented tutorial on how to fine-tune T5 for sentiment span extraction with an extractive Q&A format.

I recently presented this in a webinar published on youtube.

codecov-commenter · 2020-06-01T01:57:04Z

Codecov Report

Merging #4700 into master will decrease coverage by 1.41%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4700      +/-   ##
==========================================
- Coverage   77.14%   75.72%   -1.42%     
==========================================
  Files         128      128              
  Lines       21070    21070              
==========================================
- Hits        16255    15956     -299     
- Misses       4815     5114     +299

Impacted Files	Coverage Δ
src/transformers/modeling_longformer.py	`18.70% <0.00%> (-74.83%)`	⬇️
src/transformers/configuration_longformer.py	`76.92% <0.00%> (-23.08%)`	⬇️
src/transformers/modeling_t5.py	`77.18% <0.00%> (-6.35%)`	⬇️
src/transformers/file_utils.py	`73.44% <0.00%> (-0.42%)`	⬇️
src/transformers/modeling_bert.py	`86.77% <0.00%> (-0.19%)`	⬇️
src/transformers/modeling_utils.py	`90.17% <0.00%> (+0.23%)`	⬆️
src/transformers/modeling_openai.py	`81.78% <0.00%> (+1.37%)`	⬆️
src/transformers/modeling_gpt2.py	`86.21% <0.00%> (+14.10%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0866669...34c9a46. Read the comment docs.

LysandreJik · 2020-06-01T15:00:27Z

Nice webinar, and cool notebook :)

@patrickvonplaten, do you want to take a look?

enzoampil · 2020-06-01T23:19:31Z

Thanks @LysandreJik ! 😄

patrickvonplaten · 2020-06-02T07:59:38Z

Awesome notebook @enzoampil!

LGTM for merge!

Which dataset do you use exactly to fine-tune T5 here?

enzoampil · 2020-06-02T09:44:25Z

Thanks @patrickvonplaten ! 😄

For the dataset, I got it from an ongoing Kaggle competition called Tweet Sentiment Extraction.

The objective is to extract the span from a tweet that indicates its sentiment

Example input:

sentiment: negative
tweet:  How did we just get paid and still be broke as hell?! No shopping spree for me today

Example output:

broke as hell?!

enzoampil · 2020-06-02T09:45:44Z

I was thinking about contributing this to the nlp library, but I'm not sure if Kaggle has policies regarding uploading their datasets to other public sources ...

patrickvonplaten · 2020-06-02T11:54:57Z

I see! Yeah no worries - I don't think we currently handle dataset processing from on-going kaggle competition links.

Add community notebook for sentiment span extraction

34c9a46

patrickvonplaten merged commit d3ef14f into huggingface:master Jun 2, 2020

enzoampil deleted the add_t5_example branch June 2, 2020 09:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add community notebook for T5 sentiment span extraction #4700

Add community notebook for T5 sentiment span extraction #4700

enzoampil commented Jun 1, 2020

codecov-commenter commented Jun 1, 2020 •

edited

Loading

LysandreJik commented Jun 1, 2020

enzoampil commented Jun 1, 2020

patrickvonplaten commented Jun 2, 2020

enzoampil commented Jun 2, 2020

enzoampil commented Jun 2, 2020

patrickvonplaten commented Jun 2, 2020

Add community notebook for T5 sentiment span extraction #4700

Add community notebook for T5 sentiment span extraction #4700

Conversation

enzoampil commented Jun 1, 2020

codecov-commenter commented Jun 1, 2020 • edited Loading

Codecov Report

LysandreJik commented Jun 1, 2020

enzoampil commented Jun 1, 2020

patrickvonplaten commented Jun 2, 2020

enzoampil commented Jun 2, 2020

enzoampil commented Jun 2, 2020

patrickvonplaten commented Jun 2, 2020

codecov-commenter commented Jun 1, 2020 •

edited

Loading