Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue# 1189] Hugging Face - Question Answering Notebook #1188

Merged
merged 8 commits into from
Nov 2, 2022

Conversation

muhtalhakhan
Copy link
Contributor

// Adding Question Answering notebook using Hugging Face Transformers.

Question Answering models can retrieve the answer to a question from a given text, which is useful for searching for an answer in a document. Some question answering models can generate answers without context!

Loading the pipeline
Import Pipeline from Transformer after installing the transformers and tensorflow.

Input
The input would be the text from any source.

Output
For the output a new window is opened at the link provided above by Gradio, where you can enter your context and get your answer delivered.

Adding Question Answering notebook using Hugging Face Transformers.
@muhtalhakhan muhtalhakhan added the enhancement New feature or request label Sep 29, 2022
@jravenel
Copy link
Contributor

@muhtalhakhan can you create Issue and attach it to PR (see message from @Dr0p42) + add it to the Community roadmap: https://github.com/orgs/jupyter-naas/projects/4?fullscreen=true

This is how it's supposed to be done so we can track your progress.
Thanks

@muhtalhakhan muhtalhakhan changed the title Question Answering Notebook [Issue# 1189] Hugging Face - Question Answering Notebook Sep 29, 2022
@muhtalhakhan
Copy link
Contributor Author

muhtalhakhan commented Sep 29, 2022

Hey @Dr0p42 , and @jravenel I have created an issue which is a 1189 and added it to the review column in the community roadmap.

@FlorentLvr FlorentLvr linked an issue Sep 30, 2022 that may be closed by this pull request
@FlorentLvr
Copy link
Contributor

FlorentLvr commented Sep 30, 2022

Hi @muhtalhakhan for your contribution, I just reviewed your notebook and changed it to pass the control:

  • I assigned an issue to your PR => you can't create a solution (PR) if there is no problem (Issue). The issue was created, you just had to link it to your PR.
  • IMO framework => the section "## Model" was not in your notebook, I had it and rework a bit your code to make it easier to understand. Feel free to make so change if you don't find it relevant. The goal is to have the same framework for all notebooks to make it easy for anybody to use your templates catalog.

After passing the control, I tried to run your notebook but it throws me several errors.
Could you have a look and fix it? Let me know if I can help :)

image

image

@muhtalhakhan
Copy link
Contributor Author

Hey @fravenel, I am glad that you took the time and fixed the errors popping out. I apologise for the late response, was busy with my Final year project phase 1, and its done. I am into phase 2 now.

I have also checked the committed notebook and the code, I will follow the same procedure for the upcoming templates as well.

I would love to take your guidance alongside.

Thank you.

Copy link
Contributor Author

@muhtalhakhan muhtalhakhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, you guys can review and submit it to be there in the templates library ^_^

@jravenel @fravenel

please check.

@FlorentLvr
Copy link
Contributor

Alright, you guys can review and submit it to be there in the templates library ^_^

@jravenel @fravenel

please check.

@muhtalhakhan, I don't see any change, I still have the same error as I mentioned above.
Did you push your changes to your branch?

@muhtalhakhan
Copy link
Contributor Author

Hey @fravenel, I had checked and checked it by now. Every cell is working properly.

Attaching the screenshots.
Yeah, I pushed over to the branch.

If you still find anything missing then do let me know as I would be needing help then.
image
image
image
image
image

Copy link
Contributor Author

@muhtalhakhan muhtalhakhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the changes to the commit you did way before @jravenel

Thank you!

@jravenel
Copy link
Contributor

jravenel commented Oct 5, 2022

@muhtalhakhan can you please check in a Naas cloud environment? I see you are in vs code but as explained in README it's critical we are all in a same env for testing. This is what Naas cloud is for. Thanks a lot 🙏

@jravenel
Copy link
Contributor

jravenel commented Oct 5, 2022

Reviewed the changes to the commit you did way before @jravenel

Thank you!

Not sure I get that 😅

@muhtalhakhan
Copy link
Contributor Author

@muhtalhakhan can you please check in a Naas cloud environment? I see you are in vs code but as explained in README it's critical we are all in a same env for testing. This is what Naas cloud is for. Thanks a lot 🙏

Okay, I got your point @jravenel, now I will test it there at Naas Cloud Env.

Sorry for the hassle.

@muhtalhakhan
Copy link
Contributor Author

Reviewed the changes to the commit you did way before @jravenel
Thank you!

Not sure I get that 😅

😳

Hey @jravenel ,
I have worked upon the errors and removed the gradio for now and would add it after sometime when I find out about it's use in Jupyter Notebooks.

cc: @fravenel
@muhtalhakhan
Copy link
Contributor Author

Hey @jravenel ,
I have worked upon the errors and removed the gradio for now and would add it after sometime when I find out about it's use in Jupyter Notebooks.

cc: @fravenel

@muhtalhakhan
Copy link
Contributor Author

fixed issue #1189

@jravenel
Copy link
Contributor

jravenel commented Oct 6, 2022

@muhtalhakhan I just tried the template. I managed to install the module and run the notebook but here are my feedback :

1/ the h2 is redundant with the title, can you remove it?
Screenshot 2022-10-06 at 21 20 25

2/ the description is not enough to understand the use case and misleading, you talk about searching in a document but there is no input document provided.
Either we add an input document like the bitcoin white paper pdf example (at it can be retrieve online) or either we remove this part, don't you think?

3/ try expect to install transformer and then in the cell below another time seems redundant also here
Screenshot 2022-10-06 at 21 24 26

4/ a few test I made where giving irrelevant reponses like when I ask the number of countries and the answer is "countries"
Screenshot 2022-10-06 at 21 19 54

5/ you say in the description that you can run without context but the code says "cannot be empty"
Screenshot 2022-10-06 at 21 19 27

Overall, it needs a bit of rework, I think you can increase value by passing a document, otherwise it's not really obvious what this template can bring. I hope this feedback is useful?

@jravenel
Copy link
Contributor

jravenel commented Oct 6, 2022

removing Community Roadmap from Projects section, it should not be in PR, cf README

@jravenel jravenel removed their assignment Oct 6, 2022
@muhtalhakhan
Copy link
Contributor Author

@muhtalhakhan I just tried the template. I managed to install the module and run the notebook but here are my feedback :

1/ the h2 is redundant with the title, can you remove it? Screenshot 2022-10-06 at 21 20 25

2/ the description is not enough to understand the use case and misleading, you talk about searching in a document but there is no input document provided. Either we add an input document like the bitcoin white paper pdf example (at it can be retrieve online) or either we remove this part, don't you think?

3/ try expect to install transformer and then in the cell below another time seems redundant also here Screenshot 2022-10-06 at 21 24 26

4/ a few test I made where giving irrelevant reponses like when I ask the number of countries and the answer is "countries" Screenshot 2022-10-06 at 21 19 54

5/ you say in the description that you can run without context but the code says "cannot be empty" Screenshot 2022-10-06 at 21 19 27

Overall, it needs a bit of rework, I think you can increase value by passing a document, otherwise it's not really obvious what this template can bring. I hope this feedback is useful?

Hey @jravenel, I will work over your review and make more sense in the description.

1/ Yeah, I will change this.

2/ Yeah, we can add document there already but for that I will be needing some help to embed it there while accessing the template as it is using a pipeline which needs a context as an input and context here can be the text from the data or any document's data and afterwards a question would be asked which would be starting with question words.

3/ Yeah, I feel that I overlooked it. Will remove it for sure.

4/yeah, it will fail that countries one as you have not provided an input stating the countries name or some text describing the context for which you are asking the question.

5/I will double check my description as with an empty context we cannot run and ask question.

Yeah the feedback is valuable and detailed, If you find sometime tomorrow @ Friday then I want to lend some time with you related to this "PR".

Thank you

Spidey

@muhtalhakhan
Copy link
Contributor Author

removing Community Roadmap from Projects section, it should not be in PR, cf README

could not understand the "cf README".

@muhtalhakhan muhtalhakhan added the hacktoberfest Issues for hacktoberfest label Oct 6, 2022
@jravenel
Copy link
Contributor

jravenel commented Oct 7, 2022

cf = refer to README.

1/ good, let's be more specific on this template being able to answer questions from a PDF document
2/ I was thinking just adding in the Input section a variable : DOCUMENT_PATH = "https://bitcoin.org/bitcoin.pdf"
3/ good
4/ the question is really easy to anticipate the need from a user perspective, documentation of the notebook need to be explicit with example and I think focusing on bitcoin use case will help,
5/ good

All clear?

@muhtalhakhan
Copy link
Contributor Author

cf = refer to README.

1/ good, let's be more specific on this template being able to answer questions from a PDF document 2/ I was thinking just adding in the Input section a variable : DOCUMENT_PATH = "https://bitcoin.org/bitcoin.pdf" 3/ good 4/ the question is really easy to anticipate the need from a user perspective, documentation of the notebook need to be explicit with example and I think focusing on bitcoin use case will help, 5/ good

All clear?

1/ already working on it.

2/Nice idea, I can add a thing to fetch the text from pdf.

3/thank you!

4/Exactly but it requires some input to be answered, and yeah will be adding that bitcoin use case.

5/on it.

Yeah.. Will get back to you.

@jravenel
Copy link
Contributor

jravenel commented Oct 7, 2022

For fetching the text from pdf @muhtalhakhan check out what @MinuraPunchihewa is working on in the product roadmap he has done a template for that!

@muhtalhakhan
Copy link
Contributor Author

For fetching the text from pdf @muhtalhakhan check out what @MinuraPunchihewa is working on in the product roadmap he has done a template for that!

Yeah, I saw that but I'll be using a library called as PyPDF.

Changed accordingly with the proposed feedback.

Requiring a review from @jravenel on this.
@muhtalhakhan
Copy link
Contributor Author

Hey @jravenel made the changes accordingly, please check and let me know if anything further needs my attention.

Thank you!

@muhtalhakhan
Copy link
Contributor Author

Hey @jravenel , your review is required.

Please check kindly.

Do let me know!

@jravenel
Copy link
Contributor

jravenel commented Nov 1, 2022

Hey @jravenel , your review is required.

Please check kindly.

Do let me know!

@muhtalhakhan did you test the notebook on your end because I'm not able to get anything from the notebook: no answers.
I ask the simple question "What is bitcoin?" where I would expect the first occurrence to be replied "A Peer-to-Peer Electronic Cash System" but nothing comes out, check the GIF here to see.
If you have a question that actually works please share, I cannot validate as it is as there is nothing to show.
Thanks for your feedback
Nov-01-2022 01-45-19

@muhtalhakhan
Copy link
Contributor Author

muhtalhakhan commented Nov 2, 2022 via email

@muhtalhakhan
Copy link
Contributor Author

muhtalhakhan commented Nov 2, 2022

Hey @jravenel , I have attached the video and screenshot of the notebook that how is it working and checked it again.

Looking forward.

Naas.mp4

@muhtalhakhan
Copy link
Contributor Author

ss 1

@jravenel
Copy link
Contributor

jravenel commented Nov 2, 2022

Hey @muhtalhakhan it worked for me now.
Maybe was doing it too late 😇😅

I asked
What is it bitcoin → An electronic ...
Who created bitcoin → Satoshi Nakamoto
...

I guess that's good enough: I'm happy that it's working.
Other than that, the IMO framework was not optimal so please check the diff for next time, I basically the whole structure of the sections to make it more readable.

Thanks for the this contribution, hope to see you do more :)

Screenshot 2022-11-02 at 23 12 13

@jravenel jravenel merged commit 87e592d into master Nov 2, 2022
@jravenel jravenel deleted the hugging-face branch November 2, 2022 22:15
@muhtalhakhan
Copy link
Contributor Author

It made my day 😎
Thank you @jravenel 🥂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request hacktoberfest Issues for hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hugging Face - Question Answering Notebook
4 participants