-
Notifications
You must be signed in to change notification settings - Fork 921
Added docs links to supported tasks #257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Amazing! Thanks so much 🤗. Could you also update 5_supported-tasks.snippet? Afterwards I'll generate a preview for it |
@xenova Done 👍 Also, on top of the above (Thought: Maybe there's a way to simply use the text and image parts as separate |
Hmm, I think I'd like to keep the code snippets to only use the pipeline function (and avoid pre- and post-processing needed by the user). But as you identified, there is technically no sentence-similarity pipeline (even though the functionality does exist). Perhaps we can just add the sentence-similarity (and even
I put those examples here and here, but I do agree that since it's quite a popular use-case, it might be worth creating a tutorial/guide for it. Same for other embeddings. |
Can I add a link to the available models too? E.g. something like this: Where the https://huggingface.co/models?pipeline_tag=fill-mask&library=transformers.js&sort=trending (If yes, and you prefer a different place/format/link-text/etc. let me know) Relevant: |
That's a great idea! Yes please! 🤗
I'm not too picky/bothered :) I don't think it's too confusing or anything. |
The documentation is not available anymore as the PR was closed or merged. |
Done!
This seems like a bad idea imo if it's at the cost of the user/dev experience. I know I'd definitely have benefited from a code snippet like this. Is this just a mild preference, or something you're quite sure about? I definitely prefer that docs examples are as useful as possible to newbies. The other end of the spectrum is a very "technical" list of snippets/facts (parameter types, return values, etc.) - things that don't really help the users who are in need of the most help - the newbies who are just trying to get something working as a starting point. As a user I definitely would have benefited from having an example like the one I gave. I've created gists of minimal examples like that that I can refer back to, and I think every user would have to end up repeating that work. Cosine vs dot? pooling? normalization?
Even if the Worth noting also that sometimes the pre-existing pipelines don't quite fit the use case - e.g. I may have some existing vectors, and some text (instead of just text pairs), or I may want to save the vectors as well as the similarity scores, rather than just getting a similarity score. Or I may want to compare features across modalities like with CLIP. IIUC, these are the sorts of things people will use the Apologies for the wall of text! 😅 |
Mild preference :) If something is better for the dev experience, then I'll do that!
Agreed, though I would say that the /api/pipelines section is meant to have those technical details, while /pipelines shouldn't (it should be high-level).
Yes that's definitely something which should be improved. Perhaps adding a table of contents to the top of /api/pipelines which would link them to the relevant code snippets would be a simple addition for now (to replace the ugly auto-generated block which is there right now). For example, it could be similar to the available tasks section, but also linking to (or including) the parameters
Currently, the feature-extraction pipeline is only for text (something I actually found out recently, as I also thought it was for all modalities). The recommended way to get the raw model outputs is by loading models with the |
Nothing wrong with having technical details there imo (especially now that we have links that go straight to relevant code snippets - much easier for newbies to navigate), but if there are already example code snippets there, why not make them as useful as possible to the dev that's reading them? If 50% of people hitting the page want to do X, then the code snippet should probably show an example of X - especially if it's just a couple more lines of code. But I agree that stuff that's higher level (than e.g. a dot product or whatever), should probably go on a separate page (same with not-as-common use cases). |
Yeah that makes sense 👍 The library also has some other (not-as-well documented) methods for dot product and cosine similarity, so we could always just use those. For now, I'll merge these changes (as I am prepping v2.5.3 now), and we can continue improving the docs in other PRs 😇🤗 Thanks again for these improvements! |
Issue: #134 (comment)
I linked to the
feature-extraction
example forsentence-similarity
- relevant issues:So, for now at least, can I add an example like this to the docs for
feature-extraction
?