Skip to content

Conversation

@MedlockM
Copy link
Contributor

  • Description: Updates two notebooks in the how_to documentation to reflect new loader interfaces and functionalities.
  • Issue: Some how_to notebooks were still using loader interfaces from previous versions of LangChain and did not demonstrate the latest loader functionalities (e.g., extracting images with ImageBlobParser, extracting tables in specific output formats, parsing documents using Vision-Language Models with ZeroxPDFLoader, and using CloudBlobLoader in the GenericLoader, etc.).
  • Dependencies: py-zerox
  • Twitter handle: @MarcMedlock2

@dosubot dosubot bot added the size:XL label May 13, 2025
@vercel
Copy link

vercel bot commented May 13, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 13, 2025 9:26pm

@dosubot dosubot bot added size:L and removed size:XL labels May 13, 2025
@dosubot dosubot bot added the lgtm label May 13, 2025
@ccurme ccurme merged commit ce0b1a9 into langchain-ai:master May 13, 2025
13 checks passed
@pprados pprados mentioned this pull request May 14, 2025
2 tasks
@MedlockM
Copy link
Contributor Author

@ccurme
Could you please justify your changes? Especially concerning the pdf guide. For example, why not keep the part explaining the new method for parsing with a vision model using Zerox? The way it's currently presented (convert pdf to an image then send the image to a chat object) is "old-fashioned", isn't it?

Likewise, the fact that you can set up an ImageBlobParser directly in a loader to extract tables or images with different strategies is a new feature that should be highlighted, don't you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants