-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Feat: Add TCADP parser for PPTX and spreadsheet document types. #11041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
aidansu
wants to merge
11
commits into
infiniflow:main
Choose a base branch
from
aidansu:performance/perf_tcadp_parser
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Remove custom signature implementation and adopt Tencent Cloud's official SDK - Update configuration files to support new SDK parameters - Upgrade dependencies to the latest stable versions - Optimize streaming response handling mechanism - Unify environment variable reading logic - Enhance control over table and image response types
…add_adp_parser # Conflicts: # rag/app/naive.py # rag/flow/parser/parser.py # web/src/components/layout-recognize-form-field.tsx
…sdk-python from 3.0.1215 to 3.0.1478
- Add spreadsheet parsing field component in data flow and agent forms - Update spreadsheet parsing constant configurations to support both DeepDOC and TCADP parsing methods - Implement TCADP parsing logic for spreadsheet files in rag/app/naive.py - Extend rag/flow/parser/parser.py to support both TCADP and DeepDOC spreadsheet parsing methods - Add handling of TCADP parsing results for HTML, JSON, and Markdown output formats - Update frontend utility functions to pass spreadsheet parsing method configurations
- Add tcadp_parser method for PPT files - Support both PPT and PPTX file formats - Add PPT form field component
- Add new output format options: markdown, text, and html
…mance/perf_tcadp_parser # Conflicts: # rag/app/naive.py # rag/flow/parser/parser.py # uv.lock # web/src/components/layout-recognize-form-field.tsx # web/src/pages/data-flow/constant.tsx # web/src/pages/data-flow/form/parser-form/index.tsx # web/src/pages/data-flow/form/parser-form/pdf-form-fields.tsx # web/src/pages/data-flow/utils.ts
… for the TCADP Parser - Added TCADP Parser-related configuration fields to the PDF, PPT, and spreadsheet parsing forms - Added support for setting table result type (Markdown/HTML) and Markdown image response type (URL/Text) - Updated the TCADP Parser to support obtaining return format settings from configuration or parameters - Updated frontend logic to dynamically display TCADP configuration options based on the selected parsing method - Modified backend logic to pass the corresponding format configuration parameters when calling the TCADP API - Optimized the form default value setting logic to ensure TCADP configuration items have appropriate initial values - Updated multilingual resource files to support the UI display of the new configuration items
…nce/perf_tcadp_parser # Conflicts: # rag/app/naive.py
Member
|
Thx, please fix the ci at first~~ |
Contributor
Author
|
@yingfeng CI issues fixed, checks are passing now. Please review again. Thanks! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Type of change