Skip to content

Conversation

@aidansu
Copy link
Contributor

@aidansu aidansu commented Nov 5, 2025

What problem does this PR solve?

  • Added TCADP Parser configuration fields to PDF, PPT, and spreadsheet parsing forms
  • Implemented support for setting table result type (Markdown/HTML) and Markdown image response type (URL/Text)
  • Updated TCADP Parser to handle return format settings from configuration or parameters
  • Enhanced frontend to dynamically show TCADP options based on selected parsing method
  • Modified backend to pass format parameters when calling TCADP API
  • Optimized form default value logic for TCADP configuration items
  • Updated multilingual resource files for new configuration options

Type of change

  • New Feature (non-breaking change which adds functionality)

- Remove custom signature implementation and adopt Tencent Cloud's official SDK
- Update configuration files to support new SDK parameters
- Upgrade dependencies to the latest stable versions
- Optimize streaming response handling mechanism
- Unify environment variable reading logic
- Enhance control over table and image response types
…add_adp_parser

# Conflicts:
#	rag/app/naive.py
#	rag/flow/parser/parser.py
#	web/src/components/layout-recognize-form-field.tsx
- Add spreadsheet parsing field component in data flow and agent forms
- Update spreadsheet parsing constant configurations to support both DeepDOC and TCADP parsing methods
- Implement TCADP parsing logic for spreadsheet files in rag/app/naive.py
- Extend rag/flow/parser/parser.py to support both TCADP and DeepDOC spreadsheet parsing methods
- Add handling of TCADP parsing results for HTML, JSON, and Markdown output formats
- Update frontend utility functions to pass spreadsheet parsing method configurations
- Add tcadp_parser method for PPT files
- Support both PPT and PPTX file formats
- Add PPT form field component
- Add new output format options: markdown, text, and html
…mance/perf_tcadp_parser

# Conflicts:
#	rag/app/naive.py
#	rag/flow/parser/parser.py
#	uv.lock
#	web/src/components/layout-recognize-form-field.tsx
#	web/src/pages/data-flow/constant.tsx
#	web/src/pages/data-flow/form/parser-form/index.tsx
#	web/src/pages/data-flow/form/parser-form/pdf-form-fields.tsx
#	web/src/pages/data-flow/utils.ts
… for the TCADP Parser

- Added TCADP Parser-related configuration fields to the PDF, PPT, and spreadsheet parsing forms
- Added support for setting table result type (Markdown/HTML) and Markdown image response type (URL/Text)
- Updated the TCADP Parser to support obtaining return format settings from configuration or parameters
- Updated frontend logic to dynamically display TCADP configuration options based on the selected parsing method
- Modified backend logic to pass the corresponding format configuration parameters when calling the TCADP API
- Optimized the form default value setting logic to ensure TCADP configuration items have appropriate initial values
- Updated multilingual resource files to support the UI display of the new configuration items
…nce/perf_tcadp_parser

# Conflicts:
#	rag/app/naive.py
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. 💞 feature Feature request, pull request that fullfill a new feature. labels Nov 5, 2025
@yingfeng yingfeng added the ci Continue Integration label Nov 5, 2025
@yingfeng
Copy link
Member

yingfeng commented Nov 6, 2025

Thx, please fix the ci at first~~

@aidansu
Copy link
Contributor Author

aidansu commented Nov 6, 2025

@yingfeng CI issues fixed, checks are passing now. Please review again. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Continue Integration 💞 feature Feature request, pull request that fullfill a new feature. size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants