Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature-wip](new-scan) Support stream load with csv in new scan framework #13354

Merged
merged 5 commits into from
Oct 17, 2022

Conversation

morningman
Copy link
Contributor

@morningman morningman commented Oct 13, 2022

Proposed changes

Issue Number: close #xxx

Problem summary

  1. Refactor the file reader creation in FileFactory, for simplicity.
    Previously, FileFactory had too many create_file_reader interfaces.
    Now unified into two categories: the interface used by the previous BrokerScanNode,
    and the interface used by the new FileScanNode.
    And separate the creation methods of readers that read StreamLoadPipe and other readers that read files.

  2. Modify the StreamLoadPlanner on FE side to support using ExternalFileScanNode

  3. Now for generic reader, the file reader will be created inside the reader, not passed from the outside.

  4. Add some test cases for csv stream load, the behavior is same as the old broker scanner.

Checklist(Required)

  1. Does it affect the original behavior:
    • Yes
    • No
    • I don't know
  2. Has unit tests been added:
    • Yes
    • No
    • No Need
  3. Has document been added or modified:
    • Yes
    • No
    • No Need
  4. Does it need to update dependencies:
    • Yes
    • No
  5. Are there any changes that cannot be rolled back:
    • Yes (If Yes, please explain WHY)
    • No

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added area/load Issues or PRs related to all kinds of load area/planner Issues or PRs related to the query planner area/vectorization labels Oct 13, 2022
@github-actions github-actions bot added the kind/docs Categorizes issue or PR as related to documentation. label Oct 14, 2022
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman merged commit dbf71ed into apache:master Oct 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/load Issues or PRs related to all kinds of load area/multi-catalog area/planner Issues or PRs related to the query planner area/routine load area/vectorization kind/docs Categorizes issue or PR as related to documentation. kind/test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants