Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint. #17124

Merged
merged 2 commits into from
Feb 27, 2023

Conversation

eldenmoon
Copy link
Member

@eldenmoon eldenmoon commented Feb 24, 2023

Proposed changes

optimize load cost: 78s->62s

Issue Number: close #xxx

Problem summary

Describe your changes.

Checklist(Required)

  • Does it affect the original behavior
  • Has unit tests been added
  • Has document been added or modified
  • Does it need to update dependencies
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added area/vectorization kind/docs Categorizes issue or PR as related to documentation. kind/test labels Feb 24, 2023
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

…keyed as index in JSON object) - used as a hint.

`_simdjson_set_column_value` could become a hot spot while parsing json in simdjson mode,
introduce `_prev_positions` to cache results for previous row (keyed as index in JSON object) due to the json name field order,
should be quite the same between each lines
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@eldenmoon
Copy link
Member Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@qidaye qidaye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei merged commit 29dc08f into apache:master Feb 27, 2023
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 27, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Yulei-Yang pushed a commit to Yulei-Yang/doris that referenced this pull request Mar 5, 2023
…keyed as index in JSON object) - used as a hint. (apache#17124)

* [Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint.

`_simdjson_set_column_value` could become a hot spot while parsing json in simdjson mode,
introduce `_prev_positions` to cache results for previous row (keyed as index in JSON object) due to the json name field order,
should be quite the same between each lines

* fix case
yagagagaga pushed a commit to yagagagaga/doris that referenced this pull request Mar 9, 2023
…keyed as index in JSON object) - used as a hint. (apache#17124)

* [Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint.

`_simdjson_set_column_value` could become a hot spot while parsing json in simdjson mode,
introduce `_prev_positions` to cache results for previous row (keyed as index in JSON object) due to the json name field order,
should be quite the same between each lines

* fix case
luwei16 pushed a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
* (Enhancement)[load-json] support simdjson in new json reader (apache#16903)

be config:
enable_simdjson_reader=true

related PR apache#11665

* [Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint. (apache#17124)

* [Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint.

`_simdjson_set_column_value` could become a hot spot while parsing json in simdjson mode,
introduce `_prev_positions` to cache results for previous row (keyed as index in JSON object) due to the json name field order,
should be quite the same between each lines

* fix case
SWJTU-ZhangLei pushed a commit to SWJTU-ZhangLei/incubator-doris that referenced this pull request Jul 25, 2023
… (apache#1445)

* (Enhancement)[load-json] support simdjson in new json reader (apache#16903)

be config:
enable_simdjson_reader=true

related PR apache#11665

* [Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint. (apache#17124)

* [Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint.

`_simdjson_set_column_value` could become a hot spot while parsing json in simdjson mode,
introduce `_prev_positions` to cache results for previous row (keyed as index in JSON object) due to the json name field order,
should be quite the same between each lines

* fix case
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. area/vectorization kind/docs Categorizes issue or PR as related to documentation. kind/test reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants