-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Reduce the memory usage during data ingestion #47047
Conversation
Signed-off-by: sevev <qiangzh95@gmail.com>
POlapTableSchemaParam ptable_schema_param; | ||
// `ptable_schema_param` is valid during initialization. | ||
// And it may become invalid after initialization, so please do not access it afterward. | ||
const POlapTableSchemaParam* ptable_schema_param = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this will be invalid after unit, why not make it a function variable in unit rather than store it in member variable.
And it will be better to set it to null after unit then it will cause crash rather than wild pointer issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this will be invalid after unit, why not make it a function variable in unit rather than store it in member variable. And it will be better to set it to null after unit then it will cause crash rather than wild pointer issues
Actually this point is valid during data ingestion right now, but I worry that someone else might prematurely release the corresponding memory when making changes, so I added some comments as a warning. And we only need to access this memory during initialization. I reset it to nullptr after init and modified some comments to make it clearer for others to understand.
Signed-off-by: sevev <qiangzh95@gmail.com>
Signed-off-by: sevev <qiangzh95@gmail.com>
[FE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[BE Incremental Coverage Report]✅ pass : 17 / 17 (100.00%) file detail
|
@Mergifyio backport branch-3.3 |
@Mergifyio backport branch-3.2 |
✅ Backports have been created
|
✅ Backports have been created
|
Signed-off-by: sevev <qiangzh95@gmail.com> (cherry picked from commit 920c820)
Signed-off-by: sevev <qiangzh95@gmail.com> (cherry picked from commit 920c820) # Conflicts: # be/src/storage/delta_writer.cpp
Why I'm doing:
FE will send full schema to BE during data ingestion and each tablet may generate the new tablet schema according to the schema from FE and each
delta_writer
will keep a full schema until ingestion finished. If there are a lot of ingestion task and tablets, it maybe use a lot of memory.What I'm doing:
The full schema for each
delta_writer
is no need to hold during the whole ingestion. We only need it during initialization, so we can release memory in advance.Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: