Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support insert overwrite #6282

Merged
merged 25 commits into from
Jun 14, 2022

Conversation

ABingHuang
Copy link
Contributor

What type of PR is this:

  • bug
  • feature
  • enhancement
  • others

Which issues of this PR fixes :

Fixes #4798

Problem Summary(Required) :

Support insert overwrite for OlapTable. use as following:
insert overwrite tableA select xxx from tableB;
insert overwrite tableA partitions(p1, p2) select xxx from tableB;
The design idea is create mirror temporary partitions and load new data into the temporary partitions. when all data is loaded, swap temporary partitions and original partitions.
The core points are:

  1. modify OlapTable meta of UNPARTITIONED table to support to add temporary partitions
  2. the atomicity of load. when the insert failed, should remove the temporary partitions created
  3. the editlog and serialization to data persistence
  4. the concurrent control between insert into and insert overwrite. now just mutual exclusion is used.

@ABingHuang ABingHuang force-pushed the issue/#4798_insert_overwrite branch from 112ce6b to 92ed7c9 Compare May 18, 2022 15:47
@Astralidea Astralidea changed the title Support insert overwrite [Feature] Support insert overwrite May 19, 2022
@chaoyli
Copy link
Contributor

chaoyli commented May 19, 2022

Please change the title and obey the rule.

@imay
Copy link
Contributor

imay commented May 19, 2022

Please change the title and obey the rule.

Can you give the rule's Link? I'm not sure where can I find it.

@ABingHuang ABingHuang force-pushed the issue/#4798_insert_overwrite branch from 3089ac9 to 0c8f5b3 Compare May 27, 2022 09:42
@ABingHuang
Copy link
Contributor Author

run starrocks_fe_unittest

1 similar comment
@ABingHuang
Copy link
Contributor Author

run starrocks_fe_unittest

kangkaisen
kangkaisen previously approved these changes May 28, 2022
@ABingHuang
Copy link
Contributor Author

run starrocks_fe_unittest

@ABingHuang ABingHuang force-pushed the issue/#4798_insert_overwrite branch from c1b9b2f to 05483de Compare June 11, 2022 06:11
Comment on lines 298 to 300
insertStmt.setOriginalTargetPartitionIds(insertStmt.getTargetPartitionIds());
insertStmt.setTargetPartitionNames(partitionNames);
insertStmt.setTargetPartitionIds(newPartitionIds);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need db.lock here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only one insert for all related partition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think targetTable.getPartition should be called in read lock

private InsertOverwriteJobState jobState;

@SerializedName(value = "sourcePartitionNames")
private List<String> sourcePartitionNames;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you already have partitionid, why you need partition names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modify it to use partition ids

@@ -1670,6 +1670,31 @@ public void replaceTempPartitions(List<String> partitionNames, List<String> temp
}
}

// used for unpartitioned table in insert overwrite
// replace partition with temp partition
public void replacePartition(String sourcePartitionName, String tempPartitionName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It it strange to give partition names for unpartitioned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unpartitioned table's partition also has a name, it is default to the table name. Here I have to add a tmp partition for table, so I need to give the tmp partition a name. I think it is ok.

1. rename CreateInsertOverwriteJobInfo
2. use partition id to replace name
3. optimize log
4. extract partition utils
@ABingHuang ABingHuang force-pushed the issue/#4798_insert_overwrite branch from 495618d to d443ebd Compare June 13, 2022 11:55
@Astralidea
Copy link
Contributor

run starrocks_fe_unittest

2 similar comments
@ABingHuang
Copy link
Contributor Author

run starrocks_fe_unittest

@ABingHuang
Copy link
Contributor Author

run starrocks_fe_unittest

@wanpengfei-git
Copy link
Collaborator

[FE PR Coverage check]

😞 fail : 312 / 543 (57.46%)

file detail

path covered line new line coverage
🔵 com/starrocks/load/PartitionUtils.java 0 34 00.00%
🔵 com/starrocks/persist/EditLog.java 2 10 20.00%
🔵 com/starrocks/server/GlobalStateMgr.java 3 13 23.08%
🔵 com/starrocks/common/util/RangeUtils.java 1 4 25.00%
🔵 com/starrocks/load/InsertOverwriteJobRunner.java 58 157 36.94%
🔵 com/starrocks/server/LocalMetastore.java 23 49 46.94%
🔵 com/starrocks/alter/Alter.java 1 2 50.00%
🔵 com/starrocks/alter/SchemaChangeHandler.java 1 2 50.00%
🔵 com/starrocks/alter/MaterializedViewHandler.java 1 2 50.00%
🔵 com/starrocks/analysis/InsertStmt.java 6 11 54.55%
🔵 com/starrocks/qe/StmtExecutor.java 14 24 58.33%
🔵 com/starrocks/sql/analyzer/InsertAnalyzer.java 6 10 60.00%
🔵 com/starrocks/sql/common/MetaUtils.java 8 11 72.73%
🔵 com/starrocks/load/InsertOverwriteJobManager.java 91 111 81.98%
🔵 com/starrocks/planner/OlapTableSink.java 5 6 83.33%
🔵 com/starrocks/catalog/OlapTable.java 12 13 92.31%
🔵 com/starrocks/load/InsertOverwriteJob.java 29 31 93.55%
🔵 com/starrocks/persist/CreateInsertOverwriteJobLog.java 14 15 93.33%
🔵 com/starrocks/persist/InsertOverwriteStateChangeInfo.java 16 17 94.12%
🔵 com/starrocks/sql/parser/AstBuilder.java 2 2 100.00%
🔵 com/starrocks/load/InsertOverwriteJobState.java 5 5 100.00%
🔵 com/starrocks/sql/analyzer/DeleteAnalyzer.java 2 2 100.00%
🔵 com/starrocks/sql/analyzer/UpdateAnalyzer.java 2 2 100.00%
🔵 com/starrocks/sql/analyzer/AnalyzeStmtAnalyzer.java 4 4 100.00%
🔵 com/starrocks/journal/JournalEntity.java 6 6 100.00%

@Astralidea Astralidea merged commit ecdd73b into StarRocks:main Jun 14, 2022
jaogoy pushed a commit to jaogoy/starrocks that referenced this pull request Nov 15, 2023
Signed-off-by: 絵空事スピリット <wanglichen@starrocks.com>
jaogoy pushed a commit to jaogoy/starrocks that referenced this pull request Nov 15, 2023
Signed-off-by: 絵空事スピリット <wanglichen@starrocks.com>
(cherry picked from commit a6629bd)

Co-authored-by: 絵空事スピリット <wanglichen@starrocks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[MaterializedView] add insert overwrite [P1]
9 participants