-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]Support Alibaba DLF metastore for hive external table #6403
Conversation
@@ -465,10 +465,30 @@ under the License. | |||
|
|||
<!-- https://mvnrepository.com/artifact/com.facebook.presto.hive/hive-apache --> | |||
<dependency> | |||
<groupId>com.facebook.presto.hive</groupId> | |||
<groupId>io.trino.hive</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please run the regression test.
@@ -100,6 +101,7 @@ public class HiveMetaClient { | |||
|
|||
public HiveMetaClient(String uris) throws DdlException { | |||
HiveConf conf = new HiveConf(); | |||
conf.addResource(new Path("file:///" + StarRocksFE.STARROCKS_HOME_DIR + "/conf/hive-site.xml")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$FE/conf will be used as the classpath of fe. I think it is no necessary to add this?
@@ -119,8 +121,13 @@ public class AutoCloseClient implements AutoCloseable { | |||
private final IMetaStoreClient hiveClient; | |||
|
|||
private AutoCloseClient(HiveConf conf) throws MetaException { | |||
hiveClient = RetryingMetaStoreClient.getProxy(conf, dummyHookLoader, | |||
HiveMetaStoreThriftClient.class.getName()); | |||
if ("dlf".equalsIgnoreCase(conf.get("hive.metastore"))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the 'hive.metastore' configuration defined ? Could the name be modified? I think the 'hive.metastore' means too much
fe/pom.xml
Outdated
<dependency> | ||
<groupId>com.aliyun.datalake</groupId> | ||
<artifactId>metastore-client-hive3</artifactId> | ||
<version>0.2.14</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz define the ${metastore-client-hive3.version} in $STARROCKS/pom.xml
@@ -723,6 +723,12 @@ public void createCatalog(Catalog catalog) | |||
throw new TException("method not implemented"); | |||
} | |||
|
|||
@Override | |||
public void alterCatalog(String s, Catalog catalog) throws NoSuchObjectException, InvalidObjectException, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this interface
import java.util.Optional; | ||
import java.util.concurrent.ConcurrentHashMap; | ||
|
||
public class ProxyMetaStoreClient implements IMetaStoreClient { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only used by DLF?
|
||
public class ProxyMetaStoreClient implements IMetaStoreClient { | ||
private static final Logger logger = | ||
LoggerFactory.getLogger(com.aliyun.datalake.metastore.hive2.ProxyMetaStoreClient.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pom use metastore-client-hive3 , but here we use hive2. is it ok?
@@ -0,0 +1,2478 @@ | |||
// This file is licensed under the Elastic License 2.0. Copyright 2021-present, StarRocks Limited. | |||
// This file is based on code available under the Apache license here: | |||
// https://github.com/aliyun/datalake-catalog-metastore-client/blob/master/metastore-client-hive/metastore-client-hive3/src/main/java/com/aliyun/datalake/metastore/hive2/ProxyMetaStoreClient.java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@imay is license ok ?
run starrocks_fe_unittest |
[FE PR Coverage check]😞 fail : 3 / 6 (50.00%) file detail
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ocks#6403) Data Lake Formation (DLF) is a key component of the cloud-native data lake framework which is widely used on Alibaba Cloud just like AWS Glue, details see https://www.alibabacloud.com/en/product/datalake-formation. This PR allows users to use DLF as metastore in Hive external table. Usage: Add a config file hive-site.xml to {FE Home DIR}/conf , with following configs: <?xml version="1.0"?> <configuration> <!--Set to use dlf client--> <property> <name>hive.metastore.type</name> <value>dlf</value> </property> <!--DLF endpoint, see https://www.alibabacloud.com/help/en/doc-detail/197608.html--> <property> <name>dlf.catalog.endpoint</name> <value>dlf-vpc.cn-beijing.aliyuncs.com</value> </property> <!--DLF region, see https://www.alibabacloud.com/help/en/doc-detail/197608.html--> <property> <name>dlf.catalog.region</name> <value>cn-beijing</value> </property> <!--Proxy mode of DLF--> <property> <name>dlf.catalog.proxyMode</name> <value>DLF_ONLY</value> </property> <!--Access Key mode of DLF--> <property> <name>dlf.catalog.akMode</name> <value>EMR_AUTO</value> </property> <!--User id of the alibaba cloud account--> <property> <name>dlf.catalog.uid</name> <value>xxxxxx</value> </property> <!--Access Key ID of DLF, can be omitted if the cluster is created with the same Alibaba Cloud account of DLF--> <property> <name>dlf.catalog.accessKeyId</name> <value>xxxxxx</value> </property> <!--Access Key secret of DLF, can be omitted if the cluster is created with the same Alibaba Cloud account of DLF--> <property> <name>dlf.catalog.accessKeySecret</name> <value>xxxxxx</value> </property> </configuration>
Signed-off-by: 絵空事スピリット <wanglichen@starrocks.com>
Signed-off-by: 絵空事スピリット <wanglichen@starrocks.com> (cherry picked from commit 40daf76) Co-authored-by: 絵空事スピリット <wanglichen@starrocks.com>
What type of PR is this:
Which issues of this PR fixes :
Fixes #
Problem Summary(Required) :
Data Lake Formation (DLF) is a key component of the cloud-native data lake framework which is widely used on Alibaba Cloud just like AWS Glue, details see https://www.alibabacloud.com/en/product/datalake-formation. This PR allows users to use DLF as metastore in Hive external table.
Usage:
Add a config file
hive-site.xml
to{FE Home DIR}/conf
, with following configs: