diff --git a/conf/be.conf b/conf/be.conf index 4604d2b9ad842a..e25c636bcea546 100644 --- a/conf/be.conf +++ b/conf/be.conf @@ -17,11 +17,10 @@ PPROF_TMPDIR="$DORIS_HOME/log/" -CUR_DATE=`date +%Y%m%d-%H%M%S` -JAVA_OPTS="-Xmx1024m -DlogPath=$DORIS_HOME/log/jni.log -Xloggc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -DJDBC_MIN_POOL=1 -DJDBC_MAX_POOL=100 -DJDBC_MAX_IDEL_TIME=300000 -DJDBC_MAX_WAIT_TIME=5000" +JAVA_OPTS="-Xmx1024m -DlogPath=$DORIS_HOME/log/jni.log -Xloggc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -DJDBC_MIN_POOL=1 -DJDBC_MAX_POOL=100 -DJDBC_MAX_IDLE_TIME=300000 -DJDBC_MAX_WAIT_TIME=5000" # For jdk 9+, this JAVA_OPTS will be used as default JVM options -JAVA_OPTS_FOR_JDK_9="-Xmx1024m -DlogPath=$DORIS_HOME/log/jni.log -Xlog:gc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -DJDBC_MIN_POOL=1 -DJDBC_MAX_POOL=100 -DJDBC_MAX_IDEL_TIME=300000 -DJDBC_MAX_WAIT_TIME=5000" +JAVA_OPTS_FOR_JDK_9="-Xmx1024m -DlogPath=$DORIS_HOME/log/jni.log -Xlog:gc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -DJDBC_MIN_POOL=1 -DJDBC_MAX_POOL=100 -DJDBC_MAX_IDLE_TIME=300000 -DJDBC_MAX_WAIT_TIME=5000" # since 1.2, the JAVA_HOME need to be set to run BE process. # JAVA_HOME=/path/to/jdk/ diff --git a/conf/fe.conf b/conf/fe.conf index 4a52b6ee06f11a..896c9b98a5a465 100644 --- a/conf/fe.conf +++ b/conf/fe.conf @@ -24,11 +24,10 @@ # the output dir of stderr and stdout LOG_DIR = ${DORIS_HOME}/log -DATE = `date +%Y%m%d-%H%M%S` -JAVA_OPTS="-Xmx8192m -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$DATE" +JAVA_OPTS="-Djavax.security.auth.useSubjectCredsOnly=false -Xss4m -Xmx8192m -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$CUR_DATE" # For jdk 9+, this JAVA_OPTS will be used as default JVM options -JAVA_OPTS_FOR_JDK_9="-Xmx8192m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xlog:gc*:$DORIS_HOME/log/fe.gc.log.$DATE:time" +JAVA_OPTS_FOR_JDK_9="-Djavax.security.auth.useSubjectCredsOnly=false -Xss4m -Xmx8192m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xlog:gc*:$DORIS_HOME/log/fe.gc.log.$CUR_DATE:time" ## ## the lowercase properties are read by main program. diff --git a/docs/en/docs/lakehouse/multi-catalog/faq.md b/docs/en/docs/lakehouse/multi-catalog/faq.md index 9ec3f3ffbc235f..e73a0a5bf40756 100644 --- a/docs/en/docs/lakehouse/multi-catalog/faq.md +++ b/docs/en/docs/lakehouse/multi-catalog/faq.md @@ -109,3 +109,83 @@ under the License. ``` 'fs.defaultFS' = 'hdfs://' ``` +12. The values of the partition fields in the hudi table can be found on hive, but they cannot be found on doris. + + Doris and hive currently query hudi differently. Doris needs to add partition fields to the avsc file of the hudi table structure. If not added, it will cause Doris to query partition_ Val is empty (even if home. datasource. live_sync. partition_fields=partition_val is set) + + ``` + { + "type": "record", + "name": "record", + "fields": [{ + "name": "partition_val", + "type": [ + "null", + "string" + ], + "doc": "Preset partition field, empty string when not partitioned", + "default": null + }, + { + "name": "name", + "type": "string", + "doc": "名称" + }, + { + "name": "create_time", + "type": "string", + "doc": "创建时间" + } + ] + } + ``` + +13. The table in orc format of Hive 1.x may encounter system column names such as `_col0`, `_col1`, `_col2`... in the underlying orc file schema, which need to be specified in the catalog configuration. Add `hive.version` to 1.x.x so that it will use the column names in the hive table for mapping. + + ```sql + CREATE CATALOG hive PROPERTIES ( + 'hive.version' = '1.x.x' + ); + ``` + +14. When using JDBC Catalog to synchronize MySQL data to Doris, the date data synchronization error occurs. It is necessary to check whether the MySQL version corresponds to the MySQL driver package. For example, the driver com.mysql.cj.jdbc.Driver is required for MySQL8 and above. + +15. If an error is reported while configuring Kerberos in the catalog: `SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]`. + + Need to put `core-site.xml` to the `"${DORIS_HOME}/be/conf"` directory. + + If an error is reported while accessing HDFS: `No common protection layer between client and server`, check the `hadoop.rpc.protection` on the client and server to make them consistent. + + ``` + + + + + + + hadoop.security.authentication + kerberos + + + + ``` + +16. The solutions when configuring Kerberos in the catalog and encounter an error: `Unable to obtain password from user`. + - The principal used must exist in the klist, use `klist -kt your.keytab` to check. + - Ensure the catalog configuration correct, such as missing the `yarn.resourcemanager.principal`. + - If the preceding checks are correct, the JDK version installed by yum or other package-management utility in the current system maybe have an unsupported encryption algorithm. It is recommended to install JDK by yourself and set `JAVA_HOME` environment variable. + +17. If an error is reported while querying the catalog with Kerberos: `GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos Ticket)`. + - Restarting FE and BE can solve the problem in most cases. + - Before the restart all the nodes, can put `-Djavax.security.auth.useSubjectCredsOnly=false` to the `JAVA_OPTS` in `"${DORIS_HOME}/be/conf/be.conf"`, which can obtain credentials through the underlying mechanism, rather than through the application. + - Get more solutions to common JAAS errors from the [JAAS Troubleshooting](https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/Troubleshooting.html). + +18. If an error related to the Hive Metastore is reported while querying the catalog: `Invalid method name`. + + Configure the `hive.version`. + + ```sql + CREATE CATALOG hive PROPERTIES ( + 'hive.version' = '2.x.x' + ); + ``` diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/faq.md b/docs/zh-CN/docs/lakehouse/multi-catalog/faq.md index 2a0b6eeb69fc6c..e45f37f0565961 100644 --- a/docs/zh-CN/docs/lakehouse/multi-catalog/faq.md +++ b/docs/zh-CN/docs/lakehouse/multi-catalog/faq.md @@ -110,3 +110,78 @@ under the License. ``` 'fs.defaultFS' = 'hdfs://' ``` + +12. 在hive上可以查到hudi表分区字段的值,但是在doris查不到。 + + doris和hive目前查询hudi的方式不一样,doris需要在hudi表结构的avsc文件里添加上分区字段,如果没加,就会导致doris查询partition_val为空(即使设置了hoodie.datasource.hive_sync.partition_fields=partition_val也不可以) + ``` + { + "type": "record", + "name": "record", + "fields": [{ + "name": "partition_val", + "type": [ + "null", + "string" + ], + "doc": "Preset partition field, empty string when not partitioned", + "default": null + }, + { + "name": "name", + "type": "string", + "doc": "名称" + }, + { + "name": "create_time", + "type": "string", + "doc": "创建时间" + } + ] + } + ``` + +13. Hive 1.x 的 orc 格式的表可能会遇到底层 orc 文件 schema 中列名为 `_col0`,`_col1`,`_col2`... 这类系统列名,此时需要在 catalog 配置中添加 `hive.version` 为 1.x.x,这样就会使用 hive 表中的列名进行映射。 + + ```sql + CREATE CATALOG hive PROPERTIES ( + 'hive.version' = '1.x.x' + ); + ``` + +14. 使用JDBC Catalog将MySQL数据同步到Doris中,日期数据同步错误。需要校验下MySQL的版本是否与MySQL的驱动包是否对应,比如MySQL8以上需要使用驱动com.mysql.cj.jdbc.Driver。 + +15. 在Catalog中配置Kerberos时,如果报错`SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]`,那么需要将`core-site.xml`文件放到`"${DORIS_HOME}/be/conf"`目录下。 + + 如果访问HDFS报错`No common protection layer between client and server`,检查客户端和服务端的`hadoop.rpc.protection`属性,使他们保持一致。 + + ``` + + + + + + + hadoop.security.authentication + kerberos + + + + ``` + +16. 在Catalog中配置Kerberos时,报错`Unable to obtain password from user`的解决方法: + - 用到的principal必须在klist中存在,使用`klist -kt your.keytab`检查。 + - 检查catalog配置是否正确,比如漏配`yarn.resourcemanager.principal`。 + - 若上述检查没问题,则当前系统yum或者其他包管理软件安装的JDK版本存在不支持的加密算法,建议自行安装JDK并设置`JAVA_HOME`环境变量。 + +17. 查询配置了Kerberos的外表,遇到该报错:`GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos Ticket)`,一般重启FE和BE能够解决该问题。 + - 重启所有节点前可在`"${DORIS_HOME}/be/conf/be.conf"`中的JAVA_OPTS参数里配置`-Djavax.security.auth.useSubjectCredsOnly=false`,通过底层机制去获取JAAS credentials信息,而不是应用程序。 + - 在[JAAS Troubleshooting](https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/Troubleshooting.html)中可获取更多常见JAAS报错的解决方法。 + +18. 使用Catalog查询表数据时发现与Hive Metastore相关的报错:`Invalid method name`,需要设置`hive.version`参数。 + + ```sql + CREATE CATALOG hive PROPERTIES ( + 'hive.version' = '1.x.x' + ); + ```