Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 1.1.1, using hive metastore support may cause lead failed to start #1441

Open
foxgarden opened this issue Sep 5, 2019 · 0 comments

Comments

@foxgarden
Copy link

  1. put all needed files into conf/
  2. use 'root' user to start cluster
  3. get below exception in snappyleader.log, and lead failed to start:
......
19/09/05 10:13:11.647 CST StoreCatalog Client<tid=0x62> INFO snappystore: Done hive meta-store initialization
19/09/05 10:13:13.018 CST ForkJoinPool-1-worker-1<tid=0x11> INFO SnappyHiveThriftServer2: Starting HiveServer2 using snappy session
19/09/05 10:13:22.645 CST ForkJoinPool-1-worker-1<tid=0x11> WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
19/09/05 10:13:25.296 CST serverConnector<tid=0xd> WARN LeadImpl: Exception while starting lead node
java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3885)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3868)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:3850)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6826)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4562)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4532)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4505)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:884)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:328)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:641)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)

        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
        at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:188)
        at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:253)
        at org.apache.spark.sql.hive.HiveUtils$.newClientForExecution(HiveUtils.scala:257)
        at org.apache.spark.sql.hive.thriftserver.SnappyHiveThriftServer2$.start(SnappyHiveThriftServer2.scala:66)
        at io.snappydata.impl.LeadImpl$$anonfun$1.apply$mcV$sp(LeadImpl.scala:309)
        at io.snappydata.impl.LeadImpl$$anonfun$1.apply(LeadImpl.scala:177)
        at io.snappydata.impl.LeadImpl$$anonfun$1.apply(LeadImpl.scala:177)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
        at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3885)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3868)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:3850)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6826)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4562)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4532)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4505)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:884)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:328)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:641)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3016)
        at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2984)
        at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1047)
        at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1043)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1043)
        at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1036)
        at org.apache.hadoop.hive.ql.exec.Utilities.createDirsWithPermission(Utilities.java:3679)
        at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:597)
        at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
        ... 14 more
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3885)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3868)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:3850)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6826)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4562)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4532)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4505)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:884)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:328)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:641)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)

        at org.apache.hadoop.ipc.Client.call(Client.java:1476)
        at org.apache.hadoop.ipc.Client.call(Client.java:1413)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        at com.sun.proxy.$Proxy56.mkdirs(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:563)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy57.mkdirs(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3014)
        ... 24 more
19/09/05 10:13:25.520 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: VM is exiting - shutting down distributed system
19/09/05 10:13:25.526 CST Thread-20<tid=0x49> INFO SparkContext: Invoking stop() from shutdown hook
19/09/05 10:13:25.719 CST Thread-20<tid=0x49> INFO SparkUI: Stopped Spark web UI at http://10.0.11.39:5050
19/09/05 10:13:25.780 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: GemFireCache[id = 1288144943; isClosing = true; isShutDownAll = false; closingGatewayHubsByShutdownAll = false; created = Thu Sep 05 10:12:27 CST 2019; server = false; copyOnRead = false; lockLease = 120; lockTimeout = 60]: Now closing.
19/09/05 10:13:26.171 CST Thread-20<tid=0x49> INFO SnappyCoarseGrainedSchedulerBackend: Shutting down all executors
19/09/05 10:13:26.277 CST Thread-20<tid=0x49> INFO SparkContext: SparkContext already stopped.
19/09/05 10:13:26.313 CST Thread-20<tid=0x49> INFO snappystore: Stopping GemFireXD Management/Monitoring ... 
19/09/05 10:13:26.317 CST Thread-20<tid=0x49> INFO snappystore: Unregistered GemFireXD MBeans: []
19/09/05 10:13:26.325 CST Thread-20<tid=0x49> INFO snappystore: GfxdHeapThreshold: Stopping Query Cancellation Thread
19/09/05 10:13:26.328 CST gemfirexd.QueryCanceller<tid=0x4b> INFO snappystore: GfxdHeapThreshold: Processing CRITICAL_UP event
19/09/05 10:13:26.328 CST gemfirexd.QueryCanceller<tid=0x4b> INFO snappystore: GfxdHeapThreshold: Query Cancellation Thread Stopped 
19/09/05 10:13:26.356 CST Thread-20<tid=0x49> INFO snappystore: Disconnecting GemFire distributed system and stopping GemFireStore
19/09/05 10:13:26.356 CST Thread-20<tid=0x49> INFO snappystore: Disconnecting GemFire distributed system.
19/09/05 10:13:26.455 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: closing DiskStore[GFXD-DEFAULT-DISKSTORE]
19/09/05 10:13:26.461 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: Unlocked disk store GFXD-DEFAULT-DISKSTORE
19/09/05 10:13:26.462 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: Stopping DiskStore task pools
19/09/05 10:13:26.502 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: Shutting down DistributionManager 10.0.11.39(27263)<v4>:8932. 
19/09/05 10:13:26.663 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: Now closing distribution for 10.0.11.39(27263)<v4>:8932
19/09/05 10:13:27.014 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: DistributionManager stopped in 512ms.
19/09/05 10:13:27.014 CST Distributed system shutdown hook<tid=0x9> INFO snappystore: Marking DistributionManager 10.0.11.39(27263)<v4>:8932 as closed.
19/09/05 10:13:27.025 CST Thread-20<tid=0x49> INFO snappystore: TraceFabricServiceBoot: GemFireStore service stopped successfully, notifying status ... 
19/09/05 10:13:27.146 CST Thread-20<tid=0x49> INFO SnappyCoarseGrainedSchedulerBackend: SchedulerBackend stopped successfully
19/09/05 10:13:27.211 CST Thread-20<tid=0x49> INFO BlockManager: BlockManager stopped
19/09/05 10:13:27.214 CST Thread-20<tid=0x49> INFO BlockManagerMaster: BlockManagerMaster stopped
19/09/05 10:13:27.227 CST dispatcher-event-loop-1<tid=0x2a> INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/09/05 10:13:27.239 CST Thread-20<tid=0x49> INFO SparkContext: Successfully stopped SparkContext
19/09/05 10:13:27.239 CST Thread-20<tid=0x49> INFO ShutdownHookManager: Shutdown hook called
19/09/05 10:13:27.240 CST Thread-20<tid=0x49> INFO ShutdownHookManager: Deleting directory /tmp/spark-cc9c9122-85c3-461b-aa33-6796cc8ec928
19/09/05 10:13:27.243 CST Thread-20<tid=0x49> INFO ShutdownHookManager: Deleting directory /tmp/spark-b2ca9f18-be4c-4e1b-ba81-8731aed89a28

Finally I fixed it using below command and then restart cluster:

su - hdfs
hdfs dfs -chmod 777 /

Maybe using 'hdfs' to start snappydata can also fixed it. But I think snappydata should not write anything to HDFS, and should not test write permission of / .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant