Kerberos常见问题汇总

Posted 光于前裕于后

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Kerberos常见问题汇总相关的知识,希望对你有一定的参考价值。

1.未生成票据

报错内容:

WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
ls: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: ""; destination host is: "":8020; 

解决办法:

kinit -kt xx.keytab xx
或
kinit xx

2.主体密码错误

报错内容:

kinit: Password incorrect while getting initial credentials

解决办法:
使用正确密码或keytab文件

3.KDC未启动

报错内容:

kinit: Cannot contact any KDC for realm 'ROBOT.COM' while getting initial credentials

解决办法:
启动kdc

systemctl start krb5kdc.service # 启动
systemctl enable krb5kdc.service # 设置开机启动

4.票据缓存

报错内容:
klist查看票据,似乎没有问题,但是访问HDFS却报错

ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is
klist
Ticket cache: KEYRING:persistent:1001:1001

解决办法:
将/etc/krb5.conf文件中缓存设置注释,不太了解这有什么作用,老版本没有

#default_ccache_name = KEYRING:persistent:%uid

正常的缓存保存在tmp下,见下:

klist 
Ticket cache: FILE:/tmp/krb5cc_0

5.Peer indicated failure

报错内容:
连接beeline时报的错

Connecting to jdbc:hive2://xx:10000/;principal=hive/xx
21/05/13 16:03:41 [main]: WARN jdbc.HiveConnection: Failed to connect to xx:10000
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://xx:10000/;principal=hive/xx: Peer indicated failure: GSS initiate failed (state=08S01,code=0)

解决办法:
kinit后使用beeline,principal需与hiveserver2服务器对应

beeline -u "jdbc:hive2://同一服务器:10000/;principal=hive/同一服务器@xx.COM"

6.混合问题

报错内容:

WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=xx
 WARN ipc.Client: Couldn't setup connection for hive/xx@xx to xx:8020
org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed

解决办法:
在网上查到报类似错的有两种情况,一是票据过期了,重新kinit即可;二是因为jce的问题,将解压得到的local_policy.jar和US_export_policy.jar拷贝到$JAVA_HOME/jre/lib/security目录下即可。
下载地址:https://download.csdn.net/download/Dr_Guo/18910019

scp *.jar xx:/usr/java/jdk1.8.0_181-cloudera/jre/lib/security/

但是上述两种办法都不能解决我的问题,而且我装了Kerberos之后正常运行了数十天,执行了下面操作后才出问题:

ktadd -k /xx/hdfs.keytab hdfs/xx

我突然发现生成keytab文件忘了加 -norandkey参数,导致cm之前自动生成的keytab失效。
于是我在cm上重新生成:管理>安全>Kerberos凭据>主体全选>重新生成所选项。
然后生成keytab文件,也可find / cm生成的keytab。

# 两种方式都可,但不要忘记-norandkey参数
ktadd -k /xx/hive.keytab -norandkey hive/xx@xx.COM
xst -k /xx/hdfs.keytab -norandkey hdfs/xx@xx.COM

完整日志:

21/05/19 19:33:34 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1621424014472
21/05/19 19:33:36 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1621424014472
21/05/19 19:33:41 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1621424014472
21/05/19 19:33:42 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1621424014472
21/05/19 19:33:45 WARN ipc.Client: Couldn't setup connection for hive/xx to xx/xx:8020
org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed
	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:374)
	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:614)
	at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:410)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:799)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:795)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
	at org.apache.hadoop.ipc.Client$Connection.setupiostreams(Client.java:795)
	at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:410)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1560)
	at org.apache.hadoop.ipc.Client.call(Client.java:1391)
	at org.apache.hadoop.ipc.Client.call(Client.java:1355)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:875)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1630)
	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1496)
	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1493)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1508)
	at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:65)
	at org.apache.hadoop.fs.Globber.doGlob(Globber.java:294)
	at org.apache.hadoop.fs.Globber.glob(Globber.java:149)
	at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1950)
	at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:353)
	at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:250)
	at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:233)
	at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:103)
	at org.apache.hadoop.fs.shell.Command.run(Command.java:177)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:326)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:389)
ls: Failed on local exception: java.io.IOException: Couldn't setup connection for hive/xx to xx/xx:8020; Host Details : local host is: "xx/xx"; destination host is: "xx":8020; 
addprinc -randkey hdfs
ktadd -k /xx/hdfs.keytab -norandkey hdfs

以上是关于Kerberos常见问题汇总的主要内容,如果未能解决你的问题,请参考以下文章

Windows Server 2016-客户端加域端口汇总

离线数仓之Kerberos基本使用及问题记录

Hbase - kerberos认证异常

Spark提交任务,两个集群kerberos互信

solr+hdfs+kerberos 一个问题

可以将 hdfs 文件从 hadoop 集群 KERBEROS 复制到其他集群而不是 KERBEROS 吗?