2020-10-28 hdfs块数据传输的数据加密

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了2020-10-28 hdfs块数据传输的数据加密相关的知识,希望对你有一定的参考价值。

参考技术A 操作系统:CentOS Linux release 7.4.1708 (Core)

软件:jdk-8u201-linux-x64.tar.gz、hadoop-2.7.7.tar.gz

安装:

登陆 192.168.1.17

hostnamectl set-hostname node1;mkdir /opt/namenode;mkdir /opt/datanode

vi /etc/hosts

192.168.1.17  node1

192.168.1.18  node2

tar -zxvf jdk-8u201-linux-x64.tar.gz;mv jdk1.8.0_201/ /opt/jdk

tar -zxvf hadoop-2.7.7.tar.gz;mv hadoop-2.7.7/ /opt/hadoop

vi ~/.bashrc

export JAVA_HOME=/opt/jdk

export HADOOP_HOME=/opt/hadoop

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

source ~/.bashrc

------------------------------------

登陆 192.168.1.18

hostnamectl set-hostname node2;mkdir /opt/namenode;mkdir /opt/datanode

vi /etc/hosts

192.168.1.17  node1

192.168.1.18  node2

tar -zxvf jdk-8u201-linux-x64.tar.gz;mv jdk1.8.0_201/ /opt/jdk

tar -zxvf hadoop-2.7.7.tar.gz;mv hadoop-2.7.7/ /opt/hadoop

vi ~/.bashrc

export JAVA_HOME=/opt/jdk

export HADOOP_HOME=/opt/hadoop

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

source ~/.bashrc

-----------------------------------

修改 core-site.xml 配置

<configuration>

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://192.168.1.17:9000</value>

    </property>

</configuration>

修改 hdfs-site.xml 配置

<configuration>

    <property>

        <name>dfs.replication</name>

        <value>1</value>

    </property>

    <property>

        <name>dfs.namenode.name.dir</name>

        <value>file:///opt/namenode</value>

    </property>

    <property>

        <name>dfs.datanode.data.dir</name>

        <value>file:///opt/datanode</value>

    </property>

    <property>

        <name>dfs.encrypt.data.transfer</name>

        <value>true</value>

    </property>

    <property>

        <name>dfs.encrypt.data.transfer.algorithm</name>

        <value>3des</value>

    </property>

    <property>

        <name>dfs.encrypt.data.transfer.cipher.suites</name>

        <value>AES/CTR/NoPadding</value>

    </property>

    <property>

        <name>dfs.block.access.token.enable</name>

        <value>true</value>

    </property>

</configuration>

-------------------------------

登陆 192.168.1.17

hadoop-daemon.sh start namenode;hadoop-daemon.sh start datanode

登陆 192.168.1.18

hadoop-daemon.sh start datanode

-------------------------------

测试

hadoop fs -mkdir  -p /user/linzw

hadoop fs -put anaconda-ks.cfg /user/linzw

备注:

如果不添加 dfs.block.access.token.enable = true,会出现如下报错。添加该配置以后,可以正常写入数据

20/10/28 21:29:12 INFO hdfs.DFSClient: Exception in createBlockOutputStream

java.io.IOException: Connection reset by peer

at sun.nio.ch.FileDispatcherImpl.read0(Native Method)

at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)

at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)

at sun.nio.ch.IOUtil.read(IOUtil.java:197)

at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)

at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)

at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)

at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)

at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)

at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)

at java.io.FilterInputStream.read(FilterInputStream.java:83)

at java.io.FilterInputStream.read(FilterInputStream.java:83)

at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2292)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1480)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1400)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554)

20/10/28 21:29:12 INFO hdfs.DFSClient: Abandoning BP-304062021-127.0.0.1-1603891241969:blk_1073741825_1001

20/10/28 21:29:12 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.1.17:50010,DS-d431ec75-6a89-42b2-b782-7a2d224873d8,DISK]

20/10/28 21:29:12 INFO hdfs.DFSClient: Exception in createBlockOutputStream

java.io.IOException: Connection reset by peer

at sun.nio.ch.FileDispatcherImpl.read0(Native Method)

at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)

at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)

at sun.nio.ch.IOUtil.read(IOUtil.java:197)

at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)

at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)

at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)

at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)

at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)

at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)

at java.io.FilterInputStream.read(FilterInputStream.java:83)

at java.io.FilterInputStream.read(FilterInputStream.java:83)

at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2292)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1480)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1400)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554)

20/10/28 21:29:12 INFO hdfs.DFSClient: Abandoning BP-304062021-127.0.0.1-1603891241969:blk_1073741826_1002

20/10/28 21:29:12 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.1.18:50010,DS-168fefa8-5ff0-4fcb-82be-ecd9f81c5861,DISK]

20/10/28 21:29:12 WARN hdfs.DFSClient: DataStreamer Exception

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/linzw/anaconda-ks.cfg._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s) running and 2 node(s) are excluded in this operation.

at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1620)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3135)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3059)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)

at org.apache.hadoop.ipc.Client.call(Client.java:1476)

at org.apache.hadoop.ipc.Client.call(Client.java:1413)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)

at com.sun.proxy.$Proxy10.addBlock(Unknown Source)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at com.sun.proxy.$Proxy11.addBlock(Unknown Source)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1603)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1388)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554)

put: File /user/linzw/anaconda-ks.cfg._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s) running and 2 node(s) are excluded in this operation.

以上是关于2020-10-28 hdfs块数据传输的数据加密的主要内容,如果未能解决你的问题,请参考以下文章

HDFS透明加密原理

HDFS架构

如何对 hdfs 中的数据进行加密,然后创建 hive 或 impala 表进行查询?

HDFS核心设计

CDH设置HDFS静态数据加密

第六章 HDFS概述