快速部署Hadoop高可用集群

简介

  • Hadoop: 1个使用Java编写的Apache开放源代码框架,它允许使用简单的编程模型跨大型计算机的大型数据集进行分布式处理。
  • 基于Hadoop框架工作的应用程序可以在跨计算机群集提供分布式存储和计算的环境中工作。
  • Hadoop旨在从单一服务器扩展到数千台机器,每台机器都提供本地计算和存储。

环境

网址

  • NTP服务:《传送门》
  • JDK《传送门》
  • Hadoop《传送门》
  • JDKHadoop的兼容性:《传送门》
  • 官方教程:《传送门》

版本

  • OSCentOS 7.7Ubuntu 16.04
  • JDK1.8
  • Hadoop3.2.1
  • Zookeeper3.6.1

架构

  • Hadoop框架包括以下5个模块:
    • Hadoop Common:支持其他Hadoop模块的通用Java库和实用程序,并包含启动HadoopJava文件和脚本。
    • Hadoop Distributed File SystemHDFS,提供对应用程序数据的高吞吐量访问的分布式文件系统。
    • Hadoop YARN:用于作业调度和集群资源管理的框架。
    • Hadoop MapReduce:基于YARN的大型数据集并行处理系统。
    • Hadoop OzoneHadoop的对象存储。
  • Hadoop的框架最核心的设计:
    • HDFS:为海量的数据提供了存储。
    • MapReduce:为海量的数据提供了计算能力。
  • Hadoop 1.X时,NameNodeHDFS的单点故障,每个集群只有1NameNode,若节点宕机或服务不可用,则整个集群不可用。
  • Hadoop 2.X时,允许集群中运行2NameNode,实现主备节点,当主节点宕机时,可以快速故障转移到备用节点。
  • Hadoop 3.X时,允许集群中运行多于2个的NameNode,实现主备节点,但由于通信开销,建议不超过5个(建议使用3个)。
  • 在任何时间,仅有1NameNode处于Active状态,而其他NameNode处于Standby状态。

目标

  • Hadoop 3.X目前仅支持JDK 8,从Hadoop 3.3开始将支持JDK 11
  • 本文介绍了如何部署和配置Hadoop高可用集群,范围从几个节点到具有数千个节点的超大型集群。

节点

主机名 IP 应用
cluster-hadoop-01 172.18.20.3 NameNodeDataNodeZooKeeper
cluster-hadoop-02 172.18.20.4 NameNodeDataNodeZooKeeper
cluster-hadoop-03 172.18.20.5 DataNodeZooKeeper

部署

基础

CentOS

  • 关闭SELinux
1
2
setenforce 0
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
  • 关闭防火墙:
1
2
systemctl stop firewalld.service
systemctl disable firewalld.service

Common

  • 修改主机名:
1
echo "cluster-hadoop-01" > /etc/hostname
  • 内部解析:
1
2
3
4
echo '127.0.0.1 localhost' > /etc/hosts
echo '172.18.20.3 cluster-hadoop-01' >> /etc/hosts
echo '172.18.20.4 cluster-hadoop-02' >> /etc/hosts
echo '172.18.20.5 cluster-hadoop-03' >> /etc/hosts
  • 提示符:
1
echo 'export PS1="\u@\[\e[1;93m\]\h\[\e[m\]:\w\\$\[\e[m\] "' >> /root/.bashrc
  • 免密通信:
1
2
3
ssh-keygen -t rsa -P "" -C "Hadoop" -f '/root/.ssh/id_rsa' >/dev/null 2>&1
echo "" >> /root/.ssh/authorized_keys
cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys

NTP

  • 为保证集群内时间同步,我们需要安装NTP服务。

CentOS

1
yum install -y chrony
  • 更新配置文件:
1
vim /etc/chrony.conf

Ubuntu

1
apt install -y chrony
  • 更新配置文件:
1
vim /etc/chrony/chrony.conf

Common

  • 文件内容:
1
2
3
4
server 0.cn.pool.ntp.org iburst
server 1.cn.pool.ntp.org iburst
server 2.cn.pool.ntp.org iburst
server 3.cn.pool.ntp.org iburst
  • 更改时区:
1
2
echo "Asia/Shanghai" > /etc/timezone
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
  • 重启服务:
1
2
3
4
5
6
# Ubuntu
service chrony restart

# CentOS
systemctl enable chronyd.service
systemctl start chronyd.service

Fuser

  • 集群的高可用:开启自动故障转移,其中需要fuser命令。

CentOS

1
yum install -y psmisc

Ubuntu

1
apt install -y psmisc

JDK

  • 依照JDKHadoop的兼容性,从Oracle下载最新版本的JDK二进制文件。
  • 参考《Linux上安装JDK》,完成JDK的安装。

ZooKeeper

Hadoop

  • 依照JDKHadoop的兼容性,从Apache下载最新版本的Hadoop二进制文件。
  • 解压到指定目录:
1
tar -zxf hadoop-3.2.1.tar.gz -C /opt/
  • 创建软连接:
1
ln -sf /opt/hadoop-3.2.1 /opt/hadoop
  • 环境变量:
1
2
3
4
5
echo '' >> /etc/profile
echo '# Hadoop Home' >> /etc/profile
echo 'export HADOOP_HOME="/opt/hadoop"' >> /etc/profile
echo 'export PATH="${PATH}:${HADOOP_HOME}/bin"' >> /etc/profile
source /etc/profile
  • 获取版本信息:
1
hadoop version
  • 更改目录权限:
1
2
chown -R root:root /opt/hadoop
chown -R root:root /opt/hadoop-3.2.1

HA模式

  • 创建工作目录:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# PID目录
mkdir -p /var/run/hadoop
echo 'd /var/run/hadoop 0770 root root' > /usr/lib/tmpfiles.d/hadoop.conf

# 日志目录
mkdir -p /data/hadoop/log
echo 'd /data/hadoop/log 0770 root root' >> /usr/lib/tmpfiles.d/hadoop.conf

# 数据目录
mkdir -p /data/hadoop/data
echo 'd /data/hadoop/data 0770 root root' >> /usr/lib/tmpfiles.d/hadoop.conf

# 临时目录
mkdir -p /data/hadoop/tmp
echo 'd /data/hadoop/tmp 0770 root root' >> /usr/lib/tmpfiles.d/hadoop.conf
  • 编辑core-site.xml
1
vim /opt/hadoop/etc/hadoop/core-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdfs-cluster</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>40960</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>cluster-hadoop-01:2181,cluster-hadoop-02:2181,cluster-hadoop-03:2181</value>
</property>
<property>
<name>ha.health-monitor.rpc-timeout.ms</name>
<value>180000</value>
</property>
</configuration>
  • 编辑hdfs-site.xml
1
vim /opt/hadoop/etc/hadoop/hdfs-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>dfs.nameservices</name>
<value>hdfs-cluster</value>
</property>
<property>
<name>dfs.ha.namenodes.hdfs-cluster</name>
<value>nn-01,nn-02</value>
</property>

<property>
<name>dfs.namenode.rpc-address.hdfs-cluster.nn-01</name>
<value>cluster-hadoop-01:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hdfs-cluster.nn-01</name>
<value>cluster-hadoop-01:9870</value>
</property>

<property>
<name>dfs.namenode.rpc-address.hdfs-cluster.nn-02</name>
<value>cluster-hadoop-02:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hdfs-cluster.nn-02</name>
<value>cluster-hadoop-02:9870</value>
</property>

<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/data/hdfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/data/hdfs/data</value>
<final>true</final>
</property>

<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://cluster-hadoop-01:8485;cluster-hadoop-02:8485;cluster-hadoop-03:8485/hdfs-cluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/hadoop/tmp/journalnode</value>
</property>

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.hdfs-cluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence(root:22)</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
  • 编辑mapred-site.xml
1
vim /opt/hadoop/etc/hadoop/mapred-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>${HADOOP_MAPRED_HOME}/share/hadoop/mapreduce/*:${HADOOP_MAPRED_HOME}/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
  • 编辑yarn-site.xml
1
vim /opt/hadoop/etc/hadoop/yarn-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
<?xml version="1.0"?>

<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>rm-cluster</value>
</property>

<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm-01,rm-02</value>
</property>

<property>
<name>yarn.resourcemanager.hostname.rm-01</name>
<value>cluster-hadoop-01</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm-01</name>
<value>cluster-hadoop-01:8088</value>
</property>

<property>
<name>yarn.resourcemanager.hostname.rm-02</name>
<value>cluster-hadoop-02</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm-02</name>
<value>cluster-hadoop-02:8088</value>
</property>

<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.resourcemanager.zk-address</name>
<value>cluster-hadoop-01:2181,cluster-hadoop-02.12:2181,cluster-hadoop-03:2181</value>
</property>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
  • 编辑workers
1
vim /opt/hadoop/etc/hadoop/workers
1
2
3
cluster-hadoop-01
cluster-hadoop-02
cluster-hadoop-03
  • 配置hadoop-env.sh
1
2
3
4
5
6
7
8
9
10
echo 'export JAVA_HOME="/opt/jdk"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export HDFS_ZKFC_USER="root"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export HDFS_NAMENODE_USER="root"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export HDFS_DATANODE_USER="root"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export HDFS_JOURNALNODE_USER="root"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export HDFS_SECONDARYNAMENODE_USER="root"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export YARN_RESOURCEMANAGER_USER="root"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export YARN_NODEMANAGER_USER="root"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export HADOOP_PID_DIR="/var/run/hadoop"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
echo 'export HADOOP_LOG_DIR="/data/hadoop/log"' >> /opt/hadoop/etc/hadoop/hadoop-env.sh
  • 同步到其他节点:
1
2
rsync -azPS --delete /opt/hadoop-3.2.1 cluster-hadoop-02:/opt/
rsync -azPS --delete /opt/hadoop-3.2.1 cluster-hadoop-03:/opt/
  • 启动所有节点的JournalNode守护程序:
1
2
# cluster-hadoop-01
hdfs --workers --daemon start journalnode
  • 格式化文件系统:
1
2
# cluster-hadoop-01
hdfs namenode -format
  • 初始化HA状态到ZK
1
2
# cluster-hadoop-01
hdfs zkfc -formatZK -force
  • 启动NameNode守护程序:
1
2
# cluster-hadoop-01
hdfs --daemon start namenode
  • 同步NameNode
1
2
# cluster-hadoop-02
hdfs namenode -bootstrapStandby -force
  • 启动NameNode守护程序:
1
2
# cluster-hadoop-02
hdfs --daemon start namenode
  • 启动所有NameNode节点的ZookeeperFailoverController守护程序:
1
2
# cluster-hadoop-01
hdfs --workers --hostnames "cluster-hadoop-01 cluster-hadoop-02" --daemon start zkfc
  • 启动所有节点的DataNode守护程序:
1
2
# cluster-hadoop-01
hdfs --workers --daemon start datanode
  • 启动所有NameNode节点的NodeManagerResourceManager守护程序:
1
2
3
# cluster-hadoop-01
yarn --workers --hostnames "cluster-hadoop-01 cluster-hadoop-02" --daemon start nodemanager
yarn --workers --hostnames "cluster-hadoop-01 cluster-hadoop-02" --daemon start resourcemanager
  • 查询进程状态:
1
2
3
4
5
6
7
8
root@cluster-hadoop-01:~# jps
6368 NameNode
7458 DataNode
9010 NodeManager
9347 Jps
9175 ResourceManager
7048 DFSZKFailoverController
5101 JournalNode
  • 获取NamenodeResourceManager的状态:
1
2
3
4
5
6
7
# Namenode
hdfs haadmin -getServiceState nn-01
hdfs haadmin -getServiceState nn-02

# ResourceManager
yarn rmadmin -getServiceState rm-01
yarn rmadmin -getServiceState rm-02

维护

  • Hadoop成功启动过1次,就无需格式化了,直接利用脚本批量管理服务即可。

启动服务

  • 启动HDFS
1
/opt/hadoop/sbin/start-dfs.sh
  • 启动YARN
1
/opt/hadoop/sbin/start-yarn.sh
  • 启动所有服务:
1
/opt/hadoop/sbin/start-all.sh

停止服务

  • 停止HDFS服务:
1
/opt/hadoop/sbin/stop-dfs.sh
  • 停止YARN服务:
1
/opt/hadoop/sbin/stop-yarn.sh
  • 停止所有服务:
1
/opt/hadoop/sbin/stop-all.sh

测试

HA

  • 测试之前请先确认集群中NamenodeResourceManager的状态。
  • 获取集群中NameNodeResourceManager的状态:
1
2
3
4
5
6
7
# Namenode
hdfs haadmin -getServiceState nn-01
hdfs haadmin -getServiceState nn-02

# ResourceManager
yarn rmadmin -getServiceState rm-01
yarn rmadmin -getServiceState rm-02
  • 集群中初始状态,nn-01目前为主节点,nn-02为备用节点。
  • 主节点切换到备用节点:
    • kill掉主节点的Namenode,模拟JVM崩溃,查看备用节点的Namenode是否变为active
    • kill掉主节点的ResourceManager,模拟JVM崩溃,查看备用节点的ResourceManager是否变为active
1
2
3
4
5
6
7
8
9
10
11
root@cluster-hadoop-01:~# jps
996 JournalNode
3174 Jps
15718 ResourceManager
534 NameNode
698 DataNode
15498 NodeManager
1262 DFSZKFailoverController

kill -9 534
kill -9 15718
1
2
3
4
5
# 查看状态是否变为active
hdfs haadmin -getServiceState nn-02

# 查看状态是否变为active
yarn rmadmin -getServiceState rm-02
  • 此时,集群已经完成了1次故障转移,nn-01目前为备用节点,nn-02为主节点。
  • 备用节点切换到主节点:
    • 先启动备用节点被停止的NamenodeResourceManager守护程序。
    • kill掉主节点的Namenode,模拟JVM崩溃,查看备用节点的Namenode是否变为active
    • kill掉主节点的ResourceManager,模拟JVM崩溃,查看备用节点的ResourceManager是否变为active
1
2
hdfs --daemon start namenode
yarn --daemon start resourcemanager
1
2
3
4
5
6
7
8
9
10
11
root@cluster-hadoop-02:~# jps
27057 JournalNode
30358 ResourceManager
27898 NameNode
30843 Jps
27324 DFSZKFailoverController
26796 DataNode
28943 NodeManage

kill -9 27898
kill -9 30358

Web

  • 访问NameNodeWeb界面:
    • 地址:https://<External IP>:9870
    • 端口:Hadoop 2.X默认的端口为50070Hadoop 3.X默认的端口为9870

排错

问题描述

  • 集群异常:NameNode的状态都为Standby

解决过程

  • 获取集群中NameNode的状态:
1
2
hdfs haadmin -getServiceState nn-01
hdfs haadmin -getServiceState nn-02
  • 手动强制切换某个NameNodeactive
1
hdfs haadmin -transitionToActive --forcemanual nn-01
  • 再次获取状态,状态成功切换,但是重启hdfs后,发现状态还是Standby

  • 初步判断ZK初始化时竞选文件丢失导致无法选出activate

  • 尝试在某个NameNode重新初始化ZK状态:

1
hdfs zkfc -formatZK -force
  • 状态仍为Standby且未发现明显的问题。
  • 检查ZK的日志:
1
vim /data/hadoop/log/hadoop-hadoop-zkfc-cluster-hadoop-01.log
1
2
3
4
5
# 启动HDFS时,发现有ZK的错误
cluster-hadoop-01: ERROR: Cannot set priority of zkfc process 21355

# 日志内容
ERROR org.apache.hadoop.ha.ZKFailoverController: Problem binding to [cluster-hadoop-01:8019] java.net.BindException: Address already in use
  • 尝试停止掉HDFS并检查端口:
1
2
3
4
5
# cluster-hadoop-01
/opt/hadoop/sbin/stop-dfs.sh

# cluster-hadoop-01、cluster-hadoop-02
netstat -nlput | grep 8019
  • 此时发现端口被占用了,尝试强制kill掉:
1
2
# cluster-hadoop-01、cluster-hadoop-02
kill -9 13165
  • 尝试重新启动HDFS服务:
1
2
# cluster-hadoop-01
/opt/hadoop/sbin/start-dfs.sh
  • 至此,集群中NameNode的状态正常了,1active1standby

问题描述

  • kill掉服务后,使用jps出现-- process information unavailable

解决过程

  • 切换到启动进程的用户,然后执行jps命令,即可解决。

问题描述

  • 测试高可用时,kill掉原备用节点的NameNode守护程序后,故障未自动转移,原主节点的状态仍为standby

解决过程

  • 查看原主节点的日志:
1
vim /data/hadoop/log/hadoop-hadoop-namenode-cluster-hadoop-01.log
1
2
3
4
5
6
7
# 获取到固定间隔内,有10次尝试连接<cluster-hadoop-02/172.18.20.4:8020>的日志
2020-04-30 21:09:36,776 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: cluster-hadoop-02/172.18.20.4:8020. Already tried 0 time(s)
2020-04-30 21:09:37,777 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: cluster-hadoop-02/172.18.20.4:8020. Already tried 1 time(s)
...
2020-04-30 21:09:45,786 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: cluster-hadoop-02/172.18.20.4:8020. Already tried 9 time(s)
2020-04-30 21:09:45,788 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Exception from remote name node RemoteNameNodeInfo [nnId=nn-02, ipcAddress=cluster-hadoop-02/172.18.20.4:8020, httpAddress=http://cluster-hadoop-02:9870], try next.
java.net.ConnectException: Call From cluster-hadoop-01/172.18.20.3 to cluster-hadoop-02:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
1
vim /data/hadoop/log/hadoop-hadoop-zkfc-cluster-hadoop-01.log
1
2020-04-30 21:09:23,784 WARN org.apache.hadoop.ha.SshFenceByTcpPort: PATH=$PATH:/sbin:/usr/sbin fuser -v -k -n tcp 8020 via ssh: bash: fuser: command not found
  • 找到了根本原因,是没有fuser命令,已将安装命令写到了部署步骤中。

请作者喝瓶肥宅快乐水