博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Hadoop HA的搭建
阅读量:6858 次
发布时间:2019-06-26

本文共 11822 字,大约阅读时间需要 39 分钟。

1.首先添加hosts文件

vim /etc/hosts192.168.0.1  MSJTVL-DSJC-H01192.168.0.2  MSJTVL-DSJC-H03192.168.0.3  MSJTVL-DSJC-H05192.168.0.4  MSJTVL-DSJC-H02192.168.0.5  MSJTVL-DSJC-H04

2.几台机器做互信

Setup passphraseless sshNow check that you can ssh to the localhost without a passphrase:  $ ssh localhostIf you cannot ssh to localhost without a passphrase, execute the following commands:  $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa  $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

把其他几台机器的秘钥文件复制到MSJTVL-DSJC-H01的authorized_keys文件中

[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H02:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub2[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H03:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub3[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H04:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub4[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H05:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub5[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub2 >> ~/.ssh/authorized_keys[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub3 >> ~/.ssh/authorized_keys[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub4 >> ~/.ssh/authorized_keys[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub5 >> ~/.ssh/authorized_keys

以上操作实现了MSJTVL-DSJC-H02,3,4,5对MSJTVL-DSJC-H01的无密码登录

要是实现MSJTVL-DSJC-H01-5的全部互信则把MSJTVL-DSJC-H01上的authorized_keys文件COPY到其他机器上去

[hadoop@MSJTVL-DSJC-H02 ~]$ scp hadoop@MSJTVL-DSJC-H01:/hadoop/.ssh/authorized_keys /hadoop/.ssh/authorized_keys

 

下载相应的tar包

wget http://apache.fayea.com/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

解压tar包并且建立相应的软链接

[hadoop@MSJTVL-DSJC-H01 ~]$ tar -zxvf hadoop-2.6.4.tar.gz[hadoop@MSJTVL-DSJC-H01 ~]$ ln -sf hadoop-2.6.4 hadoop

进到hadoop相应的配置文件路径,修改hadoop-env.sh的内容

[hadoop@MSJTVL-DSJC-H01 ~]$ cd hadoop/etc/hadoop/[hadoop@MSJTVL-DSJC-H01 hadoop]$ vim hadoop-env.sh

修改hadoop-env.sh里java_home的参数信息

接下来修改hdfs-site.xml中的相关内容,来源http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

首先配置一个匿名服务dfs.nameservices

[hadoop@MSJTVL-DSJC-H01 hadoop]$ vim hdfs-site.xml
//配置服务的名称,可以进行相应的修改
dfs.nameservices
mycluster
//配置namenode的名称,mycluster需要和前面的保持一致,nn1和nn2只是名称无所谓叫啥
dfs.ha.namenodes.mycluster
nn1,nn2
//配置RPC协议的端口,两个namenode的RPC协议和端口,需要修改servicesname和value中的主机名称,MSJTVL-DSJC-H01和MSJTVL-DSJC-H02是两个namenode的主机名称
dfs.namenode.rpc-address.mycluster.nn1
MSJTVL-DSJC-H01:8020
dfs.namenode.rpc-address.mycluster.nn2
MSJTVL-DSJC-H02:8020
//配置下面是http的主机和端口
dfs.namenode.http-address.mycluster.nn1
MSJTVL-DSJC-H01:50070
dfs.namenode.http-address.mycluster.nn2
MSJTVL-DSJC-H02:50070
//接下来配置的是JournalNodes的URL地址
dfs.namenode.shared.edits.dir
qjournal://MSJTVL-DSJC-H03:8485;MSJTVL-DSJC-H04:8485;MSJTVL-DSJC-H05:8485/mycluster
//然后是固定的一个客户端使用的类(需要修改serversname的名称),客户端通过这个类找到
dfs.client.failover.proxy.provider.mycluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
//sshfence - SSH to the Active NameNode and kill the process,注意为hadoop下.ssh目录中生成的秘钥文件
dfs.ha.fencing.methods
sshfence
dfs.ha.fencing.ssh.private-key-files
/hadoop/.ssh/id_dsa
//JournalNodes的工作目录
dfs.journalnode.edits.dir
/hadoop/jn/data
//开启自动切换namenode
dfs.ha.automatic-failover.enabled
true

接下来编辑core-site.xml的配置文件

//首先配置namenode的入口,同样注意serversname的名称
fs.defaultFS
hdfs://mycluster
//配置zookeeper的集群
ha.zookeeper.quorum
MSJTVL-DSJC-H03:2181,MSJTVL-DSJC-H04:2181,MSJTVL-DSJC-H05:2181
//hadoop的临时目录
hadoop.tmp.dir
/hadoop/tmp

 配置slaves

MSJTVL-DSJC-H03MSJTVL-DSJC-H04MSJTVL-DSJC-H05

安装zookeeper

直接解压

修改相应的配置文件

[zookeeper@MSJTVL-DSJC-H03 conf]$ vim zoo.cfg//修改dataDir=/opt/zookeeper/data,不要放到tmp下dataDir=/opt/zookeeper/data#autopurge.purgeInterval=1server.1=MSJTVL-DSJC-H03:2888:3888server.2=MSJTVL-DSJC-H04:2888:3888server.3=MSJTVL-DSJC-H05:2888:3888在/opt/zookeeper/data下建立myid里面存储跟server一样的数字

启动zookeeper(zkServer.sh start),jps查看启动状态

 

启动HA集群

1.首先启动JournalNodes,到sbin目录下

 ./hadoop-daemon.sh start journalnode

[hadoop@MSJTVL-DSJC-H03 sbin]$ ./hadoop-daemon.sh start journalnodestarting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H03.out[hadoop@MSJTVL-DSJC-H03 sbin]$ jps3204 JournalNode3252 Jps[hadoop@MSJTVL-DSJC-H03 sbin]$

2.在一台namenode上进行格式化

[hadoop@MSJTVL-DSJC-H01 bin]$ ./hdfs namenode -format

初始化之后会在/hadoop/tmp/dfs/name/current下产生相应的元数据文件

[hadoop@MSJTVL-DSJC-H01 ~]$ cd tmp/[hadoop@MSJTVL-DSJC-H01 tmp]$ ll总用量 4drwxr-xr-x. 3 hadoop hadoop 4096 9月   6 16:54 dfs[hadoop@MSJTVL-DSJC-H01 tmp]$ cd dfs/[hadoop@MSJTVL-DSJC-H01 dfs]$ ll总用量 4drwxr-xr-x. 3 hadoop hadoop 4096 9月   6 16:54 name[hadoop@MSJTVL-DSJC-H01 dfs]$ cd name/[hadoop@MSJTVL-DSJC-H01 name]$ ll总用量 4drwxr-xr-x. 2 hadoop hadoop 4096 9月   6 16:54 current[hadoop@MSJTVL-DSJC-H01 name]$ cd current/[hadoop@MSJTVL-DSJC-H01 current]$ ll总用量 16-rw-r--r--. 1 hadoop hadoop 352 9月   6 16:54 fsimage_0000000000000000000-rw-r--r--. 1 hadoop hadoop  62 9月   6 16:54 fsimage_0000000000000000000.md5-rw-r--r--. 1 hadoop hadoop   2 9月   6 16:54 seen_txid-rw-r--r--. 1 hadoop hadoop 201 9月   6 16:54 VERSION[hadoop@MSJTVL-DSJC-H01 current]$ pwd/hadoop/tmp/dfs/name/current[hadoop@MSJTVL-DSJC-H01 current]$

3.把初始化的元数据文件COPY到其他的namenode上去,COPY之前需要先启动格式化的namenode

[hadoop@MSJTVL-DSJC-H01 sbin]$ ./hadoop-daemon.sh start namenodestarting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H01.out[hadoop@MSJTVL-DSJC-H01 sbin]$ jps3324 NameNode3396 Jps[hadoop@MSJTVL-DSJC-H01 sbin]$

然后在没有格式化的namenode上执行hdfs namenode -bootstrapStandby,执行完后查看元数据文件是一样的表示成功。

[hadoop@MSJTVL-DSJC-H02 bin]$ hdfs namenode -bootstrapStandby

  

4.初始化ZKFC,在任意一台机器上执行hdfs zkfc -formatZK初始化ZKFC

5.重启整个HDFS集群

[hadoop@MSJTVL-DSJC-H01 sbin]$ ./start-dfs.sh 16/09/06 17:10:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting namenodes on [MSJTVL-DSJC-H01 MSJTVL-DSJC-H02]MSJTVL-DSJC-H02: starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H02.outMSJTVL-DSJC-H01: starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H01.outMSJTVL-DSJC-H03: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H03.outMSJTVL-DSJC-H04: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H04.outMSJTVL-DSJC-H05: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H05.outStarting journal nodes [MSJTVL-DSJC-H03 MSJTVL-DSJC-H04 MSJTVL-DSJC-H05]MSJTVL-DSJC-H03: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H03.outMSJTVL-DSJC-H04: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H04.outMSJTVL-DSJC-H05: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H05.out16/09/06 17:10:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting ZK Failover Controllers on NN hosts [MSJTVL-DSJC-H01 MSJTVL-DSJC-H02]MSJTVL-DSJC-H02: starting zkfc, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-zkfc-MSJTVL-DSJC-H02.outMSJTVL-DSJC-H01: starting zkfc, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-zkfc-MSJTVL-DSJC-H01.out[hadoop@MSJTVL-DSJC-H01 sbin]$ jps4345 Jps4279 DFSZKFailoverController3993 NameNode

6.创建一个目录

./hdfs dfs -mkdir -p /usr/file ./hdfs dfs -put /hadoop/tian.txt /usr/file

放上一个文件可以在网页中查看相应的文件。

 

MR高可用

配置yarn-site.xml

yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.cluster-id
rm-cluster
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.ha.automatic-failover.recover.enabled
true
yarn.resourcemanager.recovery.enabled
true
-->
yarn.resourcemanager.hostname.rm1
MSJTVL-DSJC-H01
yarn.resourcemanager.hostname.rm2
MSJTVL-DSJC-H02
yarn.resourcemanager.store.class
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
yarn.resourcemanager.zk-address
MSJTVL-DSJC-H03:2181,MSJTVL-DSJC-H04:2181,MSJTVL-DSJC-H05:2181
yarn.resourcemanager.scheduler.address.rm1
MSJTVL-DSJC-H01:8030
yarn.resourcemanager.scheduler.address.rm2
MSJTVL-DSJC-H02:8030
yarn.resourcemanager.resource-tracker.address.rm1
MSJTVL-DSJC-H01:8031
yarn.resourcemanager.resource-tracker.address.rm2
MSJTVL-DSJC-H02:8031
yarn.resourcemanager.address.rm1
MSJTVL-DSJC-H01:8032
yarn.resourcemanager.address.rm2
MSJTVL-DSJC-H02:8032
yarn.resourcemanager.admin.address.rm1
MSJTVL-DSJC-H01:8033
yarn.resourcemanager.admin.address.rm2
MSJTVL-DSJC-H02:8033
yarn.resourcemanager.webapp.address.rm1
MSJTVL-DSJC-H01:8088
yarn.resourcemanager.webapp.address.rm2
MSJTVL-DSJC-H02:8088

  

配置mapred-site.xml

//指定mr框架为yarn方式
mapreduce.framework.name
yarn

 

 

standby的MR需要手动启动

[hadoop@MSJTVL-DSJC-H02 sbin]$ yarn-daemon.sh start resourcemanagerstarting resourcemanager, logging to /hadoop/hadoop-2.6.4/logs/yarn-hadoop-resourcemanager-MSJTVL-DSJC-H02.out[hadoop@MSJTVL-DSJC-H02 sbin]$ jps3000 ResourceManager2812 NameNode3055 Jps2922 DFSZKFailoverController[hadoop@MSJTVL-DSJC-H02 sbin]$

 

 

 

  

  

 

 

 

 

转载于:https://www.cnblogs.com/tian880820/p/5845613.html

你可能感兴趣的文章
代理(Proxy)和反射(Reflection)
查看>>
隐藏当前Activity而不关闭
查看>>
第三百四十一节,Python分布式爬虫打造搜索引擎Scrapy精讲—编写spiders爬虫文件循环抓取内容—meta属性返回指定值给回调函数—Scrapy内置图片下载器...
查看>>
温故而知新-String类
查看>>
JS控制div跳转到指定的位置的几种解决方案总结
查看>>
《图说VR》——HTC Vive控制器按键事件解耦使用
查看>>
【Java学习笔记之十一】Java中常用的8大排序算法详解总结
查看>>
android studio使用真机测试时点击Debug调试模式时报Error running app:No target device found,点击运行模式却是启动正常的...
查看>>
洛谷 P1553 数字反转(升级版)【字符串+STL stack】
查看>>
【javascript】异步编年史,从“纯回调”到Promise
查看>>
C# WinForm开发系列 - Form/Window
查看>>
python 读取单所有json数据写入mongodb(单个)
查看>>
ZooKeeper可视化Web管理工具收集(待实践)
查看>>
linux pthread【转】
查看>>
EF基础知识小记三(设计器=>数据库)
查看>>
Mybatis系列(四):Mybatis缓存
查看>>
python中的列表、元组、数组——是不是特别容易混淆啊??
查看>>
phpmyadmin 自动登录的办法
查看>>
苹果各版本手机更换电池的视频
查看>>
在Centos7.x中安装psutil模块
查看>>