HADOOP分布式集群安装
前提
准备工作
修改HOSTS,在文件尾加入IP及对应的HOSTNAME,详细教程
- vim /etc/hosts
192.168.2.8 master-8
192.168.2.5 slave-5
192.168.2.9 slave-9
>
> **SSH集群机器相互免密登录**
> * 安装SSH,并生成密钥,[教程](//www.greatytc.com/p/c3c87697d93c)
> * 将各SLAVE节点的公钥发送到MASTER节点
> ```
scp ~/.ssh/id_rsa.pub root@master-8:~/.ssh/id_rsa.pub.slave-5
scp ~/.ssh/id_rsa.pub root@master-8:~/.ssh/id_rsa.pub.slave-9
- 在MASTER中,将收到的SLAVE公钥和MASTER自己的公钥加入认证文件
authorized_keys
中
cat ~/.ssh/id_rsa.pub* >> ~/.ssh/authorized_keys
- 将认证文件分发到各SLAVE节点
scp ~/.ssh/authorized_keys root@slave-5:~/.ssh/
scp ~/.ssh/authorized_keys root@slave-9:~/.ssh/
> * 检验
> ```
ssh master-8
ssh slave-5
ssh slave-9
JAVA安装,安装教程
下载
- 官方下载地址
- 本文下载的版本
hadoop-2.7.3.tar.gz
wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
安装
tar zxvf hadoop-2.7.3.tar.gz -C /usr/local
配置
- 修改HADOOP配置文件
cd /usr/local/hadoop-2.7.3/etc/hadoop
- 修改
hadoop-env.sh
和yarn-env.sh
的JAVA_HOME
export JAVA_HOME=/usr/local/jdk1.8.0_111
- 在
slaves
中添加SLAVE的IP或HOSTNAME
slave-5
slave-9
> * 配置`core-site.xml`
> ```xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master-8:9000/</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop-2.7.3/tmp</value>
</property>
</configuration>
- 配置
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master-8:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop-2.7.3/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop-2.7.3/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
* 配置`mapred-site.xml`
* `sudo cp mapred-site.xml.template mapred-site.xml`
```xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
- 配置
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master-8:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master-8:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master-8:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master-8:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master-8:8088</value>
</property>
</configuration>
分发
scp -r hadoop-2.7.3/ root@slave-5:/usr/local
scp -r hadoop-2.7.3/ root@slave-9:/usr/local
## 启动
> * ```
cd /usr/local/hadoop-2.7.3
bin/hadoop namenode -format
sbin/start-dfs.sh
sbin/start-yarn.sh
检验
http://localhost:8088