环境
服务器(虚拟机):
- vm-master 10.211.55.23
- vm-slave1 10.211.55.25
- vm-slave2 10.211.55.24
软件环境:
- Hadoop 2.7
- JDK 1.8
- Ubuntu 14.04
Step1:创建账号并授权
使用root账户创建 hadoop用户,并设置密码为 111111
adduser hadoop
输入密码:111111
确认密码:111111
以下步骤回车即可...
使用root账户给Hadoop用户授予root权限
vim /etc/sudoers
在 "root ALL=(ALL:ALL) ALL" 下添加
hadoop ALL=(ALL:ALL) ALL
如下:
# User privilege specification
root ALL=(ALL:ALL) ALL
hadoop ALL=(ALL:ALL) ALL
注:保存退出使用 wq!
Step2:修改hosts地址
vim /etc/hosts
10.211.55.23 vm-master
10.211.55.25 vm-slave1
10.211.55.24 vm-slave2
注:使用ip地址使用ifconfig查看
Step3:安装SSH并设置免密码登录
若未安装SSH服务,使用:apt-get install ssh 即可。
使用hadoop用户生成公钥,私钥
su hadoop
cd ~
ssh-keygen -t rsa -P ""
一路回车即可...
执行完后,将在 /home/hadoop/.ssh 文件加中生成 id_rsa(私钥),id_rsa.pub(公钥)
将公钥添加到 authorized_keys(此文件用于保存允许以当前用户身份登录到ssh客户端用户的公钥内容)
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
修改此文件权限
chmod 600 ~/.ssh/authorized_keys
修改配置文件
sudo vim /etc/ssh/sshd_config, 取消下列注释:
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
重启SSH服务
sudo service ssh restart
使用haddop用户测试免密码登录localhost
ssh hadoop@localhost (弹出确认信息,输入yes即可)
Step4:安装SDK
更新软件
sudo apt-get update
安装软件
sudo apt-get install software-properties-common
添加源
sudo add-apt-repository ppa:webupd8team/java
再次更新
sudo apt-get update
安装JDK
sudo apt-get install oracle-java8-installer
检查是否安装成功:
java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
注:若此步骤安装太慢,也可到 http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html 下载自行安装。
Step5:安装vm-slave1,vm-slave2
重复以上操作
Step6:配置允许vm-master免密码登录vm-slave1, vm-slave2
在vm-master上操作
su hadoop
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@vm-slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@vm-slave2
验证是否设置成功
在vm-master上,切换至hadoop用户
ssh hadoop@vm-slave1, 看是否可以免密码登录
Step7:下载安装Hadoop
wget http://statics.charlesjiang.com/hadoop-2.7.3.tar.gz
tar zxvf hadoop-2.7.3.tar.gz
sudo mv hadoop-2.7.3 /usr/local/hadoop
将hadoop文件夹属主改为hadoop
sudo chown -R hadoop:hadoop /usr/local/hadoop
Step8:配置Hadoop
涵盖配置文件:
/usr/local/hadoop/etc/hadoop/slaves
/usr/local/hadoop/etc/hadoop/core-site.xml
/usr/local/hadoop/etc/hadoop/hdfs-site.xml
/usr/local/hadoop/etc/hadoop/mapred-site.xml
/usr/local/hadoop/etc/hadoop/yarn-site.xml
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
- 修改core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://vm-master:9000</value>
</property>
</configuration>
注:此处value不能配置为localhost
- 修改mapred-site.xml
cp mapred-site.xml.template ./mapred-site.xml
vim mapred-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://vm-master:9000</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://vm-master:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
- 修改 hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/usr/local/hadoop/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/hadoop/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
- 修改 yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>vm-master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>vm-master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>vm-master:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>vm-master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>vm-master:8033</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
- 修改slaves
vm-master
vm-slave1
vm-slave2
- 修改hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
注:此处填写JDK绝对路径
Step9:配置hadoop用户环境变量
su hadoop
vim /home/hadoop/.bash_profile
具体配置如下:
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export SCALA_HOME=/usr/local/scala
export SPARK_HOME=/usr/local/spark
JAVA_HOME=/usr/lib/jvm/java-8-oracle
JRE_HOME=/usr/lib/jvm/java-8-oracle/jre
CLASSPATH=.:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bin:$PATH
export JAVA_HOME CLASSPATH PATH USER LOGNAME MAIL HOSTNAME
同时将vm-slave1, vm-slave2 做相同配置
执行配置文件
source /home/hadoop/.bash_profile
Step10:将vm-master的Hadoop拷贝至 vm-slave1, vm-slave2
scp -r /usr/local/hadoop/ hadoop@vm-slave1:/home/hadoop
scp -r /usr/local/hadoop/ hadoop@vm-slave2:/home/hadoop
分别在vm-slave1, vm-slave2中,将hadoop文件夹移至 /usr/local/hadoop
sudo mv hadoop /usr/local/hadoop
移动后再在vm-slave1, vm-slave2 将/usr/local/hadoop宿主改为hadoop
sudo chown -R hadoop:hadoop /usr/local/hadoop/
Step11:格式化HDFS
在vm-master上操作
cd /usr/local/hadoop
./bin/hdfs namenode -format
输出类似如下信息则格式成功:
17/03/29 15:14:36 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
17/03/29 15:14:36 INFO util.ExitUtil: Exiting with status 0
17/03/29 15:14:36 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at vm-master.localdomain/127.0.1.1
************************************************************/
Step12:启动Hadoop
在vm-master上操作
sbin/start-dfs.sh
启动后,输入jps
类似如下信息则成功:
2403 DataNode
3188 Jps
3079 SecondaryNameNode
2269 NameNode
注:弹出确认信息,输入yes即可
Step12:启动Yarn
在vm-master上操作
sbin/start-yarn.sh
启动后,输入jps
类似如下信息则成功:
3667 Jps
2403 DataNode
3237 ResourceManager
3079 SecondaryNameNode
2269 NameNode
3391 NodeManager
再在 vm-sleve1 或 vm-sleve2中输入 jps
显示类似如下信息:
2777 Jps
2505 DataNode
2654 NodeManager
Step13:验证安装
- 查看Hadoop状态
bin/hdfs dfsadmin -report
如下信息:
Configured Capacity: 198576648192 (184.94 GB)
Present Capacity: 180282531840 (167.90 GB)
DFS Remaining: 180282449920 (167.90 GB)
DFS Used: 81920 (80 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (3):
Name: 10.211.55.24:50010 (vm-slave2)
Hostname: vm-master
Decommission Status : Normal
Configured Capacity: 66192216064 (61.65 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 6213672960 (5.79 GB)
DFS Remaining: 59978518528 (55.86 GB)
DFS Used%: 0.00%
DFS Remaining%: 90.61%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Mar 29 16:38:34 CST 2017
Name: 10.211.55.25:50010 (vm-slave1)
Hostname: vm-master
Decommission Status : Normal
Configured Capacity: 66192216064 (61.65 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 6213672960 (5.79 GB)
DFS Remaining: 59978518528 (55.86 GB)
DFS Used%: 0.00%
DFS Remaining%: 90.61%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Mar 29 16:38:34 CST 2017
Name: 10.211.55.23:50010 (vm-master)
Hostname: vm-master
Decommission Status : Normal
Configured Capacity: 66192216064 (61.65 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 5866770432 (5.46 GB)
DFS Remaining: 60325412864 (56.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 91.14%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Mar 29 16:38:34 CST 2017
- 查看HDFS管理页面
- http://vm-master:50070/ (HDFS管理页面)
- http://vm-master:8088/ (Hadoop进程管理页面)
常见问题
-
系统乱码如下图:
解决办法:
sudo vim /etc/environment
添加
LANG="en_US.UTF-8"
LANGUAGE="en_US:en"
sudo vim /var/lib/locales/supported.d/local
添加:
en_US.UTF-8 UTF-8
sudo vim /etc/default/locale
修改:
LANG="en_US.UTF-8"
LANGUAGE="en_US:en"
重启
sudo reboot
- 克隆虚拟机后网卡失效问题
解决办法:
vim /etc/udev/rules.d/70-persistent-net.rules
删除含有eth0 的行,并将 eth1 改为eth0
重启即可
3.未找到环境变量 JAVA_HOME
vm-slave2: Error: JAVA_HOME is not set and could not be found.
vm-slave1: Error: JAVA_HOME is not set and could not be found.
解决办法:
将所有服务器的 hadoop-env.sh 中的 export JAVA_HOME= 设置为JDK绝对路径
检查 hosts文件,
127.0.0.1 localhost
#127.0.1.1 vm-master.localdomain vm-master 【将此信息屏蔽】
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
10.211.55.23 vm-master
10.211.55.25 vm-slave1
10.211.55.24 vm-slave2
关闭yarn sbin/stop-yarn.sh
关闭hdfs sbin/stop-dfs.sh
启动hdfs sbin/start-dfs.sh
启动yarn sbin/start-yarn.sh