Anaconda与Python版本对应关系表
image.png
https://docs.anaconda.com/anaconda/packages/oldpkglists/
- 下载anaconda安装包
wget https://repo.continuum.io/archive/Anaconda3-4.4.0-Linux-x86_64.sh
- 安装anaconda
bash Anaconda3-4.4.0-Linux-x86_64.sh
长按回车
root@bigdata-dev-43:/home/hd_user# bash Anaconda3-4.4.0-Linux-x86_64.sh
Welcome to Anaconda3 4.4.0 (by Continuum Analytics, Inc.)
In order to continue the installation process, please review the license
agreement.
Please, press ENTER to continue
>>> # (按回车键)
===================================
Anaconda End User License Agreement
===================================
.......
输入yes
Copyright 2017, Continuum Analytics, Inc.
... # 省略
kerberos (krb5, non-Windows platforms)
A network authentication protocol designed to provide strong authentication
for client/server applications by using secret-key cryptography.
cryptography
A Python library which exposes cryptographic recipes and primitives.
Do you approve the license terms? [yes|no]
>>> yes # 输入 yes
Anaconda3 will now be installed into this location:
/root/anaconda3
输入安装路径 /opt/cloudera/anaconda3
如果提示“tar (child): bzip2: Cannot exec: No such file or directory”,需要先安装bzip2。sudo yum -y install bzip2
- Press ENTER to confirm the location
- Press CTRL-C to abort the installation
- Or specify a different location below
[/root/anaconda3] >>> /opt/cloudera/anaconda3 # 输入安装路径 /opt/cloudera/anaconda3
PREFIX=/opt/cloudera/anaconda3
installing: python-3.6.1-2 ...
installing: _license-1.1-py36_1 ...
设置anaconda的PATH路径:
为了确保pyspark任务提交后使用python3,故输入no,重新设置PATH
installing: alabaster-0.7.10-py36_0 ...
... # 省略
installing: zlib-1.2.8-3 ...
installing: anaconda-4.4.0-np112py36_0 ...
installing: conda-4.3.21-py36_0 ...
installing: conda-env-2.6.0-0 ...
Python 3.6.1 :: Continuum Analytics, Inc.
creating default environment...
installation finished.
Do you wish the installer to prepend the Anaconda3 install location
to PATH in your /root/.bashrc ? [yes|no]
[no] >>> no # 输入 no
You may wish to edit your .bashrc or prepend the Anaconda3 install location:
$ export PATH=/opt/cloudera/anaconda3/bin:$PATH
Thank you for installing Anaconda3!
Share your notebooks and packages on Anaconda Cloud!
Sign up for free: https://anaconda.org
- 设置anaconda3的环境变量
[root@node00 ~]# echo "export PATH=/opt/cloudera/anaconda3/bin:$PATH" >> /etc/profile
[root@node00 ~]# source /etc/profile
[root@node00 ~]# env |grep PATH
PATH=/opt/cloudera/anaconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
- 验证Python版本
root@bigdata-dev-43:/home/hd_user# python
Python 3.6.1 |Anaconda 4.4.0 (64-bit)| (default, May 11 2017, 13:09:58)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
或
root@bigdata-dev-43:/home/hd_user# python -V
Python 3.6.1 :: Anaconda 4.4.0 (64-bit)
- 在CM配置Spark的Python环境
export PYSPARK_PYTHON=/opt/cloudera/anaconda3/bin/python
export PYSPARK_DRIVER_PYTHON=/opt/cloudera/anaconda3/bin/python
4c985369e1a4ea7454e0c5c225048001.png
重启相关服务。
- 使用Pyspark命令测试
x = sc.parallelize([1,2,3])
y = x.flatMap(lambda x: (x, 100*x, x**2))
print(x.collect())
print(y.collect())
root@bigdata-dev-41:/home/charles# pyspark
Python 3.6.1 |Anaconda 4.4.0 (64-bit)| (default, May 11 2017, 13:09:58)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 1.6.0
/_/
Using Python version 3.6.1 (default, May 11 2017 13:09:58)
SparkContext available as sc, HiveContext available as sqlContext.
>>> x = sc.parallelize([1,2,3])
>>> y = x.flatMap(lambda x: (x, 100*x, x**2))
>>> print(x.collect())
[1, 2, 3]
>>> print(y.collect())
[1, 100, 1, 2, 200, 4, 3, 300, 9]
>>>