linux # centos # 安装cuda

注意: 千万不要在虚拟机机中操作，不会成功的。因为目前不支持。

要想成功，需要在实体机中操作。

准备

确认版本

主要确认CUDA toolkit和nvidia的驱动版本。
经过实践之后，发现最靠谱的确定思路是:
首先根据本机的显卡版本，确定nvidia显卡的驱动版本，然后根据驱动版本确定CUDA toolkit的版本。

查看显卡的类型

可以看到显卡的类型为GeForce GTX 1060 3G

CUDA的core个数为: 1152个

确定显卡的驱动版本
https://www.geforce.com/drivers

然后可以查询到所有支持该显卡的驱动版本，最上边的为最新版本(除了beta版本)。

可看到当前nvidia显卡最新的驱动版本为: 390.87

确定CUDA toolkit的版本
CUDA toolkit对nvidia的版本有要求，可参见https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html中的CUDA Driver部分的说明:

image.png

linux平台下，由于nvidia driver的最新版本为390.87，所以无法选择CUDA 9.2, 因为它对driver的要求是>=396.26, 所以选择CUDA 9.1，它的要求是>=390.46, 满足要求。

查看系统和内核的要求
参见https://docs.nvidia.com/cuda/archive/9.1/cuda-installation-guide-linux/index.html中System Requirements部分的说明:

可见CUDA 9.1对各系统的要求。
比如CentOS 7.x，要求内核3.10， gcc版本4.8.5， GLIBC版本2.17等。

必要的查询

可参考https://docs.nvidia.com/cuda/archive/9.1/cuda-installation-guide-linux/index.html中的第2章。

(1) 查看是否存在支持CUDA的GPU

lspci | grep -i nvidia

可以在https://developer.nvidia.com/cuda-gpus查询本机的显卡是否支持CUDA。

(2) 查看当前linux版本是否支持
The CUDA Development Tools are only supported on some specific distributions of Linux.

$ uname -m && cat /etc/*release

You should see output similar to the following, modified for your particular system:

x86_64
Red Hat Enterprise Linux Workstation release 6.0 (Santiago)

The x86_64 line indicates you are running on a 64-bit system.
The remainder gives information about your distribution.

(3) 查看gcc的版本:

$ gcc --version

(4) 查看glibc版本

ll /lib64/libc.so.*

(5) 安装当前内核需要的kernel headers
这个步骤很重要。

sudo yum install "kernel-devel-uname-r == $(uname -r)"

安装显卡驱动和CUDA toolkit

Handle Conflicting Installation Methods中提到：

可见，同版本的显卡驱动和CUDA toolkit，如果再次安装时，需要卸载旧的版本。

如果CUDA toolkit已安装，可用如下途径卸载:

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.1/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall

安装显卡driver

yum安装
大部分 Linux 发行版都使用开源的显卡驱动 nouveau，对于 nvidia 显卡来说，还是闭源的官方驱动的效果更好。

安装官方显卡驱动，可参考这个网址:https://blog.csdn.net/u013378306/article/details/69229919
里边介绍了一种简单的用yum安装nvidia显卡驱动的方法。
操作之前需要屏蔽默认带有的nouveau。

lsmod | grep nouveau

如果以上语句没有输出，则表示屏蔽默认带有的nouveau
成功。

这种方式，最后一步:

yum -y install kmod-nvidia

有时可能不成功，不过不妨碍使用

nvidia-detect -v

返回的结果去查找对应的驱动版本，进行安装。

源码安装
查找驱动的靠谱地址: https://www.geforce.com/drivers
安装过程可参考: https://blog.csdn.net/itaacy/article/details/72628792?utm_source=itdadao&utm_medium=referral

显卡安装成功后，可用如下命令查看显卡信息:

nvidia-smi

出现以上信息，说明显卡驱动安装成功。

卸载显卡驱动，可用如下指令:

nvidia-uninstall

安装 CUDA toolkit

注: 安装前应该关闭gnome。

获取CUDA toolkit下载地址:

CUDA toolkit 下载地址: https://developer.nvidia.com/cuda-toolkit-archive
下载CUDA 9.1。

安装CUDA:

sh cuda_9.1.85_387.26_linux.run

安装过程(以下是某次安装9.2版本的日志，仅参考):

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.37?
(y)es/(n)o/(q)uit: yes

Do you want to install the OpenGL libraries?
(y)es/(n)o/(q)uit [ default is yes ]: yes

Do you want to run nvidia-xconfig?
This will update the system X configuration file so that the NVIDIA X driver
is used. The pre-existing X configuration file will be backed up.
This option should not be used on systems that require a custom
X configuration, such as systems with multiple GPU vendors.
(y)es/(n)o/(q)uit [ default is no ]: y

Install the CUDA 9.2 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-9.2 ]: y

Toolkit location must be an absolute path.
Enter Toolkit Location
 [ default is /usr/local/cuda-9.2 ]: /usr/local/cuda-9.2

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 9.2 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
 [ default is /root ]: y

Samples location must be an absolute path
Enter CUDA Samples Location
 [ default is /root ]: y

Samples location must be an absolute path
Enter CUDA Samples Location
 [ default is /root ]: /root

Installing the NVIDIA display driver...

安装成功的日志:

Installing the NVIDIA display driver...
Installing the CUDA Toolkit in /usr/local/cuda-9.2 ...
Installing the CUDA Samples in /root ...
Copying samples to /root/NVIDIA_CUDA-9.2_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Installed
Toolkit:  Installed in /usr/local/cuda-9.2
Samples:  Installed in /root

Please make sure that
 -   PATH includes /usr/local/cuda-9.2/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-9.2/lib64, or, add /usr/local/cuda-9.2/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.2/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.2/doc/pdf for detailed information on setting up CUDA.

Logfile is /tmp/cuda_install_3101.log

配置环境变量

//www.greatytc.com/p/73399a4c9114 参考这个设置环境变量。

验证cuda是否安装成功

cd /root/NVIDIA_CUDA-9.2_Samples/1_Utilities/deviceQuery
make
./deviceQuery

如果成功，会显示PASS。

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1060 3GB"
  CUDA Driver Version / Runtime Version          8.0 / 8.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 3013 MBytes (3159293952 bytes)
  ( 9) Multiprocessors, (128) CUDA Cores/MP:     1152 CUDA Cores
  GPU Max Clock rate:                            1747 MHz (1.75 GHz)
  Memory Clock rate:                             4004 Mhz
  Memory Bus Width:                              192-bit
  L2 Cache Size:                                 1572864 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1060 3GB
Result = PASS

可以看到CUDA Driver Version / Runtime Version 8.0 / 8.0
( 9) Multiprocessors, (128) CUDA Cores/MP: 1152 CUDA Cores
等参数。

如何查看cuda的版本

nvcc --version

遇到问题及解决:

The driver installation is unable to locate the kernel source.

The driver installation is unable to locate the kernel source. Please make sure that the kernel source packages are installed and set up correctly.
If you know that the kernel source packages are installed and set up correctly, you may pass the location of the kernel source with the '--kernel-source-path' flag.

解决方法:

sudo yum install epel-release
yum install --enablerepo=epel dkms

Missing recommended library

Installing the NVIDIA display driver...
Installing the CUDA Toolkit in /usr/local/cuda-9.2 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so

Installing the CUDA Samples in /root ...
Copying samples to /root/NVIDIA_CUDA-9.2_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Installed
Toolkit:  Installed in /usr/local/cuda-9.2
Samples:  Installed in /root, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-9.2/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-9.2/lib64, or, add /usr/local/cuda-9.2/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.2/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.2/doc/pdf for detailed information on setting up CUDA.

Logfile is /tmp/cuda_install_7498.log

解决方法：

yum install mesa-libGLES.x86_64 mesa-libGL-devel.x86_64 
mesa-libGLU-devel.x86_64 mesa-libGLw.x86_64 
mesa-libGLw-devel.x86_64 libXi-devel.x86_64 
freeglut-devel.x86_64 freeglut.x86_64

cudaGetDeviceCount returned 30
验证cuda安装是否成功时，出现如下提示:

[root@localhost deviceQuery]# ./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL
[root@localhost deviceQuery]# pwd
/root/NVIDIA_CUDA-9.2_Samples/1_Utilities/deviceQuery

这种一般是nvidia显卡驱动的问题，需要安装最新的nvidia的驱动。

http://elrepo.org/tiki/tiki-index.php

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh https://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm

然后按照https://blog.csdn.net/u013378306/article/details/69229919中用yum方式安装nvidia的驱动。

cudaGetDeviceCount returned 35

图片发自简书App

这种一般是cuda版本的问题。确定正确的版本，安装即可。

CUDA driver version is insufficient for CUDA runtime version就是说cuda runtime库的版本比driver的版本高了，要么装更高版本的驱动，要么就用低一点版本的cuda runtime库，所有的库都可以在这里面找到http://developer.download.nvidia.com/compute/cuda/repos/

Your kernel headers for kernel xxx cannot be found

图片发自简书App

The solution is likely to be found at this question the short version being, run

sudo yum install "kernel-devel-uname-r == $(uname -r)"

That will install the kernel headers for the version of the kernel you are currently running.

References:

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
https://baiweiblog.wordpress.com/2017/07/21/cuda-8-0%E5%9C%A8linux%E4%B8%8A%E7%9A%84%E5%AE%89%E8%A3%85%E6%B5%81%E7%A8%8B/
https://stackoverflow.com/questions/38016466/installing-cuda-7-5-on-centos-7-unable-to-locate-the-kernel-source
https://bitsanddragons.wordpress.com/2016/10/07/cuda-on-centos-7/
https://devtalk.nvidia.com/default/topic/1027413/cuda-setup-and-installation/linux-installation-error-cudagetdevicecount-returned-30-gt-unknown-error/
https://developer.download.nvidia.com/compute/cuda/9.2/Prod2/docs/sidebar/CUDA_Installation_Guide_Linux.pdf
https://blog.csdn.net/10km/article/details/61665578
https://medium.com/@changrongko/nv-how-to-check-cuda-and-cudnn-version-e05aa21daf6c
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
https://www.cnblogs.com/wolflzc/p/9117291.html
http://detail.zol.com.cn/picture_index_1760/index17594460.shtml

最后编辑于：2018.09.06 14:43:58

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 206,968评论 6赞 482
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 88,601评论 2赞 382
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 153,220评论 0赞 344
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 55,416评论 1赞 279
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 64,425评论 5赞 374
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 49,144评论 1赞 285
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,432评论 3赞 401
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 37,088评论 0赞 261
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 43,586评论 1赞 300
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 36,028评论 2赞 325
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 38,137评论 1赞 334
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,783评论 4赞 324
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,343评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,333评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,559评论 1赞 262
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,595评论 2赞 355
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,901评论 2赞 345