参考https://www.paddlepaddle.org.cn/documentation/docs/zh/install/compile/linux-compile.html
centos7 cuda11.0 cudnn 8.0( 前提是cuda和cudnn已经安装好了)
照猫画虎,从http://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/下载三个rpm包
wget http://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/libnccl-2.7.8-1+cuda11.0.x86_64.rpm
wget http://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/libnccl-devel-2.7.8-1+cuda11.0.x86_64.rpm
wget http://developer.download.nvidia.com/compute/machine-learning/repos/rhel7/x86_64/libnccl-static-2.7.8-1+cuda11.0.x86_64.rpm
安装这个三个rpm包
rpm -i libnccl-2.7.8-1+cuda11.0.x86_64.rpm
rpm -i libnccl-devel-2.7.8-1+cuda11.0.x86_64.rpm
rpm -i libnccl-static-2.7.8-1+cuda11.0.x86_64.rpm
报错
/sbin/ldconfig: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn.so.8 is not a symbolic link
/sbin/ldconfig: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_train.so.8 is not a symbolic link
/sbin/ldconfig: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8 is not a symbolic link
/sbin/ldconfig: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8 is not a symbolic link
/sbin/ldconfig: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8 is not a symbolic link
/sbin/ldconfig: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_train.so.8 is not a symbolic link
/sbin/ldconfig: /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8 is not a symbolic link
创建对cudnn的软链接
ln -s /root/cuda/lib64/libcudnn_cnn_infer.so.8 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8
另外几个同理
yum install
yum update -y
yum install -y libnccl-2.7.8-1+cuda11.0 libnccl-devel-2.7.8-1+cuda11.0 libnccl-static-2.7.8-1+cuda11.0
检查
done!