此文针对TensorFlow GPU版 win10安装配置的,CPU版本安装比较简单,如下:
CPU版本TensorFlow安装
tensorflow-CPU版本安装非常简单直接在终端输入命令
pip install tensorflow
或者安装特定的版本
pip install tensorflow==2.1
注:不注明版本默认是CPU版本,但好像最新版默认GPUCPU都有
如果你有GPU的话,继续往下看------------------------------------------------------------------
GPU版本TensorFlow安装及配置比较复杂,具体如下:
GPU版本TensorFlow安装及配置
所需环境安装以及安装顺序
- Python 3.7.6(Anaconda)
- tensorflow-gpu==2.1
- Cuda 10.1(update2)(10.2和TensorFlow2.1不匹配)
- Cudnn 7.6(for CUDA 10.1)
作者已经装好,可以先让我们开始运行一下看看效果:
首先放出代码
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
@File : pt2.py
@Time : 2020/05/21 23:58:57
@Author : 艾强云
@Contact : aqy0716@163.com
@Department : SCAU
@Desc : None
#机器学习神经网络
# here put the import lib
import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
mnist = keras.datasets.fashion_mnist
(X_train, y_train),(X_test,y_test) = mnist.load_data()
print("训练数据形状," , X_train.shape)
print("数据最大值 " , np.max(X_train))
print("查看标签数值 " , y_train)
class_names =['top','trouser','pullover','dress','coat','sandal','shirt','sneaker','bag','ankle boot']#定义10个类别的名称
plt.figure()#可视化
plt.imshow(X_train[1])#【】里面的数据可以自己输入随便一个画出第几个的图
plt.colorbar()#加一个颜色条
plt.show()
#将数据集归一化 即降低数据集的值
X_train = X_train/255.0
X_test = X_test/255.0
plt.figure()#可视化
plt.imshow(X_train[1])#【】里面的数据可以自己输入随便一个画出第几个的图
plt.colorbar()#加一个颜色条
plt.show()
#可以看出值被缩放到0到1之间
from tensorflow.python.keras.models import Sequential #导入训练模型
from tensorflow.python.keras.layers import Flatten,Dense#导入神经网络的第一层和第二层
model = Sequential()
model.add(Flatten(input_shape = (28,28)))#此行代码是将图的大小数据转换成一维的数据
model.add(Dense(128,activation = 'relu'))#定义第一层神经网络有128个单元,并且选择的激活函数是ReLu函数,也可以是其他函数性sigmoid函数
# 这里要是不懂可以查看吴恩达老师深度学习的3.6节课
model.add(Dense(10,activation = 'softmax'))#定义输出层,有10类所以输出10,激活函数是max函数
print("查看自己写的代码的总体参数 " , model.summary())#查看自己写的代码的总体参数
#模型补充
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])#定义损失函数
#使用的优化器名叫AdamOptimizer,使用的损失函数是稀疏分类交叉熵
model.fit(X_train,y_train,epochs = 10)#进行训练,epochs是显示运行多少次
test_loss, test_acc = model.evaluate(X_test,y_test)#利用测试集测试训练下的模型的准确度
print(test_acc)
#预测模型精确度
from sklearn.metrics import accuracy_score
y_pred = model.predict_classes(X_test)
print(accuracy_score(y_test, y_pred))
print(tf.test.is_gpu_available())
GPU运行成功:具体如下
PS F:\vscode-python-kiton> & D:/ruanjian/anaconda202002/python.exe f:/vscode-python-kiton/数学/TensorFlow/pt2.py
2020-05-29 00:33:44.308803: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
训练数据形状, (60000, 28, 28)
数据最大值 255
查看标签数值 [9 0 0 ... 3 0 5]
2020-05-29 00:33:50.532487: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-05-29 00:33:50.557648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2020-05-29 00:33:50.561284: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-29 00:33:50.568473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-29 00:33:50.573147: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-29 00:33:50.575965: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-29 00:33:50.581990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-29 00:33:50.585757: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-29 00:33:50.593746: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-29 00:33:50.596544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-29 00:33:50.598143: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-05-29 00:33:50.601011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2020-05-29 00:33:50.605562: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-29 00:33:50.608013: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-29 00:33:50.609933: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-29 00:33:50.612253: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-29 00:33:50.614185: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-29 00:33:50.616119: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-29 00:33:50.617990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-29 00:33:50.619995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-29 00:33:51.071264: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-29 00:33:51.073211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-05-29 00:33:51.074427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-05-29 00:33:51.075876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 784) 0
_________________________________________________________________
dense (Dense) (None, 128) 100480
_________________________________________________________________
dense_1 (Dense) (None, 10) 1290
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________
查看自己写的代码的总体参数 None
Train on 60000 samples
Epoch 1/10
2020-05-29 00:33:51.467932: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
60000/60000 [==============================] - 2s 41us/sample - loss: 0.5001 - accuracy: 0.8269
Epoch 2/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.3769 - accuracy: 0.8647
Epoch 3/10
60000/60000 [==============================] - 2s 34us/sample - loss: 0.3376 - accuracy: 0.8768
Epoch 4/10
60000/60000 [==============================] - 2s 34us/sample - loss: 0.3126 - accuracy: 0.8848
Epoch 5/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2953 - accuracy: 0.8902
Epoch 6/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2818 - accuracy: 0.8956
Epoch 7/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2693 - accuracy: 0.9008
Epoch 8/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2591 - accuracy: 0.9031
Epoch 9/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2496 - accuracy: 0.9071
Epoch 10/10
60000/60000 [==============================] - 2s 33us/sample - loss: 0.2408 - accuracy: 0.9107
10000/10000 [==============================] - 0s 33us/sample - loss: 0.3349 - accuracy: 0.8823
0.8823
0.8823
WARNING:tensorflow:From f:/vscode-python-kiton/数学/TensorFlow/pt2.py:70: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-05-29 00:34:12.313519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2020-05-29 00:34:12.318078: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-29 00:34:12.319828: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-29 00:34:12.322225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-29 00:34:12.324215: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-29 00:34:12.325950: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-29 00:34:12.327717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-29 00:34:12.329492: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-29 00:34:12.332276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-29 00:34:12.333684: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-29 00:34:12.335506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-05-29 00:34:12.336623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-05-29 00:34:12.337972: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 4604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
True
成功采用GPU运行计算,接下来具体描述安装流程
安装配置流程
1. 首先下载安装 Anaconda(开源的Python发行版本,最新版为3.7, 大小为466MB)
安装完后会自动添加相关路径到PATH环境变量,可以直接在终端cmd或者power shell界面输入python查看是否安装好。安装方法参考
C:\Users\Administrator>python
Python 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
2. 更新pip到最新版本(版本需要大于20.0)
在终端cmd或者power shell界面直接输入如下命令:
python -m pip install --upgrade pip
然后查看pip版本,终端输入:
pip --version
C:\Users\Administrator>pip --version
pip 20.2b1 from D:\ruanjian\anaconda202002\lib\site-packages\pip-20.2b1-py3.7.egg\pip (python 3.7)
3. 安装TensorFlow-GPU版本,这里选用2.1版本(2.2GPU版本有兼容问题)
pip install tensorflow-gpu==2.1
耐心等待下载安装完(大概300+MB),在终端进入python环境后查看tensorflow是否安装好(我这是全部配置好后的情况),版本号以及安装路径。逐个输入下方命令
python
import tensorflow as tf
tf.__version__
tf.__path__
结果表明安装完毕
>>> import tensorflow as tf
2020-05-29 13:49:55.894087: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
>>> tf.__version__
'2.1.0'
>>> tf.__path__
['C:\\Users\\Administrator\\AppData\\Roaming\\Python\\Python37\\site-packages\\tensorflow']
>>>
4. 下载安装对应的CUDA版本,这里选用CUDA10.1
选择安装Windows x64 10 local, 然后点击下载(Download 2.5GB ),然后下载完直接点击安装好就行
查看安装情况,在终端输入如下命令:
deviceQuery
C:\Users\Administrator>deviceQuery
deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce RTX 2060"
CUDA Driver Version / Runtime Version 10.2 / 10.1
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 6144 MBytes (6442450944 bytes)
(30) Multiprocessors, ( 64) CUDA Cores/MP: 1920 CUDA Cores
GPU Max Clock rate: 1755 MHz (1.75 GHz)
Memory Clock rate: 7001 Mhz
Memory Bus Width: 192-bit
L2 Cache Size: 3145728 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: zu bytes
Total amount of shared memory per block: zu bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1024
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: zu bytes
Texture alignment: zu bytes
Concurrent copy and kernel execution: Yes with 3 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model)
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.1, NumDevs = 1, Device0 = GeForce RTX 2060
Result = PASS
也可以输入命令nvcc -V查看
nvcc -V
C:\Users\Administrator>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:12:52_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.1, V10.1.243
5. 下载cuDNN, 这里选用cuDNN 7.6版本
cuDNN作为cuda的补充,安装比较简单多了,只需要把下载后的压缩文件解压缩然后 复制过去就行,具体步骤如下:
下载红色框cuDNN7.6.4 for CUDA 10.1版本
再选择win10版本
下载完后,解压后将/bin, /include 和 /lib 三个文件夹都复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1文件夹下,会自动合并文件
6. 环境变量设置-PATH路径添加
将CUDA各个PATH路径添加好,否则有可能出问题,系统环境变量PATH需添加的路径如下:
至此,TensorFlow GPU版 win10 环境配置已然完成!
测试
1. 查看GPU情况
使用NVSMI命令查看驱动版本,CUDA版本等信息
nvidia-smi
C:\Users\Administrator>nvidia-smi
Fri May 29 15:04:54 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 441.22 Driver Version: 441.22 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2060 WDDM | 00000000:01:00.0 On | N/A |
| 0% 43C P8 7W / 175W | 880MiB / 6144MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1164 C+G Insufficient Permissions N/A |
| 0 4296 C+G C:\Windows\explorer.exe N/A |
| 0 4504 C+G ...al\Google\Chrome\Application\chrome.exe N/A |
| 0 5052 C+G ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 5204 C+G ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A |
| 0 6440 C+G ...hell.Experiences.TextInput.InputApp.exe N/A |
| 0 12332 C+G ...rogram Files\Microsoft VS Code\Code.exe N/A |
| 0 14236 C+G ...oftEdge_8wekyb3d8bbwe\MicrosoftEdge.exe N/A |
| 0 15860 C+G ...rosoft Office\root\Office16\WINWORD.EXE N/A |
+-----------------------------------------------------------------------------+
2. 比较GPU 和CPU的速度--tensorflow中测试cpu和gpu的速度差距
具体代码 就不粘贴了,结果如下:
******************************************************
1500次比对
******************************************************
----------------------
GPU
-----------------------
Shape: (1500, 1500) Device: /gpu:0
Time taken: 0:00:00.958767
------------------------------
CPU
---------------------------
Shape: (1500, 1500) Device: /cpu:0
Time taken: 0:00:00.601363
******************************************************
15000次比对
******************************************************
----------------------
GPU
-----------------------
Shape: (15000, 15000) Device: /gpu:0
Time taken: 0:00:02.584088
------------------------------
CPU
---------------------------
Shape: (15000, 15000) Device: /cpu:0
Time taken: 0:00:13.458996
******************************************************
20000次比对
******************************************************
------------------------------
GPU
---------------------------
1999980200000.0
Shape: (20000, 20000) Device: /gpu:0
Time taken: 0:00:05.113321
----------------------
CPU
-----------------------
2000095700000.0
Shape: (20000, 20000) Device: /cpu:0
Time taken: 0:00:32.852118
从运行时间来看,在训练规模较小时,CPU还可能更快,在规模较大时,GPU优势明显。因此如果我们的训练数据集较小时可以不用调用GPU运算,而只用CPU运行,可以在导入TensorFlow前加入如下python代码:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1' #不用GPU 使用CPU