1.实验目的
通过使用智能编程语言(BANGC)进行算子开发,对高性能库(CNML) 算子进行扩展,并最终集成到编程框架(TensorFlow)中,掌握对高性能库及编 程框架进行扩展的能力,使读者可以在 DLP 硬件上自由设计并优化满足特定应 用场景的新算子,满足日新月异智能算法的实际需求。
2.背景介绍
智能编程语言开发所需的编译工具链包括但不限于 CNCC、CNGDB 等。该部分详见理论课程 PPT。
3.实验内容
a) 算子实现:采用智能编程语言 BCL 实现 PowerDifference 算子;
b) 算子测试:对 PowerDifference 算子本身进行测试,保证其功能正确;
c) 框架集成:通过高性能库 PluginOp 的接口对 PowerDifference 算子进行封装,
使其调用方式和高性能库原有算子一致,
将封装后的算子集成到TensorFlow 编程框架中;
d) 框架算子测试:使用框架 API 测试上一步集成在 TensorFlow 中的算子,保证其功能正确。
- 实验过程
a) 登录云平台:ssh root@xxx.xxx.xxx.xxx -p xxx
(寒武纪-开发平台-登录申请资源)
b) 初始化环境:cd /opt/AICSE-demo-student/env;source env.sh
c) cd /opt/AICSE-demo-student/demo/style_transfer_bcl/src/bangc/PluginPowerDifferenceOp
d) PowerDifference BANGC 算子实现,
补全 plugin_power_difference_kernel.h和 plugin_power_difference_kernel.mlu 文件。
e) PowerDifference BANGC 算子测试,
补全 powerDiff.cpp 文件,执行./make.sh
f) cnplugin 集成:
补全 plugin_power_difference_op.cc 和 cnplugin.h 并编译新的 Cambricon-CNPlugin。
编译前复制改动后的PluginPowerDifferenceOp文件夹以及头文件
cp -r /opt/AICSE-demo-student/demo/style_transfer_bcl/src/bangc/PluginPowerDifferenceOp /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/pluginops/
cp /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/pluginops/PluginPowerDifferenceOp/cnplugin.h /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/common/include/
编译
cd /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/
./build_cnplugin.sh --mlu200
得到新的libcnplugin.so,将其放到tensorflow源码目录下
cp /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/build/libcnplugin.so /opt/AICSE-demo-student/env/neuware/lib64/
//头文件也放到tensorflow目录下
cp /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/pluginops/PluginPowerDifferenceOp/cnplugin.h /opt/AICSE-demo-student/env/Cambricon-CNPlugin-MLU270/common/include/cnplugin.h
g) TensorFlow 算子集成,将下述文件夹中的文件依次添加到 TensorFlow 源码中(由于课程时间关系,该部分代码直接给出):
/opt/AICSE-demostudent/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff;
/opt/AICSE-demo-student/env/tensorflow-v1.10
将demo项目中的部分文件按课件readme要求,复制到tensorflow源码目录下
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/BUILD /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/core/kernels/BUILD
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/mlu_stream.h /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/stream_executor/mlu/
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/mlu_lib_ops.* /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/stream_executor/mlu/mlu_api/lib_ops/
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/mlu_ops.h /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/stream_executor/mlu/mlu_api/ops/
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/power_difference.cc /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/stream_executor/mlu/mlu_api/ops/
cp -rf /opt/AICSE-demo-student/demo/style_transfer_bcl/src/tf-implementation/tf-add-power-diff/math_ops.cc /opt/AICSE-demo-student/env/tensorflow-v1.10/tensorflow/core/ops/
编译
cd /opt/AICSE-demo-student/env/tensorflow-v1.10
./build_tensorflow-v1.10_mlu.sh
h) 框架算子测试,
补全.../src/online_mlu/power_difference_test_bcl.py
和 .../src/online_cpu/power_difference_test_cpu.py 文件,
执行: python power_difference_test_xxx.py
踩坑集合
tensorflow编译报socket closed
将编译脚本job_nums=32 改为job_nums=16