1. TensorFlow Lite 简介
TensorFlow Lite是TensorFlow针对移动和嵌入式设备的轻量级解决方案。它使设备上的机器学习预测具有低延迟和小的二进制大小。 TensorFlow Lite还支持硬件加速Android神经网络API(SDK27以上)。
2. TensorFlow Lite 架构
下图显示了TensorFlow Lite的架构设计:
首先,将TensorFlow模型(.pb
)转换为TensorFlow Lite文件格式(.tflite
),这需要使用TensorFlow Lite转换器。然后,您可以在移动应用程序中部署转换后的模型文件。
部署TensorFlow Lite模型文件使用:
- Java API:围绕Android上C++ API的便捷包装。
- C++ API:加载TensorFlow Lite模型文件并调用解释器。 Android和iOS都提供相同的库。
- 解释器:使用一组内核来执行模型,解释器支持选择性内核加载。没有内核,只有100KB;加载了所有内核,300KB。这比TensorFlow Mobile要求的1.5M的显著减少。
- 在选定的Android设备上,解释器将使用Android神经网络API进行硬件加速,如果没有可用的,则默认为CPU执行。
3. 遇到的问题
- 在用toco转换模型时,其转换工具要求固定模型的输入大小,而风格迁移模型要求输入图像的大小是任意的,输出图像的大小与输入图像相同,TensorFlow Lite不能满足这一点需求。
- TensorFlow Lite只包含了一小部分必须的TensorFlow运算符,在风格迁移模型中用到的许多操作都没有实现好,只能自己写C++代码去自定义运算符,难度过大。
4. 查阅的资料
4.1 Feature request : Allow variable (None) input_shape for Tflite toco #21440
gargn commented on 17 Aug
@Kayoku It's currently not feasible to get TOCO to support this. A suggested work around is to put an arbitrary size into TOCO and use ResizeInputTensor when calling the TFLite interpreter.
miaout17 commented on 25 Aug
I skimmed over the paper and I think it should handle variable input size as you said.
Assuming the network architecture can really handle arbitrary input size (I skimmed over the paper and I think it does), could you try this and let us know if it works:
- When converting, use an arbitrary input size like (1, 512, 512, 3).
- When using the interpreter, call
interpreter->ResizeInputTensor
to resize the input tensor before callinginterpreter->Invoke
.Theoretically it should do the trick. Let us know if it works for your case.
4.2 TensorFlow Lite Guide
Each output should be an array or multi-dimensional array of the supported primitive types, or a ByteBuffer of the appropriate size. Note that some models have dynamic outputs, where the shape of output tensors can vary depending on the input. There's no straightforward way of handling this with the existing Java inference API, but planned extensions will make this possible.
5. 模型转换
The model generated (or downloaded) in the previous step is a standard Tensorflow model and you should now have a .pb or .pbtxt tf.GraphDef
file. Models generated with transfer learning (re-training) or custom models must be converted—but, we must first freeze the graph to convert the model to the Tensorflow Lite format. This process uses several model formats:
-
tf.GraphDef
(.pb) —A protobuf that represents the TensorFlow training or computation graph. It contains operators, tensors, and variables definitions. - CheckPoint (.ckpt) —Serialized variables from a TensorFlow graph. Since this does not contain a graph structure, it cannot be interpreted by itself.
-
FrozenGraphDef
—A subclass ofGraphDef
that does not contain variables. AGraphDef
can be converted to aFrozenGraphDef
by taking a CheckPoint and aGraphDef
, and converting each variable into a constant using the value retrieved from the CheckPoint. -
SavedModel
—AGraphDef
and CheckPoint with a signature that labels input and output arguments to a model. AGraphDef
and CheckPoint can be extracted from aSavedModel
. -
TensorFlow Lite model (.tflite) —A serialized FlatBuffer that contains TensorFlow Lite operators and tensors for the TensorFlow Lite interpreter, similar to a
FrozenGraphDef
.
After a TensorFlow model is trained, the TensorFlow Lite converter uses that model to generate a TensorFlow Lite FlatBuffer file (.tflite
). The converter supports as input: SavedModels, frozen graphs (models generated byfreeze_graph.py), and tf.keras
models. The TensorFlow Lite FlatBuffer
file is deployed to a client device (generally a mobile or embedded device), and the TensorFlow Lite interpreter uses the compressed model for on-device inference. This conversion process is shown in the diagram below: