<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ONNX &#8211; 天地一沙鸥</title>
	<atom:link href="https://haoluobo.com/tag/onnx/feed/" rel="self" type="application/rss+xml" />
	<link>https://haoluobo.com</link>
	<description>to be continue....</description>
	<lastBuildDate>Thu, 16 Dec 2021 02:05:16 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8.2</generator>
	<item>
		<title>将OpenVINO预训练模型转为为ONNX，并使用TVM进行优化</title>
		<link>https://haoluobo.com/2021/04/openvino-onnx-tvm/</link>
		
		<dc:creator><![CDATA[vicalloy]]></dc:creator>
		<pubDate>Thu, 22 Apr 2021 13:17:21 +0000</pubDate>
				<category><![CDATA[编程]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[ONNX]]></category>
		<category><![CDATA[OpenVINO]]></category>
		<category><![CDATA[TVM]]></category>
		<guid isPermaLink="false">/?p=11868</guid>

					<description><![CDATA[OpenVINO是Intel推出的一款深度学习工具套件。OpenVINO带来大量的预训练模型，使用这些预训练模 [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p>OpenVINO是Intel推出的一款深度学习工具套件。OpenVINO带来大量的预训练模型，使用这些预训练模型可以快速的开发出自己的AI应用。</p>



<p>不过既然是Intel出的东西，自然少不了和Intel平台深度绑定。OpenVINO主要针对Intel的CPU进行优化。虽然也可以支持GPU，但支持的是Intel家的GPU。Intel家的GPU，应当不用报太多期待了。</p>



<p>为了支持更丰富的硬件类型，可以将OpenVINO自带的预训练模型 转为ONNX格式，然后在做其他处理。</p>



<h2 class="wp-block-heading">OpenVINO模型导出为ONNX</h2>



<p>OpenVINO优化后的预训练模型无法直接转换为ONNX。不过好在Intel有提供模型的训练和导出工具，利用OpenVINO的训练工具导出ONNX</p>



<p>OpenVINO用于训练和导出的库为： <a href="https://github.com/openvinotoolkit/training_extensions" target="_blank" rel="noreferrer noopener">https://github.com/openvinotoolkit/training_extensions</a> 。</p>



<p>具体的操作方式参见项目的具体说明文档。</p>



<p>对照人脸检测的文档，导出人脸检测对应ONNX模型： <a href="https://github.com/openvinotoolkit/training_extensions/tree/develop/models/object_detection/model_templates/face-detection" target="_blank" rel="noreferrer noopener">https://github.com/openvinotoolkit/training_extensions/tree/develop/models/object_detection/model_templates/face-detection</a></p>



<p>注：导出目录里有 <code>export/</code>，<code>export/alt_ssd_export/</code> 两种模型。其中 <code>export/alt_ssd_export/</code> 包含了OpenVINO特有的实现，在转换为其他推理引擎模型时会失败，因此后续工作使用 <code>export/</code> 中的模型。</p>



<h2 class="wp-block-heading">使用TVM对ONNX模型进行优化</h2>



<h3 class="wp-block-heading">针对TVM的VM进行优化</h3>



<p>对于存在动态shape的模型，TVM无法进行编译。很不幸的是OpenVINO中物体检测相关的模型都存在动态shape。在TVM无法编译的情况下，可使用TVM的VM进行执行。</p>



<ul class="wp-block-list"><li>注：<ul><li>关于VM的相关内容请阅读： <a href="https://tvm.apache.org/docs/dev/virtual_machine.html" target="_blank" rel="noreferrer noopener">https://tvm.apache.org/docs/dev/virtual_machine.html</a></li><li>TVM的文档比较欠缺（特别是VM相关的内容）。不过好在项目还在快速迭代过程中，提交的issue很快就可以得到回复。</li><li>根据测试，使用VM模式，在CPU上TVM的速度甚至比用 <code>ONNXRuntime</code> 还要慢不少。不知道是否是跑在虚拟机上的关系。</li></ul></li></ul>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
import onnx
import time
import tvm
import numpy as np
import tvm.relay as relay
target = &#039;llvm -mcpu=skylake&#039;
model_path = &#039;face-detection-0200.onnx&#039;
onnx_model = onnx.load(model_path)
shape = &#x5B;1,3,256,256]
input_name = &quot;image&quot;
shape_dict = {
        input_name: shape,
        }
mod, params = relay.frontend.from_onnx(onnx_model, shape_dict)
print(relay.transform.DynamicToStatic()(mod))
with tvm.transform.PassContext(opt_level=3):
    executable = relay.vm.compile(mod, target=&quot;llvm&quot;, target_host=None, params=params)
code, lib = executable.save()
with open(&quot;code.ro&quot;, &quot;wb&quot;) as fo:
    fo.write(code)
lib.export_library(&quot;lib.so&quot;)
</pre></div>


<h3 class="wp-block-heading">针对TVM进行编译和优化</h3>



<p>如果你的模型可以正常编译，那就没必要采用VM模式了。直接编译理论上优化效果要好很多。这里采用的是TVM范例中给出的图片分类模型。</p>



<p>一个完整的模型优化和执行可以参考官方文档：<a href="https://tvm.apache.org/docs/tutorials/get_started/auto_tuning_with_python.html#sphx-glr-tutorials-get-started-auto-tuning-with-python-py" data-type="URL" data-id="https://tvm.apache.org/docs/tutorials/get_started/auto_tuning_with_python.html#sphx-glr-tutorials-get-started-auto-tuning-with-python-py" target="_blank" rel="noreferrer noopener">Compiling and Optimizing a Model with the Python AutoScheduler</a></p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
import onnx
import time
import tvm
import numpy as np
import tvm.relay as relay
target = &#039;llvm&#039;
model_name = &#039;mobilenetv2&#039;
model_path = f&#039;{model_name}.onnx&#039;
onnx_model = onnx.load(model_path)
mod, params = relay.frontend.from_onnx(onnx_model)
with relay.build_config(opt_level=3):
    graph, lib, params = relay.build(mod, target, params=params)
path_lib = f&quot;./{model_name}.so&quot;
lib.export_library(path_lib)
fo=open(f&quot;./{model_name}.json&quot;,&quot;w&quot;)
fo.write(graph)
fo.close()
fo=open(&quot;./{model_name}.params&quot;,&quot;wb&quot;)
fo.write(relay.save_param_dict(params))
fo.close()
</pre></div>


<h3 class="wp-block-heading">VM模式下加载和运行优化好的模型</h3>



<p>加载前面导出的模型，并执行。</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">

import onnx
import time
import tvm
import numpy as np
import tvm.relay as relay
def vmobj_to_array(o, dtype=np.float32):
    if isinstance(o, tvm.nd.NDArray):
        return &#x5B;o.asnumpy()]
    elif isinstance(o, tvm.runtime.container.ADT):
        result = &#x5B;]
        for f in o:
            result.extend(vmobj_to_array(f, dtype))
        return result
    else:
        raise RuntimeError(&quot;Unknown object type: %s&quot; % type(o))
shape = &#x5B;1, 3, 224, 224]
model_path = &#039;face-detection-0200&#039;
loaded_lib = tvm.runtime.load_module(f&quot;{model_path}.tvm.so&quot;)
loaded_code = bytearray(open(f&quot;{model_path}.tvm.code&quot;, &quot;rb&quot;).read())
exe = tvm.runtime.vm.Executable.load_exec(loaded_code, loaded_lib)
ctx = tvm.cpu()
vm = tvm.runtime.vm.VirtualMachine(exe, ctx)
data = np.random.uniform(size=shape).astype(&quot;float32&quot;)
out = vm.run(data)
out = vmobj_to_array(out)
print(out)
</pre></div>]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
