##install protobuf https://github.com/protocolbuffers/protobuf/blob/master/src/README.md 

install openBlas 


git clone git://github.com/xianyi/OpenBLAS 

sudo make FC=gfortran (如果没有安装gfortran,执行sudo apt-get install gfortran) sudo make install #添加库安装路径/opt/OpenBLAS/lib

ln /opt/OpenBLAS/lib/libopenblas_haswellp-r0.3.7.dev.so /usr/lib/libopenblas.so --------------有用 

cp /opt/OpenBLAS/include/* /usr/local/include/ 

ARM Tengine install 


cd example 

mkdir build 

替换CMakeList.txt, cmake, 


/data/edge/tengine/tengine/examples/build# cmake .. -DTENGINE_DIR=/data/edge/tengine/tengine -DCMAKE_BUILD_TYPE=Debug --trace 

cd ../ make 

doc: https://github.com/OAID/Tengine/blob/master/doc/operator_ir.md 

###Tengine is composed of six modules: 

core /operator/serializer/executor/driver/wrapper. 

###core: provides the basic components and functionalities of the system. operator: defines the schema of operators, such as convolution, relu, pooling, etc. al. Here is the current support operator list. 

###serializer: is to load the saved model. The serializer framework is extensible to support different format, including the customized one. Caffe/ONNX/Tensorflow/MXNet and Tengine models can be loaded directly by Tengine. 

###executor: implements the code to run graph and operators. Current version provides a highly optimized implementation for multi A72 cores. 

###driver: is the adapter of real H/W and provides service to device executor by HAL API. It is possible for single driver to create multiple devices. 

###wrapper: provides the wrapper of APIs for different frameworks. Both Caffe API wrapper and Tensorflow API wrapper work now.  

##Support Operator Lists 

BatchNorm Concat ConstOp Convolution Deconvolution Detection_output Dropout Eltwise Flatten Fully_connected Input_op LRN LSTM Normalize Permute Pooling Priorbox PReLu Region Resize Reorg Reshape ReLu RPN Roi_pooling Scale Slice Softmax 

###build example 


export DEBUG_G=1 

/data/edge/tengine/tengine/examples/mtcnn/build# cmake .. -DTENGINE_DIR=/data/edge/tengine/tengine --trace  



add_definitions(-Wall) add_definitions(-fPIC) add_definitions(-g) 

add_definitions(-O3) add_definitions(-funroll-loops) 


add_definitions(-Wno-deprecated-register) add_compile_options($<$<COMPILE_LANGUAGE:CXX>:-std=c++11>) 


####openBlas版: ./executor/operator/common/blas/conv_2d_blas.cpp: if(!NodeOpsRegistryManager::RegisterOPImplementor("common", "Convolution", ConvolutionImpl::SelectFunc, 

####汇编通用卷积版: ./executor/operator/arm64/conv/conv_2d_fast.cpp: NodeOpsRegistryManager::RegisterOPImplementor("arm64", "Convolution", conv_fast::SelectFunc, 

####汇编深度分离卷积版: ./executor/operator/arm64/conv/conv_2d_dw.cpp: NodeOpsRegistryManager::RegisterOPImplementor("arm64", "Convolution", conv_2d_dw::SelectFunc, 

####CPU C语言版: ./executor/operator/ref/ref_convolution.cpp: NodeOpsRegistryManager::RegisterOPImplementor(REF_REGISTRY_NAME, "Convolution", RefConvolutionOps::SelectFunc, 

##前端其他模型格式导入 The serializer module loads the whole model file stored in disk, and creates a Tengine in-memory IR, which is StaticGraph. (导入其他格式的模型文件,比如tensorflow的,变成内部内存中的格式StaticGraph) The serializer module also can store the StaticGraph into disk in the specific format. However, current version of this document describes the loading process, which is more important than the storing process. 

 ################################ #######前端序列化############### ################################ 

Load Interface unsigned int GetFileNum(void); --- 返回模型共有几个文件 bool LoadModel(const std::vector<std::string>& file_list, StaticGraph * static_graph); ----把模型文件转成StaticGraph ###gatherV2 ops gather用于获取tensor中某几位的量 https://baijiahao.baidu.com/s?id=1602069319915188130&wfr=spider&for=pc 

 ### 现在支持的tensorflow ops: 

p_tf->RegisterOpLoadMethod("AvgPool", op_load_t(LoadPool)); p_tf->RegisterOpLoadMethod("MaxPool", op_load_t(LoadPool)); p_tf->RegisterOpLoadMethod("Conv2D", op_load_t(LoadConv2D)); p_tf->RegisterOpLoadMethod("DepthwiseConv2dNative", op_load_t(LoadConv2D)); p_tf->RegisterOpLoadMethod("FusedBatchNorm", op_load_t(LoadBatchNorm)); p_tf->RegisterOpLoadMethod("Relu6", op_load_t(LoadRelu6)); p_tf->RegisterOpLoadMethod("Relu", op_load_t(LoadRelu)); p_tf->RegisterOpLoadMethod("Softmax", op_load_t(LoadSoftmax)); p_tf->RegisterOpLoadMethod("ConcatV2", op_load_t(LoadConcat)); p_tf->RegisterOpLoadMethod("Add", op_load_t(LoadEltwise)); p_tf->RegisterOpLoadMethod("Sub", op_load_t(LoadEltwise)); p_tf->RegisterOpLoadMethod("Mul", op_load_t(LoadEltwise)); p_tf->RegisterOpLoadMethod("Minimum", op_load_t(LoadEltwise)); p_tf->RegisterOpLoadMethod("Rsqrt", op_load_t(LoadEltwise)); p_tf->RegisterOpLoadMethod("ResizeNearestNeighbor", op_load_t(LoadResize)); p_tf->RegisterOpLoadMethod("ComposedBN", op_load_t(LoadComposedBN)); p_tf->RegisterOpLoadMethod("Reshape", op_load_t(LoadReshape)); p_tf->RegisterOpLoadMethod("MatMul", op_load_t(LoadGemm)); p_tf->RegisterOpLoadMethod("AddN", op_load_t(LoadEltwise)); p_tf->RegisterOpLoadMethod("FIFOQueueV2", op_load_t(LoadFIFOQueue)); p_tf->RegisterOpLoadMethod("Mean", op_load_t(LoadMean)); p_tf->RegisterOpLoadMethod("DecodeWav", op_load_t(LoadGeneric)); p_tf->RegisterOpLoadMethod("AudioSpectrogram", op_load_t(LoadGeneric)); p_tf->RegisterOpLoadMethod("Mfcc", op_load_t(LoadGeneric)); p_tf->RegisterOpLoadMethod("LSTM", op_load_t(LoadLSTM)); p_tf->RegisterOpLoadMethod("RNN", op_load_t(LoadRNN)); p_tf->RegisterOpLoadMethod("GRU", op_load_t(LoadGRU)); p_tf->RegisterOpLoadMethod("StridedSlice", op_load_t(LoadStridedSlice));



