論文の公開元へ

書き出し

Refer/BibIX

RIS

BibTeX

TSV

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based Input Tiling

Zhuang, Weihao Hascoet, Tristan Chen, Xunquan Takashima, Ryoichi Takiguchi, Tetsuya Ariki, Yasuo 神戸大学

2023.01.18

概要

Currently, deep learning plays an indispensable role in many fields, including computer vision, natural language processing, and speech recognition. Convolutional Neural Networks (CNNs) have demonstrated excellent performance in computer vision tasks thanks to their powerful feature-extraction capability. However, as the larger models have shown higher accuracy, recent developments have led to state-of-the-art CNN models with increasing resource consumption. This paper investigates a conceptual approach to reduce the memory consumption of CNN inference. Our method consists of processing the input image in a sequence of carefully designed tiles within the lower subnetwork of the CNN, so as to minimize its peak memory consumption, while keeping the end-to-end computation unchanged. This method introduces a trade-off between memory consumption and computations, which is particularly suitable for high-resolution inputs. Our experimental results show that MobileNetV2 memory consumption can be reduced by up to 5.3 times with our proposed method. For ResNet50, one of the most commonly used CNN models in computer vision tasks, memory can be optimized by up to 2.3 times.

論文の公開元へ

この論文で使われている画像

参考文献

[1] A. Araujo, W. Norris, and J. Sim, “Computing Receptive Fields of

Convolutional Neural Networks,” Distill, 4(11), 2019, e21.

[2] F. Chollet, “Xception: Deep Learning with Depthwise Separable Convolutions,” in Proceedings of the IEEE Conference on Computer Vision

and Pattern Recognition, 2017, 1251–8.

[3] A. Gural and B. Murmann, “Memory-Optimal Direct Convolutions

for Maximizing Classification Accuracy in Embedded Applications,” in

International Conference on Machine Learning, 2019, 2515–24.

[4] S. Han, H. Mao, and W. J. Dally, “Deep Compression: Compressing

Deep Neural Networks with Pruning, Trained Quantization and Huffman

Coding,” arXiv preprint arXiv:1510.00149, 2015.

[5] S. Han, J. Pool, J. Tran, and W. Dally, “Learning Both Weights and Connections for Efficient Neural Network,” Advances in Neural Information

Processing Systems, 28, 2015, 1135–43.

[6] G. Hinton, O. Vinyals, and J. Dean, “Distilling the Knowledge in a

Neural Network,” arXiv preprint arXiv:1503.02531, 2015.

Convolutional Neural Networks Inference Memory Optimization

[7] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand,

M. Andreetto, and H. Adam, “MobileNets: Efficient Convolutional Neural

Networks for Mobile Vision Applications,” arXiv preprint arXiv:1704.

04861, 2017.

[8] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally,

and K. Keutzer, “SqueezeNet: AlexNet-Level Accuracy with 50× Fewer

Parameters and <0.5 MB Model Size,” arXiv preprint arXiv:1602.07360,

2016.

[9] J. Jin, A. Dundar, and E. Culurciello, “Flattened Convolutional Neural

Networks for Feedforward Acceleration,” arXiv preprint arXiv:1412.5474,

2014.

[10] R. Krishnamoorthi, “Quantizing Deep Convolutional Networks for Efficient Inference: A Whitepaper,” arXiv preprint arXiv:1806.08342, 2018.

[11] J. Lee and Y. Pisarchyk, “Optimizing TensorFlow Lite Runtime Memory,”

2020, url: https://blog.tensorflow.org/2020/10/optimizing-tensorflowlite-runtime.html.

[12] N. Ma, X. Zhang, H.-T. Zheng, and J. Sun, “ShuffleNet v2: Practical

Guidelines for Efficient CNN Architecture Design,” in Proceedings of the

European Conference on Computer Vision (ECCV), 2018, 116–31.

[13] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T.

Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., “Pytorch: An Imperative Style, High-Performance Deep Learning Library,” arXiv preprint

arXiv:1912.01703, 2019.

[14] Y. Pisarchyk and J. Lee, “Efficient Memory Management for Deep Neural

Net Inference,” arXiv preprint arXiv:2001.03288, 2020.

[15] C. F. Sabottke and B. M. Spieler, “The Effect of Image Resolution on

Deep Learning in Radiography,” Radiology: Artificial Intelligence, 2(1),

2020, e190015.

[16] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetv2: Inverted Residuals and Linear Bottlenecks,” in Proceedings

of the IEEE Conference on Computer Vision and Pattern Recognition,

2018, 4510–20.

[17] X. Tao, D. Zhang, W. Ma, X. Liu, and D. Xu, “Automatic Metallic

Surface Defect Detection and Recognition with Convolutional Neural

Networks,” Applied Sciences, 8(9), 2018, 1575.

[18] T. Wang, Y. Chen, M. Qiao, and H. Snoussi, “A Fast and Robust

Convolutional Neural Network-Based Defect Detection Model in Product

Quality Control,” The International Journal of Advanced Manufacturing

Technology, 94(9), 2018, 3465–71.

[19] Y. Wen, A. Anderson, V. Radu, M. F. O’Boyle, and D. Gregg, “TASO:

Time and Space Optimization for Memory-Constrained DNN Inference,”

in 2020 IEEE 32nd International Symposium on Computer Architecture

and High Performance Computing (SBAC-PAD), 2020, 199–208.

Zhuang et al.

[20] S. Wu, M. Zhang, G. Chen, and K. Chen, “A New Approach to Compute

CNNs for Extremely Large Images,” in Proceedings of the 2017 ACM on

Conference on Information and Knowledge Management, 2017, 39–48.

[21] X. Zhang, X. Zhou, M. Lin, and J. Sun, “ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices,” in Proceedings

of the IEEE Conference on Computer Vision and Pattern Recognition,

2018, 6848–56.

...

参考文献をもっと見る

分野

大学

学位論文種類・取得年

言語

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based Input Tiling

概要

この論文で使われている画像

関連論文

Efficient Convolutional Neural Networks for Brain Machine Interface Systems : A transfer learning approach

Deep Learning Based Intelligent Diagnosis Methods for Rotating Machinery Using Vibration Signal-Approach by Signal Preprocessed SAE and Improved CNN

The application of video and Image recognition technology with neural network to conservation biology

余事象モデルに基づく未学習クラス推定法と生体信号分類への応用

Real-time inference improvement on ECNet: A Lightweight Edge-Cloud Network System based on Cascading Structure

参考文献

分野

大学

学位論文種類・取得年

言語

コピーが完了しました

URLをコピーしました

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based Input Tiling

概要

この論文で使われている画像

関連論文

Efficient Convolutional Neural Networks for Brain Machine Interface Systems : A transfer learning approach

Deep Learning Based Intelligent Diagnosis Methods for Rotating Machinery Using Vibration Signal-Approach by Signal Preprocessed SAE and Improved CNN

The application of video and Image recognition technology with neural network to conservation biology

余事象モデルに基づく未学習クラス推定法と生体信号分類への応用

Real-time inference improvement on ECNet: A Lightweight Edge-Cloud Network System based on Cascading Structure

参考文献