[1] A. Araujo, W. Norris, and J. Sim, “Computing Receptive Fields of
Convolutional Neural Networks,” Distill, 4(11), 2019, e21.
[2] F. Chollet, “Xception: Deep Learning with Depthwise Separable Convolutions,” in Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, 2017, 1251–8.
[3] A. Gural and B. Murmann, “Memory-Optimal Direct Convolutions
for Maximizing Classification Accuracy in Embedded Applications,” in
International Conference on Machine Learning, 2019, 2515–24.
[4] S. Han, H. Mao, and W. J. Dally, “Deep Compression: Compressing
Deep Neural Networks with Pruning, Trained Quantization and Huffman
Coding,” arXiv preprint arXiv:1510.00149, 2015.
[5] S. Han, J. Pool, J. Tran, and W. Dally, “Learning Both Weights and Connections for Efficient Neural Network,” Advances in Neural Information
Processing Systems, 28, 2015, 1135–43.
[6] G. Hinton, O. Vinyals, and J. Dean, “Distilling the Knowledge in a
Neural Network,” arXiv preprint arXiv:1503.02531, 2015.
Convolutional Neural Networks Inference Memory Optimization
19
[7] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand,
M. Andreetto, and H. Adam, “MobileNets: Efficient Convolutional Neural
Networks for Mobile Vision Applications,” arXiv preprint arXiv:1704.
04861, 2017.
[8] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally,
and K. Keutzer, “SqueezeNet: AlexNet-Level Accuracy with 50× Fewer
Parameters and <0.5 MB Model Size,” arXiv preprint arXiv:1602.07360,
2016.
[9] J. Jin, A. Dundar, and E. Culurciello, “Flattened Convolutional Neural
Networks for Feedforward Acceleration,” arXiv preprint arXiv:1412.5474,
2014.
[10] R. Krishnamoorthi, “Quantizing Deep Convolutional Networks for Efficient Inference: A Whitepaper,” arXiv preprint arXiv:1806.08342, 2018.
[11] J. Lee and Y. Pisarchyk, “Optimizing TensorFlow Lite Runtime Memory,”
2020, url: https://blog.tensorflow.org/2020/10/optimizing-tensorflowlite-runtime.html.
[12] N. Ma, X. Zhang, H.-T. Zheng, and J. Sun, “ShuffleNet v2: Practical
Guidelines for Efficient CNN Architecture Design,” in Proceedings of the
European Conference on Computer Vision (ECCV), 2018, 116–31.
[13] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T.
Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., “Pytorch: An Imperative Style, High-Performance Deep Learning Library,” arXiv preprint
arXiv:1912.01703, 2019.
[14] Y. Pisarchyk and J. Lee, “Efficient Memory Management for Deep Neural
Net Inference,” arXiv preprint arXiv:2001.03288, 2020.
[15] C. F. Sabottke and B. M. Spieler, “The Effect of Image Resolution on
Deep Learning in Radiography,” Radiology: Artificial Intelligence, 2(1),
2020, e190015.
[16] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetv2: Inverted Residuals and Linear Bottlenecks,” in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition,
2018, 4510–20.
[17] X. Tao, D. Zhang, W. Ma, X. Liu, and D. Xu, “Automatic Metallic
Surface Defect Detection and Recognition with Convolutional Neural
Networks,” Applied Sciences, 8(9), 2018, 1575.
[18] T. Wang, Y. Chen, M. Qiao, and H. Snoussi, “A Fast and Robust
Convolutional Neural Network-Based Defect Detection Model in Product
Quality Control,” The International Journal of Advanced Manufacturing
Technology, 94(9), 2018, 3465–71.
[19] Y. Wen, A. Anderson, V. Radu, M. F. O’Boyle, and D. Gregg, “TASO:
Time and Space Optimization for Memory-Constrained DNN Inference,”
in 2020 IEEE 32nd International Symposium on Computer Architecture
and High Performance Computing (SBAC-PAD), 2020, 199–208.
20
Zhuang et al.
[20] S. Wu, M. Zhang, G. Chen, and K. Chen, “A New Approach to Compute
CNNs for Extremely Large Images,” in Proceedings of the 2017 ACM on
Conference on Information and Knowledge Management, 2017, 39–48.
[21] X. Zhang, X. Zhou, M. Lin, and J. Sun, “ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices,” in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition,
2018, 6848–56.
...