ライブラリ登録: Guest
Begell Digital Portal Begellデジタルライブラリー 電子書籍 ジャーナル 参考文献と会報 リサーチ集
Journal of Flow Visualization and Image Processing
SJR: 0.161 SNIP: 0.312 CiteScore™: 0.5

ISSN 印刷: 1065-3090
ISSN オンライン: 1940-4336

Journal of Flow Visualization and Image Processing

DOI: 10.1615/JFlowVisImageProc.2019031771
pages 371-393

OPTICAL FLOW ESTIMATION USING CHANNEL ATTENTION MECHANISM

Xiang Xuezhi
College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, Heilongjiang, P.R. China
Syed Masroor Ali
College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, Heilongjiang, PR China
Ghulam Farid
College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, Heilongjiang, P.R. China

要約

Image processing-based optical flow computation has been an appalling task so far. Nowadays, convolution neural network (CNNs), a deep learning method, is broadly applied for estimating optical flow. One of the frameworks of CNNs is the U-Net architecture. This architecture is comprised of encoder-decoder framework that can be trained endways. However, its encoder component utilizes an identical image categorization technique which is common in other categorization architectures. Moreover, the decoder unit is employed to augment the spatial feature maps to full scale of intake by executing successive deconvolution. This type of architecture yields blurred flow fields owing to unpolished features and low-resolution as it is noteworthy that optical flow is pixel-level stint instead. In this article, it has been strived to find a way out for this problem. For this, two architectures, dilated convolution neural network, and channel attention mechanism are introduced inside FlowNetCorr network to estimate optical flow and training loss. In our framework, dilation convolution is deployed to attain spatial precision, and also elaborate the receptive field without requiring huge computational means, and keeps the spatial resolution of feature map unmodified, while channel attention is founded on squeeze-and-excitation architecture for image categorization, which can accommodatingly readjust channel-wise feature by applying global channel knowledge. Comprehensive experiments are executed on MPI-Sintel (Clean and Final), and KITTI (2012 and 2015) test datasets to assess the efficiency of our framework. The experimental outcome denotes that our framework has gone par in performance for MPI-Sintel (Clean and Final) datasets in terms of minimizing training loss, accuracy, and visual betterment to many unsupervised methods, e.g., USCNN, UnSupFlownet, DSTFlow, etc., and cutting-edge supervised methods, e.g., SpyNet, SpyNet+ft, CaF-Full-41c, etc. However, having degraded performance than FlowNet2, while for KITTI (2012 and 2015) datasets, our framework has also achieved better performances over many methods except from UnFlow+ft. These consequences affirm the significance of dilated convolution, and channel attention strategies as well for estimating optical flow.

参考

  1. Ahmadi, A. and Patras, I., Unsupervised Convolutional Neural Networks for Motion Estimation, IEEE Int. Conf. on Image Processing(ICIP), pp. 1629-1633, 2016.

  2. Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., and Szeliski, R., A Database and Evaluation Methodology for Optical Flow, Int. J. Comput. Vision, vol. 92, pp. 1-31, 2011.

  3. Banerjee, B. and Murino, V., Efficient Pooling of Image Based CNN Features for Action Recognition in Videos, 2017 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 2637-2641, 2017.

  4. Bonneel, N., Tompkin, J., Sunkavalli, K., Sun, D., Paris, S., and Pfister, H., Blind Video Temporal Consistency, ACM Trans. Graph, vol. 34, pp. 1-9, 2015.

  5. Brox, T. and Malik, J., Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation, IEEE Trans. Pattern Anal. Machine Intell., vol. 33, pp. 500-513, 2011.

  6. Butler, D.J., Wulff, J., Stanley, G.B., and Black, M.J., A Naturalistic Open Source Movie for Optical Flow Evaluation, in Computer Vision-ECCV 2012, Berlin, Heidelberg, pp. 611-625, 2012.

  7. Chen, L., Yang, Y., and Wang, W.X.J., and Yuille, A.L., Attention to Scale: Scale-Aware Semantic Image Segmentation, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 3640-3649, 2016.

  8. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L., Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, IEEE Trans. Pattern Anal. Machine Intell., pp. 434-448, 2018.

  9. Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A.L., and Wang, X., Multi-Context Attention for Human Pose Estimation, IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 5669-5678, 2017.

  10. Dai, J., Huang, S., and Nguyen, T., Pyramid Structured Optical Flow Learning with Motion Cues, in 2018 25th IEEE Int. Conf. on Image Processing (ICIP), pp. 3338-3342, 2018.

  11. Dai, J., Li, Y., He, K., and Sun, J., R-FCN: Object Detection via Region-Based Fully Convolutional Networks, Adv. Neural Inf. Process. Syst., pp. 379-387, 2016.

  12. Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., et al., FlowNet: Learning Optical Flow with Convolutional Networks, in 2015 IEEE Int. Conf. on Computer Vision (ICCV), pp. 2758-2766, 2015.

  13. Elad, M. and Feuer, A., Recursive Optical Flow Estimation-Adaptive Filtering Approach, J. Visual Commun. Image Represent., vol. 9, pp. 119-138, 1998.

  14. Geiger, A., Lenz, P., and Urtasun, R., Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite, in 2012 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3354-3361, 2012.

  15. Girshick, R., Donahue, J., Darrell, T., and Malik, J., Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, IEEE Conf. on Computer Vision and Pattern Recognition(CVPR), pp. 580-587, 2014.

  16. Heindlmaier, M., Yu, L., and Diepold, K., The Impact of Nonlinear Filtering and Confidence Information on Optical Flow Estimation in a Lucas Kanade Framework, Int. Conf. on Image Processing, ICIP, Cairo, Egypt, pp. 1593-1596, November 7-10, 2009.

  17. Horn, B.K.P. and Schunck, B.G., Determining Optical Flow, Artificial Intell., vol. 17, pp. 185-203, 1981.

  18. Hu, J., Shen, L., and Sun, G., Squeeze-and-Excitation Networks, IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 7132-7141, 2018a.

  19. Hu, P., Wang, G., and Tan, Y.-P., Recurrent Spatial Pyramid CNN for Optical Flow Estimation, IEEE Trans. Multimedia, vol. 20, pp. 2814-2823, 2018b.

  20. Huang, Z., Wang, L., Meng, G., and Pan, C., Image Super-Resolution via Deep Dilated Convolutional Networks, IEEE Int. Conf on Image Processing (ICIP), pp. 953-957, 2017.

  21. Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., and Brox, T., FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks, pp. 1-16, 2016. arXiv: 1612.01925v1 [cs.CV].

  22. Janai, J. and Guney, F., Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art, pp. 1-67, 2017. arXiv: 1704.05519.

  23. Jia, Y., Shelhamer, E., Donahue, J., Karayev, J.L.S., Girshick, R., Guadarrama, S., et al., Caffe: Convolutional Architecture for Fast Feature Embedding, ACM Multimedia, New York City: Springer International Publishing, pp. 1-4, 2014.

  24. Karianakis, N., Liu, Z., Chen, Y., and Soatto, S., Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-Identification, Springer International Publishing, pp. 737-756, 2018.

  25. Koga, K. and Miike, H., Optical Flow Analysis Based on Spatio-Temporal Correlation of Dynamic Image, Syst. Comput. Jpn., vol. 21, pp. 97-108, 1990.

  26. Krizhevsky, A., Sutskever, I., and Hinton, G.E., ImageNet Classification with Deep Convolutional Neural Networks, Neural Inform. Process. Syst., vol. 25, pp. 1-9, 2012.

  27. Lai, W.-S., Huang, J.-B., and Yang, M.-H., Semi-Supervised Learning for Optical Flow with Generative Adversarial Networks, Proc. the 31st Int. Conf. on Neural Information Processing Systems, Long Beach, California, pp. 1-11, 2017.

  28. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Comput., vol. 1, pp. 541-551, 1989.

  29. Meister, S., Hur, J., and Roth, S., UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss, pp. 1-9, 2017. arXiv: 1711.07837 [cs.CV].

  30. Memisevic, R. and Hinton, G., Unsupervised Learning of Image Transformations, in IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1-8, 2007.

  31. Randriantsoa, A. and Berthoumieu, Y., Optical Flow Estimation Using Forward-Backward Constraint Equation, in Int. Conf. on Image Processing (ICIP 2000), pp. 578-581, 2000.

  32. Ranjan, A. and Black, M.J., Optical Flow Estimation Using a Spatial Pyramid Network, in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 2720-2729, 2017.

  33. Ren, Z., Yan. J., Ni, B., Liu, B., Yang, X., and Zha, H., Unsupervised Deep Learning for Optical Flow Estimation, Proc. of Thirty-First AAAI Conf. on Artificial Intelligence (AAAI-17), pp. 1495-1501, 2017.

  34. Shelhamer, E., Long, J., and Darrell, T., Fully Convolutional Networks for Semantic Segmentation, IEEE Transact. Pattern Anal. Machine Intell., vol. 39, pp. 640-651, 2017.

  35. Simonyan, K. and Zisserman, A., Two-Stream Convolutional Networks for Action Recognition in Videos, Proc. of 27th Int. Conf. on Neural Information Processing Systems, vol. 1, pp. 568-576, 2014.

  36. Sun, D., Roth, S., and Black, M.J., Secrets of Optical Flow Estimation and Their Principles, in IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, pp. 2432-2439, June 13-18, 2010.

  37. Sun, D., Yang, X., Liu, M.-Y., and Kautz, J., PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume, pp. 1-18, 2017. arXiv: 1709.02371 [cs.CV].

  38. Sun, Z. and Wang, H., Deeper Spatial Pyramid Network with Refined Up-Sampling for Optical Flow Estimation, in 19th Pacific-Rim Conf. on Multimedia, PCM 2018, Hefei, China, pp. 492-501, September 21-22, 2018.

  39. Teney, D. and M, H., Learning to Extract Motion from Videos in Convolutional Neural Networks, in Computer Vision-ACCV 2016, S.H. Lai, V. Lepetit, K. Nishino, and Y. Sato, Eds., New York City: Springer International Publishing, pp. 412-428, 2017.

  40. Vaquero, V., Ros, G., Moreno-Noguer, F., Lopez, A.M., and Sanfeliu, A., Joint Coarse-and-Fine Reasoning for Deep Optical Flow, IEEE Int. Conf. on Image Processing (ICIP), pp. 2558-2562, 2017.

  41. Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., et al., Residual Attention Network for Image Classification, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 6450-6458, 2017a.

  42. Wang, P., Chen, P., Yuan, Y., Liu, Z.H.D., and Hou, G.C.X., Understanding Convolution for Semantic Segmentation, IEEE Winter Conf. on Applications of Computer Vision (WACV), pp. 1451-1460, 2018.

  43. Wang, Y., Yang, Y., Yang, Z., Zhao, L., and Xu, W., Occlusion Aware Unsupervised Learning of Optical Flow, pp. 1-10, 2017b. arXiv: 1711.05890v2 [cs.CV].

  44. Xiang, X., Zhai, M., Zhang, R., Qiao, Y., and Saddik, A.E., Deep Optical Flow Supervised Learning with Prior Assumptions, IEEE Access 6, pp. 43222-43232, 2018.

  45. Yamashita, T., Furukawa, H., and Fujiyoshi, H., Multiple Skip Connections of Dilated Convolution Network for Semantic Segmentation, IEEE Int. Conf. on Image Processing (ICIP), pp. 1593-1597, 2018.

  46. Yang, F., Yan, K., Lu, S., Jia, H., Xie, X., and Gao, W., Attention Driven Person Re-Identification, Pattern Recog., pp. 143-155, 2019.

  47. Yu, F., Koltun, V., and Funkhouser, T., Dilated Residual Networks, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 636-644, 2017.

  48. Yu, J., Harley, A.W., and Derpanis, K.G., Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness, Computer Vision-ECCV 2016 Workshops, pp. 3-10, 2016.

  49. Zhang, X., Wang, T., Qi, J., Lu, H., and Wang, G., Progressive Attention Guided Recurrent Network for Salient Object Detection, IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 714-722, 2018a.

  50. Zhang, Y., Li, K., Wang, L., Zhong, B., and Fu, Y., Image Super-Resolution Using Very Deep Residual Channel Attention Networks, Computer Vision-ECCV 2018, New York City: Springer International Publishing, pp. 294-310, 2018b.

  51. Zhu, Y. and Newsam, S., Densenet for Dense Flow, IEEE Int. Conf. on Image Processing (ICIP), pp. 790-794, 2017.

  52. Zhu, Y., Zhao, C., Guo, H., Wang, J., Zhao, X., and Lu, H., Attention Couplenet: Fully Convolutional Attention Coupling Network for Object Detection, IEEE Trans. Image Process., pp. 113-126, 2019.


Articles with similar content:

THERMAL FLUID FLOW TRANSPORT PHENOMENA IN NANOFLUID JET ARRAY IMPINGEMENT
Journal of Flow Visualization and Image Processing, Vol.22, 2015, issue 1-3
Shuichi Torii, Caner Senkal
A stochastic extension of the Approximate Deconvolution Model
TSFP DIGITAL LIBRARY ONLINE, Vol.7, 2011, issue
Nikolaus A. Adams
On Selecting Basis Functions in CMAC Neural Network
Journal of Automation and Information Sciences, Vol.36, 2004, issue 4
Alexander A. Bessonov, Oleg G. Rudenko
EVALUATION OF OLFACTORY SEARCH ALGORITHMS USING DIRECT NUMERICAL SIMULATION OF TURBULENT SCALAR TRANSPORT
TSFP DIGITAL LIBRARY ONLINE, Vol.10, 2017, issue
Davide Cerizza, C. F. Panagiotou, Tamer A. Zaki, Yosuke Hasegawa
TEMPERATURE SENSITIVE PARTICLES FOR THE 3-D SIMULTANEOUS MEASUREMENT OF VELOCITY AND TEMPERATURE
TSFP DIGITAL LIBRARY ONLINE, Vol.1, 1999, issue
Hitoshi Sugiyama, Nao Ninomiya, Mitsunobu Akiyama