地球信息科学学报 ›› 2021, Vol. 23 ›› Issue (9): 1537-1547.doi: 10.12082/dqxxkx.2021.200604
史劲霖1(), 周良辰1,2,3, 闾国年1,2,3, 林冰仙1,2,3,*(
)
收稿日期:
2020-10-15
出版日期:
2021-09-25
发布日期:
2021-11-25
通讯作者:
*林冰仙(1984— ),女,江苏南通人,博士,副教授。从事虚拟地理环境研究。E-mail: lbx1984@hotmail.com作者简介:
史劲霖(1994— ),男,江苏常州人,硕士。从事深度学习算法、虚拟地理环境构建研究。E-mail: sjl_njnu@163.com
基金资助:
SHI Jinlin1(), ZHOU Liangchen1,2,3, LV Guonian1,2,3, LIN Bingxian1,2,3,*(
)
Received:
2020-10-15
Online:
2021-09-25
Published:
2021-11-25
Supported by:
摘要:
为避免密集人群踩踏事件发生,从监控图像中准确获取密集人群人数信息非常重要。针对密集人群计数难度大、人群目标小、场景尺度变化大等特点,本文提出一种新型神经网络结构VGG-ResNeXt。本网络使用VGG-16的前10层作粗粒度特征提取器,使用改进的残差神经网络作为细粒度特征提取器。利用改进的残差神经网络“多通道,共激活”的特点,使得单列式人群计数神经网络获得了多列式人群计数网络的优点(即从小目标、多尺度的密集人群图像中提取更多人群特征),同时避免了多列式人群计数网络训练难度大、结构冗余等缺点。实验结果表明本模型在UCF-CC-50数据集、ShangHaiTech B数据集和UCF-QNRF数据集中取得了最高精度,MAE指标分别优于其他同期模型7.5%、18.8%和2.4%,证明了本模型的在计数精度方面的有效性。本研究成果可以有效帮助城市管理,有效缓解公安疏导压力,保障人民生命财产安全。
史劲霖, 周良辰, 闾国年, 林冰仙. 基于残差神经网络改进的密集人群计数方法[J]. 地球信息科学学报, 2021, 23(9): 1537-1547.DOI:10.12082/dqxxkx.2021.200604
SHI Jinlin, ZHOU Liangchen, LV Guonian, LIN Bingxian. Improved Dense Crowd Counting Method based on Residual Neural Network[J]. Journal of Geo-information Science, 2021, 23(9): 1537-1547.DOI:10.12082/dqxxkx.2021.200604
表2
UCF-CC-50数据集中实验结果
方法 | MAE | MSE |
---|---|---|
Zhang et al [ | 467.0 | 498.5 |
MCNN[ | 377.6 | 509.1 |
Cascaded-MTL[ | 322.8 | 397.9 |
SwitchingCNN[ | 318.1 | 439.2 |
CP-CNN[ | 295.8 | 320.9 |
CSRNet[ | 266.1 | 397.5 |
SANet[ | 258.4 | 334.9 |
TEDNet[ | 249.4 | 354.5 |
VGG-ResNet | 229.6 | 319.4 |
LMCNN[ | 219.2 | 297.1 |
VGG-ResNeXt | 202.6 | 297.9 |
[1] |
Dollár P, Wojek C, Schiele B, et al. Pedestrian detection: An evaluation of the state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4):743-761.
doi: 10.1109/TPAMI.2011.155 pmid: 21808091 |
[2] |
Viola P, Jones M J, Snow D. Detecting pedestrians using patterns of motion and appearance[J]. International Journal of Computer Vision, 2005, 63(2):153-161.
doi: 10.1007/s11263-005-6644-8 |
[3] | Dalal N, Triggs B. Histograms of oriented gradients for human detection[J]. Proceedings-2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, 2005, I:886-893. |
[4] | Wu B, Nevatia R. Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors[C]. Proceedings of the IEEE International Conference on Computer Vision, 2005, I:90-97. |
[5] |
Lin S F, Chen J U, Chao H X. Estimation of number of people in crowded scenes using perspective transformation[J]. IEEE Transactions on Systems Man and Cybernetics - Part A Systems and Humans, 2001, 31(6):645-654.
doi: 10.1109/3468.983420 |
[6] | Chan A B, Vasconcelos N. Bayesian Poisson regression for crowd counting[J]. Proceedings of the IEEE International Conference on Computer Vision, 2009:545-551. |
[7] | Ryan D, Denman S, Fookes C, et al. Crowd counting using multiple local features [C]//Digital Image Computing: Techniques and Applications. IEEE, 2009:81-88. |
[8] | Chen K, Loy C C, Gong S G, et al. Feature mining for localised crowd counting[C]// 2012 British Machine Vision Conference, 2012(21):1-11. |
[9] | Wang C, Zhang H, Yang L, et al. Deep people counting in extremely dense crowds [C]//the 23rd ACM international conference. ACM, 2015:1299-1302. |
[10] |
Fu M, Xu P, Li X, et al. Fast crowd density estimation with convolutional neural networks[J]. Engineering Applications of Artificial Intelligence, 2015, 43(8):81-88.
doi: 10.1016/j.engappai.2015.04.006 |
[11] | Sam D B, Sajjan N N, Babu R V. Divide and grow: capturing huge diversity in crowd images with incrementally growing CNN[EB/OL]. 2018: arXiv: 1807.09993. https://arxiv.org/abs/1807.09993 . |
[12] | Li Y H, Zhang X F, Chen D M. CSRNet: dilated convolutional neural networks for understanding the highly congested scenes[EB/OL]. 2018: arXiv: 1802.10062. https://arxiv.org/abs/1802.10062 |
[13] | Cao X K, Wang Z P, Zhao Y Y, Scale aggregation network for accurate and efficient crowd counting [C]//2018 European Conference on Computer Vision, 2018:757-773. |
[14] | Jiang X, Xiao Z, Zhang B, et al. Crowd counting and density estimation by trellis encoder-decoder networks [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019:6126-6135. |
[15] | Skaug C, Ai H, Bai B. End-to-end crowd counting via joint learning local and global count [C]//2016 International Conference on Image Processing (ICIP). IEEE, 2016:1215-1219. |
[16] | Liu W Z, Salzmann M Fua P. Context-aware crowd counting[EB/OL]. arXiv preprint arXiv:1811.10452, 2018. |
[17] | Sindagi V A, Patel V M. A survey of recent advances in CNN-based single image crowd counting and density estimation[EB/OL]. 2017: arXiv: 1707.01202. https://arxiv.org/abs/1707.01202 . |
[18] | Zhang Y, Zhou D, Chen S, et al. Single image crowd counting via multi-column convolutional neural network [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 589-597. |
[19] | Onoro R D, Lopez S R J. Towards perspective-free object counting with deep learning [C]//2016 European Conference on Computer Vision.Springer, 2016: 615-629. |
[20] | Karen S, Andrew Z. Very deep convolutional networks for large-scale image recognition[EB/OL]. arXiv preprint arXiv:1409.1556, 2014. |
[21] | Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions[J]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2015:1-9. |
[22] |
Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252.
doi: 10.1007/s11263-015-0816-y |
[23] | He K M, Zhang X Y, Sun J. Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.IEEE, 2016:770-778. |
[24] | He T, Zhang Z, Zhang H, et al. Bag of tricks for image classification with convolutional neural networks[EB/OL]. arXiv preprint arXiv:1812.01187, 2019. |
[25] | He K M, Zhang X Y, Ren S Q, et al. Identity mappings in deep residual networks[C]. arXiv preprint arXiv:1603.05027, 2016. |
[26] | Sergey Z, Nikos K. Wide residual networks[EB/OL]. arXiv preprint arXiv:1605.07146, 2016. |
[27] | Xie S N, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[EB/OL]. 2016: arXiv: 1611.05431. https://arxiv.org/abs/1611.05431 . |
[28] | Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[EB/OL]. 2015: arXiv: 1512.00567. https://arxiv.org/abs/1512.00567 . |
[29] | Wang X, Yu K, Wu S, et al. ESRGAN: Enhanced super-resolution generative adversarial networks[C]. // Computer Vision - ECCV 2018 Workshops, 2019: arXiv preprint arXiv:1809.00219, 2018. |
[30] |
Kruthiventi S S S, Ayush K, Babu R V. DeepFix: a fully convolutional neural network for predicting human eye fixations[J]. IEEE Transactions on Image Processing, 2017, 26(9):4446-4456.
doi: 10.1109/TIP.2017.2710620 pmid: 28692956 |
[31] | Boominathan L, Kruthiventi S S S, Babu R V. CrowdNet: a deep convolutional network for dense crowd counting[EB/OL]. 2016: arXiv: 1608.06197. https://arxiv.org/abs/1608.06197 . |
[32] | Sam D B, Surya S, Babu R V. Switching convolutional neural network for crowd counting[EB/OL]. 2017: arXiv: 1708.00199. https://arxiv.org/abs/1708.00199 . |
[33] | Sindagi V A Patel V M. Generating high-quality crowd density maps using contextual pyramid CNNs[EB/OL]. 2017: arXiv: 1708.00953. https://arxiv.org/abs/1708.00953 . |
[34] | Idrees H, Saleemi I, Seibert C, et al. Multi-source multi-scale counting in extremely dense crowd images [C]//2013 IEEE International Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2013:2547-2554. |
[35] | Zhang Y, Zhou D, Chen S, et al. Single-image crowd counting via multi-column convolutional neural network [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016:589-597. |
[36] | Idrees H, Tayyab M, Athrey K, et al. Composition loss for counting, density map estimation and localization in dense crowds [C]//2018 IEEE European Conference on Computer Vision (ECCV). IEEE, 2018:8-14. |
[37] | Smith S L, Le Q V. A Bayesian perspective on generalization and stochastic gradient descent[EB/OL]. 2017: arXiv: 1710.06451. https://arxiv.org/abs/1710.06451 . |
[38] | Zhang C, Li H, Wang X, et al. Cross-scene crowd counting via deep convolutional neural networks [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015:833-841. |
[39] | Sindagi V A, Patel V M. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting[EB/OL]. 2017: arXiv: 1707.09605. https://arxiv.org/abs/1707.09605 . |
[40] | Sindagi V A, Patel V M. Generating high-quality crowd density maps using contextual pyramid CNNs[EB/OL]. 2017: arXiv: 1708.00953. https://arxiv.org/abs/1708.00953 . |
[41] | 付倩慧, 李庆奎, 傅景楠, 等. 基于空间维度循环感知网络的密集人群计数模型[J]. 计算机应用, 2020(10-12):1-7. |
[ Fu Q H, Li Q K, Fu J N, Dense crowd counting model based on spatial dimensional circular perception network[J]. Journal of Computer Applications, 2020(10-12):1-7. ] | |
[42] | Shen Z, Xu Y, Ni B, et al. Crowd counting via adversarial cross-scale consistency pursuit [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018:5245-5254. |
[43] | Zhang L, Shi M J, Chen Q B. Crowd counting via scale-adaptive convolutional neural network[EB/OL]. 2017: arXiv: 1711.04433. https://arxiv.org/abs/1711.04433 . |
[44] | Liu X L, van de Weijer J, Bagdanov A D. Leveraging unlabeled data for crowd counting by learning to rank[EB/OL]. 2018: arXiv: 1803.03095. https://arxiv.org/abs/1803.03095 . |
[45] | Liu L, Wang H, Li G, et al. Crowd counting using deep recurrent spatial-aware network [C]//Twenty-Seventh International Joint Conference on Artificial Intelligence. 2018:849-855. |
[46] | Sindagi V A, Patel V M. HA-CCN: Hierarchical attention-based crowd counting network[J]. IEEE Transactions on Image Processing, 2019(99):1-1. |
[1] | 李国清, 柏永青, 杨轩, 陈正超, 余海坤. 基于深度学习的高分辨率遥感影像土地覆盖自动分类方法[J]. 地球信息科学学报, 2021, 23(9): 1690-1704. |
[2] | 卞萌, 郭树毅, 王威, 欧阳昱晖, 黄颖菁, 费腾. 融合植被遥感数据的北京市次日花粉浓度预测[J]. 地球信息科学学报, 2021, 23(9): 1705-1713. |
[3] | 熊皓丽, 周小成, 汪小钦, 崔雅君. 基于GEE云平台的福建省10 m分辨率茶园专题空间分布制图[J]. 地球信息科学学报, 2021, 23(7): 1325-1337. |
[4] | 张菊, 房世波, 刘汉湖. 基于微波数据与光学数据集成的机器学习技术在作物产量估算中的应用[J]. 地球信息科学学报, 2021, 23(6): 1082-1091. |
[5] | 施海霞, 韦玉春, 徐晗泽宇, 周爽, 程琪. 高分遥感图像相对辐射校正中的伪不变地物自动提取和优化选择[J]. 地球信息科学学报, 2021, 23(5): 903-917. |
[6] | 唐璎, 刘正军, 杨懿, 顾海燕, 杨树文. 基于特征增强和ELU的神经网络建筑物提取研究[J]. 地球信息科学学报, 2021, 23(4): 692-709. |
[7] | 赵泉华, 冯林达, 李玉. 基于最优极化特征组合的SAR影像湿地分类[J]. 地球信息科学学报, 2021, 23(4): 723-736. |
[8] | 邓雅文, 凌子燕, 孙娜, 吕金霞. 基于广义回归神经网络的京津冀地区土壤湿度遥感逐日估算研究[J]. 地球信息科学学报, 2021, 23(4): 749-761. |
[9] | 李广洋, 寇卫利, 陈帮乾, 代飞, 强振平, 吴超. 多核学习算法及其在高光谱图像分类中的应用研究进展[J]. 地球信息科学学报, 2021, 23(3): 492-504. |
[10] | 王海起, 孔浩然, 李学伟. 基于过滤文本和社交网络的用户常驻位置预测[J]. 地球信息科学学报, 2021, 23(10): 1778-1786. |
[11] | 徐佳伟, 刘伟, 单浩宇, 史嘉诚, 李二珠, 张连蓬, 李行. 基于PRCUnet的高分遥感影像建筑物提取[J]. 地球信息科学学报, 2021, 23(10): 1838-1849. |
[12] | 刘新, 赵宁, 郭金运, 郭斌. 基于LSTM神经网络的青藏高原月降水量预测[J]. 地球信息科学学报, 2020, 22(8): 1617-1629. |
[13] | 李玉, 李奕燃, 王光辉, 石雪. 基于加权指数函数模型的高光谱图像分类方法[J]. 地球信息科学学报, 2020, 22(8): 1642-1653. |
[14] | 周琦, 高长春. 城市创意产业空间动态集聚演化的计算与可视优化方法[J]. 地球信息科学学报, 2020, 22(5): 1033-1048. |
[15] | 杜培军, 王欣, 蒙亚平, 林聪, 张鹏, 卢刚. 面向地理国情监测的变化检测与地表覆盖信息更新方法[J]. 地球信息科学学报, 2020, 22(4): 857-866. |
|