R-YOLO: A YOLO-Based Method for Arbitrary-Oriented Target Detection in High-Resolution Remote Sensing Images.

Yongjie Hou, Gang Shi, Yingxiang Zhao, Fan Wang, Xian Jiang, Rujun Zhuang, Yunfei Mei, Xinjiang Ma
Author Information
  1. Yongjie Hou: College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China.
  2. Gang Shi: College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China.
  3. Yingxiang Zhao: College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China.
  4. Fan Wang: College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China. ORCID
  5. Xian Jiang: College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China.
  6. Rujun Zhuang: College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China.
  7. Yunfei Mei: College of Information Science and Engineering, Xinjiang University, Urumqi 830017, China.
  8. Xinjiang Ma: Geomatics School of Earth Sciences and Engineering, Hohai University, Nanjing 211100, China.

Abstract

In view of the existence of remote sensing images with large variations in spatial resolution, small and dense objects, and the inability to determine the direction of motion, all these components make object detection from remote sensing images very challenging. In this paper, we propose a single-stage detection network based on YOLOv5. This method introduces the MS Transformer module at the end of the feature extraction network of the original network to enhance the feature extraction capability of the network model and integrates the Convolutional Block Attention Model (CBAM) to find the attention area in dense scenes. In addition, the YOLOv5 target detection network is improved by incorporating a rotation angle approach from the a priori frame design and the bounding box regression formulation to make it suitable for rotating frame-based detection scenarios. Finally, the weighted combination of the two difficult sample mining methods is used to improve the focal loss function, so as to improve the detection accuracy. The average accuracy of the test results of the improved algorithm on the DOTA data set is 77.01%, which is higher than the previous detection algorithm. Compared with the average detection accuracy of YOLOv5, the average detection accuracy is improved by 8.83%. The experimental results show that the algorithm has higher detection accuracy than other algorithms in remote sensing scenes.

Keywords

References

  1. IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149 [PMID: 27295650]

Grants

  1. 62162059/National Natural Science Foundation of China
  2. 12061072/National Natural Science Foundation of China
  3. 2021xjkk1404/Third Xinjiang Comprehensive Scientific Expedition Projec

MeSH Term

Algorithms
Attention
Data Collection
Remote Sensing Technology

Word Cloud

Created with Highcharts 10.0.0detectionnetworkaccuracyremotesensingYOLOv5improvedaveragealgorithmimagesdensemakefeatureextractionattentionscenesframerotatingimproveresultshigherviewexistencelargevariationsspatialresolutionsmallobjectsinabilitydeterminedirectionmotioncomponentsobjectchallengingpaperproposesingle-stagebasedmethodintroducesMSTransformermoduleendoriginalenhancecapabilitymodelintegratesConvolutionalBlockAttentionModelCBAMfindareaadditiontargetincorporatingrotationangleapproachprioridesignboundingboxregressionformulationsuitableframe-basedscenariosFinallyweightedcombinationtwodifficultsampleminingmethodsusedfocallossfunctiontestDOTAdataset7701%previousCompared883%experimentalshowalgorithmsR-YOLO:YOLO-BasedMethodArbitrary-OrientedTargetDetectionHigh-ResolutionRemoteSensingImagesmechanismdeeplearningimage

Similar Articles

Cited By