A skipping spectrum sensing scheme based on deep reinforcement learning for transform domain communication systems.

Advanced Search

Ce Li, Yanhua Wu, Rangang Zhu, Ruochen Wu, Zhengkun Zhang, Zunhui Wang

Author Information

Ce Li: College of Electronic Engineering, National University of Defense Technology, Hefei, 230000, China. lice22@nudt.edu.cn.
Yanhua Wu: College of Electronic Engineering, National University of Defense Technology, Hefei, 230000, China. wuyanhua17@nudt.edu.cn.
Rangang Zhu: College of Electronic Engineering, National University of Defense Technology, Hefei, 230000, China.
Ruochen Wu: School of Automation, Southeast University, Nanjing, 210000, China.
Zhengkun Zhang: College of Electronic Engineering, National University of Defense Technology, Hefei, 230000, China.
Zunhui Wang: College of Electronic Engineering, National University of Defense Technology, Hefei, 230000, China.

PMID: 39732917 DOI: 10.1038/s41598-024-83140-w

Spectrum sensing is a key technology and prerequisite for Transform Domain Communication Systems (TDCS). The traditional approach typically involves selecting a working sub-band and maintaining it without further changes, with spectrum sensing being conducted periodically. However, this approach presents two main issues: on the one hand, if the selected working band has few idle channels, TDCS devices are unable to flexibly switch sub-bands, leading to reduced performance; on the other hand, periodic sensing consumes time and energy, limiting TDCS's transmission efficiency. In contrast to previous studies that unrealistically modeled the problem as a Markov Decision Process (MDP), this study accounts for the fact that TDCS devices cannot fully observe the entire spectrum state and must rely on historical observations, along with the current state of sub-bands, to make informed decisions. We innovatively model this as a Partially Observable Markov Decision Process (POMDP). Moreover, we consider both the number of skipped time slots and the selection of idle sub-bands, establishing distinct termination conditions for each action. By assigning different weights to balance sensing overhead and spectrum utilization while reducing conflicts, the algorithm's adaptability and performance are improved. To address the Q-value overestimation problem inherent in traditional Deep Recurrent Q-Network (DRQN) due to the use of a single network, we propose a DDRQN-BandShift strategy that combines Double Deep Q-Network (DDQN) and DRQN. Simulation results show that the proposed scheme significantly improves TDCS transmission efficiency while effectively reducing sensing costs.

Double Deep Recurrent Q-Network Dynamic spectrum access Partially observable Markov decision process Spectrum sensing Transform domain communication system

Mitola, J. & Maguire, G. Q. Cognitive radio: Making software radios more personal. IEEE Pers. Commun. 6, 13–18. https://doi.org/10.1109/98.788210 (1999). [DOI: 10.1109/98.788210]
Haykin, S. Cognitive radio: Brain-empowered wireless communications. IEEE J. Sel. Areas Commun. 23, 201–220. https://doi.org/10.1109/JSAC.2004.839380 (2005). [DOI: 10.1109/JSAC.2004.839380]
Swackhammer, P.J., Temple, M.A & Raines, R.A. Performance simulation of a transform domain communication system for multiple access applications. In MILCOM 1999. IEEE Military Communications. Conference Proceedings (Cat. No.99CH36341), vol. 2, 1055–1059, (1999) https://doi.org/10.1109/MILCOM.1999.821364 (IEEE).
Roberts, M.L., Temple, M.A., Raines, R.A & Magee, E.P. Initial acquisition performance of a transform domain communication system: Modeling and simulation results. In MILCOM 2000 Proceedings. 21st Century Military Communications. Architectures and Technologies for Information Superiority (Cat. No.00CH37155), vol. 2, 1119–1123, https://doi.org/10.1109/MILCOM.2000.904101 (IEEE) (2000).
Klein, Randall W. Wavelet Domain Communication System (WDCS): Design, Model, Simulation, and Analysis. Ph.D. thesis, Air Force Institute of Technology (2001).
Tan, Kefeng, Andrian, Jean, Candocia, Frank & Zhou, Chi. An Enhanced Wavelet Domain Communication System (EWDCS) with Nonstationary Interference Avoidance Capability. In IEEE Vehicular Technology Conference, 1–6, https://doi.org/10.1109/VTCF.2006.379 (IEEE) (2006).
Su, Hu. et al. TDCS-IDMA system for cognitive radio networks with cloud. IEEE Access 6, 20520–20530. https://doi.org/10.1109/ACCESS.2018.2825223 (2018). [DOI: 10.1109/ACCESS.2018.2825223]
Chang, Cheng, Gu, Xin, Gu, Yue, Deng, Zhijun & Wu, Haihua. S Domain Communication System and Its Anti-Interference Performance Analysis. In 2019 6th International Conference on Dependable Systems and Their Applications (DSA), 509–510, https://doi.org/10.1109/DSA.2019.00089 (IEEE)(2020).
Liang, Yuan, Da, Xinyu, Zhang, Zhe & Liu, Huijun. Design of doublethreshold basic function in transform domain communication system for covert communication. J. Huazhong Univ. Sci. Technol. Nat. Sci. Edi.45, 11–16, (2017) https://doi.org/10.13245/j.hust.171103 .
Chae, Keunhong, Park, Jungin & Kim, Yusung. Rethinking autocorrelation for deep spectrum sensing in cognitive radio networks. IEEE Internet Things J. 10, 31–41. https://doi.org/10.1109/JIOT.2022.3200968 (2023). [DOI: 10.1109/JIOT.2022.3200968]
Mehrabian, Amir, Sabbaghian, Maryam & Yanikomeroglu, Halim. CNN-based detector for spectrum sensing with general noise models. IEEE Trans. Wireless Commun. 22, 1235–1249. https://doi.org/10.1109/TWC.2022.3203732 (2023). [DOI: 10.1109/TWC.2022.3203732]
Liu, Chang, Wang, Jie, Liu, Xuemeng & Liang, Ying-Chang. Deep CM-CNN for spectrum sensing in cognitive radio. IEEE J. Sel. Areas Commun. 37, 2306–2321. https://doi.org/10.1109/JSAC.2019.2933892 (2019). [DOI: 10.1109/JSAC.2019.2933892]
Sun, Hongyi et al. A cost-efficient skipping based spectrum sensing scheme via reinforcement learning. IEEE Trans. Veh. Technol. 71, 2220–2224. https://doi.org/10.1109/TVT.2021.3136197 (2022). [DOI: 10.1109/TVT.2021.3136197]
Sutton, Richard S & Barto, Andrew G. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning (The MIT Press) (1998).
Nguyen Cong Luong et al. Applications of Deep Reinforcement Learning in Communications and Networking: A Survey. IEEE Communications Surveys & Tutorials21, 3133–3174, (2019)
Liu, Xin et al. Reinforcement learning-based multislot double-threshold spectrum sensing with bayesian fusion for industrial big spectrum data. IEEE Trans. Industr. Inf. 17, 3391–3400. https://doi.org/10.1109/TII.2020.2987421 (2021). [DOI: 10.1109/TII.2020.2987421]
Huang, X-L., Yu-Xuan L., Gao, Y. & X-W Tang. Q-Learning-Based Spectrum Access for Multimedia Transmission Over Cognitive Radio Networks. IEEE Trans. Cognitive Commun. Netw.7, 110–119. https://doi.org/10.1109/TCCN.2020.3027297 (2021).
Morozs, Nils, Clarke, Tim & Grace, David. Distributed heuristically accelerated Q-learning for robust cognitive spectrum management in LTE cellular systems. IEEE Trans. Mob. Comput. 15, 817–825. https://doi.org/10.1109/TMC.2015.2442529 (2016). [DOI: 10.1109/TMC.2015.2442529]
Md Arman Hossen & Sang-Jo Yoo. Q-learning based multi-objective clustering algorithm for cognitive radio Ad hoc networks. IEEE Access 7, 181959–181971. https://doi.org/10.1109/ACCESS.2019.2959313 (2019). [DOI: 10.1109/ACCESS.2019.2959313]
van Hasselt, Hado, Guez, Arthur & Silver, David. Deep Reinforcement Learning with Double Q-learning (2015). arxiv:1509.06461 .
Hausknecht, Matthew & Stone, Peter. Deep Recurrent Q-Learning for Partially Observable MDPs (2017). arxiv:1507.06527 .
Gilbert, E. N. Capacity of a burst-noise channel. Bell Syst. Tech. J. 39, 1253–1265. https://doi.org/10.1002/j.1538-7305.1960.tb03959.x (1960). [DOI: 10.1002/j.1538-7305.1960.tb03959.x]
Mohamed A. Aref, Machuzak, Stephen, Jayaweera, Sudharman K & Lane, Steven. Replicated Q-learning Based Sub-band Selection for Wideband Spectrum Sensing in Cognitive Radios. In 2016 IEEE/CIC International Conference on Communications in China (ICCC), (2016) https://doi.org/10.1109/ICCChina.2016.7636732 .
Sawaki, K. & Ichikawa, A. Optimal Control for Partially Observable Markov Decision Processes over an Infinite Horizon. J. Operat. Res. Soc. Jpn.21, 1–16, (1978) https://doi.org/10.15807/jorsj.21.1 .
Tian, Z. & Giannakis, G. B. Compressed Sensing for Wideband Cognitive Radios. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing—ICASSP ’07, IV–1357–IV–1360, (2007) https://doi.org/10.1109/ICASSP.2007.367330 (IEEE).

Journal Article

OpenLB
Open Library of Bioscience