Large-Scale Automated Sleep Staging.

Haoqi Sun, Jian Jia, Balaji Goparaju, Guang-Bin Huang, Olga Sourina, Matt Travis Bianchi, M Brandon Westover
Author Information
  1. Haoqi Sun: Energy Research Institute @ NTU, Interdisciplinary Graduate School, Nanyang Technological University, 639798, Singapore.
  2. Jian Jia: School of Mathematics, Northwest University, Xi'an, Shaanxi, 710127China.
  3. Balaji Goparaju: Massachusetts General Hospital, Neurology Department,Boston, MA.
  4. Guang-Bin Huang: School of Electrical and Electronic Engineering, Nanyang Technological University, 639798,Singapore.
  5. Olga Sourina: Fraunhofer IDM @ NTU, Nanyang Technological University, 639798, Singapore.
  6. Matt Travis Bianchi: Massachusetts General Hospital, Neurology Department,Boston, MA.
  7. M Brandon Westover: Massachusetts General Hospital, Neurology Department,Boston, MA.

Abstract

Study Objectives: Automated sleep staging has been previously limited by a combination of clinical and physiological heterogeneity. Both factors are in principle addressable with large data sets that enable robust calibration. However, the impact of sample size remains uncertain. The objectives are to investigate the extent to which machine learning methods can approximate the performance of human scorers when supplied with sufficient training cases and to investigate how staging performance depends on the number of training patients, contextual information, model complexity, and imbalance between sleep stage proportions.
Methods: A total of 102 features were extracted from six electroencephalography (EEG) channels in routine polysomnography. Two thousand nights were partitioned into equal (n = 1000) training and testing sets for validation. We used epoch-by-epoch Cohen's kappa statistics to measure the agreement between classifier output and human scorer according to American Academy of Sleep Medicine scoring criteria.
Results: Epoch-by-epoch Cohen's kappa improved with increasing training EEG recordings until saturation occurred (n = ~300). The kappa value was further improved by accounting for contextual (temporal) information, increasing model complexity, and adjusting the model training procedure to account for the imbalance of stage proportions. The final kappa on the testing set was 0.68. Testing on more EEG recordings leads to kappa estimates with lower variance.
Conclusion: Training with a large data set enables automated sleep staging that compares favorably with human scorers. Because testing was performed on a large and heterogeneous data set, the performance estimate has low variance and is likely to generalize broadly.

Keywords

References

  1. Sleep. 2013 Apr 01;36(4):591-6 [PMID: 23565005]
  2. J Neurosci Methods. 2012 Mar 30;205(1):169-76 [PMID: 22245090]
  3. J Clin Sleep Med. 2017 Feb 15;13(2):245-258 [PMID: 27784419]
  4. Methods Inf Med. 2010;49(3):230-7 [PMID: 20091018]
  5. Sleep. 2007 Nov;30(11):1587-95 [PMID: 18041491]
  6. Sleep. 2013 Apr 01;36(4):573-82 [PMID: 23565003]
  7. J Neurosci Methods. 2015 Jul 30;250:94-105 [PMID: 25629798]
  8. Nat Sci Sleep. 2015 Sep 18;7:101-11 [PMID: 26425109]
  9. J Clin Sleep Med. 2016 May 15;12(5):735-46 [PMID: 26951417]
  10. Sleep. 1996 Jan;19(1):26-35 [PMID: 8650459]
  11. IEEE Eng Med Biol Mag. 2001 May-Jun;20(3):51-7 [PMID: 11446210]
  12. Neuropsychobiology. 2010;62(4):250-64 [PMID: 20829636]
  13. J Clin Sleep Med. 2007 Mar 15;3(2):121-31 [PMID: 17557422]
  14. Neuropsychobiology. 2005;51(3):115-33 [PMID: 15838184]
  15. Sleep. 2016 May 01;39(5):1151-64 [PMID: 27070134]
  16. Sleep. 2015 Oct 01;38(10):1555-66 [PMID: 25902809]
  17. IEEE Trans Biomed Eng. 2014 May;61(5):1555-64 [PMID: 24759284]
  18. Sleep. 2015 Apr 01;38(4):641-54 [PMID: 25348125]
  19. J Clin Sleep Med. 2016 Oct 15;12(10):1347-1356 [PMID: 27448418]
  20. J Neurosci Methods. 2016 Sep 15;271:107-18 [PMID: 27456762]
  21. Ann Am Thorac Soc. 2015 Aug;12(8):1206-18 [PMID: 26065574]
  22. J Sleep Res. 2009 Mar;18(1):74-84 [PMID: 19250176]

Grants

  1. K23 NS090900/NINDS NIH HHS

MeSH Term

Adult
Electroencephalography
Electronic Data Processing
Female
Humans
Machine Learning
Male
Middle Aged
Observer Variation
Polysomnography
Reproducibility of Results
Sleep
Sleep Apnea Syndromes
Sleep Stages

Word Cloud

Created with Highcharts 10.0.0trainingkappasleepdataEEGstaginglargeperformancehumanmodeltestingsetAutomatedsetsinvestigatemachinelearningscorerscontextualinformationcomplexityimbalancestageproportionsn=Cohen'sSleepimprovedincreasingrecordingsvarianceStudyObjectives:previouslylimitedcombinationclinicalphysiologicalheterogeneityfactorsprincipleaddressableenablerobustcalibrationHoweverimpactsamplesizeremainsuncertainobjectivesextentmethodscanapproximatesuppliedsufficientcasesdependsnumberpatientsMethods:total102featuresextractedsixelectroencephalographychannelsroutinepolysomnographyTwothousandnightspartitionedequal1000validationusedepoch-by-epochstatisticsmeasureagreementclassifieroutputscoreraccordingAmericanAcademyMedicinescoringcriteriaResults:Epoch-by-epochsaturationoccurred~300valueaccountingtemporaladjustingprocedureaccountfinal068TestingleadsestimateslowerConclusion:TrainingenablesautomatedcomparesfavorablyperformedheterogeneousestimatelowlikelygeneralizebroadlyLarge-ScaleStagingbigstages

Similar Articles

Cited By