A framework for measuring the training efficiency of a neural architecture.

Eduardo Cueto-Mendoza, John Kelleher
Author Information
  1. Eduardo Cueto-Mendoza: School of Computer Science, TU Dublin, Grangegorman, Dublin 7, D07H6K8 Co. Dublin Ireland.
  2. John Kelleher: ADAPT Research Centre, School of Computer Science and Statistics, Trinity College Dublin, Dublin 2, Co. Dublin Ireland.

Abstract

Measuring Efficiency in neural network system development is an open research problem. This paper presents an experimental framework to measure the training efficiency of a neural architecture. To demonstrate our approach, we analyze the training efficiency of Convolutional Neural Networks and Bayesian equivalents on the MNIST and CIFAR-10 tasks. Our results show that training efficiency decays as training progresses and varies across different stopping criteria for a given neural model and learning task. We also find a non-linear relationship between training stopping criteria, training Efficiency, model size, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtraining on measuring the training efficiency of a neural architecture. Regarding relative training efficiency across different architectures, our results indicate that CNNs are more efficient than BCNNs on both datasets. More generally, as a learning task becomes more complex, the relative difference in training efficiency between different architectures becomes more pronounced.

Keywords

References

  1. Nat Commun. 2020 Jan 13;11(1):233 [PMID: 31932590]
  2. IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8036-8055 [PMID: 38743547]
  3. Sci Rep. 2024 Jul 2;14(1):15197 [PMID: 38956088]

Word Cloud

Created with Highcharts 10.0.0trainingefficiencyneuralEfficiencyarchitecturedifferentlearningframeworkresultsacrossstoppingcriteriamodeltaskmeasuringrelativearchitecturesbecomesDeepMeasuringnetworksystemdevelopmentopenresearchproblempaperpresentsexperimentalmeasuredemonstrateapproachanalyzeConvolutionalNeuralNetworksBayesianequivalentsMNISTCIFAR-10tasksshowdecaysprogressesvariesgivenalsofindnon-linearrelationshipsizeFurthermoreillustratepotentialconfoundingeffectsovertrainingRegardingindicateCNNsefficientBCNNsdatasetsgenerallycomplexdifferencepronouncednetworksHyperparameters

Similar Articles

Cited By

No available data.