Criminal networks analysis in missing data scenarios through graph distances.

Annamaria Ficara, Lucia Cavallaro, Francesco Curreri, Giacomo Fiumara, Pasquale De Meo, Ovidiu Bagdasar, Wei Song, Antonio Liotta
Author Information
  1. Annamaria Ficara: DMI Department, University of Palermo, Palermo, Italy. ORCID
  2. Lucia Cavallaro: School of Computing and Engineering, University of Derby, Derby, United Kingdom. ORCID
  3. Francesco Curreri: DMI Department, University of Palermo, Palermo, Italy.
  4. Giacomo Fiumara: MIFT Department, University of Messina, Messina, Italy.
  5. Pasquale De Meo: DICAM Department, University of Messina, Messina, Italy.
  6. Ovidiu Bagdasar: School of Computing and Engineering, University of Derby, Derby, United Kingdom. ORCID
  7. Wei Song: College of Information Technology, Shanghai Ocean University, Shanghai, China.
  8. Antonio Liotta: Faculty of Computer Science, Free University of Bozen-Bolzano, Bozen-Bolzano, Italy.

Abstract

Data collected in criminal investigations may suffer from issues like: (i) incompleteness, due to the covert nature of criminal organizations; (ii) incorrectness, caused by either unintentional data collection errors or intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyze nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data, and to determine which network type is most affected by it. The networks are firstly pruned using two specific methods: (i) random edge removal, simulating the scenario in which the Law Enforcement Agencies fail to intercept some calls, or to spot sporadic meetings among suspects; (ii) node removal, modeling the situation in which some suspects cannot be intercepted or investigated. Finally we compute spectral distances (i.e., Adjacency, Laplacian and normalized Laplacian Spectral Distances) and matrix distances (i.e., Root Euclidean Distance) between the complete and pruned networks, which we compare using statistical analysis. Our investigation identifies two main features: first, the overall understanding of the criminal networks remains high even with incomplete data on criminal interactions (i.e., when 10% of edges are removed); second, removing even a small fraction of suspects not investigated (i.e., 2% of nodes are removed) may lead to significant misinterpretation of the overall network.

References

  1. Sci Rep. 2019 Nov 26;9(1):17557 [PMID: 31772246]
  2. PLoS One. 2020 Aug 5;15(8):e0236476 [PMID: 32756592]
  3. PLoS One. 2015 Mar 16;10(3):e0119309 [PMID: 25775130]
  4. PLoS One. 2016 Apr 22;11(4):e0154244 [PMID: 27104948]
  5. Sci Rep. 2014 Feb 28;4:4238 [PMID: 24577374]
  6. PLoS One. 2020 Feb 12;15(2):e0228728 [PMID: 32050004]

MeSH Term

Algorithms
Criminals
Data Analysis
Humans
Social Networking
Terrorism

Word Cloud

Created with Highcharts 10.0.0criminalnetworksiedatasuspectsdistancescollectedmaynatureorganizationsiidifferentincompletenetworkprunedusingtworemovalinvestigatedLaplaciananalysisoverallevenremovedDatainvestigationssufferissueslike:incompletenessduecovertincorrectnesscausedeitherunintentionalcollectionerrorsintentionaldeceptioncriminalsiiiinconsistencyinformationlawenforcementdatabasesmultipletimesformatspaperanalyzeninerealMafiastreetgangsterroristorderquantifyimpactdeterminetypeaffectedfirstlyspecificmethods:randomedgesimulatingscenarioLawEnforcementAgenciesfailinterceptcallsspotsporadicmeetingsamongnodemodelingsituationinterceptedFinallycomputespectralAdjacencynormalizedSpectralDistancesmatrixRootEuclideanDistancecompletecomparestatisticalinvestigationidentifiesmainfeatures:firstunderstandingremainshighinteractions10%edgessecondremovingsmallfraction2%nodesleadsignificantmisinterpretationCriminalmissingscenariosgraph

Similar Articles

Cited By