research papers on machine learning applications

Frequently Asked Questions

Journal of Machine Learning Research

The Journal of Machine Learning Research (JMLR), established in 2000 , provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online.

  • 2024.02.18 : Volume 24 completed; Volume 25 began.
  • 2023.01.20 : Volume 23 completed; Volume 24 began.
  • 2022.07.20 : New special issue on climate change .
  • 2022.02.18 : New blog post: Retrospectives from 20 Years of JMLR .
  • 2022.01.25 : Volume 22 completed; Volume 23 began.
  • 2021.12.02 : Message from outgoing co-EiC Bernhard Schölkopf .
  • 2021.02.10 : Volume 21 completed; Volume 22 began.
  • More news ...

Latest papers

Topological Node2vec: Enhanced Graph Embedding via Persistent Homology Yasuaki Hiraoka, Yusuke Imoto, Théo Lacombe, Killian Meehan, Toshiaki Yachimura , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Granger Causal Inference in Multivariate Hawkes Processes by Minimum Message Length Katerina Hlaváčková-Schindler, Anna Melnykova, Irene Tubikanec , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Representation Learning via Manifold Flattening and Reconstruction Michael Psenka, Druv Pai, Vishal Raman, Shankar Sastry, Yi Ma , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Bagging Provides Assumption-free Stability Jake A. Soloff, Rina Foygel Barber, Rebecca Willett , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Fairness guarantees in multi-class classification with demographic parity Christophe Denis, Romuald Elie, Mohamed Hebiri, François Hu , 2024. [ abs ][ pdf ][ bib ]

Regimes of No Gain in Multi-class Active Learning Gan Yuan, Yunfan Zhao, Samory Kpotufe , 2024. [ abs ][ pdf ][ bib ]

Learning Optimal Dynamic Treatment Regimens Subject to Stagewise Risk Controls Mochuan Liu, Yuanjia Wang, Haoda Fu, Donglin Zeng , 2024. [ abs ][ pdf ][ bib ]

Margin-Based Active Learning of Classifiers Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice , 2024. [ abs ][ pdf ][ bib ]

Random Subgraph Detection Using Queries Wasim Huleihel, Arya Mazumdar, Soumyabrata Pal , 2024. [ abs ][ pdf ][ bib ]

Classification with Deep Neural Networks and Logistic Loss Zihan Zhang, Lei Shi, Ding-Xuan Zhou , 2024. [ abs ][ pdf ][ bib ]

Spectral learning of multivariate extremes Marco Avella Medina, Richard A Davis, Gennady Samorodnitsky , 2024. [ abs ][ pdf ][ bib ]

Sum-of-norms clustering does not separate nearby balls Alexander Dunlap, Jean-Christophe Mourrat , 2024. [ abs ][ pdf ][ bib ]      [ code ]

An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization Guy Kornowski, Ohad Shamir , 2024. [ abs ][ pdf ][ bib ]

Linear Distance Metric Learning with Noisy Labels Meysam Alishahi, Anna Little, Jeff M. Phillips , 2024. [ abs ][ pdf ][ bib ]      [ code ]

OpenBox: A Python Toolkit for Generalized Black-box Optimization Huaijun Jiang, Yu Shen, Yang Li, Beicheng Xu, Sixian Du, Wentao Zhang, Ce Zhang, Bin Cui , 2024. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ]      [ code ]

Generative Adversarial Ranking Nets Yinghua Yao, Yuangang Pan, Jing Li, Ivor W. Tsang, Xin Yao , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Predictive Inference with Weak Supervision Maxime Cauchois, Suyash Gupta, Alnur Ali, John C. Duchi , 2024. [ abs ][ pdf ][ bib ]

Functions with average smoothness: structure, algorithms, and learning Yair Ashlagi, Lee-Ad Gottlieb, Aryeh Kontorovich , 2024. [ abs ][ pdf ][ bib ]

Differentially Private Data Release for Mixed-type Data via Latent Factor Models Yanqing Zhang, Qi Xu, Niansheng Tang, Annie Qu , 2024. [ abs ][ pdf ][ bib ]

The Non-Overlapping Statistical Approximation to Overlapping Group Lasso Mingyu Qi, Tianxi Li , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Faster Rates of Differentially Private Stochastic Convex Optimization Jinyan Su, Lijie Hu, Di Wang , 2024. [ abs ][ pdf ][ bib ]

Nonasymptotic analysis of Stochastic Gradient Hamiltonian Monte Carlo under local conditions for nonconvex optimization O. Deniz Akyildiz, Sotirios Sabanis , 2024. [ abs ][ pdf ][ bib ]

Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits Junpei Komiyama, Edouard Fouché, Junya Honda , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Stable Implementation of Probabilistic ODE Solvers Nicholas Krämer, Philipp Hennig , 2024. [ abs ][ pdf ][ bib ]

More PAC-Bayes bounds: From bounded losses, to losses with general tail behaviors, to anytime validity Borja Rodríguez-Gálvez, Ragnar Thobaben, Mikael Skoglund , 2024. [ abs ][ pdf ][ bib ]

Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space Zhengdao Chen , 2024. [ abs ][ pdf ][ bib ]

QDax: A Library for Quality-Diversity and Population-based Algorithms with Hardware Acceleration Felix Chalumeau, Bryan Lim, Raphaël Boige, Maxime Allard, Luca Grillotti, Manon Flageat, Valentin Macé, Guillaume Richard, Arthur Flajolet, Thomas Pierrot, Antoine Cully , 2024. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ]      [ code ]

Random Forest Weighted Local Fréchet Regression with Random Objects Rui Qiu, Zhou Yu, Ruoqing Zhu , 2024. [ abs ][ pdf ][ bib ]      [ code ]

PhAST: Physics-Aware, Scalable, and Task-Specific GNNs for Accelerated Catalyst Design Alexandre Duval, Victor Schmidt, Santiago Miret, Yoshua Bengio, Alex Hernández-García, David Rolnick , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Unsupervised Anomaly Detection Algorithms on Real-world Data: How Many Do We Need? Roel Bouman, Zaharah Bukhsh, Tom Heskes , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Multi-class Probabilistic Bounds for Majority Vote Classifiers with Partially Labeled Data Vasilii Feofanov, Emilie Devijver, Massih-Reza Amini , 2024. [ abs ][ pdf ][ bib ]

Information Processing Equalities and the Information–Risk Bridge Robert C. Williamson, Zac Cranko , 2024. [ abs ][ pdf ][ bib ]

Nonparametric Regression for 3D Point Cloud Learning Xinyi Li, Shan Yu, Yueying Wang, Guannan Wang, Li Wang, Ming-Jun Lai , 2024. [ abs ][ pdf ][ bib ]      [ code ]

AMLB: an AutoML Benchmark Pieter Gijsbers, Marcos L. P. Bueno, Stefan Coors, Erin LeDell, Sébastien Poirier, Janek Thomas, Bernd Bischl, Joaquin Vanschoren , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Materials Discovery using Max K-Armed Bandit Nobuaki Kikkawa, Hiroshi Ohno , 2024. [ abs ][ pdf ][ bib ]

Semi-supervised Inference for Block-wise Missing Data without Imputation Shanshan Song, Yuanyuan Lin, Yong Zhou , 2024. [ abs ][ pdf ][ bib ]

Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization Peng Zhao, Yu-Jie Zhang, Lijun Zhang, Zhi-Hua Zhou , 2024. [ abs ][ pdf ][ bib ]

Scaling Speech Technology to 1,000+ Languages Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli , 2024. [ abs ][ pdf ][ bib ]      [ code ]

MAP- and MLE-Based Teaching Hans Ulrich Simon, Jan Arne Telle , 2024. [ abs ][ pdf ][ bib ]

A General Framework for the Analysis of Kernel-based Tests Tamara Fernández, Nicolás Rivera , 2024. [ abs ][ pdf ][ bib ]

Overparametrized Multi-layer Neural Networks: Uniform Concentration of Neural Tangent Kernel and Convergence of Stochastic Gradient Descent Jiaming Xu, Hanjing Zhu , 2024. [ abs ][ pdf ][ bib ]

Sparse Representer Theorems for Learning in Reproducing Kernel Banach Spaces Rui Wang, Yuesheng Xu, Mingsong Yan , 2024. [ abs ][ pdf ][ bib ]

Exploration of the Search Space of Gaussian Graphical Models for Paired Data Alberto Roverato, Dung Ngoc Nguyen , 2024. [ abs ][ pdf ][ bib ]

The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective Chi-Heng Lin, Chiraag Kaushik, Eva L. Dyer, Vidya Muthukumar , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Stochastic Approximation with Decision-Dependent Distributions: Asymptotic Normality and Optimality Joshua Cutler, Mateo Díaz, Dmitriy Drusvyatskiy , 2024. [ abs ][ pdf ][ bib ]

Minimax Rates for High-Dimensional Random Tessellation Forests Eliza O'Reilly, Ngoc Mai Tran , 2024. [ abs ][ pdf ][ bib ]

Nonparametric Estimation of Non-Crossing Quantile Regression Process with Deep ReQU Neural Networks Guohao Shen, Yuling Jiao, Yuanyuan Lin, Joel L. Horowitz, Jian Huang , 2024. [ abs ][ pdf ][ bib ]

Spatial meshing for general Bayesian multivariate models Michele Peruzzi, David B. Dunson , 2024. [ abs ][ pdf ][ bib ]      [ code ]

A Semi-parametric Estimation of Personalized Dose-response Function Using Instrumental Variables Wei Luo, Yeying Zhu, Xuekui Zhang, Lin Lin , 2024. [ abs ][ pdf ][ bib ]

Learning Non-Gaussian Graphical Models via Hessian Scores and Triangular Transport Ricardo Baptista, Rebecca Morrison, Olivier Zahm, Youssef Marzouk , 2024. [ abs ][ pdf ][ bib ]      [ code ]

On the Learnability of Out-of-distribution Detection Zhen Fang, Yixuan Li, Feng Liu, Bo Han, Jie Lu , 2024. [ abs ][ pdf ][ bib ]

Win: Weight-Decay-Integrated Nesterov Acceleration for Faster Network Training Pan Zhou, Xingyu Xie, Zhouchen Lin, Kim-Chuan Toh, Shuicheng Yan , 2024. [ abs ][ pdf ][ bib ]      [ code ]

On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains Yicheng Li, Zixiong Yu, Guhan Chen, Qian Lin , 2024. [ abs ][ pdf ][ bib ]

Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions Maksim Velikanov, Dmitry Yarotsky , 2024. [ abs ][ pdf ][ bib ]

ptwt - The PyTorch Wavelet Toolbox Moritz Wolter, Felix Blanke, Jochen Garcke, Charles Tapley Hoyt , 2024. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ]      [ code ]

Choosing the Number of Topics in LDA Models – A Monte Carlo Comparison of Selection Criteria Victor Bystrov, Viktoriia Naboka-Krell, Anna Staszewska-Bystrova, Peter Winker , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Functional Directed Acyclic Graphs Kuang-Yao Lee, Lexin Li, Bing Li , 2024. [ abs ][ pdf ][ bib ]

Unlabeled Principal Component Analysis and Matrix Completion Yunzhen Yao, Liangzu Peng, Manolis C. Tsakiris , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Distributed Estimation on Semi-Supervised Generalized Linear Model Jiyuan Tu, Weidong Liu, Xiaojun Mao , 2024. [ abs ][ pdf ][ bib ]

Towards Explainable Evaluation Metrics for Machine Translation Christoph Leiter, Piyawat Lertvittayakumjorn, Marina Fomicheva, Wei Zhao, Yang Gao, Steffen Eger , 2024. [ abs ][ pdf ][ bib ]

Differentially private methods for managing model uncertainty in linear regression Víctor Peña, Andrés F. Barrientos , 2024. [ abs ][ pdf ][ bib ]

Data Summarization via Bilevel Optimization Zalán Borsos, Mojmír Mutný, Marco Tagliasacchi, Andreas Krause , 2024. [ abs ][ pdf ][ bib ]

Pareto Smoothed Importance Sampling Aki Vehtari, Daniel Simpson, Andrew Gelman, Yuling Yao, Jonah Gabry , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Policy Gradient Methods in the Presence of Symmetries and State Abstractions Prakash Panangaden, Sahand Rezaei-Shoshtari, Rosie Zhao, David Meger, Doina Precup , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Scaling Instruction-Finetuned Language Models Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Alex Castro-Ros, Marie Pellat, Kevin Robinson, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, Jason Wei , 2024. [ abs ][ pdf ][ bib ]

Tangential Wasserstein Projections Florian Gunsilius, Meng Hsuan Hsieh, Myung Jin Lee , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Learnability of Linear Port-Hamiltonian Systems Juan-Pablo Ortega, Daiying Yin , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Off-Policy Action Anticipation in Multi-Agent Reinforcement Learning Ariyan Bighashdel, Daan de Geus, Pavol Jancura, Gijs Dubbelman , 2024. [ abs ][ pdf ][ bib ]      [ code ]

On Unbiased Estimation for Partially Observed Diffusions Jeremy Heng, Jeremie Houssineau, Ajay Jasra , 2024. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ]      [ code ]

Improving Lipschitz-Constrained Neural Networks by Learning Activation Functions Stanislas Ducotterd, Alexis Goujon, Pakshal Bohra, Dimitris Perdios, Sebastian Neumayer, Michael Unser , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Mathematical Framework for Online Social Media Auditing Wasim Huleihel, Yehonathan Refael , 2024. [ abs ][ pdf ][ bib ]

An Embedding Framework for the Design and Analysis of Consistent Polyhedral Surrogates Jessie Finocchiaro, Rafael M. Frongillo, Bo Waggoner , 2024. [ abs ][ pdf ][ bib ]

Low-rank Variational Bayes correction to the Laplace method Janet van Niekerk, Haavard Rue , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Scaling the Convex Barrier with Sparse Dual Algorithms Alessandro De Palma, Harkirat Singh Behl, Rudy Bunel, Philip H.S. Torr, M. Pawan Kumar , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Causal-learn: Causal Discovery in Python Yujia Zheng, Biwei Huang, Wei Chen, Joseph Ramsey, Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes, Kun Zhang , 2024. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ]      [ code ]

Decomposed Linear Dynamical Systems (dLDS) for learning the latent components of neural dynamics Noga Mudrik, Yenho Chen, Eva Yezerets, Christopher J. Rozell, Adam S. Charles , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Existence and Minimax Theorems for Adversarial Surrogate Risks in Binary Classification Natalie S. Frank, Jonathan Niles-Weed , 2024. [ abs ][ pdf ][ bib ]

Data Thinning for Convolution-Closed Distributions Anna Neufeld, Ameer Dharamshi, Lucy L. Gao, Daniela Witten , 2024. [ abs ][ pdf ][ bib ]      [ code ]

A projected semismooth Newton method for a class of nonconvex composite programs with strong prox-regularity Jiang Hu, Kangkang Deng, Jiayuan Wu, Quanzheng Li , 2024. [ abs ][ pdf ][ bib ]

Revisiting RIP Guarantees for Sketching Operators on Mixture Models Ayoub Belhadji, Rémi Gribonval , 2024. [ abs ][ pdf ][ bib ]

Monotonic Risk Relationships under Distribution Shifts for Regularized Risk Minimization Daniel LeJeune, Jiayu Liu, Reinhard Heckel , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks Dong-Young Lim, Sotirios Sabanis , 2024. [ abs ][ pdf ][ bib ]

Axiomatic effect propagation in structural causal models Raghav Singal, George Michailidis , 2024. [ abs ][ pdf ][ bib ]

Optimal First-Order Algorithms as a Function of Inequalities Chanwoo Park, Ernest K. Ryu , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Resource-Efficient Neural Networks for Embedded Systems Wolfgang Roth, Günther Schindler, Bernhard Klein, Robert Peharz, Sebastian Tschiatschek, Holger Fröning, Franz Pernkopf, Zoubin Ghahramani , 2024. [ abs ][ pdf ][ bib ]

Trained Transformers Learn Linear Models In-Context Ruiqi Zhang, Spencer Frei, Peter L. Bartlett , 2024. [ abs ][ pdf ][ bib ]

Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees Nachuan Xiao, Xiaoyin Hu, Xin Liu, Kim-Chuan Toh , 2024. [ abs ][ pdf ][ bib ]

Efficient Modality Selection in Multimodal Learning Yifei He, Runxiang Cheng, Gargi Balasubramaniam, Yao-Hung Hubert Tsai, Han Zhao , 2024. [ abs ][ pdf ][ bib ]

A Multilabel Classification Framework for Approximate Nearest Neighbor Search Ville Hyvönen, Elias Jääsaari, Teemu Roos , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Probabilistic Forecasting with Generative Networks via Scoring Rule Minimization Lorenzo Pacchiardi, Rilwan A. Adewoyin, Peter Dueben, Ritabrata Dutta , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Multiple Descent in the Multiple Random Feature Model Xuran Meng, Jianfeng Yao, Yuan Cao , 2024. [ abs ][ pdf ][ bib ]

Mean-Square Analysis of Discretized Itô Diffusions for Heavy-tailed Sampling Ye He, Tyler Farghly, Krishnakumar Balasubramanian, Murat A. Erdogdu , 2024. [ abs ][ pdf ][ bib ]

Invariant and Equivariant Reynolds Networks Akiyoshi Sannai, Makoto Kawano, Wataru Kumagai , 2024. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ]      [ code ]

Personalized PCA: Decoupling Shared and Unique Features Naichen Shi, Raed Al Kontar , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee George H. Chen , 2024. [ abs ][ pdf ][ bib ]      [ code ]

On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang, Mengdi Wang, Alec Koppel , 2024. [ abs ][ pdf ][ bib ]

Convergence for nonconvex ADMM, with applications to CT imaging Rina Foygel Barber, Emil Y. Sidky , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms T. Tony Cai, Hongji Wei , 2024. [ abs ][ pdf ][ bib ]

Sparse NMF with Archetypal Regularization: Computational and Robustness Properties Kayhan Behdin, Rahul Mazumder , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Deep Network Approximation: Beyond ReLU to Diverse Activation Functions Shijun Zhang, Jianfeng Lu, Hongkai Zhao , 2024. [ abs ][ pdf ][ bib ]

Effect-Invariant Mechanisms for Policy Generalization Sorawit Saengkyongam, Niklas Pfister, Predrag Klasnja, Susan Murphy, Jonas Peters , 2024. [ abs ][ pdf ][ bib ]

Pygmtools: A Python Graph Matching Toolkit Runzhong Wang, Ziao Guo, Wenzheng Pan, Jiale Ma, Yikai Zhang, Nan Yang, Qi Liu, Longxuan Wei, Hanxue Zhang, Chang Liu, Zetian Jiang, Xiaokang Yang, Junchi Yan , 2024. (Machine Learning Open Source Software Paper) [ abs ][ pdf ][ bib ]      [ code ]

Heterogeneous-Agent Reinforcement Learning Yifan Zhong, Jakub Grudzien Kuba, Xidong Feng, Siyi Hu, Jiaming Ji, Yaodong Yang , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Sample-efficient Adversarial Imitation Learning Dahuin Jung, Hyungyu Lee, Sungroh Yoon , 2024. [ abs ][ pdf ][ bib ]

Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent Benjamin Gess, Sebastian Kassing, Vitalii Konarovskyi , 2024. [ abs ][ pdf ][ bib ]

Rates of convergence for density estimation with generative adversarial networks Nikita Puchkin, Sergey Samsonov, Denis Belomestny, Eric Moulines, Alexey Naumov , 2024. [ abs ][ pdf ][ bib ]

Additive smoothing error in backward variational inference for general state-space models Mathis Chagneux, Elisabeth Gassiat, Pierre Gloaguen, Sylvain Le Corff , 2024. [ abs ][ pdf ][ bib ]

Optimal Bump Functions for Shallow ReLU networks: Weight Decay, Depth Separation, Curse of Dimensionality Stephan Wojtowytsch , 2024. [ abs ][ pdf ][ bib ]

Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees Alexander Terenin, David R. Burt, Artem Artemev, Seth Flaxman, Mark van der Wilk, Carl Edward Rasmussen, Hong Ge , 2024. [ abs ][ pdf ][ bib ]      [ code ]

On Tail Decay Rate Estimation of Loss Function Distributions Etrit Haxholli, Marco Lorenzi , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces Hao Liu, Haizhao Yang, Minshuo Chen, Tuo Zhao, Wenjing Liao , 2024. [ abs ][ pdf ][ bib ]

Post-Regularization Confidence Bands for Ordinary Differential Equations Xiaowu Dai, Lexin Li , 2024. [ abs ][ pdf ][ bib ]

On the Generalization of Stochastic Gradient Descent with Momentum Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher, Ashish Khisti, Ben Liang , 2024. [ abs ][ pdf ][ bib ]

Pursuit of the Cluster Structure of Network Lasso: Recovery Condition and Non-convex Extension Shotaro Yagishita, Jun-ya Gotoh , 2024. [ abs ][ pdf ][ bib ]

Iterate Averaging in the Quest for Best Test Error Diego Granziol, Nicholas P. Baskerville, Xingchen Wan, Samuel Albanie, Stephen Roberts , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Nonparametric Inference under B-bits Quantization Kexuan Li, Ruiqi Liu, Ganggang Xu, Zuofeng Shang , 2024. [ abs ][ pdf ][ bib ]

Black Box Variational Inference with a Deterministic Objective: Faster, More Accurate, and Even More Black Box Ryan Giordano, Martin Ingram, Tamara Broderick , 2024. [ abs ][ pdf ][ bib ]      [ code ]

On Sufficient Graphical Models Bing Li, Kyongwon Kim , 2024. [ abs ][ pdf ][ bib ]

Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond Nathan Kallus, Xiaojie Mao, Masatoshi Uehara , 2024. [ abs ][ pdf ][ bib ]      [ code ]

On the Effect of Initialization: The Scaling Path of 2-Layer Neural Networks Sebastian Neumayer, Lénaïc Chizat, Michael Unser , 2024. [ abs ][ pdf ][ bib ]

Improving physics-informed neural networks with meta-learned optimization Alex Bihlo , 2024. [ abs ][ pdf ][ bib ]

A Comparison of Continuous-Time Approximations to Stochastic Gradient Descent Stefan Ankirchner, Stefan Perko , 2024. [ abs ][ pdf ][ bib ]

Critically Assessing the State of the Art in Neural Network Verification Matthias König, Annelot W. Bosman, Holger H. Hoos, Jan N. van Rijn , 2024. [ abs ][ pdf ][ bib ]

Estimating the Minimizer and the Minimum Value of a Regression Function under Passive Design Arya Akhavan, Davit Gogolashvili, Alexandre B. Tsybakov , 2024. [ abs ][ pdf ][ bib ]

Modeling Random Networks with Heterogeneous Reciprocity Daniel Cirkovic, Tiandong Wang , 2024. [ abs ][ pdf ][ bib ]

Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment Zixian Yang, Xin Liu, Lei Ying , 2024. [ abs ][ pdf ][ bib ]

On Efficient and Scalable Computation of the Nonparametric Maximum Likelihood Estimator in Mixture Models Yangjing Zhang, Ying Cui, Bodhisattva Sen, Kim-Chuan Toh , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Decorrelated Variable Importance Isabella Verdinelli, Larry Wasserman , 2024. [ abs ][ pdf ][ bib ]

Model-Free Representation Learning and Exploration in Low-Rank MDPs Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal , 2024. [ abs ][ pdf ][ bib ]

Seeded Graph Matching for the Correlated Gaussian Wigner Model via the Projected Power Method Ernesto Araya, Guillaume Braun, Hemant Tyagi , 2024. [ abs ][ pdf ][ bib ]      [ code ]

Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization Shicong Cen, Yuting Wei, Yuejie Chi , 2024. [ abs ][ pdf ][ bib ]

Power of knockoff: The impact of ranking algorithm, augmented design, and symmetric statistic Zheng Tracy Ke, Jun S. Liu, Yucong Ma , 2024. [ abs ][ pdf ][ bib ]

Lower Complexity Bounds of Finite-Sum Optimization Problems: The Results and Construction Yuze Han, Guangzeng Xie, Zhihua Zhang , 2024. [ abs ][ pdf ][ bib ]

On Truthing Issues in Supervised Classification Jonathan K. Su , 2024. [ abs ][ pdf ][ bib ]

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Comput Struct Biotechnol J

Applied machine learning in cancer research: A systematic review for patient diagnosis, classification and prognosis

Konstantina kourou.

a Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece

g Foundation for Research and Technology-Hellas, Institute of Molecular Biology and Biotechnology, Dept. of Biomedical Research, Ioannina GR45110, Greece

Konstantinos P. Exarchos

b Dept. of Respiratory Medicine, Faculty of Medicine, University of Ioannina, Ioannina, Greece

Costas Papaloukas

c Dept. of Biological Applications and Technology, University of Ioannina, Ioannina, Greece

Prodromos Sakaloglou

d Dept. of Precision and Molecular Medicine, Unit of Liquid Biopsy in Oncology, Ioannina University Hospital, Ioannina, Greece

e Laboratory of Medical Genetics in Clinical Practice, School of Health Sciences, Faculty of Medicine, University of Ioannina, Ioannina, Greece

Themis Exarchos

f Dept. of Informatics, Ionian University, Corfu, Greece

Dimitrios I. Fotiadis

Associated data.

Artificial Intelligence (AI) has recently altered the landscape of cancer research and medical oncology using traditional Machine Learning (ML) algorithms and cutting-edge Deep Learning (DL) architectures. In this review article we focus on the ML aspect of AI applications in cancer research and present the most indicative studies with respect to the ML algorithms and data used. The PubMed and dblp databases were considered to obtain the most relevant research works of the last five years. Based on a comparison of the proposed studies and their research clinical outcomes concerning the medical ML application in cancer research, three main clinical scenarios were identified. We give an overview of the well-known DL and Reinforcement Learning (RL) methodologies, as well as their application in clinical practice, and we briefly discuss Systems Biology in cancer research. We also provide a thorough examination of the clinical scenarios with respect to disease diagnosis, patient classification and cancer prognosis and survival. The most relevant studies identified in the preceding year are presented along with their primary findings. Furthermore, we examine the effective implementation and the main points that need to be addressed in the direction of robustness, explainability and transparency of predictive models. Finally, we summarize the most recent advances in the field of AI/ML applications in cancer research and medical oncology, as well as some of the challenges and open issues that need to be addressed before data-driven models can be implemented in healthcare systems to assist physicians in their daily practice.

1. Introduction

Artificial Intelligence (AI) has recently made eminent progress in many areas, including medicine and biomedical research. AI, a branch of computer science, encompasses mathematical methods that enable the decision making or action, the rational and autonomous reasoning, and the effective adaptation to complex and unseen situations [1] . AI systems regroup several different algorithms originated from the subfield of Machine Learning (ML) to advance the automation of human experts’ tasks leading to real and tangible results in healthcare. Given the digital acquisition of high-dimensional annotated medical data, the improvements in ML methods, the open ML data science and the evolving computational power and storage services, we could anticipate the tremendous progress of AI in the medical practice landscape [2] , [3] .

Recently, the medical applications of AI have expanded to clinical practice, translational medicine and the biomedical research of various diseases, such as cancer [1] , [4] . Current AI systems, based solely on ML methodologies, have been applied to different aspects of clinical practice including: (i) the image-based computer-aided detection and diagnosis within many medical specialties (i.e. pathology, radiology, ophthalmology and dermatology), (ii) the interpretation of genomic data for identifying genetic variants based on high-throughput sequencing technologies, (iii) the prediction of patients prognosis and monitoring, (iv) the discovery of novel biomarkers through the integration of omics and phenotype data, (v) the detection of health status in terms of biological signals collected from wearable devices, and finally (vi) the development and application of autonomous robots in medical interventions [1] .

Using AI/ML technologies in precision oncology and integrating them into clinical practice, however, raises technological challenges in model development [5] , [6] . Data curation and sanitization reduce the bias in collection and management preventing subsequently errors during the training and testing phases. These challenges along with social, economic and legal aspects should be considered before the deployment of AI/ML systems in medical practice to empower clinicians for better prevention, diagnosis, treatment and care in oncology. In addition, improving the performance, reproducibility and reliability of the AI/ML models would augment the work of clinicians by making better diagnostic decisions and tailoring the medical treatment to the patient's unique phenotype.

AI today is dominated by ML techniques capable of extracting patterns from large amounts of data as well as building reasoning systems for patient risk stratification and better decision making. For more accurate patient-level predictions and for modeling disease prognosis and risk prediction, data mining techniques and adaptive ML algorithms have consistently outperformed traditional statistical approaches [7] . ML-based techniques have the advantage of being able to automate the process of hypothesis formulation and evaluation, while assigning parameter weights to predictors based on correlates with the outcome prediction [6] , [8] . Despite this, the enormous promise of AI in cancer research should be carefully addressed alongside answers to the challenges of transparency and reproducibility [9] , [10] , [11] . To ensure the high potential of AI and ML in medicine and clinical trials, we need to adopt a framework for making the scientific research more transparent and reproducible.

In cancer research and oncology, the successful application of Deep Learning (DL) techniques has recently demonstrated fundamental improvements in image-based disease diagnosis and detection [12] , [13] . Generally, DL architectures correspond to artificial neural networks of multiple non-linear layers. Over the last decade, a variety of DL designs have been developed based on the input data types and the study aim (s). Concurrently, the evaluation of the model's performance has shown that DL application on cancer prognosis outperforms other conventional ML techniques [14] . DL frameworks have been also applied towards cancer diagnosis, classification and treatment by exploiting genomic profiles and phenotype data [1] , [7] , [15] .

In this review article, we focus on the ML aspect of AI-based applications in cancer research and medical oncology and present relevant studies that have been published the last five years (2016–2020) concerning the development of robust ML models towards patient diagnosis, classification and prognosis. The selection of the material was based on three clinical scenarios considering both the ML-based techniques and the heterogeneous data sources that were employed. We provide the search criteria of the literature review to obtain the most relevant studies, summarize the successful clinical scenarios towards applying robust and validated ML methods, discuss the state-of-the-art DL and Reinforcement Learning (RL) applications, present the impact of ML models in terms of robustness and explainability, identify the achievements and new challenges of ML-based systems in healthcare and discuss future research investigations along with the unsolved problems of reproducibility and transparency with possible solutions in the field. Systems biology and network-centered methods for computationally analyzing various sources of omics data and better comprehending the complex structures of biological processes and cellular components within cancer cells are also explained.

2. Literature review

The PubMed biomedical repository [16] and the dblp computer science bibliography [17] were selected to perform a literature overview on ML-based studies in cancer towards disease diagnosis, disease outcome prediction and patients’ classification. We searched and selected original research journal papers excluding reviews and technical reports between 2016 (January) and 2020 (December). In the PubMed’s advanced search option, we added the query terms “Cancer AND machine learning”, “Cancer AND artificial intelligence” and “Cancer AND deep learning”, separately, in the title field and not in the abstract to obtain the relevant studies. The same strategy and keywords were followed and applied to the dblp query search. According to our search results a total of 921 and 165 studies were found in PubMed and dblp databases, respectively, for the three queries. Duplicate studies and review or technical reports within the search results were excluded. A total of 734 research studies were further compared to provide a comprehensive overview of the application of ML and DL techniques in oncology research. We systematically reviewed the methods and outcomes of these research works and compared them until we identified the main clinical scenarios where ML methods are widely applied to enhance the automated decision support systems, the selection of appropriate treatments and the explanation of clinical reasoning.

The Tables showing the total number of studies identified in the preceding year for each search query in PubMed [16] and dblp [17] are given as supplementary material (Tables I-III). In the current work, we selected the most representative research from each clinical setting and provided a quick review of their key findings. To summarize the most current computational methods and clinical investigations in connection to early disease diagnosis, prognosis, and clinical outcome prediction for patient monitoring, we examined primarily preliminary studies published in the last year in both archives.

In Fig. 1 , we present the results of our literature overview on cancer diagnosis, prognosis and patients’ risk stratification the last five years on both databases. In Fig. 1 (upper part), a group-based barplot illustrates the number of articles that were identified when considering each search query in the databases. Fig. 1 (bottom part) illustrates the timeline results for each database. The total number of articles as regards to the total sum of the articles within the three queries is depicted per year.

An external file that holds a picture, illustration, etc.
Object name is gr1.jpg

The results of our literature overview on cancer diagnosis, patients’ classification and prognosis. The upper part of the figure presents the literature search results per category when considering each database. The bottom part of the figure depicts the timeline (last five years) results considering the total number of articles for the three search queries.

In the sections that follow we present briefly the popular and rising techniques of DL and RL along with their successful application and impact in oncology research and clinical practice. The clinical scenarios we have identified according to our literature overview are clearly presented along with the relative publications in the next sections. These clinical scenarios are among the most successful domains of biomedical ML-based applications in medical oncology. We provide details on the facets of robustness and explainability when ML-based models are employed in the healthcare systems. A summary and outlook presenting the recent advances and new challenges for the application of ML models for automated decision making in the clinical practice is also given in the last section.

2.1. Deep learning in oncology research

The potential of AI/ML techniques in biomedicine and precision oncology has become apparent with advances in new ML methods for computer-aided diagnosis [7] . These new technologies have been also integrated into the clinical practice for improving patient outcomes and accelerating clinical decision making [14] . DL approaches, a branch of ML, have recently showed great help to physicians in medical oncology with the development of medical-imaging diagnostic systems trying to improve disease diagnosis and the early detection of tumors [18] , [19] . With the availability of huge amounts of data and the parallel and distributed ML frameworks for their analysis, DL architectures have emerged and are categorized into three groups: (i) the deep neural networks (DNNs) [20] , [21] , [22] , (ii) the convolutional neural networks (CNNs) [23] , [24] and (iii) the recurrent neural networks (RNNs) [25] , [26] DL architectures are essentially artificial neural networks with numerous non-linear layers. The key distinguishing aspect of DL is that the feature layers are learnt from data using a general-purpose learning method rather than being created by the user.

ML can be roughly divided into three paradigms: (i) the supervised task that includes a label/class, (ii) the unsupervised task where no label is provided and (iii) the last category of RL techniques where an agent is trained to perform actions sequentially. Supervised techniques are mainly applied in today’s use of ML in automated patient-centered decision making with the decision trees (DTs), support vector machines (SVMs) or linear regression being the most common algorithms [27] . Based on the traditional ML techniques the main descriptors or variables are used to train a model and extract patterns and reasonable representations of feature vectors relevant to the problem under study. Despite this, the ability of conventional machine-learning approaches to analyze natural data in its raw form is limited. On the other hand, nowadays, DL (i.e. the implementation of multi-layered neural networks) has gained a lot of attention for their potential to include multiple levels of representation of features as part of the learning process, increasing thereby the model’s performance, computational feasibility and scalability [28] , [29] . DL approaches can be adapted to new representations of data allowing the different layers of features to be learned from more informative data using a complex learning procedure. DL outperforms in tasks related to perception problems (such as image analysis and sound recognition), while typical ML methods suffer from managing high dimensional datasets.

In cancer research and medical oncology, several DL architectures have been developed and applied for the classification and/or detection of cancer types [30] . The evaluation of DL models’ performances have shown that this type of techniques outperform other conventional ML approaches. DL frameworks have been also developed and further utilized for cancer diagnosis and classification based on gene expression profiles [31] , [32] , [33] . Concerning cancer prognosis and treatment, DL methods have been proposed to tackle the problem of predicting the drug response in certain cancer types [12] .

2.2. Reinforcement learning in oncology research

RL, a distinctive class of ML, has also found applications in cancer research and medical oncology towards finding the optimal treatment policies and computer-aided disease diagnosis [34] , [35] . In an RL model, an agent (i.e. the physician) learns from the interaction with his/her environment to achieve a goal based on the outcome that he/she wants to optimize (reward function). The learning process of an agent in a typical RL cycle is a continuous procedure. The interaction with the environment occurs at discrete time points. Once an environment’s state is received the agent selects a certain action to interact with it. The environment responds then to the action and the reward that the agent will or will not receive is finally determined [36] .

The incorporation of DL and RL systems into clinical practice with reference to available structured and unstructured biomedical and clinical cancer data will improve our understanding of cancer complexity and the role of risk factors and determinants in the development of effective treatment protocols.

2.3. Systems biology in oncology research

Systems biology concerns the integration of different components (i.e. genes, proteins and other cellular components) and how they interact in a dynamic environment. To facilitate our understanding on how cellular components function we need to elucidate in an integrated way how the system is organized with reference to dynamic networks of genes or proteins alongside their interactions with each other [37] . The development of AI models that predict the characteristics of large and interconnected networks found in living organisms would permit the thorough investigation of how signaling molecules produce functional cellular responses. In systems biology, mathematical descriptions of the processes during cancer progression and knowledge from network analysis and ML theories are used to identify the components and their interactions in a network-centered model and integrate them into an interconnected biological pathway. To this end, molecular or cellular associations and causal dependencies can be identified [38] , [39] , [40] .

The last decade, different omics platforms have provided large cancer datasets concerning the biological and cellular processes that can be studied at both the gene and protein levels. Applying AI/ML tools on omics data, based on systems biology and network-based theory, we may be able to expose the intricate structure of biological processes, improving our understanding of cancer onset and progression. Network theory and analysis in oncology research could permit to decipher the organization of biological signals within the cells in terms of nodes (e.g. genes or proteins) and edges which represent the relationships among them allowing to quantify the strength, type and direction of these interactions based primarily on omics data. High-throughput technologies, such as DNA microarrays, facilitate the simultaneous assessment of many gene expression levels as they vary over time. The huge amount of available experimental data may be used to obtain a better understanding of how genes interact with one another, forming a network and allowing for integrative analysis of biological systems [37] , [38] .

In the following sections, we clearly provide the clinical scenarios we found based on our literature review, along with the relevant papers published in the field of cancer research and clinical oncology.

2.4. Cancer detection and diagnosis

Arguably, automated cancer detection and diagnosis is one of the most important and successful domains of biomedical ML applications. According to our search results in PubMed and dblp, the last five years 192 research studies proposed ML-based pipelines based on conventional or state-of-the-art techniques to perform diagnostic tasks in common cancer types such as breast, lung, colon and pancreatic cancers, among others. Most of the studies used imaging data acquired from computed tomography (CT), magnetic resonance imaging (MRI), X-ray radiography and positron-emission tomography (PET) to develop automated diagnostic models based mainly on DL architectures.

In this review article, we present the most recent and indicative studies of the last year using either imaging or clinical, genomic and other relevant medical data to develop ML-based models for disease diagnosis and detection. A large proportion of this comprehensive list corresponds to studies that handle the specific clinical scenario by utilizing solely imaging data as input to DL models (Table I in the supplementary material ).

To this direction, automatic disease diagnosis was studied in terms of CNN models to early detect breast cancer by analyzing histopathological images [41] , [42] , [43] , [44] , [45] , [46] , [47] , [48] , [49] . More specifically, Zheng et al. [42] examined and proposed a CNN-based transfer learning method to early detect breast cancer by efficiently segment the ROIs. In comparison to other machine learning traditional approaches, promising results were obtained with high levels of accuracy (i.e. 97.2%) and a fair balance between sensitivity and specificity metrics (i.e. 98.3%, and 96.5%, respectively). Similar approaches were proposed by Benhammou et al. [43] , Sha et al. [44] , Kumar et al. [45] , Krithiga et al. [46] , Hameed et al. [47] , and Li et al. [48] towards assessing the diagnostic capability of deep CNN architectures by analyzing imaging slides. Based on the models' preprocessing, training, and evaluation procedures, favorable results were suggested with an average percentage of accuracy equal to 90.0%, demonstrating the authors' contribution in assisting clinicians to their diagnostic processes. DL frameworks were also designed and developed in [50] , [51] , [52] , [53] , [54] based on the CNN architecture for the analysis of CT and dermoscopy images in liver and skin cancer, respectively. In the study of Das et al. [53] the Gaussian mixture model (GMM) algorithm was primarily used to effectively segment the lesions and deep neural networks were then employed for the automated diagnostic task. Furthermore, in [54] feature selection and optimization was performed to identify the most important determinants of skin cancer detection while deep CNN was employed for melanoma detection. Promising results were obtained by the studies with high performance in terms of classification accuracy (i.e. ∼ 90.0%).

Conventional ML algorithms, such as DTs, Random Forests (RFs), Naïve Bayes (NB), k Nearest Neighbor (kNN), Artificial Neural Networks (ANNs), Gradient Boosting Machines (GBMs) and SVMs were also applied the last year in medical oncology for the automated detection and diagnosis of cancer. Indicative works include the studies of [55] , [56] , [57] , [58] , [59] , [60] , [61] , [62] where positive results were obtained by employing traditional ML techniques for the analysis of clinical, laboratory, genomic and epidemiological data to effectively diagnose prostate, lung, colorectal, breast and gastric cancers. In a separate work [63] , supervised and unsupervised techniques were applied to transcriptomic data for the identification of potential candidate biomarkers (i.e. genes) with reference to pancreatic cancer onset. Preprocessing steps based on certain bioinformatics workflows were applied to detect the novel gene set that contributes to the extension of prostate cancer to adjacent lymph nodes with Area Under the ROC Curve (AUC) higher than 0.90.

The total number of published studies identified in the previous year based on our literature search results for cancer prognosis and survival prediction is shown in Table II in the supplementary material .

2.5. Cancer patient classification

In medical oncology and cancer research the classification task of disease prediction has been studied thoroughly based on well-established ML algorithms for handling binary or multi-class learning problems. Patient classification into predefined groups would enable the development of ML-based predictive models able to assess risk stratification with generalizable performance. In this regard, numerous research papers were released last year that predicted the identification of key variables for cancer classification using traditional algorithms and DL methods. Most of the studies utilized DL architectures for the analysis of imaging and genomic data with respect to risk prediction and stratification. Indicatively, in [64] , [65] , [66] , [67] , [68] , [69] DL models were trained to classify and detect disease subtypes based on images and genetic data. These data-driven approaches demonstrated the superiority of ML-based frameworks towards exploiting heterogeneous datasets with respect to improved diagnosis and treatment.

Recently, a very interesting study was proposed by Li et al. [70] with regards to the assessment of pan-cancer Ras pathway activation and the identification of hidden key players during disease progression. RNA sequencing, copy number and mutation data were integrated in the DL model to provide insights into the pathway activity. The proposed model achieved superior performance in comparison to relevant studies concerning the classification of abnormal activity of the Ras pathway in tumor samples (i.e. AUC > 0.90) In an alternative study, a colorectal cancer (CRC) cohort [71] was analyzed based on whole-genome sequencing experiments of DNA samples to obtain an ML model with accurate generalization performance towards the early detection of the disease. A comprehensive ML-based pipeline was proposed to investigate the genomic profiles and cancer status and further identify the highly ranked covariates that discriminate control cases and early-stage CRC patients. According to the performance results a mean AUC of 0.92 with a mean sensitivity of 85.0% at 85.0% specificity were achieved.

Furthermore, well-known adaptive ML algorithms have been used widely in the literature for cancer classification by integrating different types of data [72] , [73] , [74] , [75] , [76] . Song et al. [77] proposed a predictive model for long-term prognosis of bladder cancer based on the learning ability of ML algorithms. The validated classification model was developed by utilizing clinical and molecular features able to distinguish cancerous from non-cancerous samples. Positive results were obtained in terms of classification performance with AUC higher than 0.70. Recently, similar works were published [60] , [78] , [79] , [80] , [81] aiming at applying data-driven methodologies to classify cancer data for prediction purposes. These studies correspond to ML-based models that improve the decision making process of physicians during patient monitoring and follow-up. Due to the availability of large amounts of heterogeneous data types in cancer research these studies utilized cancer data coming from patient registries, electronic health records, demographics, sequencing and imaging technologies.

Two distinct research studies [82] , [83] were published currently which use CT data integrated with radiomics features to classify cancer cases for improved prediction of lung cancer and in pulmonary lesions, respectively. The combination of radiomic features with clinical information in terms of ML algorithms empowered the extraction of potential patient characteristics that need to be considered thoroughly for disease screening. The performance metrics of the proposed ML-based methods were high with classification accuracy and AUC higher than 77.0% and 0.80, respectively.

The total number of studies identified in the previous year based on our indicative literature search results for cancer prognosis and survival prediction is shown in Table III in the supplementary material .

2.6. Cancer prognosis and survival prediction

This is another important aspect of cancer research where AI is expected to provide significant insight in the management of patients diagnosed with cancer. Specifically, in this category we have gathered studies aiming to assess the prognosis of patients, i.e. predict approximate survival based on a set of features (clinical, imaging, genomic), evaluate response to treatment and consequently patient prognosis. Due to the volume of data and its complexity, such analyses would be inevitable without the employment of ML algorithms and especially DL techniques. Specifically, during the last year only, approximately 200 studies were published aiming to assess cancer prognosis. Among them, the considerable majority utilized DL techniques, whereas only a small percentage used traditional ML algorithms.

Same as in the previous scenarios as well, and in accordance with the cancer research overall, certain organ cancers are predominantly studied, such as breast, lung, prostate and colorectal. The types of data used vary across the studies, however, we observe propensity towards specific sources of data for each cancer type. Specifically, pathology data are used in prostate cancer, breast and colorectal cancer research utilize genomic data, and lung cancer is largely dependent on imaging data, especially CT scans. Despite those slight propensities per cancer type, we observe that all ML techniques, and especially DL techniques, are primarily used for the analysis of imaging data, indifferent to the type of imaging modality employed.

An interesting approach was recently proposed by [84] where an automated deep learning system was trained to grade prostate biopsies following the Gleason grading system. Similar approaches have been presented in the literature for assessing prostate cancer prognosis [85] , [86] . In the same manner, a DL approach is proposed in [87] to discern between benign and malignant lesions of the skin, resulting in an overall AUC = 0.91. Another commonly used purpose of ML for cancer prognosis, is the assessment of approximate survival of the patient based on a set of features, from the baseline; encompassing information from subsequent follow-up visits achieves higher accuracy. Such studies have been presented in the literature for several types of cancer, e.g. lung cancer [88] , breast cancer [89] , bladder cancer [90] , etc. For a similar end purpose, ML algorithms have also been used for predicting response to treatment and consequently assessing the patient’s overall prognosis and survival [91] .

The total number of relevant studies identified in the preceding year based on our literature search results for cancer prognosis and survival prediction is shown in Tables I-III in the supplementary material .

2.7. Robustness and explainability of AI/ML models

The recent advances in AI/ML raised the issue of vulnerabilities that affect the predictive models and strongly impact their robustness. To this direction, a set of principles for trustworthy and secure use of ML techniques in the digital society have been drawn to augment innovation while protecting fundamental human rights [92] . Although ML techniques could extract complex patterns and correlations from large datasets, there is a severe lack of understanding considering the causal relationships and the explicit rules [93] .

To ensure the right deployment of ML models in the clinical practice in accordance with a sound regulatory framework, three main topics need to be highlighted and addressed. Firstly, the transparency of the models, which relates to the technical requirements and the data used, should be achieved. To obtain a complete view of a ML model the levels of implementation (i.e. technical principles), specifications (i.e. details about the training and testing phase) and interpretability (understanding model’s reasoning) must be fulfilled. The second topic concerns the reliability of the models along with the technical solutions that need to be clarified and adopted to prevent failures of autonomous systems in specific conditions. To assess the reliability of a model its performance and vulnerabilities need to be evaluated. Poor performance and existence of malfunctions indicate that a learned ML model is not reliable. Hence, the approaches of: (i) data sanitization, (ii) robust learning and (iii) extensive testing have been proposed to increase the reliability of ML models. The protection of sensitive data in ML systems is the third point that needs to be encountered for ensuring a good regulatory framework towards building and making use of automated systems. The implementation of data protection principles will guarantee the compliance to the privacy and data protection laws. Nevertheless [94] , the use of anonymization and pseudonymization techniques on sensitive data in accordance with the General Data Protection Regulation (GDPR) in Europe [95] and the guidelines on how the information may be used or shared in accordance with the Health Insurance Portability and Accountability Act (HIPPA) in the United States [96] may increase model’s complexity and impact its explainability.

Understanding the mechanisms and reasoning of a ML system in the digital society could guarantee its reliability. Introducing standardized approaches to assess the robustness of predictive models with respect to the data used for training, promoting model’s transparency through explainability-by-design principle for ML-based systems and designing methodologies to address vulnerabilities ensuring thereby the reliability will promote an effective and secure use of AI/ML systems. Furthermore, the successful establishment of good practices towards the right development and deployment of automated ML-based systems will ensure a regulatory framework for strengthening the trust in AI/ML systems.

Explainable AI (XAI) provides a framework to facilitate the understanding why an AI system or ML model have produced a given result. Interpreting the output of a model and giving the explanations at the local and global levels would benefit the improvement of clinical decision support systems. Model-specific and model-agnostic analysis could be implemented for black-box models’ explanations (such as SVM models) increasing thereby their trustworthiness and transparency in clinical trials [97] . Model-specific explanations are common but not well-adapted for two models with different structures. Once a new architecture for a predictive model is obtained a new method for model exploration and diagnostics should be searched. On the contrary, model-agnostic techniques could enhance models’ exploratory analysis with instance-level exploration methods for better understanding on how a model yields a prediction for a particular single observation. Apart from instance-level explanations, dataset-level-explainers for ML-based predictive models help to understand how the model’s predictions perform for the entire dataset and not for a certain observation [97] . Concerning the explanations of network-based models and tree-based classifiers (e.g. DTs and RFs), XAI techniques related to local and global explanations could benefit more the output interpretation concerning their less complex structure and hyper-parameters tuning. Although, DL techniques have been proved very efficient and effective regarding their performance, explanations on how a DL model has produced a result should be based on more comprehensive techniques with reference to model-specific and model-agnostic analysis.

2.8. Summary and outlook

In the previous review article [98] , we provided a comprehensive overview of ML applications for cancer prognosis and prediction by explaining the main aspects of ML and their clinical implications. On this basis, we analyzed the most representative studies published between 2010 and 2015 which used traditional ML approaches to predict cancer susceptibility, recurrence, and survivability in cancer patients. In this work, we conduct a thorough and complementary literature search on the application of ML-based models in cancer research and oncology the last five years (2015–2020). According to our search results in both the PubMed [16] and dblp [17] databases, a comprehensive list of publications was obtained. In comparison to our previous review article, three different clinical scenarios were identified according to their clinical outcomes regarding disease diagnosis, patient classification and prognosis. To highlight the most recent achievements in the subject, we included the most representative studies from the previous year in each category. Furthermore, we investigated the contrasts between recent research works in terms of data used and cutting-edge ML approaches for addressing each clinical situation in the data-driven era of precision oncology.

AI and ML approaches may be utilized to explore many aspects of cancer biology and extract new insights given that any disruption to the genetic material causes the beginning and development of carcinogenesis. Apparently, a massive quantity of genetic information on neoplasia has now been gathered, and it is rapidly increasing. This knowledge has significantly aided two key goals in cancer research: (i) a better understanding of the processes and mechanisms of oncogenesis, and (ii) their direct use in clinical practice as markers of diagnosis, prognosis, prevention, and cancer treatment. Fig. 2 depicts the major subject themes in cancer biology that have been extensively explored in terms of ML applications during the last decade. We classified the main topics according to the well-established research domains in oncology research and provide indicative paradigms where ML-based methods can be employed.

An external file that holds a picture, illustration, etc.
Object name is gr2.jpg

The most widely researched study topics in cancer biology, where ML-based techniques are frequently used. The six major features are illustrated, as well as the major biological issues they may address. CNVs: Copy Number Variations, DNA: Deoxyribonucleic acid, RNA: Ribonucleic acid, miRNAs: microRNAs

The last decade, AI and its technologies has made tremendous progress helping clinicians to automate tasks, detect the disease earlier while obtaining more real and tangible results for tailoring treatments. In comparison to the traditional statistical approaches, ML-based techniques have the inherent ability to identify patterns from high-dimensional datasets while automating the decision-making process by developing reasoning systems for disease early diagnosis, prognosis and risk stratification.

In the light of the recent advances in DL and RL methods for building ML-oriented systems we herein discuss their main characteristics alongside their major applications in the field of cancer research. Regarding the robustness and explainability of AI/ML systems we provide a brief overview of the standard points that need to be addressed when building ML models towards establishing a trustworthy regulatory framework while ensuring reliability, data protection and transparency as well as understanding of models.

ML methodologies have raised concerns in automated decision making tools and personal data regarding the lack of reasoning and explicit rules in black-box models. Hence, technical solutions need to be adopted for the design of standardized principles to increase the robustness and explainability of AI/ML systems as well as face the challenges of transparency and reproducibility of AI-based solutions. Transparency and reproducibility in AI are paramount for prospectively validate and implement in clinical practice such technologies and models [9] , [10] . Several frameworks and reproducible research practices have been implemented to ensure that the methods and code underpinning a research publication are adequately documented. Transparency is handled in terms of common code, software dependencies, and parameters required to train a model, allowing thereby the research study to be reproducible [99] , [100] . Practical and pragmatic recommendations for the effective documentation of research experiments and results have been proposed in the scientific community towards reproducible research and open science [100] . The degrees of reproducibility that are introduced concern the: (i) experiment reproducible, (ii) data reproducible and (iii) method reproducible. Different set of factors need to be documented within a publication to validate and reproduce the research results. Encouraging the research community to follow the best practices and recommendations for (i) data in publications, (ii) source code implementing AI/ML, (iii) AI methods and (iv) experiments described in scientific publications would be the steppingstone to accelerate transparency and reproducibility in the era of AI research. Several research groups cited replicable and clinically validated results in accordance with the oncology context, as well as transparency and validity in AI/ML-based solutions, concerning the clinical scenarios we addressed in this study and the indicative publications of 2020 described in each case [57] , [67] , [84] .

Furthermore, transparent and reliable predictive models can protect the sensitive data according to anonymization and pseudonymization approaches that have been assessed by the GDPR [94] , [95] . Nowadays, ML-based systems are not yet considered reliable enough to avoid any malfunction without the human supervision. Identifying the vulnerabilities of ML models would foster the predictions of the given input and output in the learning process of a predictive model enhancing its robustness. According to the guidelines for the ethical development of ML-based systems [101] , an ethics by design approach has been proposed for trustworthy AI/ML for GDPR-compliant, ethical and robust systems. Certain ethics and trustworthy aspects are outlined along with possible tools to self-assess an automated decision support system based on cutting-edge ML methodologies.

In addition to the standard technical solutions regarding the trustworthiness of autonomous ML-based systems in clinical practice we should also take into consideration the FAIR (Findable, Accessible, Interoperable and Reusable) data principles [102] . Thinking of the complex nature of cancer and the multistep process of tumorigenesis, one can easily presume that not enough data can be obtained from single centers regarding cancer research. Tailoring treatments to patients according to their status at both the phenotype and genotype levels would accelerate the automated decision process in disease management in the era of precision oncology. Moreover, the rise of omics data and their integration in precision oncology will promote a global and integrative analytical approach. Therefore, the adherence to the FAIR (Findable, Accessible, Interoperable and Reusable) principles when developing computational models leverages the adoption of data quality guidelines, data integration procedures and GDPR-compliance data sharing and access.

Dealing with multiple data modalities, i.e. multimodal frameworks, when building a ML-based framework for cancer prediction and classification, poses a new challenge in the field of cancer research. The development of integrative predictive models by combining the output from different algorithms is an innovation but also a challenge for the interpretation and reliability of the models implications in clinical practice.

To achieve our mission towards precision oncology and better understand the complex mechanisms of cancer, intervention actions should be designed by means of evidence-based decision support tools to prevent what is preventable, optimise diagnostics and treatment and support the quality of life of patients and caregivers. Furthermore, considering the COVID-19 pandemic in the last two years and the situation in the public healthcare systems, we can admit that cancer patients faced a severe and anxious period of follow-up visits trying to avoid a possible COVID-19 diagnosis which resulted in reduced hospitalizations and procedures [103] , [104] . As a result, we can foresee the influence of the COVID-19 pandemic on cancer early detection, on top of worsening prognosis and patient screening.

CRediT authorship contribution statement

Konstantina Kourou: Conceptualization, Writing – original draft, Visualization, Writing - review & editing. Konstantinos P. Exarchos: Conceptualization, Writing - review & editing. Costas Papaloukas: Conceptualization, Writing - review & editing. Prodromos Sakaloglou: Visualization. Themis Exarchos: Conceptualization. Dimitrios I. Fotiadis: Conceptualization, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

This work has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 777167.

Appendix A Supplementary data to this article can be found online at https://doi.org/10.1016/j.csbj.2021.10.006 .

Appendix A. Supplementary data

The following are the Supplementary data to this article:

NTRS - NASA Technical Reports Server

Available downloads, related records.

Machine Learning Applications for Precision Agriculture: A Comprehensive Review

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

  • Computer Vision
  • Federated Learning
  • Reinforcement Learning
  • Natural Language Processing
  • New Releases
  • Advisory Board Members
  • 🐝 Partnership and Promotion

Logo

Training various deep learning architectures to compute multiple abstract features reveals systematic biases in feature representation. These biases depend on extraneous properties like feature complexity, learning order, and feature distribution. Simpler or earlier-learned features are represented more strongly than complex or later-learned ones, even if all are learned equally well. Architectures, optimizers, and training regimes, such as transformers, also influence these biases. These findings characterize the inductive biases of gradient-based representation learning and highlight challenges in disentangling extraneous biases from computationally important aspects for interpretability and comparison with brain representations.

In this work, researchers trained deep learning models to compute multiple input features, revealing substantial biases in their representations. These biases depend on feature properties like complexity, learning order, dataset prevalence, and output sequence position. Representational biases may relate to implicit inductive biases in deep learning. Practically, these biases pose challenges for interpreting learned representations and comparing them across different systems in machine learning, cognitive science, and neuroscience.

Check out the  Paper . All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on  Twitter . Join our  Telegram Channel ,   Discord Channel , and  LinkedIn Gr oup .

If you like our work, you will love our  newsletter..

Don’t Forget to join our  43k+ ML SubReddit | Also, check out our AI Events Platform

research papers on machine learning applications

Mohammad Asjad

Asjad is an intern consultant at Marktechpost. He is persuing B.Tech in mechanical engineering at the Indian Institute of Technology, Kharagpur. Asjad is a Machine learning and deep learning enthusiast who is always researching the applications of machine learning in healthcare.

  • Optimizing Agent Planning: A Parametric AI Approach to World Knowledge
  • Unlocking the Potential of SirLLM: Advancements in Memory Retention and Attention Mechanisms
  • Achieving Balance in Lifelong Learning: The WISE Memory Approach
  • A Paradigm Shift: MoRA's Role in Advancing Parameter-Efficient Fine-Tuning Techniques

RELATED ARTICLES MORE FROM AUTHOR

The rise of agentic retrieval-augmented generation (rag) in artificial intelligence ai, deep learning in healthcare: challenges, applications, and future directions, researchers at arizona state university evaluates react prompting: the role of example similarity in enhancing large language model reasoning, this ai paper from cornell unravels causal complexities in interventional probability estimation, nv-embed: nvidia’s groundbreaking embedding model dominates mteb benchmarks, mistral-finetune: a light-weight codebase that enables memory-efficient and performant finetuning of mistral’s models, researchers at arizona state university evaluates react prompting: the role of example similarity in....

  • AI Magazine
  • Privacy & TC
  • Cookie Policy

🐝 🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...

Thank You 🙌

Privacy Overview

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 27 May 2024

Using machine learning algorithms to enhance IoT system security

  • Hosam El-Sofany 1 ,
  • Samir A. El-Seoud 2 ,
  • Omar H. Karam 2 &
  • Belgacem Bouallegue 1 , 3  

Scientific Reports volume  14 , Article number:  12077 ( 2024 ) Cite this article

5 Altmetric

Metrics details

  • Computer science
  • Information technology

The term “Internet of Things” (IoT) refers to a system of networked computing devices that may work and communicate with one another without direct human intervention. It is one of the most exciting areas of computing nowadays, with its applications in multiple sectors like cities, homes, wearable equipment, critical infrastructure, hospitals, and transportation. The security issues surrounding IoT devices increase as they expand. To address these issues, this study presents a novel model for enhancing the security of IoT systems using machine learning (ML) classifiers. The proposed approach analyzes recent technologies, security, intelligent solutions, and vulnerabilities in ML IoT-based intelligent systems as an essential technology to improve IoT security. The study illustrates the benefits and limitations of applying ML in an IoT environment and provides a security model based on ML that manages autonomously the rising number of security issues related to the IoT domain. The paper proposes an ML-based security model that autonomously handles the growing number of security issues associated with the IoT domain. This research made a significant contribution by developing a cyberattack detection solution for IoT devices using ML. The study used seven ML algorithms to identify the most accurate classifiers for their AI-based reaction agent’s implementation phase, which can identify attack activities and patterns in networks connected to the IoT. The study used seven ML algorithms to identify the most accurate classifiers for their AI-based reaction agent’s implementation phase, which can identify attack activities and patterns in networks connected to the IoT. Compared to previous research, the proposed approach achieved a 99.9% accuracy, a 99.8% detection average, a 99.9 F1 score, and a perfect AUC score of 1. The study highlights that the proposed approach outperforms earlier machine learning-based models in terms of both execution speed and accuracy. The study illustrates that the suggested approach outperforms previous machine learning-based models in both execution time and accuracy.

Similar content being viewed by others

research papers on machine learning applications

Anomaly detection in IoT-based healthcare: machine learning for enhanced security

research papers on machine learning applications

SALT: transfer learning-based threat model for attack detection in smart home

research papers on machine learning applications

Firefly algorithm based WSN-IoT security enhancement with machine learning for intrusion detection

Introduction.

Technology such as cloud computing, cloud edge, and software-defined networking (SDN) have significantly increased users’ reliance on their infrastructure. Consequently, the number of threats faced by these users has also risen. As a result, security management during IoT system development has become increasingly difficult and complex. The IoT can be described as an electrical network that connects physical objects, such as sensors, with software that makes it possible for them to exchange, examine, and gather data. Various sectors use IoT applications, including the military, personal healthcare, household appliances, and agriculture production infrastructure 1 . This research attempts to achieve the Sustainable Cities and Communities Goal (SDG 11) included in the UN Sustainable Development Goals (SDG) 2 . Addressing the challenges and finding solutions for the IoT require considering a wide range of factors. It is crucial for solutions to encompass the entire system to provide comprehensive security. However, most IoT devices operate without human interaction, making them susceptible to unauthorized access. Therefore, it is imperative to enhance the existing security techniques to safeguard the IoT environment 3 . ML techniques can offer potential alternatives for securing IoT systems, including:

Intrusion detection and prevention ML can create IoT intrusion detection and prevention (IDPS) tools. ML algorithms can analyze network traffic, device logs, and other data related to known attacks or suspicious activity.

Anomaly detection ML algorithms can learn IoT device behavior and network interactions through anomaly detection. ML models can detect unusual IoT activity using real-time data. This helps detect security breaches like unauthorized access or malicious acts and prompt appropriate responses.

Threat intelligence and prediction ML can analyze big security data sets and provide insights. ML models may discover new risks, anticipate attack pathways, and give actionable insight to IoT security practitioners by analyzing data from security feeds, vulnerability databases, and public forums.

Firmware and software vulnerability analysis Researchers may use ML to analyze IoT firmware and software for vulnerabilities. ML models may discover IoT device firmware and software security problems by training on known vulnerabilities and coding patterns. This helps manufacturers repair vulnerabilities before deployment or deliver security patches quickly.

Behavior-based authentication ML algorithms can learn IoT devices and user behavior. By analyzing device usage patterns, ML models may create predictable behavior profiles. ML can require extra authentication or warn for illegal access when a device or user deviates considerably from the learned profile.

Data privacy and encryption ML can assist in ensuring data privacy and security in IoT systems. ML algorithms may provide homomorphic encryption, which permits calculations on encrypted data. ML can perform data anonymization and de-identification to safeguard sensitive data and facilitate analysis and insights.

In general, ML techniques must be used in conjunction with other security measures to offer complete security for IoT systems. ML algorithms and methods have been applied in various tasks, including machine translation, regression, clustering, transcription, detection, classification, probability mass function, sampling, and estimation of probability density. Numerous applications utilize ML techniques and algorithms, such as spam identification, image and video recognition, customer segmentation, sentiment analysis, demand forecasting, virtual personal assistants, detection of fraudulent transactions, automation of customer service, authentication, malware detection, and speech recognition 4 .

In addition, IoT and ML integration can enhance the devices of IoT levels of security, thereby increasing their reliability and accessibility. ML’s advanced data exploration methods play an important role in elevating IoT security from only providing security for communication devices to intelligent systems with a high level of security 5 .

ML-based models have emerged as a response to cyberattacks within the IoT ecosystem, and the combination of Deep Learning (DL) and ML approaches represents a novel and significant development that requires careful consideration. Numerous uses, including wearable smart gadgets, smart homes, healthcare, and Vehicular Area Networks (VANET), necessitate the implementation of robust security measures to safeguard user privacy and personal information. The successful utilization of IoT is evident across multiple sectors of modern life 6 . By 2025, we expect that the IoT will have an economic effect of $2.70–$6.20 trillion. Research findings indicate that ML and DL techniques are key drivers of automation in knowledge work, thereby contributing to the economic impact. There have been many recent technological advancements that are shaping our world in significant ways. By 2025, we expect an estimated $5.2–$6.7 trillion in annual economic effects from knowledge labor automation 7 .

This research study addresses the vulnerabilities in IoT systems by presenting a novel ML-based security model. The proposed approach aims to address the increasing security concerns associated with the Internet of Things. The study analyzes recent technologies, security, intelligent solutions, and vulnerabilities in IoT-based smart systems that utilize ML as a crucial technology to enhance IoT security. The paper provides a detailed analysis of using ML technologies to improve IoT systems’ security and highlights the benefits and limitations of applying ML in an IoT environment. When compared to current ML-based models, the proposed approach outperforms them in both accuracy and execution time, making it an ideal option for improving the security of IoT systems. The creation of a novel ML-based security model, which can enhance the effectiveness of cybersecurity systems and IoT infrastructure, is the contribution of the study. The proposed model can keep threat knowledge databases up to date, analyze network traffic, and protect IoT systems from newly detected attacks by drawing on prior knowledge of cyber threats.

The study comprises five sections: “ Related works ” section presents a summary of some previous research. “ IoT, security, and ML ” section introduces the Internet of Things’ security and ML aspects. “ The proposed IoT framework architecture ” section presents the proposed IoT framework architecture, providing detailed information and focusing on its performance evaluation. “ Result evaluation and discussion ” section provides an evaluation of the outcomes and compares them with other similar systems. We achieve this by utilizing appropriate datasets, methodologies, and classifiers. “ Conclusions and upcoming work ” section concludes the discussion and outlines future research directions.

Related works

The idea of security in IoT devices has been recently articulated in studies that analyze the security needs at several layers of architecture, such as the application, cloud, network, data, and physical layers. Layers have examined potential vulnerabilities and attacks against IoT devices, classified IoT attacks, and explained layer-based security requirements 8 . On the other hand, industrial IoT (IIoT) networks are vulnerable to cyberattacks. Developing IDS is important to secure IIoT networks. The authors presented three DL models, LSTM, CNN, and a hybrid, to identify IIoT network breaches 9 . The researchers used the UNSW-NB15 and X-IIoTID datasets to identify normal and abnormal data, then compared them to other research using multi-class, and binary classification. The hybrid LSTM + CNN model has the greatest intrusion detection accuracy in both datasets. The researchers also assessed the implemented models’ accuracy in detecting attack types in the datasets 9 .

In Ref. 10 , the authors introduced the hybrid synchronous-asynchronous privacy-preserving federated technique. The federated paradigm eliminates FL-enabled NG-IoT setup issues and protects all its pieces with Two-Trapdoor Homomorphic Encryption. The server protocol blocks irregular users. The asynchronous hybrid LEGATO algorithm reduces user dropout. By sharing data, they assist data-poor consumers. In the presented model, security analysis ensures federated correctness, auditing, and PP. Their performance evaluation showed higher functionality, accuracy, and reduced system overheads than peer efforts. For medical devices, the authors of Ref. 11 developed an auditable privacy-preserving federated learning (AP2FL) method. By utilizing Trusted Execution Environments (TEEs), AP2FL reduces issues about data leakage during training and aggregation activities on both servers and clients. The authors of this study aggregated user updates and found data similarities for non-IID data using Active Personalized Federated Learning (ActPerFL) and Batch Normalization (BN).

In Ref. 12 , the authors addressed two major consumer IoT threat detection issues. First, the authors addressed FL’s unfixed issue: stringent client validation. They solved this using quantum-centric registration and authentication, ensuring strict client validation in FL. FL client model weight protection is the second problem. They suggested adding additive homomorphic encryption to their model to protect FL participants’ privacy without sacrificing computational speed. This technique obtained an average accuracy of 94.93% on the N-baIoT dataset and 91.93% on the Edge-IIoTset dataset, demonstrating consistent and resilient performance across varied client settings.

Utilizing a semi-deep learning approach, SteelEye was created in Ref. 13 to precisely detect and assign responsibility for cyberattacks that occur at the application layer in industrial control systems. The proposed model uses category boosting and a diverse range of variables to provide precise cyber-attack detection and attack attribution. SteelEye demonstrated superior performance in terms of accuracy, precision, recall, and Fl-score compared to state-of-the-art cyber-attack detection and attribution systems.

In Ref. 14 , researchers developed a fuzzy DL model, an enhanced adaptive neuro-fuzzy inference system (ANFIS), fuzzy matching (FM), and a fuzzy control system to detect network risks. Our fuzzy DL finds robust nonlinear aggregation using the fuzzy Choquet integral. Metaheuristics optimized ANFIS attack detection’s error function. FM verifies transactions to detect blockchain fraud and boost efficiency. The first safe, intelligent fuzzy blockchain architecture, which evaluates IoT security threats and uncertainties, enables blockchain layer decision-making and transaction approval. Tests show that the blockchain layer’s throughput and latency can reveal threats to blockchain and IoT. Recall, accuracy, precision, and F1-score are important for the intelligent fuzzy layer. In blockchain-based IoT networks, the FCS model for threat detection was also shown to be reliable.

In Ref. 15 , the study examined Federated Learning (FL) privacy measurement to determine its efficacy in securing sensitive data during AI and ML model training. While FL promises to safeguard privacy during model training, its proper implementation is crucial. Evaluation of FL privacy measurement metrics and methodologies can identify gaps in existing systems and suggest novel privacy enhancement strategies. Thus, FL needs full research on “privacy measurement and metrics” to thrive. The survey critically assessed FL privacy measurement found research gaps, and suggested further study. The research also included a case study that assessed privacy methods in an FL situation. The research concluded with a plan to improve FL privacy via quantum computing and trusted execution environments.

IoT, security, and ML

Iot attacks and security vulnerabilities.

Critical obstacles standing in the way of future attempts to see IoT fully accepted in society are security flaws and vulnerabilities. Everyday IoT operations are successfully managed by security concerns. In contrast, they have a centralized structure that results in several vulnerable points that may be attacked. For example, unpatched vulnerabilities in IoT devices are a security concern due to outdated software and manual updates. Weak authentication in IoT devices is a significant issue due to easy-to-identify passwords. Attackers commonly target vulnerable Application Programming Interfaces (APIs) in IoT devices using code injections, a man-in-the-middle (MiTM), and Distributed Denial-of-Service (DDoS) 16 . Unpatched IoT devices pose risks to users, including data theft and physical harm. IoT devices store sensitive data, making them vulnerable to theft. In the medical field, weak security in devices such as heart monitors and pacemakers can impede medical treatment. Figure  1 illustrates the types of IoT attacks (threats) 17 . Unsecured IoT devices can be taken over and used in botnets, leading to cyberattacks such as DDoS, spam, and phishing. The Mirai software in 2016 encouraged criminals to develop extensive botnets for IoT devices, leading to unprecedented attacks. Malware can easily exploit weak security safeguards in IoT devices 18 . Because there are so many connected devices, it may be difficult to ensure IoT device security. Users must follow fundamental security practices, such as changing default passwords and prohibiting unauthorized remote access 19 . Manufacturers and vendors must invest in securing IoT tool managers by proactively notifying users about outdated software, enforcing strong password management, disabling remote access for unnecessary functions, establishing strict API access control, and protecting command-and-control (C&C) servers from attacks.

figure 1

Types of IoT attacks.

IoT applications’ support security issues

Security is a major requirement for almost all IoT applications. IoT applications are expanding quickly and have impacted current industries. Even though operators supported some applications with the current technologies of networks, others required greater security support from the IoT-based technologies they use 20 . The IoT has several uses, including home automation and smart buildings and cities. Security measures can enhance home security, but unauthorized users may damage the owner’s property. Smart applications can threaten people’s privacy, even if they are meant to raise their standard of living. Governments are encouraging the creation of intelligent cities, but the safety of citizens’ personal information may be at risk 21 , 22 .

Retail extensively uses the IoT to improve warehouse restocking and create smart shopping applications. Augmented reality applications enable offline retailers to try online shopping. However, security issues have plagued IoT apps implemented by retail businesses, leading to financial losses for both clients and companies. Hackers may access IoT apps to provide false details regarding goods and steal personal information 23 . Smart agriculture techniques include selective irrigation, soil hydration monitoring, and temperature and moisture regulation. Smart technologies can result in larger crops and prevent the growth of mold and other contaminants. IoT apps monitor farm animals’ activity and health, but compromised agriculture applications can lead to the theft of animals and damage to crops. Intelligent grids and automated metering use smart meters to monitor and record storage tanks, improve solar system performance, and track water pressure. However, smart meters are more susceptible to cyber and physical threats than traditional meters. Advanced Metering Infrastructure (AMI) connects all electrical appliances in a house to smart meters, enabling communication and security networks to monitor consumption and costs. Adversary incursions into such systems might change the data obtained, costing consumers or service providers money 24 . IoT apps in security and emergency sectors limit access to restricted areas and identify harmful gas leaks. Security measures protect confidential information and sensitive products. However, compromised security in IoT apps can have disastrous consequences, such as criminals accessing banned areas or erroneous radiation level alerts leading to serious illnesses 25 .

IoT security attacks based on each layer

IoT devices’ architecture includes five layers: perception, network Layer, middleware (information processing), application, and business (system management). Figure  2 illustrates how the development of IoT ecosystems has changed from a three-layer to a five-layer approach. IoT threats can be physical or cyber, with cyberattacks being passive or active. IoT devices can be physically damaged by attacks, and various IoT security attacks based on each tier are described 26 . Perception layer attacks are intrusions on IoT physical components, for example, devices and sensors. Some of the typical perception layer attacks are as follows:

Botnets Devices get infected by malware called botnets, like Mirai. The bot’s main objectives are to infect improperly configured devices and assault a target server when given the order by a botmaster 27 .

Sleep deprivation attack Attacks from sleep deprivation are linked to battery-powered sensor nodes and equipment. Keeping the machines and devices awake for a long time is the aim of the sleep disturbances assault 28 .

Node tampering and jamming Node tampering attacks are launched by querying the machines to acquire accessibility to and change confidential data, like routing data tables and cryptographic shared keys. A node jamming assault, on the other hand, occurs when perpetrators breach the radio frequencies of wireless sensor nodes 29 .

Eavesdropping By allowing the attacker to hear the information being transferred across a private channel, eavesdropping is an exploit that puts the secrecy of a message in danger 30 .

figure 2

IoT ecosystem five-layer architecture.

These attacks can harm most or all IoT system physical components and can be prevented by implementing appropriate security measures.

Network layer attacks aim to interfere with the IoT space’s network components, which include routers, bridges, and others. The following are some examples of network layer attacks:

Man-in-the-middle (MiTM) This threat involves an attacker posing as a part of the communication networks and directly connecting to another user device 31 .

Denial of service (DoS) Attackers who use DoS techniques generate numerous pointless requests, making it challenging for the user to access and utilize IoT gadgets.

Routing attacks Malicious nodes engage in routing-type assaults to block routing functionality or to perform DoS activities.

Middleware attacks An assault on middleware directly targets the IoT system’s middleware components. Cloud-based attacks, breaches of authentication, and signature packaging attacks are the three most common forms of middleware attacks.

These attacks can be prevented by implementing appropriate security measures.

Smart cities, smart grids, and smart homes are some examples of apps included in the application layer. An application layer attack relates to the security flaws in IoT apps. Here are a few examples of application layer attacks 32 :

Malware The use of executable software by attackers to interfere with network equipment is known as malware.

Phishing attack This is a sort of breach that seeks to get users’ usernames and passwords by making them appear to be reliable entities.

Code injection attack The main goal of an injector attack into a program or script code is to inject an executable code into the memory space of the breached process.

Appropriate security measures can help prevent these attacks as well.

Overview of ML within the IoT

IoT systems are susceptible to hackers because they lack clear boundaries and new devices are always being introduced. There is a possibility to create algorithms that can learn about the behavior of objects and other IoT components inside such large networks by utilizing ML and DL approaches. By using these techniques to predict a system’s expected behavior based on past experiences, security protocols can be developed to a significant extent.

ML techniques and their applications in IoT

ML techniques play an essential role in analyzing and extracting insights from the massive amount of data produced by IoT devices. Here are some popular ML techniques and their applications in the IoT:

Supervised learning This type of algorithm learns from labeled training data. Various applications in the IoT can utilize supervised learning, such as:

Anomaly detection By training ML models to recognize abnormal patterns or behaviors in IoT sensor data, we can identify anomalies or potential security breaches.

Predictive maintenance By analyzing past sensor data, supervised learning algorithms can predict equipment failures or maintenance requirements. This enables the implementation of proactive maintenance measures, leading to a decrease in downtime.

Environmental monitoring ML models can learn from sensor data to predict environmental conditions like air quality, water pollution, or weather patterns.

Unsupervised learning Unsupervised learning algorithms extract patterns or structures from unlabeled data without predefined categories. In IoT, unsupervised learning techniques find applications such as:

Clustering ML models can group similar IoT devices or data points, facilitating resource allocation, load balancing, or identifying network segments.

Dimensionality reduction Unsupervised learning techniques like autoencoders or principal component analysis (PCA) make it easier to analyze IoT data.

Behavioral profiling Unsupervised learning can help in understanding the normal behavior of IoT devices or users, enabling the detection of deviations or anomalies.

Reinforcement learning Reinforcement learning aims to maximize a reward by training an agent how to interact with its environment and use feedback to improve its performance. The following applications use reinforcement learning on the IoT.

Energy management ML models can learn optimal energy allocation strategies for IoT devices to maximize energy efficiency or minimize costs.

Adaptive IoT systems Reinforcement learning can be used to optimize IoT system parameters or configurations based on real-time feedback and changing conditions.

Smart resource allocation ML models can learn to allocate resources dynamically based on demand, user preferences, or changing network conditions.

Deep learning DL algorithms, especially deep neural networks, excel at processing complex data and extracting high-level features. In IoT, DL has various applications, including:

Image and video analysis DL models can analyze images or video streams from IoT devices, enabling applications like object detection, surveillance, or facial recognition.

Natural language processing (NLP) DL techniques can process and understand text or voice data from IoT devices, enabling voice assistants, sentiment analysis, or chatbots.

Time-series analysis DL models, such as long short-term memory (LSTM) or recurrent neural networks (RNNs) networks, can analyze time-series sensor datasets for predicting future values or detecting anomalies.

ML for IoT security

ML is a promising approach for defending IoT devices against cyberattacks. It offers a unique strategy for thwarting assaults and provides several benefits, including designing sensor-dependent systems, providing real-time evaluation, boosting security, reducing the flowing data, and utilizing the large quantity of data on the Internet for all individualized user applications. The influence of ML on the IoT’s development is crucial for enhancing practical smart applications. ML has garnered scientific attention recently and is being applied to IoT security as well as the growth of numerous other industries. Effective data exploration methods for identifying “abnormal” and “normal” IoT components and behavior of devices inside the IoT ecosystem are DL and ML. Consequently, to transform the security of IoT systems from enabling secure Device-to-Device (D2D) connectivity to delivering intelligence security-based systems, ML/DL techniques are needed 33 .

Enhancing IoT security using the algorithms of ML

ML approaches, such as ensemble learning, k-means clustering, Random Forest (RF), Association Rule (AR), Decision Tree (DT), AdaBoost, Support Vector Machine (SVM), XGBoost, and K-Nearest Neighbor (KNN), have benefits, drawbacks, and applications in IoT security. DT, a natural ML technique, resembles a tree, with branches and leaves that serve as nodes in the model. In classification, SVM maximizes the distance between the closest points and the hyperplane to classify the class 34 . In identifying DDoS attacks, RF performs better than SVM, ANN, and KNN. A Principal Component Analysis (PCA) with KNN and classifier softmax has been suggested in Ref. 35 to develop a system that has great time efficiency while still having cheap computation, which enables it to be employed in IoT real-time situations.

Limitations of applying ML in networks of IoT

Using ML approaches for IoT networks has limitations because of dedicated processing power and IoT machines’ limited energy. IoT networks generate data with a variety of structures, forms, and meanings, and traditional ML algorithms are ill-equipped to handle these massive, continuous streams of real-time data. The semantic and syntactic variability in this data is evident, particularly in the case of huge data, and heterogeneous datasets with unique features pose problems for effective and uniform generalization. ML assumes that all the dataset’s statistical attributes are constant, and the data must first go through preprocessing and cleaning before fitting into a particular model. However, in the real world, data comes from multiple nodes and has different representations with variant formatting, which presents challenges for ML algorithms 36 .

The proposed IoT framework architecture

Fundamental concepts and methodologies.

Software defined networking (SDN) SDN is a cutting-edge networking model that separates the data plane from the control plane. This improves network programmability, adaptability, and management, and it also enables external applications to control how the network behaves. The SDN’s three basic components are communication interfaces, controllers, and switches. Cognitive judgments were imposed on the switches by a central authority (i.e., the SDN controller). It keeps the state of the system up to date by changing the flow rules of the appropriate switches. IoT systems’ success and viability depend on SDN adoption. To handle IoT networks’ huge data flows and minimize bottlenecks, SDN’s routing traffic intelligence and improving usage of the network are essential. This connection may be applied at many layers in the IoT network, including enabling end-to-end IoT traffic control, core, access, and cloud networks (where creation, processing, and providing of data takes place). SDN also enhances IoT security, for example, tenant traffic isolation, tracking centralized security based on the network’s global view, and dropping of traffic at the edge of the network to ward off malignant traffic.

Network function virtualization (NFV) Virtualization in network contexts is called network function virtualization (NFV). NFV separates software from hardware, adding value and reducing capital and operational costs. The European Telecommunications Standards Institute (ETSI) has standardized this approach’s novel design for use in telecommunications systems. The architecture of ETSI NFV has three basic components:

Virtualization infrastructure Virtualization technologies are found in this layer in addition to needed hardware that offers abstractions to resources for Virtualized Network Functions (VNFs). Cloud platforms handle networking, data processing, and storage.

Virtual network functions VNFs replace specific hardware equipment for network functions. They scale and cost-effectively handle network services across numerous settings.

Management and orchestration Block of Management and orchestration (MANO) is a component of ETSI NFV and is responsible for communicating with the VNF layer and the infrastructure layer. It manages monitoring VNFs, configuration, instantiation, and global resource allocation.

The ecosystem of the IoT is given value by virtualized resources of the network, explaining its variability and quick expansion. NFV and SDN can offer advanced virtual monitoring tools like Deep Packet Inspectors (DPIs) and Intrusion Detection Systems (IDSs). They can provide scalable network security equipment, as well as deploy and configure on-demand components, such as authentication systems and firewalls, to defend against attacks that have been identified by monitoring agents. When processing for security is offloaded from resource-constrained IoT devices to virtual instances, the resulting boost in efficiency and drop in energy consumption clear the way for other useful applications to be implemented. IoT security hardware lacks NFV’s flexibility and enhanced security. NFV’s value-added features improved IoT security, even if they did not replace current solutions.

Machine learning (ML) ML is an algorithmic artificial intelligence (AI) discipline that uses techniques to give intelligence to devices and computers. ML methods include unsupervised , supervised , and reinforcement learning. They are typically used in the security of networks. ML is used to specify and precisely identify the security regulations of the data plane. In mitigating a sort of attack given by tagging traffic networks or creating policies to access control, the difficulty is to fine-tune key security protocol parameters. Moreover, several ML approaches may prevent IoT attacks.

Supervised learning In algorithms of supervised learning, the model output is known even though the underlying relationships between the data are unknown. This model is often trained with two datasets: One for “testing” and “evaluating” the driven model and another to “learn” from. Within the context of security, it is common to compare a suspected attack to a database of known threats.

Unsupervised learning Data is not pre-labeled, and the model is unknown. It sets it apart from supervised learning. It aims to classify and find patterns in the data.

Reinforcement learning It looks at problems and methods to enhance its model through study. It employs trial and error and incentive mechanisms to train its models in a novel way. A metric known as the “value function” is determined by tracking the output’s success and applying the reward to its formula. This value tells the model how well it is evaluated, so it may adjust its behavior accordingly.

The proposed security model

Figure  3 illustrates the proposed ML-based security model to address IoT security issues based on NFV, SDN, and ML technologies. The figure displays the security component framework and interconnections, whereas Fig.  4 demonstrates the closed-loop automation phases, starting with detection and monitoring and ending with preventing threats. To ensure complete security, the system suggested integrating the enablers and countermeasures from the previous subsections. This framework enforces security policies beginning with the design and concluding with the application and maintenance. Two primary framework levels are shown in Fig.  3 (i.e., security orchestration and security enforcement layers). The two layers and their closed-loop automation intercommunications to detect and prevent attacks are discussed below.

figure 3

The proposed ML-based security model.

figure 4

Automation with a closed loop, from detection to prevention.

Security enforcement layer Several VNFs implemented on many clouds, Physical Network Functions (PNFs), and edges facilitate interaction between IoT devices and end users. These network functions (PNFs and VNFs), end users, and IoT devices interact with each other over either a conventional or an SDN-based network. The research classifies attacks on the IoT as either internal or external . The internal attack is caused by compromised and malicious IoT devices, while the external attack is initiated from the end-user network and directed at the IoT domain. The external attack creates danger for the external network and/or other authorized IoT devices. Attacks would be primarily addressed at three levels: (1) IoT devices, via IoT controllers; (2) network, via SDN controllers; and (3) cloud, via an NFV orchestrator. By implementing VNF security and setting the interaction through SDN networking, the security framework features may be properly implemented within the IoT territory. The security enforcement plan was developed to match closely with ETSI and Open Networking Foundation (ONF) guidelines for NFV and SDN. As shown in Fig.  1 , the security enforcement mechanisms consist of five separate logical blocks.

Management and control block It analyzes the components required to manage NFV and SDN infrastructures. It uses SDN controllers and ETSI MANO stack modules for this. To implement efficient security functions, the SDN controllers and NFV orchestrator must work closely together as NFV is frequently used alongside SDN to alter programmatically the network based on policies and resources.

VNF block Taking into consideration the VNFs that have been implemented across the virtualization infrastructure to implement various network-based security measures, the threat and protection measures required by the rules of security will be met with a focus on the delivery of sophisticated VNF security (e.g., IDS/IPS, virtual firewalls, etc.).

Infrastructure block It includes every hardware component needed to construct an IaaS layer, including computers, storage devices, networks, and the software used to run them in a virtualized environment. In addition to the elements of the network that are in charge of transmitting traffic while adhering to the regulations that have been specified by the SDN controller, a set of security probes is included in this plane to gather data for use by the monitoring services.

Monitoring agents block Its primary duty is reporting network activity and IoT actions to identify and prevent various types of attacks. In the proposed model, the detection technique may make use of either network patterns or IoT misbehavior. Using SDN-enabled traffic mirroring, every bit of data that is being sent over the network can be seen. The Security Orchestration Plane hosts an AI-based response agent that receives logs from the monitoring agents describing malicious transactions.

The IoT domain block It refers to the interconnected system of cameras, sensors, appliances, and other physical objects that form the SDN. The proposed methodology considers the substantial risk these devices pose to data privacy and integrity, and it tries to enforce the security standards in this domain.

Security orchestration layer This layer has the task of setting up real-time rules of security depending on the current state of monitoring data and adjusting the policies dynamically based on their context. It is a novel part of the proposed framework that communicates with the security enforcement layer to request the necessary actions to be taken to enforce security regulations inside the IoT domain. Virtual security enablers must be created, configured, and monitored to deal with the present attack.

Figure 2 is a diagrammatic representation of the major cooperation that happens among various framework components. This study proposes a feedback automation mechanism control system consisting of an oversight agent, an AI-based reaction agent, and an orchestrator for security. The latter protects against dangers by utilizing an NFV orchestrator, SDN controller, and IoT controller (see Figs. 3 , 4 ).

AI-based reaction agent This part orders the security orchestrator to perform predetermined measures in response to an incident. This block, as shown in Fig.  4 , makes use of the information collected by the monitoring agent from IoT domains and the network. This part employs ML models that have been trained on network topologies and the actions of IoT devices to identify potential dangers. For the security orchestrator, these ML models will be able to prescribe the optimal template for policies of security. Figure  4 also shows how to identify security threats from observations of network patterns and/or IoT activities. The security orchestrator would then be informed of the discovered danger level (where every level from L1 to L5 belongs to a different predefined security policy). As shown in Fig.  4 , we developed an AI-based reaction agent that uses seven ML techniques to recognize IoT-related attack activities and/or patterns in a network. These techniques are Random Forest, Decision Tree, Naive Bayes, Backpropagation NN, XGBoost, AdaBoost, and Ensemble RF-BPNN.

Security orchestrator This part of closed-loop automation enforces the AI reaction agent’s security practices. It enforces IoT security regulations utilizing SDN and NFV with the control and management block. The security orchestrator instantiates, configures, and monitors virtual security devices, manipulates bad traffic through SDN, or directly controls IoT machines, like shutting off a hacked device.

We have addressed the IoT security threats using RF, NB, DT, NNs, XGBoost, AdaBoost, and Ensemble RF-BPNN, which involve leveraging ML algorithms to detect and mitigate potential risks. To highlight their effectiveness, we can compare some of these approaches to traditional security methods as follows:

RFs are an ensemble learning algorithm that combines multiple DTs to enhance accuracy and robustness. They applied to the proposed IoT security system as follows:

Ensemble construction RF consists of multiple DTs, each trained on a randomly selected subset of the training dataset. This randomness helps to reduce overfitting and increase generalization.

Classification When classifying new instances, each DT in the RF independently predicts the class. The last prediction depends on the majority vote or averaging of the individual tree predictions.

Decision trees (DTs) are a popular ML technique for classification and regression tasks. The proposed IoT security system uses a DT classifier to identify and address unique threats, and it works as follows:

Feature selection The first stage is to select relevant features from the IoT device data. These features can include network traffic patterns, device behavior, communication protocols, and more.

Training Using a labeled dataset, we train a DT classifier that contains instances of both normal and malicious behavior. The model learns to classify instances based on the selected features.

Detection Once trained, the DT can classify new instances as normal or malicious, depending on their feature values. If the DT classified an instance as malicious, it would take appropriate security measures, such as blocking network access or raising an alarm.

Neural networks NNs, particularly DL architectures, have gained significant popularity in various domains, including IoT security. Here’s how they can be used:

Multiple layers of interconnected nodes (neurons) form the architecture design of a neural network model. Each neuron applies a non-linear activation function to weighted inputs from the previous layer.

We train the neural network using a labeled dataset through a process known as backpropagation. To reduce the discrepancy between the expected and observed labels, we iteratively tweak the network’s biases and weights.

Prediction: Once trained, the neural network can classify new instances into different threat categories based on their input features.

Comparative analysis with traditional approaches Compared to traditional security approaches, such as rule-based systems or signature-based detection, ML techniques offer several advantages. Traditional methods rely on predefined rules or patterns, which might not be able to adapt to rapidly evolving threats. In contrast, ML methods can learn from data and adapt their behavior accordingly. They can detect anomalies, identify new attack patterns, and improve over time as they encounter new threats. However, traditional approaches often provide better interpretability and explainability.

Rule-based systems explicitly define security rules, making it easier for security analysts to understand and verify their behavior. However, ML models, especially complicated ones like neural networks, are black boxes, making their decision-making process difficult to comprehend.

In conclusion, ML techniques like DTs, RFs, XGBoost, AdaBoost, and neural networks provide powerful tools for addressing unique IoT security threats. They offer improved accuracy, adaptability, and the ability to handle complex and evolving attack patterns. However, they may trade off some interpretability compared to traditional security approaches. The approach is selected based on the specific requirements of the IoT security system and the trade-offs between accuracy, interpretability, and computational requirements.

Performance evaluation of the proposed model

The experimental methodology and analysis outcomes of the AI-based response agent are covered in this section. An AI-based response agent can identify potential threats by performing the following steps: (1) Evaluate network patterns. To identify various forms of network infiltration, the research presents a knowledge-based intrusion detection framework. (2) Examine the strange behaviors that have been seen in the IoT system. Here, attacks are uncovered through the investigation of strange actions taken by IoT devices. To appropriately categorize the degree of the attacks and select the right security solutions, the research has applied supervised learning algorithms. The AI-based reaction agent will employ many ML approaches, considering the appropriate inputs from the monitoring agents, to remove a specific attack.

Evaluating network patterns Intrusion system evaluation is the first stage in evaluating the framework’s effectiveness.

Several publicly available datasets, including the UNSW_NB15, IoT-23, DARPA, KDD 99, NSL-KDD, DEFCON, and balanced BoTNeT-IoT-L01 datasets, were used to build the proposed system (see the datasets link ( https://drive.google.com/drive/folders/1gjP-pQzFZsLh2QMsIa5GPhEh5etv9Jvc?usp=sharing )). These datasets contain information on IoT attacks in the form of (.csv) files. Table 1 shows the network traffic information from different IoT devices. Advantages of the NSL-KDD dataset compared with the initial KDD dataset: The train set does not contain duplicated data; therefore, classifiers are not biased toward more frequent records. BoTNeT-IoT-L01 is a recent dataset that consists of two Botnet assaults (Gafgyt and Mirai). Over a 10-s frame with a decay factor of (0.1), the mean, count, variance, radius, magnitude, correlation coefficient, and covariance were the seven statistical measures that were computed. The .csv file was used to extract four features: jitter, packet count, outbound packet size alone, and combined outbound and inbound packet size 37 . By computing three or more statistical measures for each of the four traits, a total of twenty-three features were obtained.

Furthermore, this study used the widely recognized NSL-KDD dataset as a benchmark. It served as a benchmark for assessing intrusion detection systems in this research. It is a much better version of dataset KDD 99 (see Table 2 ). The NSL-KDD dataset has over 21 distinct attack types, which serve as the foundation for the application of our proposed IDS model, such as teardrop, satan, rootkit, buffer-overflow, smurf DDoS, pod-dos, and Neptune-dos. The NSL-KDD dataset is primarily composed of preprocessed network traffic data. These data provide a more precise representation of the network traffic that occurs at present. There are two distinct collections of data inside the dataset: a set for testing and a set for training . Comparatively, the set of testing has around 23,000 records, whereas the training set contains approximately 125,000 records. Each entry in the dataset corresponds to a network connection and contains a set of 41 features, including the IP addresses of the source and destination, protocols, flags, and a label indicating whether the connection is normal or abnormal (anomalous). Each sample in the dataset corresponds to certain attacks as follows: DoS attacks, remote-to-local (R2L) attacks, user-to-root (U2R) attacks, and probing attacks 38 . There are many implementation tools available for analyzing IoT attack datasets, such as Wireshark, Snort, Zeek (formerly Bro), Jupyter Notebook, Python, and Weka. In this work, the researchers used Python programming and Weka data mining tools for ML and data analysis processing.

The proposed tools include a large collection of ML algorithms for classification, regression, clustering, and association rule mining, such as RF, NB, DT, NNs, XGBoost, AdaBoost, and Ensemble RF-BPNN, as well as tools for model evaluation and selection, including cross-validation and ROC analysis.

Certain ML algorithms are incapable of learning due to the wide range of features present in nature. The modeling process becomes more challenging when a feature is continuous. Hence, before constructing classification patterns, preprocessing is fundamental to optimize prediction accuracy. Specifically, a discretization technique is used to overcome this restriction. When applied to a continuous variable, the discretization data mining approach seeks to minimize the number of possible values by categorizing them into intervals. Two different kinds of discretization are discussed in the literature: (1) static variable discretization , in which variables are partitioned separately, and (2) dynamic variable discretization, in which all features are discretized concurrently 39 . The research discretized the attacks and then categorized them such that the research was left with only the most common types (UDP, Junk, Ack, and UDP plain from the balanced BoTNet-IoT-L01 dataset and DDoS, Probe, U2R, and R2L from NSL-KDD).

Metrics for comparing performance Choosing measures that can indicate the strength of an IDS is a major problem when evaluating an IDS. An IDS’s performance goes well beyond its classification results alone. Cost Per Example (CPE), precision, detection rate, and model accuracy are utilized to evaluate the effectiveness of the proposed system. When evaluating outcomes, the following metrics should be used in conjunction with one another 40 .

Equation ( 1 ) indicates Cost-Sensitive Classification (CSC) or CPE, where N is the total number of samples, CM refers to the classification’s Confusion Matrix algorithm, and C is the Cost Matrix (see Table 3 ) 41 .

Input data cleaning, feature extraction, and classification The research proposes a first method, which involves preparing the entire dataset and then categorizing it using a variety of techniques (Hoeffding Tree, RF, Bayes Net, and J48) as shown in Fig.  6 . Next, the research chooses the best classifier (algorithm) that generates a preferred accuracy (see Table 4 for the BoTNet-IoT-L01 dataset and Table 5 for the NSL-KDD dataset).

Backpropagation approach To investigate the multilayer neural net approach, the research utilized the capabilities of a backpropagation technique for learning. The research employed a multilayer neural network with three layers. The initial layer had 41 inputs, representing the features of the dataset. The final layer encompassed the classification responses, namely, U2L, U2R, Probe, DoS, and Normal. An extra hidden layer was incorporated to facilitate the learning process. This method uses 100 neurons and a single hidden layer. Experience has shown that the alternative hidden layer and neuron counts did not increase the mean squared error (MSE) (see Table 6 ).

Distributed classification module This module introduces a distributed categorization system in which the various types of attacks (DDoS, U2R, R2L, and Probe; UDP, UDP plain, Ack, and Junk) are all assigned to the Ensembled RF-BPNN algorithm. Finally, the AdaBoost method is used to combine the resulting models (see Table 7 ).

Result evaluation and discussion

The findings reported in Table 5 demonstrate both the accuracy rate and precision of the RF technique. Unfortunately, the results are not promising for either U2R or U2L attacks. There is a low misclassification rate (or CPE) and high accuracy when using J48 to identify attacks. When it comes to the accuracy required for U2R strikes, however, J48 falls short. Despite its consistent performance, the Hoeffding tree method has a low accuracy for U2R threats. Although it has a strong model accuracy, the Bayes Net method provides the lowest results, failing to identify the vast majority of U2R threats. As can be seen from Table 6 , the backpropagation process is generally as precise as its predecessors, if not somewhat more so. However, misclassification comes with a significant processing time penalty. AdaBoost, CPE, and detection rate produced a better detection accuracy model as shown in Table 7 .

The performance of ML algorithms used in the proposed system

A classification algorithm for IoT detection based on ensembles of backpropagation neural networks is trained on the BoTNet-IoT-L01 dataset (see Table 8 ). The novelty of the algorithm stems from the methodology employed for combining outputs of the backpropagation neural network ensembles. The backpropagation neural network Oracle 8i database tool is utilized to combine the ensemble outputs. As Fig.  5 shows, the neural network backpropagation Oracle is constructed with an RF algorithm that produces high classification accuracy and low classification error (see Table 4 ). The thresholds are not learned all at once in the RF model but rather hierarchically. The decrease in impurity will be enforced one directionally from the starting to the finishing index of the symbolic path; however, the research learned them simultaneously. The idea of hierarchical node splits will be represented by this one-directional impurity reduction. To do this, firstly, the research breaks up each node in the symbolic path into some votes for each class. Secondly, the research computes the impurity based on those votes. The third step is to gradually lower it by a certain amount using the Softmax activation method. Our proposed algorithm uses margin ranking loss as its objective function. It is important to maintain a minimum margin disparity between the intended result and the actual one. The margin difference is the ‘reduction in impurity’. The target is output shifted by one index to the right and the impurity at first split is initialized by the impurity of the batch (see Fig.  5 ).

figure 5

Architectural flow graph of the proposed RF with backpropagation NN (RF-BPNN).

When employing the AdaBoost classifier as a detection model, the research was limited to considering a single window size. Therefore, the research has successfully decreased the number of attributes in the BoTNeT-IoT-L01 dataset from 115 to 23. This significant decrease in the dimensionality of the dataset results in a significant acceleration of the detection process. Speaking of the BotNet-IoT dataset, the research discovered that just a small number of parameters have an important role in our system’s overall performance, and time windows of 10 s performed marginally better than those of shorter duration (see Fig.  6 ). Additionally, the research discovered that traffic heterogeneity greatly impacted RF classifier performance. However, when compared to the other classification algorithms, AdaBoost and RF-BPNN had the greatest and most stable results (see Table 7 ).

figure 6

RF-BPNN accuracy evaluation for each attack type in the balanced BoTNet-IoT-L01 dataset.

Figure  7 shows the accuracy for detecting DoS , Fuzzers , Gene ric, Backdoor, and Exploit attacks in the UNSW_NB15 dataset using the RF classifier and SMOTE (where “ label” refers to the target variable and “attack_cat ” refers to the attack types).

figure 7

The accuracy for detecting some attacks in the UNSW_NB15 dataset, using RF Classifier.

Different experiments determine the system’s performance. Examining and validating each stage using the supplied classifiers is necessary to confirm the experimental results. Whether the classifier can discriminate across feature categories is also crucial. Accuracy, specificity, precision, recall, F1-score, and AUC measure the model’s performance and indicate the correctness of the system. Such measurements are based on the T P , F P , T N , and F N , as shown in Eqs. ( 2 ) to ( 6 ):

We use the following terms to describe the classification errors: true positive (TP) for attack instances, true negative (TN) for normal cases, false positive (FP) for incorrectly classified normal instances, and false negative (FN) for incorrectly classified attack instances.

Thus, the accuracy formula evaluates the classifier’s capacity to accurately categorize both positive and negative instances; precision denotes the classifier’s ability to avoid incorrectly labeling positive instances as negative, and specificity denotes its capacity to avoid incorrectly labeling negative instances as positive. In machine learning, recall is the rate at which a classifier can identify positive examples, whereas the F1-score is the weighted average of accuracy and recall.

Table 9 shows the performance of seven machine learning classifiers using the Synthetic Minority Oversampling Technique (SMOTE) on the UNSW_NB15 dataset. As you can see in Fig.  8 , the RF, XGBoost, AdaBoost, and Ensembled RF-BPNN classifiers did the best overall. They achieved an accuracy of 99.9%, an AUC of 1, and an F1 score of 99.9%. The Naive Bayes classifier, on the other hand, obtained the minimum accuracy and F1 score.

figure 8

The accuracy of 7 ML algorithms using the UNSW-NB15 dataset and SMOTE.

Integration with existing IoT security frameworks and standards

The proposed model can integrate with existing IoT security frameworks and standards as follows:

Integration with IoT security frameworks The ML-based model can integrate with IoT security frameworks by aligning its functionalities with their security objectives and guidelines. For example:

The proposed model can integrate with existing authentication mechanisms recommended by IoT security frameworks, such as digital certificates or secure bootstrapping protocols. It can enhance device authentication by analyzing device behavior patterns and detecting anomalies that may indicate unauthorized access or compromised devices.

To align with data privacy requirements, the model can utilize encryption techniques and privacy-preserving algorithms recommended by the IoT security frameworks. It provides a guarantee of secure transmission and storage of data, protecting confidential information against illegal access.

The proposed model can integrate with existing access control mechanisms defined by IoT security frameworks. It can augment access control by providing intelligent decision-making capabilities based on historical data, user behavior analysis, or contextual information. This aids in assessing access requests and preventing unauthorized access to IoT resources.

Integration with IoT security standards The ML-based model can comply with IoT security standards by incorporating the required security controls and practices. For example:

The proposed model can align with ISO/IEC 27000 standards by implementing appropriate security controls for risk assessment, incident management, and data protection. It can follow the standards’ guidelines to ensure that the necessary security measures are in place.

The model can follow the NIST framework to enhance its threat detection and incident response capabilities.

Interoperability in IoT ecosystems By adhering to standard IoT protocols, data formats, and metadata standards, the ML-based model can ensure interoperability. For example:

The ML model can communicate with IoT devices and gateways using standard IoT protocols such as MQTT or CoAP, ensuring compatibility and interoperability across different devices and platforms.

The ML model can use commonly used data formats, such as JSON, or semantic data models, such as the Semantic Sensor Network (SSN) ontology, to facilitate seamless data sharing and interoperability with other components within the IoT ecosystem.

By integrating with existing IoT security frameworks and standards, the proposed model can enhance its adaptability and compatibility within IoT ecosystems. This integration allows the model to complement and enhance the existing security infrastructure, contributing to improved IoT security outcomes.

Comparisons with related systems

Table 10 highlights the proposed model’s performance outcomes by comparing it to previous systems. This study looked at existing literature and compared it to others based on standards, like the false positive rate (FPR), CPE, accuracy, and detection rate 38 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 47 . Through several experiments, the proposed system achieved the best evaluation metrics for accuracy, precision, detection rate, CPE, and lowest time complexity compared with previous solutions, as shown in Tables 10 and 11 .

Privacy concerns and data bias

The authors of this work have incorporated essential steps into the development and deployment of the proposed ML-based security model to effectively address privacy concerns and data bias, as well as ensure the technology’s ethical and responsible use within the IoT system.

The authors conducted a privacy impact assessment to determine if the proposed ML-based security model has any privacy issues or concerns.

To mitigate privacy concerns, the study implemented privacy-enhancing techniques . This process included data anonymization, encryption, differential privacy, or federated learning, which allows for training the proposed ML model without sharing raw data.

The study minimized the amount of personally identifiable information (PII) gathered and stored to reduce privacy risks. During the requirements engineering phase, we only collected the necessary data for the proposed machine learning-based security model, ensuring its safe storage and disposal when no longer required.

We implemented regular monitoring of the proposed ML model for potential biases in data and outcomes. Implementing a bias detection process is critical for identifying discriminatory patterns. We can take steps to mitigate detected biases , which may include adjusting training data, diversifying datasets, or utilizing bias correction algorithms.

Regularly monitor the proposed ML-based security model performance, including privacy aspects, and update it as needed to address emerging privacy concerns, mitigate biases, and ensure ongoing compliance with ethical standards.

Conclusions and upcoming work

This research introduces a new proposed ML-based security model to address the vulnerabilities in IoT systems. We designed the proposed model to autonomously handle the growing number of security problems associated with the IoT domain. This study analyzed the state-of-the-art security measures, intelligent solutions, and vulnerabilities in smart systems built on the IoT that make use of ML as a key technology for improving IoT security. The study illustrated the benefits and limitations of applying ML in an IoT environment and proposed a security model based on ML that can automatically address the rising concerns about high security in the IoT domain. The suggested method performs better in terms of accuracy and execution time than existing ML algorithms, which makes it a viable option for improving the security of IoT systems. This research evaluates the intrusion detection system using the BoTNet-IoT-L01 dataset. The research applied our proposed IDS model to a dataset that included more than 23 types of attacks. This study also utilized the NSL-KDD dataset to evaluate the intrusion detection mechanism and evaluated the proposed model in a real-world smart building environment. The presented ML-based model is found to have a good accuracy rate of 99.9% compared with previous research for improving IoT systems’ security. This paper’s contribution is the development of a novel ML-based security model that can improve the efficiency of cybersecurity systems and IoT infrastructure. The proposed model can keep threat knowledge databases up to date, analyze network traffic, and protect IoT systems from newly detected attacks by drawing on prior knowledge of cyber threats. This study presents a promising ML-based security approach to enhance IoT system security. However, future work and improvements remain possible. Expanding the dataset for the intrusion detection system evaluation could be one area of improvement. While the BoTNet-IoT-L01 and NSL-KDD datasets used in this study are comprehensive, they may not cover all possible types of attacks that could occur in an IoT environment. Therefore, our future research could focus on collecting and analyzing more diverse datasets to increase the performance of the proposed model. Furthermore, optimizing the proposed model’s execution time is crucial for real-world applications. Also, we could integrate the proposed model with other security solutions to create a more comprehensive and robust security system for IoT devices. Overall, the development of this novel ML-based security model is a significant contribution to the literature on ML security models and IoT security, and further work and improvements will continue to advance the field. Finally, the security analyst treats the AI-based IDS as a black box due to its inability to explain the decision-making process 48 . In our future work, we will expand our research by integrating blockchain-based AKA mechanisms with explainable artificial intelligence (XAI) to secure smart city-based consumer applications 49 . On the other hand, we can use the Shapley Additive Explanations (SHAP) mechanism to explain and interpret the prominent features that are most influential in the decision 50 .

Data availability

The corresponding author can provide the datasets used and/or analyzed in this work upon reasonable request.

Sharma, A., Singh, P. K. & Kumar, Y. An integrated fire detection system using IoT and image processing technique for smart cities. Sustain. Cities Soc. 61 , e4826 (2020).

Article   Google Scholar  

Sinan, K. SDG-11: Sustainable Cities and Communities. Emerging Technologies, Sustainable Development Goals Series 1st edn. (Springer, 2020).

Google Scholar  

Hussain, F., Hussain, R., Hassan, S. A. & Hossain, E. Machine learning in IoT security: Current solutions and future challenges. IEEE Commun. Surv. Tutor. 22 (3), 1686–1721 (2020).

Bharati, S., Mondal, M. R. H., Podder, P. & Prasath, V. B. Federated learning: Applications, challenges and future directions. Int. J. Hybrid Intell. Syst. 18 (1–2), 19–35 (2022).

Shafiq, M., Tian, Z., Bashir, A. K., Du, X. & Guizani, M. Corrauc: A malicious BOT-IOT traffic detection method in IoT network using machine learning techniques. IEEE Internet Things J. 8 (5), 3242–3254 (2020).

Omolara, A. E. et al. The Internet of Things security: A survey encompassing unexplored areas and new insights. Comput. Secur. 112 , 102494 (2022).

Bharati, S., Podder, P., Mondal, M. R. H. & Paul, P. K. Applications and challenges of cloud integrated IoMT. In Cognitive Internet of Medical Things for Smart Healthcare 1st edn (eds Hassanien, A. E. et al. ) 67–85 (Springer, 2021).

Chapter   Google Scholar  

Özalp, A. N. et al . Layer-based examination of cyber-attacks in IoT. In 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) (IEEE, 2022).

Altunay, H. C. & Albayrak, Z. A hybrid CNN+ LSTM—Based intrusion detection system for industrial IoT networks. Eng. Sci. Technol. Int. J. 38 , 101322 (2023).

Abbas, Y., Ali, D., Gautam, S., Hadis, K. & Reza, M. P. Hybrid privacy preserving federated learning against irregular users in next-generation Internet of Things. J. Syst. Archit. 148 , 103088 (2024).

Abbas, Y., Ali, D. & Gautam, S. AP2FL: Auditable privacy-preserving federated learning framework for electronics in healthcare. IEEE Trans. Consumer Electron. 99 , 1 (2023).

Danyal, N., Abbas, Y., Ali, D. & Gautam, S. Federated quantum-based privacy-preserving threat detection model for consumer Internet of Things. IEEE Trans. Consumer Electron. https://doi.org/10.1109/TCE.2024.3377550 (2024).

Sanaz, N., Behrouz, Z., Abbas, Y. & Ali, D. Steeleye: An application-layer attack detection and attribution model in industrial control systems using semi-deep learning. In 2021 18th International Conference on Privacy, Security and Trust (PST), IEEE Xplore (2021).

Abbas, Y., Ali, D., Reza, M. P., Gautam, S. & Hadis, K. Secure intelligent fuzzy blockchain framework: Effective threat detection in IoT networks. Comput. Ind. 144 , 103801 (2023).

Gopi, K. J., Abbas, Y., Reza, M. P. & Seyedamin, P. Exploring privacy measurement in federated learning. J. Supercomput. 1 , 43 (2023).

Otoum, Y. & Nayak, A. On securing IoT from deep learning perspective. In Proc. 2020 IEEE Symposium on Computers and Communications (ISCC) 1–7 (2020).

Butun, I., Sterberg, P. O. & Song, H. Security of the Internet of Things: Vulnerabilities, attacks, and countermeasures. IEEE Commun. Surv. Tutor. 22 (1), 616–644 (2020).

Tahsien, S. M., Karimipour, H. & Spachos, P. Machine learning based solutions for security of Internet of Things (IoT): A survey. J. Netw. Comput. Appl. 161 , 102630 (2020).

Abiodun, O. I., Abiodun, E. O., Alawida, M., Alkhawaldeh, R. S. & Arshad, H. A review on the security of the Internet of Things: Challenges and solutions. Wirel. Person. Commun. 119 (3), 2603–2637 (2021).

Podder, P., Mondal, M. R. H., Bharati, S. & Paul, P. K. Review on the security threats of Internet of Things. Int. J. Comput. Appl. 176 (41), 37–45 (2020).

Hamad, Z. J. & Askar, S. Machine learning powered IoT for smart applications. Int. J. Sci. Bus. 5 (3), 92–100 (2021).

Xu, H. et al. A combination strategy of feature selection based on an integrated optimization algorithm and weighted K-nearest neighbor to improve the performance of network intrusion detection. Electronics 9 (8), 1206 (2020).

Bharati, S. & Mondal, M. R. H. Computational intelligence for managing pandemics. In 12 Applications and Challenges of AI-Driven IoHT for Combating Pandemics: A Review (eds Bharati, S. & Mondal, M. R. H.) 213–230 (De Gruyter, 2021).

Robel, M. R. A., Bharati, S., Podder, P. & Mondal, M. R. H. IoT driven healthcare monitoring system. In Fog, Edge, and Pervasive Computing in Intelligent IoT Driven Applications (eds Gupta, D. & Khamparia, A.) 161–176 (Wiley, 2020).

Podder, P., Mondal, M. R. H. & Kamruzzaman, J. Iris feature extraction using three-level Haar wavelet transform and modified local binary pattern. In Applications of Computational Intelligence in Multi-Disciplinary Research 1st edn (eds Elngar, A. A. et al. ) (Elsevier, 2022).

Chandavarkar, B. R. Hardcoded credentials and insecure data transfer in IoT: National and international status. In Proc. 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT) 1–7 (2020).

Ferrara, P., Mandal, A. K., Cortesi, A. & Spoto, F. Static analysis for discovering IoT vulnerabilities. Int. J. Softw. Tools Technol. Transf. 23 (1), 71–88 (2021).

Yu, Y., Guo, L., Liu, S., Zheng, J. & Wang, H. Privacy protection scheme based on CP-ABE in crowdsourcing-IoT for Smart Ocean. IEEE Internet Things J. 7 (10), 10061–10071 (2020).

Xiong, J. et al. A personalized privacy protection framework for mobile crowdsensing in IIoT. IEEE Trans. Ind. Inform. 16 (6), 4231–4241 (2020).

Jiang, X., Lora, M. & Chattopadhyay, S. An experimental analysis of security vulnerabilities in industrial IoT devices. ACM Trans. Internet Technol. 20 (1), 1–24 (2020).

Visoottiviseth, V., Sakarin, P., Thongwilai, J. & Choobanjong T. Signature-based and behavior-based attack detection with machine learning for home IoT devices. In Proc. 2020 IEEE Region 10 Conference (TENCON 2020) 829–834 (2020).

Turk, Z., Soto, B. G. D., Mantha, B. R. K., Maciel, A. & Georgescu, A. A systemic framework for addressing cybersecurity in construction. Autom. Construct. 133 (3), 103988 (2022).

Al Hayajneh, A., Bhuiyan, N. Z. A. & McAndrew, I. Improving internet of things (IoT) security with software defined networking (SDN). Computers 9 (1), 8 (2020).

Hussain, F., Hassan, S. A., Hussain, R. & Hossain, E. Machine learning for resource management in cellular and IoT networks: Potentials, current solutions, and open challenges. IEEE Commun. Surv. Tutor. 22 (2), 1251–1275 (2020).

IoT Dataset for Intrusion Detection Systems (IDS). https://www.kaggle.com/azalhowaide/iot-dataset-for-intrusion-detection-systems-ids (2023).

Nawir, M., Amir, A., Yaakob, N. & Lynn, O. B. Internet of Things (IoT): Taxonomy of security attacks. In Proc. 3rd International Conference in Electronic Design (ICED) 321–326 (2016).

Herzberg, B., Bekerman, D. & Zeifman, I. Breaking down mirai: An IoT DDoS botnet analysis. Incapsula Blog, Bots and DDoS, Security, (2016).

Ambusaidi, M. A., He, X., Nanda, P. & Tan, Z. Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans. Comput. 65 (10), 2986–2998 (2016).

Article   MathSciNet   Google Scholar  

Moustafa, N., Creech, G. & Slay, J. Big data analytics for intrusion detection system: Statistical decision-making using finite Dirichlet mixture models. In Data Analytics and Decision Support for Cybersecurity 1st edn (eds Moustafa, N. et al. ) 127–156 (Springer, 2017).

Tsai, C. F. & Lin, C. Y. A triangle area based nearest neighbors approach to intrusion detection. Pattern Recogn. 43 (1), 222–229 (2010).

Article   ADS   Google Scholar  

Alom, M. Z., Bontupalli, V. & Taha, T. M. Intrusion detection using deep belief networks. In Proc. IEEE National Aerospace and Electronics Conference (NAECON) 339–344 (2015).

Yin, C., Zhu, Y., Fei, J. & He, X. A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5 , 21954–21961 (2017).

Tang, T. A., Mhamdi, L., McLernon, D., Zaidi, S. A. R. & Ghogho, M. Deep learning approach for network intrusion detection in software defined networking. In Proc. 2016 International Conference on Wireless Networks and Mobile Communications (WINCOM) 258–263 (2016).

Ludwig, S. A. Intrusion detection of multiple attack classes using a deep neural net ensemble. In Proc. 2017 IEEE Symposium Series on Computational Intelligence (SSCI) 1–7 (2017).

Al-Hawawreh, M., Moustafa, N. & Sitnikova, E. Identification of malicious activities in industrial Internet of Things based on deep learning models. J. Inf. Secur. Appl. 41 , 1–11 (2018).

Shone, N., Ngoc, T. N., Phai, V. D. & Shi, Q. Deep learning approach to network intrusion detection. IEEE Trans. Emerg. Top. Comput. Intell. 2 (1), 41–50 (2018).

Subba, B., Biswas, S. & Karmakar, S. Enhancing performance of anomaly-based intrusion detection systems through dimensionality reduction using principal component analysis. In Proc. 2016 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS) 1–6 (2016).

Kumar, R. et al. Blockchain-based authentication and explainable AI for securing consumer IoT applications. IEEE Trans. Consumer Electron. https://doi.org/10.1109/TCE.2023.3320157 (2024).

Javeed, D., Gao, T., Kumar, P. & Jolfaei, A. An explainable and resilient intrusion detection system for industry 5.0. IEEE Trans. Consumer Electron. 70 (1), 1342–1350. https://doi.org/10.1109/TCE.2023.3283704 (2024).

Kumar, R. et al. Digital twins-enabled zero touch network: A smart contract and explainable AI integrated cybersecurity framework. Future Gener. Comput. Syst. https://doi.org/10.1016/j.future.2024.02.015 (2024).

Download references

Acknowledgements

The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through small group research under Grant Number (RGP1/129/45).

Author information

Authors and affiliations.

College of Computer Science, King Khalid University, Abha, Kingdom of Saudi Arabia

Hosam El-Sofany & Belgacem Bouallegue

Faculty of Informatics and Computer Science, British University in Egypt-BUE, Cairo, Egypt

Samir A. El-Seoud & Omar H. Karam

Electronics and Micro-Electronics Laboratory (E. μ. E. L), Faculty of Sciences of Monastir, University of Monastir, Monastir, Tunisia

Belgacem Bouallegue

You can also search for this author in PubMed   Google Scholar

Contributions

Hosam El-Sofany is responsible for developing the original research concept, design, methodology, and implementation. He is also responsible for writing, editing, reviewing, checking against plagiarism using the iThenticate program, and proofreading. Samir A. El-Seoud: methodology, writing, and proofreading. Omar H. Karam: methodology, writing, and proofreading. Belgacem Bouallegue: methodology, writing, reviewing, editing, and proofreading.

Corresponding author

Correspondence to Hosam El-Sofany .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

El-Sofany, H., El-Seoud, S.A., Karam, O.H. et al. Using machine learning algorithms to enhance IoT system security. Sci Rep 14 , 12077 (2024). https://doi.org/10.1038/s41598-024-62861-y

Download citation

Received : 13 November 2023

Accepted : 22 May 2024

Published : 27 May 2024

DOI : https://doi.org/10.1038/s41598-024-62861-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Internet of Things
  • Sustainable development goals
  • Sustainable cities and communities
  • IoT security
  • Machine learning

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

research papers on machine learning applications

IMAGES

  1. (PDF) An Overview of Machine Learning and its Applications

    research papers on machine learning applications

  2. (PDF) Machine learning and its applications: A Review

    research papers on machine learning applications

  3. Machine Learning Latest Research Papers

    research papers on machine learning applications

  4. Top 3 Artificial Intelligence Research Papers

    research papers on machine learning applications

  5. (PDF) An Overview of Artificial Intelligence and their Applications

    research papers on machine learning applications

  6. (PDF) Artificial Intelligence and Machine Learning Applications in

    research papers on machine learning applications

VIDEO

  1. MLDescent #1: Can Anyone write a Research Paper in the Age of AI?

  2. TOP AI NEWS. Machine Learning Trends. NeurIPS 2020

  3. Why you should read Research Papers in ML & DL? #machinelearning #deeplearning

  4. Overview of Machine learning & Data science Applications on Modern Power System

  5. Foundations of 1 x 1 convolutions

  6. Introduction to How to Work on #AIResearchPapers #VPremiumWebinar

COMMENTS

  1. Machine Learning: Algorithms, Real-World Applications and Research

    To discuss the applicability of machine learning-based solutions in various real-world application domains. To highlight and summarize the potential research directions within the scope of our study for intelligent data analysis and services. The rest of the paper is organized as follows.

  2. Machine learning

    Machine learning articles from across Nature Portfolio. Machine learning is the ability of a machine to improve its performance based on previous results. Machine learning methods enable computers ...

  3. Machine Learning with Applications

    Machine Learning with Applications (MLWA) is a peer reviewed, open access journal focused on research related to machine learning.The journal encompasses all aspects of research and development in ML, including but not limited to data mining, computer vision, natural language processing (NLP), intelligent systems, neural networks, AI-based software engineering, bioinformatics and their ...

  4. Machine Learning: Algorithms, Real-World Applications and Research

    The learning algorithms can be categorized into four major types, such as supervised, unsupervised, semi-supervised, and reinforcement learning in the area [ 75 ], discussed briefly in Sect. " Types of Real-World Data and Machine Learning Techniques ". The popularity of these approaches to learning is increasing day-by-day, which is shown ...

  5. Machine learning-based approach: global trends, research directions

    Artificial intelligence (AI), and in particular, Machine Learning (ML), have progressed remarkably in recent years as key instruments to intelligently analyze such data and to develop the corresponding real-world applications (Koteluk et al., 2021; Sarker, 2021b).For instance, ML has emerged as the method of choice for developing practical software for computer vision, speech recognition, and ...

  6. Journal of Machine Learning Research

    The Journal of Machine Learning Research (JMLR), , provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online. JMLR has a commitment to rigorous yet rapid reviewing. Final versions are (ISSN 1533-7928) immediately ...

  7. PDF Machine Learning: Algorithms, Real-World Applications and Research

    Types of Real‐World Data and Machine Learning Techniques. Machine learning algorithms typically consume and process data to learn the related patterns about individuals, business processes, transactions, events, and so on. In the following, we discuss various types of real-world data as well as cat-egories of machine learning algorithms.

  8. Machine Learning: Algorithms, Models, and Applications

    View a PDF of the paper titled Machine Learning: Algorithms, Models, and Applications, by Jaydip Sen and 14 other authors ... the current volume presents a few innovative research works and their applications in real world, such as stock trading, medical and healthcare systems, and software automation. The chapters in the book illustrate how ...

  9. Eight ways machine learning is assisting medicine

    As Keane says, "There is a huge gap between showing a proof of concept in a research paper and actually deploying a machine learning system in the real world — something that Eric Topol of ...

  10. Recent advances and applications of machine learning in solid-state

    We then review numerous applications of machine learning in solid-state materials science: the discovery of new stable materials and the prediction of their structure, the machine learning ...

  11. Machine Learning

    Machine Learning Algorithms, Models and Applications Edited by Jaydip Sen Edited by Jaydip Sen Recent times are witnessing rapid development in machine learning algorithm systems, especially in reinforcement learning, natural language processing, computer and robot vision, image processing, speech, and emotional processing and understanding.

  12. Machine Learning for industrial applications: A comprehensive

    To answer the first three research questions, all papers were carefully read and classified in terms of their:-Application Domain (AD) - The industrial area or process considered in the paper, ... El-Bendary et al. (2015) proposed the application of machine learning techniques to assess tomato ripeness. Posed as a multi-class classification ...

  13. Machine learning and its applications: A review

    Machine learning is used in various fields such as bioinformatics, intrusion detection, Information retrieval, game playing, marketing, malware detection, image deconvolution and so on. This paper presents the work done by various authors in the field of machine learning in various application areas.

  14. A Very Brief Introduction to Machine Learning With Applications to

    Fig. 2. Machine learning methodology that integrates domain knowl-edge during model selection. Moving beyond the basic formulation described above, machine learning tools can integrate available domain knowledge in the learning process. This is indeed the key to the success of machine learning tools in a number of applications.

  15. Machine Learning and its Applications: A Study

    Machine Learning is one of the highly recognized research areas nowadays. Different algorithms are used widely across several domains to implement the concepts. In this paper, discussion has been done in relation to machine learning along with its types, application areas [1].

  16. Applied machine learning in cancer research: A systematic review for

    2. Literature review. The PubMed biomedical repository and the dblp computer science bibliography were selected to perform a literature overview on ML-based studies in cancer towards disease diagnosis, disease outcome prediction and patients' classification. We searched and selected original research journal papers excluding reviews and technical reports between 2016 (January) and 2020 ...

  17. Machine Learning and Deep Learning: A Review of Methods and Applications

    This article provides a comprehensive overview of the basics of machine learning and deep learning, their differences, applications, and their impact on society. With a focus on current literature and research, this article aims to provide a better understanding of the potential of machine learning and deep learning and their implications for ...

  18. Applications of machine learning in drug discovery and development

    Abstract. Drug discovery and development pipelines are long, complex and depend on numerous factors. Machine learning (ML) approaches provide a set of tools that can improve discovery and decision ...

  19. Machine Learning Aspects and its Applications Towards Different

    The paper started with giving a brief description of machine learning, and the use of different models of machine learning. The various types of machine learning algorithms that are used for various purposes like data mining, predictive analytics, image processing etc. has also presented in the comprehensive review. We have also given a review ...

  20. Machine learning in medical applications: A review of state-of-the-art

    Several papers on ML's application in the medical field have been widely published. Fig. 4 demonstrates published articles amount in the period between 2000 and December 2021. The materials are gathered depending on the keyword "machine learning in the medical field" First, published articles were collected from well-known publishers, including Springer, Elsevier, IEEE, and some other ...

  21. Machine Learning Algorithms for Predicting Energy Consumption in

    In the past few years, there has been a notable interest in the application of machine learning methods to enhance energy efficiency in the smart building industry. The paper discusses the use of machine learning in smart buildings to improve energy efficiency by analyzing data on energy usage, occupancy patterns, and environmental conditions.

  22. NTRS

    A notable breakthrough in the machine learning community, namely the transformer neural network architecture, forms the backbone of the proposed solution in this paper. The transformer architecture's role in this research represents a paradigm shift in the efficiency of automatic speech recognition models.

  23. Advancing fNIRS Neuroimaging through Synthetic Data ...

    This study presents an integrated approach for advancing functional Near-Infrared Spectroscopy (fNIRS) neuroimaging through the synthesis of data and application of machine learning models. By addressing the scarcity of high-quality neuroimaging datasets, this work harnesses Monte Carlo simulations and parametric head models to generate a comprehensive synthetic dataset, reflecting a wide ...

  24. Machine Learning Applications for Precision Agriculture: A

    The mechanism that drives this cutting edge technology is machine learning (ML). It gives the machine ability to learn without being explicitly programmed. ML together with IoT (Internet of Things) enabled farm machinery are key components of the next agriculture revolution. ... In this article, authors present a systematic review of ML ...

  25. Machine learning applications in production lines: A systematic

    This paper systematically reviews the state-of-the-art in this domain and paves the way for further research on the application of machine learning in production lines. The following sections are organized as follows: Section 2 presents the background and related work. Section 3 describes the methodology.

  26. Inductive Biases in Deep Learning: Understanding Feature Representation

    Machine learning research aims to learn representations that enable effective downstream task performance. A growing subfield seeks to interpret these representations' roles in model behaviors or modify them to enhance alignment, interpretability, or generalization. Similarly, neuroscience examines neural representations and their behavioral correlations. Both fields focus on understanding or ...

  27. Using machine learning algorithms to enhance IoT system security

    Machine learning (ML) ML is an algorithmic artificial intelligence (AI) discipline that uses techniques to give intelligence to devices and computers. ML methods include unsupervised , supervised ...

  28. Free Full-Text

    Today, malware is arguably one of the biggest challenges organisations face from a cybersecurity standpoint, regardless of the types of devices used in the organisation. One of the most malware-attacked mobile operating systems today is Android. In response to this threat, this paper presents research on the functionalities and performance of different malicious Android application package ...