computer science Recently Published Documents
Total documents.
- Latest Documents
- Most Cited Documents
- Contributed Authors
- Related Sources
- Related Keywords
Hiring CS Graduates: What We Learned from Employers
Computer science ( CS ) majors are in high demand and account for a large part of national computer and information technology job market applicants. Employment in this sector is projected to grow 12% between 2018 and 2028, which is faster than the average of all other occupations. Published data are available on traditional non-computer science-specific hiring processes. However, the hiring process for CS majors may be different. It is critical to have up-to-date information on questions such as “what positions are in high demand for CS majors?,” “what is a typical hiring process?,” and “what do employers say they look for when hiring CS graduates?” This article discusses the analysis of a survey of 218 recruiters hiring CS graduates in the United States. We used Atlas.ti to analyze qualitative survey data and report the results on what positions are in the highest demand, the hiring process, and the resume review process. Our study revealed that a software developer was the most common job the recruiters were looking to fill. We found that the hiring process steps for CS graduates are generally aligned with traditional hiring steps, with an additional emphasis on technical and coding tests. Recruiters reported that their hiring choices were based on reviewing resume’s experience, GPA, and projects sections. The results provide insights into the hiring process, decision making, resume analysis, and some discrepancies between current undergraduate CS program outcomes and employers’ expectations.
A Systematic Literature Review of Empiricism and Norms of Reporting in Computing Education Research Literature
Context. Computing Education Research (CER) is critical to help the computing education community and policy makers support the increasing population of students who need to learn computing skills for future careers. For a community to systematically advance knowledge about a topic, the members must be able to understand published work thoroughly enough to perform replications, conduct meta-analyses, and build theories. There is a need to understand whether published research allows the CER community to systematically advance knowledge and build theories. Objectives. The goal of this study is to characterize the reporting of empiricism in Computing Education Research literature by identifying whether publications include content necessary for researchers to perform replications, meta-analyses, and theory building. We answer three research questions related to this goal: (RQ1) What percentage of papers in CER venues have some form of empirical evaluation? (RQ2) Of the papers that have empirical evaluation, what are the characteristics of the empirical evaluation? (RQ3) Of the papers that have empirical evaluation, do they follow norms (both for inclusion and for labeling of information needed for replication, meta-analysis, and, eventually, theory-building) for reporting empirical work? Methods. We conducted a systematic literature review of the 2014 and 2015 proceedings or issues of five CER venues: Technical Symposium on Computer Science Education (SIGCSE TS), International Symposium on Computing Education Research (ICER), Conference on Innovation and Technology in Computer Science Education (ITiCSE), ACM Transactions on Computing Education (TOCE), and Computer Science Education (CSE). We developed and applied the CER Empiricism Assessment Rubric to the 427 papers accepted and published at these venues over 2014 and 2015. Two people evaluated each paper using the Base Rubric for characterizing the paper. An individual person applied the other rubrics to characterize the norms of reporting, as appropriate for the paper type. Any discrepancies or questions were discussed between multiple reviewers to resolve. Results. We found that over 80% of papers accepted across all five venues had some form of empirical evaluation. Quantitative evaluation methods were the most frequently reported. Papers most frequently reported results on interventions around pedagogical techniques, curriculum, community, or tools. There was a split in papers that had some type of comparison between an intervention and some other dataset or baseline. Most papers reported related work, following the expectations for doing so in the SIGCSE and CER community. However, many papers were lacking properly reported research objectives, goals, research questions, or hypotheses; description of participants; study design; data collection; and threats to validity. These results align with prior surveys of the CER literature. Conclusions. CER authors are contributing empirical results to the literature; however, not all norms for reporting are met. We encourage authors to provide clear, labeled details about their work so readers can use the study methodologies and results for replications and meta-analyses. As our community grows, our reporting of CER should mature to help establish computing education theory to support the next generation of computing learners.
Light Diacritic Restoration to Disambiguate Homographs in Modern Arabic Texts
Diacritic restoration (also known as diacritization or vowelization) is the process of inserting the correct diacritical markings into a text. Modern Arabic is typically written without diacritics, e.g., newspapers. This lack of diacritical markings often causes ambiguity, and though natives are adept at resolving, there are times they may fail. Diacritic restoration is a classical problem in computer science. Still, as most of the works tackle the full (heavy) diacritization of text, we, however, are interested in diacritizing the text using a fewer number of diacritics. Studies have shown that a fully diacritized text is visually displeasing and slows down the reading. This article proposes a system to diacritize homographs using the least number of diacritics, thus the name “light.” There is a large class of words that fall under the homograph category, and we will be dealing with the class of words that share the spelling but not the meaning. With fewer diacritics, we do not expect any effect on reading speed, while eye strain is reduced. The system contains morphological analyzer and context similarities. The morphological analyzer is used to generate all word candidates for diacritics. Then, through a statistical approach and context similarities, we resolve the homographs. Experimentally, the system shows very promising results, and our best accuracy is 85.6%.
A genre-based analysis of questions and comments in Q&A sessions after conference paper presentations in computer science
Gender diversity in computer science at a large public r1 research university: reporting on a self-study.
With the number of jobs in computer occupations on the rise, there is a greater need for computer science (CS) graduates than ever. At the same time, most CS departments across the country are only seeing 25–30% of women students in their classes, meaning that we are failing to draw interest from a large portion of the population. In this work, we explore the gender gap in CS at Rutgers University–New Brunswick, a large public R1 research university, using three data sets that span thousands of students across six academic years. Specifically, we combine these data sets to study the gender gaps in four core CS courses and explore the correlation of several factors with retention and the impact of these factors on changes to the gender gap as students proceed through the CS courses toward completing the CS major. For example, we find that a significant percentage of women students taking the introductory CS1 course for majors do not intend to major in CS, which may be a contributing factor to a large increase in the gender gap immediately after CS1. This finding implies that part of the retention task is attracting these women students to further explore the major. Results from our study include both novel findings and findings that are consistent with known challenges for increasing gender diversity in CS. In both cases, we provide extensive quantitative data in support of the findings.
Designing for Student-Directedness: How K–12 Teachers Utilize Peers to Support Projects
Student-directed projects—projects in which students have individual control over what they create and how to create it—are a promising practice for supporting the development of conceptual understanding and personal interest in K–12 computer science classrooms. In this article, we explore a central (and perhaps counterintuitive) design principle identified by a group of K–12 computer science teachers who support student-directed projects in their classrooms: in order for students to develop their own ideas and determine how to pursue them, students must have opportunities to engage with other students’ work. In this qualitative study, we investigated the instructional practices of 25 K–12 teachers using a series of in-depth, semi-structured interviews to develop understandings of how they used peer work to support student-directed projects in their classrooms. Teachers described supporting their students in navigating three stages of project development: generating ideas, pursuing ideas, and presenting ideas. For each of these three stages, teachers considered multiple factors to encourage engagement with peer work in their classrooms, including the quality and completeness of shared work and the modes of interaction with the work. We discuss how this pedagogical approach offers students new relationships to their own learning, to their peers, and to their teachers and communicates important messages to students about their own competence and agency, potentially contributing to aims within computer science for broadening participation.
Creativity in CS1: A Literature Review
Computer science is a fast-growing field in today’s digitized age, and working in this industry often requires creativity and innovative thought. An issue within computer science education, however, is that large introductory programming courses often involve little opportunity for creative thinking within coursework. The undergraduate introductory programming course (CS1) is notorious for its poor student performance and retention rates across multiple institutions. Integrating opportunities for creative thinking may help combat this issue by adding a personal touch to course content, which could allow beginner CS students to better relate to the abstract world of programming. Research on the role of creativity in computer science education (CSE) is an interesting area with a lot of room for exploration due to the complexity of the phenomenon of creativity as well as the CSE research field being fairly new compared to some other education fields where this topic has been more closely explored. To contribute to this area of research, this article provides a literature review exploring the concept of creativity as relevant to computer science education and CS1 in particular. Based on the review of the literature, we conclude creativity is an essential component to computer science, and the type of creativity that computer science requires is in fact, a teachable skill through the use of various tools and strategies. These strategies include the integration of open-ended assignments, large collaborative projects, learning by teaching, multimedia projects, small creative computational exercises, game development projects, digitally produced art, robotics, digital story-telling, music manipulation, and project-based learning. Research on each of these strategies and their effects on student experiences within CS1 is discussed in this review. Last, six main components of creativity-enhancing activities are identified based on the studies about incorporating creativity into CS1. These components are as follows: Collaboration, Relevance, Autonomy, Ownership, Hands-On Learning, and Visual Feedback. The purpose of this article is to contribute to computer science educators’ understanding of how creativity is best understood in the context of computer science education and explore practical applications of creativity theory in CS1 classrooms. This is an important collection of information for restructuring aspects of future introductory programming courses in creative, innovative ways that benefit student learning.
CATS: Customizable Abstractive Topic-based Summarization
Neural sequence-to-sequence models are the state-of-the-art approach used in abstractive summarization of textual documents, useful for producing condensed versions of source text narratives without being restricted to using only words from the original text. Despite the advances in abstractive summarization, custom generation of summaries (e.g., towards a user’s preference) remains unexplored. In this article, we present CATS, an abstractive neural summarization model that summarizes content in a sequence-to-sequence fashion while also introducing a new mechanism to control the underlying latent topic distribution of the produced summaries. We empirically illustrate the efficacy of our model in producing customized summaries and present findings that facilitate the design of such systems. We use the well-known CNN/DailyMail dataset to evaluate our model. Furthermore, we present a transfer-learning method and demonstrate the effectiveness of our approach in a low resource setting, i.e., abstractive summarization of meetings minutes, where combining the main available meetings’ transcripts datasets, AMI and International Computer Science Institute(ICSI) , results in merely a few hundred training documents.
Exploring students’ and lecturers’ views on collaboration and cooperation in computer science courses - a qualitative analysis
Factors affecting student educational choices regarding oer material in computer science, export citation format, share document.
Advertisement
Machine Learning: Algorithms, Real-World Applications and Research Directions
- Review Article
- Published: 22 March 2021
- Volume 2 , article number 160 , ( 2021 )
Cite this article
- Iqbal H. Sarker ORCID: orcid.org/0000-0003-1740-5517 1 , 2
622k Accesses
1950 Citations
53 Altmetric
Explore all metrics
In the current age of the Fourth Industrial Revolution (4 IR or Industry 4.0), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding smart and automated applications, the knowledge of artificial intelligence (AI), particularly, machine learning (ML) is the key. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in the area. Besides, the deep learning , which is part of a broader family of machine learning methods, can intelligently analyze the data on a large scale. In this paper, we present a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, this study’s key contribution is explaining the principles of different machine learning techniques and their applicability in various real-world application domains, such as cybersecurity systems, smart cities, healthcare, e-commerce, agriculture, and many more. We also highlight the challenges and potential research directions based on our study. Overall, this paper aims to serve as a reference point for both academia and industry professionals as well as for decision-makers in various real-world situations and application areas, particularly from the technical point of view.
Similar content being viewed by others
Machine Learning Approaches for Smart City Applications: Emergence, Challenges and Opportunities
Insights into the Advancements of Artificial Intelligence and Machine Learning, the Present State of Art, and Future Prospects: Seven Decades of Digital Revolution
Editorial: Machine Learning, Advances in Computing, Renewable Energy and Communication (MARC)
Explore related subjects.
- Artificial Intelligence
Avoid common mistakes on your manuscript.
Introduction
We live in the age of data, where everything around us is connected to a data source, and everything in our lives is digitally recorded [ 21 , 103 ]. For instance, the current electronic world has a wealth of various kinds of data, such as the Internet of Things (IoT) data, cybersecurity data, smart city data, business data, smartphone data, social media data, health data, COVID-19 data, and many more. The data can be structured, semi-structured, or unstructured, discussed briefly in Sect. “ Types of Real-World Data and Machine Learning Techniques ”, which is increasing day-by-day. Extracting insights from these data can be used to build various intelligent applications in the relevant domains. For instance, to build a data-driven automated and intelligent cybersecurity system, the relevant cybersecurity data can be used [ 105 ]; to build personalized context-aware smart mobile applications, the relevant mobile data can be used [ 103 ], and so on. Thus, the data management tools and techniques having the capability of extracting insights or useful knowledge from the data in a timely and intelligent way is urgently needed, on which the real-world applications are based.
The worldwide popularity score of various types of ML algorithms (supervised, unsupervised, semi-supervised, and reinforcement) in a range of 0 (min) to 100 (max) over time where x-axis represents the timestamp information and y-axis represents the corresponding score
Artificial intelligence (AI), particularly, machine learning (ML) have grown rapidly in recent years in the context of data analysis and computing that typically allows the applications to function in an intelligent manner [ 95 ]. ML usually provides systems with the ability to learn and enhance from experience automatically without being specifically programmed and is generally referred to as the most popular latest technologies in the fourth industrial revolution (4 IR or Industry 4.0) [ 103 , 105 ]. “Industry 4.0” [ 114 ] is typically the ongoing automation of conventional manufacturing and industrial practices, including exploratory data processing, using new smart technologies such as machine learning automation. Thus, to intelligently analyze these data and to develop the corresponding real-world applications, machine learning algorithms is the key. The learning algorithms can be categorized into four major types, such as supervised, unsupervised, semi-supervised, and reinforcement learning in the area [ 75 ], discussed briefly in Sect. “ Types of Real-World Data and Machine Learning Techniques ”. The popularity of these approaches to learning is increasing day-by-day, which is shown in Fig. 1 , based on data collected from Google Trends [ 4 ] over the last five years. The x - axis of the figure indicates the specific dates and the corresponding popularity score within the range of \(0 \; (minimum)\) to \(100 \; (maximum)\) has been shown in y - axis . According to Fig. 1 , the popularity indication values for these learning types are low in 2015 and are increasing day by day. These statistics motivate us to study on machine learning in this paper, which can play an important role in the real-world through Industry 4.0 automation.
In general, the effectiveness and the efficiency of a machine learning solution depend on the nature and characteristics of data and the performance of the learning algorithms . In the area of machine learning algorithms, classification analysis, regression, data clustering, feature engineering and dimensionality reduction, association rule learning, or reinforcement learning techniques exist to effectively build data-driven systems [ 41 , 125 ]. Besides, deep learning originated from the artificial neural network that can be used to intelligently analyze data, which is known as part of a wider family of machine learning approaches [ 96 ]. Thus, selecting a proper learning algorithm that is suitable for the target application in a particular domain is challenging. The reason is that the purpose of different learning algorithms is different, even the outcome of different learning algorithms in a similar category may vary depending on the data characteristics [ 106 ]. Thus, it is important to understand the principles of various machine learning algorithms and their applicability to apply in various real-world application areas, such as IoT systems, cybersecurity services, business and recommendation systems, smart cities, healthcare and COVID-19, context-aware systems, sustainable agriculture, and many more that are explained briefly in Sect. “ Applications of Machine Learning ”.
Based on the importance and potentiality of “Machine Learning” to analyze the data mentioned above, in this paper, we provide a comprehensive view on various types of machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, the key contribution of this study is explaining the principles and potentiality of different machine learning techniques, and their applicability in various real-world application areas mentioned earlier. The purpose of this paper is, therefore, to provide a basic guide for those academia and industry people who want to study, research, and develop data-driven automated and intelligent systems in the relevant areas based on machine learning techniques.
The key contributions of this paper are listed as follows:
To define the scope of our study by taking into account the nature and characteristics of various types of real-world data and the capabilities of various learning techniques.
To provide a comprehensive view on machine learning algorithms that can be applied to enhance the intelligence and capabilities of a data-driven application.
To discuss the applicability of machine learning-based solutions in various real-world application domains.
To highlight and summarize the potential research directions within the scope of our study for intelligent data analysis and services.
The rest of the paper is organized as follows. The next section presents the types of data and machine learning algorithms in a broader sense and defines the scope of our study. We briefly discuss and explain different machine learning algorithms in the subsequent section followed by which various real-world application areas based on machine learning algorithms are discussed and summarized. In the penultimate section, we highlight several research issues and potential future directions, and the final section concludes this paper.
Types of Real-World Data and Machine Learning Techniques
Machine learning algorithms typically consume and process data to learn the related patterns about individuals, business processes, transactions, events, and so on. In the following, we discuss various types of real-world data as well as categories of machine learning algorithms.
Types of Real-World Data
Usually, the availability of data is considered as the key to construct a machine learning model or data-driven real-world systems [ 103 , 105 ]. Data can be of various forms, such as structured, semi-structured, or unstructured [ 41 , 72 ]. Besides, the “metadata” is another type that typically represents data about the data. In the following, we briefly discuss these types of data.
Structured: It has a well-defined structure, conforms to a data model following a standard order, which is highly organized and easily accessed, and used by an entity or a computer program. In well-defined schemes, such as relational databases, structured data are typically stored, i.e., in a tabular format. For instance, names, dates, addresses, credit card numbers, stock information, geolocation, etc. are examples of structured data.
Unstructured: On the other hand, there is no pre-defined format or organization for unstructured data, making it much more difficult to capture, process, and analyze, mostly containing text and multimedia material. For example, sensor data, emails, blog entries, wikis, and word processing documents, PDF files, audio files, videos, images, presentations, web pages, and many other types of business documents can be considered as unstructured data.
Semi-structured: Semi-structured data are not stored in a relational database like the structured data mentioned above, but it does have certain organizational properties that make it easier to analyze. HTML, XML, JSON documents, NoSQL databases, etc., are some examples of semi-structured data.
Metadata: It is not the normal form of data, but “data about data”. The primary difference between “data” and “metadata” is that data are simply the material that can classify, measure, or even document something relative to an organization’s data properties. On the other hand, metadata describes the relevant data information, giving it more significance for data users. A basic example of a document’s metadata might be the author, file size, date generated by the document, keywords to define the document, etc.
In the area of machine learning and data science, researchers use various widely used datasets for different purposes. These are, for example, cybersecurity datasets such as NSL-KDD [ 119 ], UNSW-NB15 [ 76 ], ISCX’12 [ 1 ], CIC-DDoS2019 [ 2 ], Bot-IoT [ 59 ], etc., smartphone datasets such as phone call logs [ 84 , 101 ], SMS Log [ 29 ], mobile application usages logs [ 137 ] [ 117 ], mobile phone notification logs [ 73 ] etc., IoT data [ 16 , 57 , 62 ], agriculture and e-commerce data [ 120 , 138 ], health data such as heart disease [ 92 ], diabetes mellitus [ 83 , 134 ], COVID-19 [ 43 , 74 ], etc., and many more in various application domains. The data can be in different types discussed above, which may vary from application to application in the real world. To analyze such data in a particular problem domain, and to extract the insights or useful knowledge from the data for building the real-world intelligent applications, different types of machine learning techniques can be used according to their learning capabilities, which is discussed in the following.
Types of Machine Learning Techniques
Machine Learning algorithms are mainly divided into four categories: Supervised learning, Unsupervised learning, Semi-supervised learning, and Reinforcement learning [ 75 ], as shown in Fig. 2 . In the following, we briefly discuss each type of learning technique with the scope of their applicability to solve real-world problems.
Various types of machine learning techniques
Supervised: Supervised learning is typically the task of machine learning to learn a function that maps an input to an output based on sample input-output pairs [ 41 ]. It uses labeled training data and a collection of training examples to infer a function. Supervised learning is carried out when certain goals are identified to be accomplished from a certain set of inputs [ 105 ], i.e., a task-driven approach . The most common supervised tasks are “classification” that separates the data, and “regression” that fits the data. For instance, predicting the class label or sentiment of a piece of text, like a tweet or a product review, i.e., text classification, is an example of supervised learning.
Unsupervised: Unsupervised learning analyzes unlabeled datasets without the need for human interference, i.e., a data-driven process [ 41 ]. This is widely used for extracting generative features, identifying meaningful trends and structures, groupings in results, and exploratory purposes. The most common unsupervised learning tasks are clustering, density estimation, feature learning, dimensionality reduction, finding association rules, anomaly detection, etc.
Semi-supervised: Semi-supervised learning can be defined as a hybridization of the above-mentioned supervised and unsupervised methods, as it operates on both labeled and unlabeled data [ 41 , 105 ]. Thus, it falls between learning “without supervision” and learning “with supervision”. In the real world, labeled data could be rare in several contexts, and unlabeled data are numerous, where semi-supervised learning is useful [ 75 ]. The ultimate goal of a semi-supervised learning model is to provide a better outcome for prediction than that produced using the labeled data alone from the model. Some application areas where semi-supervised learning is used include machine translation, fraud detection, labeling data and text classification.
Reinforcement: Reinforcement learning is a type of machine learning algorithm that enables software agents and machines to automatically evaluate the optimal behavior in a particular context or environment to improve its efficiency [ 52 ], i.e., an environment-driven approach . This type of learning is based on reward or penalty, and its ultimate goal is to use insights obtained from environmental activists to take action to increase the reward or minimize the risk [ 75 ]. It is a powerful tool for training AI models that can help increase automation or optimize the operational efficiency of sophisticated systems such as robotics, autonomous driving tasks, manufacturing and supply chain logistics, however, not preferable to use it for solving the basic or straightforward problems.
Thus, to build effective models in various application areas different types of machine learning techniques can play a significant role according to their learning capabilities, depending on the nature of the data discussed earlier, and the target outcome. In Table 1 , we summarize various types of machine learning techniques with examples. In the following, we provide a comprehensive view of machine learning algorithms that can be applied to enhance the intelligence and capabilities of a data-driven application.
Machine Learning Tasks and Algorithms
In this section, we discuss various machine learning algorithms that include classification analysis, regression analysis, data clustering, association rule learning, feature engineering for dimensionality reduction, as well as deep learning methods. A general structure of a machine learning-based predictive model has been shown in Fig. 3 , where the model is trained from historical data in phase 1 and the outcome is generated in phase 2 for the new test data.
A general structure of a machine learning based predictive model considering both the training and testing phase
Classification Analysis
Classification is regarded as a supervised learning method in machine learning, referring to a problem of predictive modeling as well, where a class label is predicted for a given example [ 41 ]. Mathematically, it maps a function ( f ) from input variables ( X ) to output variables ( Y ) as target, label or categories. To predict the class of given data points, it can be carried out on structured or unstructured data. For example, spam detection such as “spam” and “not spam” in email service providers can be a classification problem. In the following, we summarize the common classification problems.
Binary classification: It refers to the classification tasks having two class labels such as “true and false” or “yes and no” [ 41 ]. In such binary classification tasks, one class could be the normal state, while the abnormal state could be another class. For instance, “cancer not detected” is the normal state of a task that involves a medical test, and “cancer detected” could be considered as the abnormal state. Similarly, “spam” and “not spam” in the above example of email service providers are considered as binary classification.
Multiclass classification: Traditionally, this refers to those classification tasks having more than two class labels [ 41 ]. The multiclass classification does not have the principle of normal and abnormal outcomes, unlike binary classification tasks. Instead, within a range of specified classes, examples are classified as belonging to one. For example, it can be a multiclass classification task to classify various types of network attacks in the NSL-KDD [ 119 ] dataset, where the attack categories are classified into four class labels, such as DoS (Denial of Service Attack), U2R (User to Root Attack), R2L (Root to Local Attack), and Probing Attack.
Multi-label classification: In machine learning, multi-label classification is an important consideration where an example is associated with several classes or labels. Thus, it is a generalization of multiclass classification, where the classes involved in the problem are hierarchically structured, and each example may simultaneously belong to more than one class in each hierarchical level, e.g., multi-level text classification. For instance, Google news can be presented under the categories of a “city name”, “technology”, or “latest news”, etc. Multi-label classification includes advanced machine learning algorithms that support predicting various mutually non-exclusive classes or labels, unlike traditional classification tasks where class labels are mutually exclusive [ 82 ].
Many classification algorithms have been proposed in the machine learning and data science literature [ 41 , 125 ]. In the following, we summarize the most common and popular methods that are used widely in various application areas.
Naive Bayes (NB): The naive Bayes algorithm is based on the Bayes’ theorem with the assumption of independence between each pair of features [ 51 ]. It works well and can be used for both binary and multi-class categories in many real-world situations, such as document or text classification, spam filtering, etc. To effectively classify the noisy instances in the data and to construct a robust prediction model, the NB classifier can be used [ 94 ]. The key benefit is that, compared to more sophisticated approaches, it needs a small amount of training data to estimate the necessary parameters and quickly [ 82 ]. However, its performance may affect due to its strong assumptions on features independence. Gaussian, Multinomial, Complement, Bernoulli, and Categorical are the common variants of NB classifier [ 82 ].
Linear Discriminant Analysis (LDA): Linear Discriminant Analysis (LDA) is a linear decision boundary classifier created by fitting class conditional densities to data and applying Bayes’ rule [ 51 , 82 ]. This method is also known as a generalization of Fisher’s linear discriminant, which projects a given dataset into a lower-dimensional space, i.e., a reduction of dimensionality that minimizes the complexity of the model or reduces the resulting model’s computational costs. The standard LDA model usually suits each class with a Gaussian density, assuming that all classes share the same covariance matrix [ 82 ]. LDA is closely related to ANOVA (analysis of variance) and regression analysis, which seek to express one dependent variable as a linear combination of other features or measurements.
Logistic regression (LR): Another common probabilistic based statistical model used to solve classification issues in machine learning is Logistic Regression (LR) [ 64 ]. Logistic regression typically uses a logistic function to estimate the probabilities, which is also referred to as the mathematically defined sigmoid function in Eq. 1 . It can overfit high-dimensional datasets and works well when the dataset can be separated linearly. The regularization (L1 and L2) techniques [ 82 ] can be used to avoid over-fitting in such scenarios. The assumption of linearity between the dependent and independent variables is considered as a major drawback of Logistic Regression. It can be used for both classification and regression problems, but it is more commonly used for classification.
K-nearest neighbors (KNN): K-Nearest Neighbors (KNN) [ 9 ] is an “instance-based learning” or non-generalizing learning, also known as a “lazy learning” algorithm. It does not focus on constructing a general internal model; instead, it stores all instances corresponding to training data in n -dimensional space. KNN uses data and classifies new data points based on similarity measures (e.g., Euclidean distance function) [ 82 ]. Classification is computed from a simple majority vote of the k nearest neighbors of each point. It is quite robust to noisy training data, and accuracy depends on the data quality. The biggest issue with KNN is to choose the optimal number of neighbors to be considered. KNN can be used both for classification as well as regression.
Support vector machine (SVM): In machine learning, another common technique that can be used for classification, regression, or other tasks is a support vector machine (SVM) [ 56 ]. In high- or infinite-dimensional space, a support vector machine constructs a hyper-plane or set of hyper-planes. Intuitively, the hyper-plane, which has the greatest distance from the nearest training data points in any class, achieves a strong separation since, in general, the greater the margin, the lower the classifier’s generalization error. It is effective in high-dimensional spaces and can behave differently based on different mathematical functions known as the kernel. Linear, polynomial, radial basis function (RBF), sigmoid, etc., are the popular kernel functions used in SVM classifier [ 82 ]. However, when the data set contains more noise, such as overlapping target classes, SVM does not perform well.
Decision tree (DT): Decision tree (DT) [ 88 ] is a well-known non-parametric supervised learning method. DT learning methods are used for both the classification and regression tasks [ 82 ]. ID3 [ 87 ], C4.5 [ 88 ], and CART [ 20 ] are well known for DT algorithms. Moreover, recently proposed BehavDT [ 100 ], and IntrudTree [ 97 ] by Sarker et al. are effective in the relevant application domains, such as user behavior analytics and cybersecurity analytics, respectively. By sorting down the tree from the root to some leaf nodes, as shown in Fig. 4 , DT classifies the instances. Instances are classified by checking the attribute defined by that node, starting at the root node of the tree, and then moving down the tree branch corresponding to the attribute value. For splitting, the most popular criteria are “gini” for the Gini impurity and “entropy” for the information gain that can be expressed mathematically as [ 82 ].
An example of a decision tree structure
An example of a random forest structure considering multiple decision trees
Random forest (RF): A random forest classifier [ 19 ] is well known as an ensemble classification technique that is used in the field of machine learning and data science in various application areas. This method uses “parallel ensembling” which fits several decision tree classifiers in parallel, as shown in Fig. 5 , on different data set sub-samples and uses majority voting or averages for the outcome or final result. It thus minimizes the over-fitting problem and increases the prediction accuracy and control [ 82 ]. Therefore, the RF learning model with multiple decision trees is typically more accurate than a single decision tree based model [ 106 ]. To build a series of decision trees with controlled variation, it combines bootstrap aggregation (bagging) [ 18 ] and random feature selection [ 11 ]. It is adaptable to both classification and regression problems and fits well for both categorical and continuous values.
Adaptive Boosting (AdaBoost): Adaptive Boosting (AdaBoost) is an ensemble learning process that employs an iterative approach to improve poor classifiers by learning from their errors. This is developed by Yoav Freund et al. [ 35 ] and also known as “meta-learning”. Unlike the random forest that uses parallel ensembling, Adaboost uses “sequential ensembling”. It creates a powerful classifier by combining many poorly performing classifiers to obtain a good classifier of high accuracy. In that sense, AdaBoost is called an adaptive classifier by significantly improving the efficiency of the classifier, but in some instances, it can trigger overfits. AdaBoost is best used to boost the performance of decision trees, base estimator [ 82 ], on binary classification problems, however, is sensitive to noisy data and outliers.
Extreme gradient boosting (XGBoost): Gradient Boosting, like Random Forests [ 19 ] above, is an ensemble learning algorithm that generates a final model based on a series of individual models, typically decision trees. The gradient is used to minimize the loss function, similar to how neural networks [ 41 ] use gradient descent to optimize weights. Extreme Gradient Boosting (XGBoost) is a form of gradient boosting that takes more detailed approximations into account when determining the best model [ 82 ]. It computes second-order gradients of the loss function to minimize loss and advanced regularization (L1 and L2) [ 82 ], which reduces over-fitting, and improves model generalization and performance. XGBoost is fast to interpret and can handle large-sized datasets well.
Stochastic gradient descent (SGD): Stochastic gradient descent (SGD) [ 41 ] is an iterative method for optimizing an objective function with appropriate smoothness properties, where the word ‘stochastic’ refers to random probability. This reduces the computational burden, particularly in high-dimensional optimization problems, allowing for faster iterations in exchange for a lower convergence rate. A gradient is the slope of a function that calculates a variable’s degree of change in response to another variable’s changes. Mathematically, the Gradient Descent is a convex function whose output is a partial derivative of a set of its input parameters. Let, \(\alpha\) is the learning rate, and \(J_i\) is the training example cost of \(i \mathrm{th}\) , then Eq. ( 4 ) represents the stochastic gradient descent weight update method at the \(j^\mathrm{th}\) iteration. In large-scale and sparse machine learning, SGD has been successfully applied to problems often encountered in text classification and natural language processing [ 82 ]. However, SGD is sensitive to feature scaling and needs a range of hyperparameters, such as the regularization parameter and the number of iterations.
Rule-based classification : The term rule-based classification can be used to refer to any classification scheme that makes use of IF-THEN rules for class prediction. Several classification algorithms such as Zero-R [ 125 ], One-R [ 47 ], decision trees [ 87 , 88 ], DTNB [ 110 ], Ripple Down Rule learner (RIDOR) [ 125 ], Repeated Incremental Pruning to Produce Error Reduction (RIPPER) [ 126 ] exist with the ability of rule generation. The decision tree is one of the most common rule-based classification algorithms among these techniques because it has several advantages, such as being easier to interpret; the ability to handle high-dimensional data; simplicity and speed; good accuracy; and the capability to produce rules for human clear and understandable classification [ 127 ] [ 128 ]. The decision tree-based rules also provide significant accuracy in a prediction model for unseen test cases [ 106 ]. Since the rules are easily interpretable, these rule-based classifiers are often used to produce descriptive models that can describe a system including the entities and their relationships.
Classification vs. regression. In classification the dotted line represents a linear boundary that separates the two classes; in regression, the dotted line models the linear relationship between the two variables
Regression Analysis
Regression analysis includes several methods of machine learning that allow to predict a continuous ( y ) result variable based on the value of one or more ( x ) predictor variables [ 41 ]. The most significant distinction between classification and regression is that classification predicts distinct class labels, while regression facilitates the prediction of a continuous quantity. Figure 6 shows an example of how classification is different with regression models. Some overlaps are often found between the two types of machine learning algorithms. Regression models are now widely used in a variety of fields, including financial forecasting or prediction, cost estimation, trend analysis, marketing, time series estimation, drug response modeling, and many more. Some of the familiar types of regression algorithms are linear, polynomial, lasso and ridge regression, etc., which are explained briefly in the following.
Simple and multiple linear regression: This is one of the most popular ML modeling techniques as well as a well-known regression technique. In this technique, the dependent variable is continuous, the independent variable(s) can be continuous or discrete, and the form of the regression line is linear. Linear regression creates a relationship between the dependent variable ( Y ) and one or more independent variables ( X ) (also known as regression line) using the best fit straight line [ 41 ]. It is defined by the following equations:
where a is the intercept, b is the slope of the line, and e is the error term. This equation can be used to predict the value of the target variable based on the given predictor variable(s). Multiple linear regression is an extension of simple linear regression that allows two or more predictor variables to model a response variable, y, as a linear function [ 41 ] defined in Eq. 6 , whereas simple linear regression has only 1 independent variable, defined in Eq. 5 .
Polynomial regression: Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is not linear, but is the polynomial degree of \(n^\mathrm{th}\) in x [ 82 ]. The equation for polynomial regression is also derived from linear regression (polynomial regression of degree 1) equation, which is defined as below:
Here, y is the predicted/target output, \(b_0, b_1,... b_n\) are the regression coefficients, x is an independent/ input variable. In simple words, we can say that if data are not distributed linearly, instead it is \(n^\mathrm{th}\) degree of polynomial then we use polynomial regression to get desired output.
LASSO and ridge regression: LASSO and Ridge regression are well known as powerful techniques which are typically used for building learning models in presence of a large number of features, due to their capability to preventing over-fitting and reducing the complexity of the model. The LASSO (least absolute shrinkage and selection operator) regression model uses L 1 regularization technique [ 82 ] that uses shrinkage, which penalizes “absolute value of magnitude of coefficients” ( L 1 penalty). As a result, LASSO appears to render coefficients to absolute zero. Thus, LASSO regression aims to find the subset of predictors that minimizes the prediction error for a quantitative response variable. On the other hand, ridge regression uses L 2 regularization [ 82 ], which is the “squared magnitude of coefficients” ( L 2 penalty). Thus, ridge regression forces the weights to be small but never sets the coefficient value to zero, and does a non-sparse solution. Overall, LASSO regression is useful to obtain a subset of predictors by eliminating less important features, and ridge regression is useful when a data set has “multicollinearity” which refers to the predictors that are correlated with other predictors.
Cluster Analysis
Cluster analysis, also known as clustering, is an unsupervised machine learning technique for identifying and grouping related data points in large datasets without concern for the specific outcome. It does grouping a collection of objects in such a way that objects in the same category, called a cluster, are in some sense more similar to each other than objects in other groups [ 41 ]. It is often used as a data analysis technique to discover interesting trends or patterns in data, e.g., groups of consumers based on their behavior. In a broad range of application areas, such as cybersecurity, e-commerce, mobile data processing, health analytics, user modeling and behavioral analytics, clustering can be used. In the following, we briefly discuss and summarize various types of clustering methods.
Partitioning methods: Based on the features and similarities in the data, this clustering approach categorizes the data into multiple groups or clusters. The data scientists or analysts typically determine the number of clusters either dynamically or statically depending on the nature of the target applications, to produce for the methods of clustering. The most common clustering algorithms based on partitioning methods are K-means [ 69 ], K-Mediods [ 80 ], CLARA [ 55 ] etc.
Density-based methods: To identify distinct groups or clusters, it uses the concept that a cluster in the data space is a contiguous region of high point density isolated from other such clusters by contiguous regions of low point density. Points that are not part of a cluster are considered as noise. The typical clustering algorithms based on density are DBSCAN [ 32 ], OPTICS [ 12 ] etc. The density-based methods typically struggle with clusters of similar density and high dimensionality data.
Hierarchical-based methods: Hierarchical clustering typically seeks to construct a hierarchy of clusters, i.e., the tree structure. Strategies for hierarchical clustering generally fall into two types: (i) Agglomerative—a “bottom-up” approach in which each observation begins in its cluster and pairs of clusters are combined as one, moves up the hierarchy, and (ii) Divisive—a “top-down” approach in which all observations begin in one cluster and splits are performed recursively, moves down the hierarchy, as shown in Fig 7 . Our earlier proposed BOTS technique, Sarker et al. [ 102 ] is an example of a hierarchical, particularly, bottom-up clustering algorithm.
Grid-based methods: To deal with massive datasets, grid-based clustering is especially suitable. To obtain clusters, the principle is first to summarize the dataset with a grid representation and then to combine grid cells. STING [ 122 ], CLIQUE [ 6 ], etc. are the standard algorithms of grid-based clustering.
Model-based methods: There are mainly two types of model-based clustering algorithms: one that uses statistical learning, and the other based on a method of neural network learning [ 130 ]. For instance, GMM [ 89 ] is an example of a statistical learning method, and SOM [ 22 ] [ 96 ] is an example of a neural network learning method.
Constraint-based methods: Constrained-based clustering is a semi-supervised approach to data clustering that uses constraints to incorporate domain knowledge. Application or user-oriented constraints are incorporated to perform the clustering. The typical algorithms of this kind of clustering are COP K-means [ 121 ], CMWK-Means [ 27 ], etc.
A graphical interpretation of the widely-used hierarchical clustering (Bottom-up and top-down) technique
Many clustering algorithms have been proposed with the ability to grouping data in machine learning and data science literature [ 41 , 125 ]. In the following, we summarize the popular methods that are used widely in various application areas.
K-means clustering: K-means clustering [ 69 ] is a fast, robust, and simple algorithm that provides reliable results when data sets are well-separated from each other. The data points are allocated to a cluster in this algorithm in such a way that the amount of the squared distance between the data points and the centroid is as small as possible. In other words, the K-means algorithm identifies the k number of centroids and then assigns each data point to the nearest cluster while keeping the centroids as small as possible. Since it begins with a random selection of cluster centers, the results can be inconsistent. Since extreme values can easily affect a mean, the K-means clustering algorithm is sensitive to outliers. K-medoids clustering [ 91 ] is a variant of K-means that is more robust to noises and outliers.
Mean-shift clustering: Mean-shift clustering [ 37 ] is a nonparametric clustering technique that does not require prior knowledge of the number of clusters or constraints on cluster shape. Mean-shift clustering aims to discover “blobs” in a smooth distribution or density of samples [ 82 ]. It is a centroid-based algorithm that works by updating centroid candidates to be the mean of the points in a given region. To form the final set of centroids, these candidates are filtered in a post-processing stage to remove near-duplicates. Cluster analysis in computer vision and image processing are examples of application domains. Mean Shift has the disadvantage of being computationally expensive. Moreover, in cases of high dimension, where the number of clusters shifts abruptly, the mean-shift algorithm does not work well.
DBSCAN: Density-based spatial clustering of applications with noise (DBSCAN) [ 32 ] is a base algorithm for density-based clustering which is widely used in data mining and machine learning. This is known as a non-parametric density-based clustering technique for separating high-density clusters from low-density clusters that are used in model building. DBSCAN’s main idea is that a point belongs to a cluster if it is close to many points from that cluster. It can find clusters of various shapes and sizes in a vast volume of data that is noisy and contains outliers. DBSCAN, unlike k-means, does not require a priori specification of the number of clusters in the data and can find arbitrarily shaped clusters. Although k-means is much faster than DBSCAN, it is efficient at finding high-density regions and outliers, i.e., is robust to outliers.
GMM clustering: Gaussian mixture models (GMMs) are often used for data clustering, which is a distribution-based clustering algorithm. A Gaussian mixture model is a probabilistic model in which all the data points are produced by a mixture of a finite number of Gaussian distributions with unknown parameters [ 82 ]. To find the Gaussian parameters for each cluster, an optimization algorithm called expectation-maximization (EM) [ 82 ] can be used. EM is an iterative method that uses a statistical model to estimate the parameters. In contrast to k-means, Gaussian mixture models account for uncertainty and return the likelihood that a data point belongs to one of the k clusters. GMM clustering is more robust than k-means and works well even with non-linear data distributions.
Agglomerative hierarchical clustering: The most common method of hierarchical clustering used to group objects in clusters based on their similarity is agglomerative clustering. This technique uses a bottom-up approach, where each object is first treated as a singleton cluster by the algorithm. Following that, pairs of clusters are merged one by one until all clusters have been merged into a single large cluster containing all objects. The result is a dendrogram, which is a tree-based representation of the elements. Single linkage [ 115 ], Complete linkage [ 116 ], BOTS [ 102 ] etc. are some examples of such techniques. The main advantage of agglomerative hierarchical clustering over k-means is that the tree-structure hierarchy generated by agglomerative clustering is more informative than the unstructured collection of flat clusters returned by k-means, which can help to make better decisions in the relevant application areas.
Dimensionality Reduction and Feature Learning
In machine learning and data science, high-dimensional data processing is a challenging task for both researchers and application developers. Thus, dimensionality reduction which is an unsupervised learning technique, is important because it leads to better human interpretations, lower computational costs, and avoids overfitting and redundancy by simplifying models. Both the process of feature selection and feature extraction can be used for dimensionality reduction. The primary distinction between the selection and extraction of features is that the “feature selection” keeps a subset of the original features [ 97 ], while “feature extraction” creates brand new ones [ 98 ]. In the following, we briefly discuss these techniques.
Feature selection: The selection of features, also known as the selection of variables or attributes in the data, is the process of choosing a subset of unique features (variables, predictors) to use in building machine learning and data science model. It decreases a model’s complexity by eliminating the irrelevant or less important features and allows for faster training of machine learning algorithms. A right and optimal subset of the selected features in a problem domain is capable to minimize the overfitting problem through simplifying and generalizing the model as well as increases the model’s accuracy [ 97 ]. Thus, “feature selection” [ 66 , 99 ] is considered as one of the primary concepts in machine learning that greatly affects the effectiveness and efficiency of the target machine learning model. Chi-squared test, Analysis of variance (ANOVA) test, Pearson’s correlation coefficient, recursive feature elimination, are some popular techniques that can be used for feature selection.
Feature extraction: In a machine learning-based model or system, feature extraction techniques usually provide a better understanding of the data, a way to improve prediction accuracy, and to reduce computational cost or training time. The aim of “feature extraction” [ 66 , 99 ] is to reduce the number of features in a dataset by generating new ones from the existing ones and then discarding the original features. The majority of the information found in the original set of features can then be summarized using this new reduced set of features. For instance, principal components analysis (PCA) is often used as a dimensionality-reduction technique to extract a lower-dimensional space creating new brand components from the existing features in a dataset [ 98 ].
Many algorithms have been proposed to reduce data dimensions in the machine learning and data science literature [ 41 , 125 ]. In the following, we summarize the popular methods that are used widely in various application areas.
Variance threshold: A simple basic approach to feature selection is the variance threshold [ 82 ]. This excludes all features of low variance, i.e., all features whose variance does not exceed the threshold. It eliminates all zero-variance characteristics by default, i.e., characteristics that have the same value in all samples. This feature selection algorithm looks only at the ( X ) features, not the ( y ) outputs needed, and can, therefore, be used for unsupervised learning.
Pearson correlation: Pearson’s correlation is another method to understand a feature’s relation to the response variable and can be used for feature selection [ 99 ]. This method is also used for finding the association between the features in a dataset. The resulting value is \([-1, 1]\) , where \(-1\) means perfect negative correlation, \(+1\) means perfect positive correlation, and 0 means that the two variables do not have a linear correlation. If two random variables represent X and Y , then the correlation coefficient between X and Y is defined as [ 41 ]
ANOVA: Analysis of variance (ANOVA) is a statistical tool used to verify the mean values of two or more groups that differ significantly from each other. ANOVA assumes a linear relationship between the variables and the target and the variables’ normal distribution. To statistically test the equality of means, the ANOVA method utilizes F tests. For feature selection, the results ‘ANOVA F value’ [ 82 ] of this test can be used where certain features independent of the goal variable can be omitted.
Chi square: The chi-square \({\chi }^2\) [ 82 ] statistic is an estimate of the difference between the effects of a series of events or variables observed and expected frequencies. The magnitude of the difference between the real and observed values, the degrees of freedom, and the sample size depends on \({\chi }^2\) . The chi-square \({\chi }^2\) is commonly used for testing relationships between categorical variables. If \(O_i\) represents observed value and \(E_i\) represents expected value, then
Recursive feature elimination (RFE): Recursive Feature Elimination (RFE) is a brute force approach to feature selection. RFE [ 82 ] fits the model and removes the weakest feature before it meets the specified number of features. Features are ranked by the coefficients or feature significance of the model. RFE aims to remove dependencies and collinearity in the model by recursively removing a small number of features per iteration.
Model-based selection: To reduce the dimensionality of the data, linear models penalized with the L 1 regularization can be used. Least absolute shrinkage and selection operator (Lasso) regression is a type of linear regression that has the property of shrinking some of the coefficients to zero [ 82 ]. Therefore, that feature can be removed from the model. Thus, the penalized lasso regression method, often used in machine learning to select the subset of variables. Extra Trees Classifier [ 82 ] is an example of a tree-based estimator that can be used to compute impurity-based function importance, which can then be used to discard irrelevant features.
Principal component analysis (PCA): Principal component analysis (PCA) is a well-known unsupervised learning approach in the field of machine learning and data science. PCA is a mathematical technique that transforms a set of correlated variables into a set of uncorrelated variables known as principal components [ 48 , 81 ]. Figure 8 shows an example of the effect of PCA on various dimensions space, where Fig. 8 a shows the original features in 3D space, and Fig. 8 b shows the created principal components PC1 and PC2 onto a 2D plane, and 1D line with the principal component PC1 respectively. Thus, PCA can be used as a feature extraction technique that reduces the dimensionality of the datasets, and to build an effective machine learning model [ 98 ]. Technically, PCA identifies the completely transformed with the highest eigenvalues of a covariance matrix and then uses those to project the data into a new subspace of equal or fewer dimensions [ 82 ].
An example of a principal component analysis (PCA) and created principal components PC1 and PC2 in different dimension space
Association Rule Learning
Association rule learning is a rule-based machine learning approach to discover interesting relationships, “IF-THEN” statements, in large datasets between variables [ 7 ]. One example is that “if a customer buys a computer or laptop (an item), s/he is likely to also buy anti-virus software (another item) at the same time”. Association rules are employed today in many application areas, including IoT services, medical diagnosis, usage behavior analytics, web usage mining, smartphone applications, cybersecurity applications, and bioinformatics. In comparison to sequence mining, association rule learning does not usually take into account the order of things within or across transactions. A common way of measuring the usefulness of association rules is to use its parameter, the ‘support’ and ‘confidence’, which is introduced in [ 7 ].
In the data mining literature, many association rule learning methods have been proposed, such as logic dependent [ 34 ], frequent pattern based [ 8 , 49 , 68 ], and tree-based [ 42 ]. The most popular association rule learning algorithms are summarized below.
AIS and SETM: AIS is the first algorithm proposed by Agrawal et al. [ 7 ] for association rule mining. The AIS algorithm’s main downside is that too many candidate itemsets are generated, requiring more space and wasting a lot of effort. This algorithm calls for too many passes over the entire dataset to produce the rules. Another approach SETM [ 49 ] exhibits good performance and stable behavior with execution time; however, it suffers from the same flaw as the AIS algorithm.
Apriori: For generating association rules for a given dataset, Agrawal et al. [ 8 ] proposed the Apriori, Apriori-TID, and Apriori-Hybrid algorithms. These later algorithms outperform the AIS and SETM mentioned above due to the Apriori property of frequent itemset [ 8 ]. The term ‘Apriori’ usually refers to having prior knowledge of frequent itemset properties. Apriori uses a “bottom-up” approach, where it generates the candidate itemsets. To reduce the search space, Apriori uses the property “all subsets of a frequent itemset must be frequent; and if an itemset is infrequent, then all its supersets must also be infrequent”. Another approach predictive Apriori [ 108 ] can also generate rules; however, it receives unexpected results as it combines both the support and confidence. The Apriori [ 8 ] is the widely applicable techniques in mining association rules.
ECLAT: This technique was proposed by Zaki et al. [ 131 ] and stands for Equivalence Class Clustering and bottom-up Lattice Traversal. ECLAT uses a depth-first search to find frequent itemsets. In contrast to the Apriori [ 8 ] algorithm, which represents data in a horizontal pattern, it represents data vertically. Hence, the ECLAT algorithm is more efficient and scalable in the area of association rule learning. This algorithm is better suited for small and medium datasets whereas the Apriori algorithm is used for large datasets.
FP-Growth: Another common association rule learning technique based on the frequent-pattern tree (FP-tree) proposed by Han et al. [ 42 ] is Frequent Pattern Growth, known as FP-Growth. The key difference with Apriori is that while generating rules, the Apriori algorithm [ 8 ] generates frequent candidate itemsets; on the other hand, the FP-growth algorithm [ 42 ] prevents candidate generation and thus produces a tree by the successful strategy of ‘divide and conquer’ approach. Due to its sophistication, however, FP-Tree is challenging to use in an interactive mining environment [ 133 ]. Thus, the FP-Tree would not fit into memory for massive data sets, making it challenging to process big data as well. Another solution is RARM (Rapid Association Rule Mining) proposed by Das et al. [ 26 ] but faces a related FP-tree issue [ 133 ].
ABC-RuleMiner: A rule-based machine learning method, recently proposed in our earlier paper, by Sarker et al. [ 104 ], to discover the interesting non-redundant rules to provide real-world intelligent services. This algorithm effectively identifies the redundancy in associations by taking into account the impact or precedence of the related contextual features and discovers a set of non-redundant association rules. This algorithm first constructs an association generation tree (AGT), a top-down approach, and then extracts the association rules through traversing the tree. Thus, ABC-RuleMiner is more potent than traditional rule-based methods in terms of both non-redundant rule generation and intelligent decision-making, particularly in a context-aware smart computing environment, where human or user preferences are involved.
Among the association rule learning techniques discussed above, Apriori [ 8 ] is the most widely used algorithm for discovering association rules from a given dataset [ 133 ]. The main strength of the association learning technique is its comprehensiveness, as it generates all associations that satisfy the user-specified constraints, such as minimum support and confidence value. The ABC-RuleMiner approach [ 104 ] discussed earlier could give significant results in terms of non-redundant rule generation and intelligent decision-making for the relevant application areas in the real world.
Reinforcement Learning
Reinforcement learning (RL) is a machine learning technique that allows an agent to learn by trial and error in an interactive environment using input from its actions and experiences. Unlike supervised learning, which is based on given sample data or examples, the RL method is based on interacting with the environment. The problem to be solved in reinforcement learning (RL) is defined as a Markov Decision Process (MDP) [ 86 ], i.e., all about sequentially making decisions. An RL problem typically includes four elements such as Agent, Environment, Rewards, and Policy.
RL can be split roughly into Model-based and Model-free techniques. Model-based RL is the process of inferring optimal behavior from a model of the environment by performing actions and observing the results, which include the next state and the immediate reward [ 85 ]. AlphaZero, AlphaGo [ 113 ] are examples of the model-based approaches. On the other hand, a model-free approach does not use the distribution of the transition probability and the reward function associated with MDP. Q-learning, Deep Q Network, Monte Carlo Control, SARSA (State–Action–Reward–State–Action), etc. are some examples of model-free algorithms [ 52 ]. The policy network, which is required for model-based RL but not for model-free, is the key difference between model-free and model-based learning. In the following, we discuss the popular RL algorithms.
Monte Carlo methods: Monte Carlo techniques, or Monte Carlo experiments, are a wide category of computational algorithms that rely on repeated random sampling to obtain numerical results [ 52 ]. The underlying concept is to use randomness to solve problems that are deterministic in principle. Optimization, numerical integration, and making drawings from the probability distribution are the three problem classes where Monte Carlo techniques are most commonly used.
Q-learning: Q-learning is a model-free reinforcement learning algorithm for learning the quality of behaviors that tell an agent what action to take under what conditions [ 52 ]. It does not need a model of the environment (hence the term “model-free”), and it can deal with stochastic transitions and rewards without the need for adaptations. The ‘Q’ in Q-learning usually stands for quality, as the algorithm calculates the maximum expected rewards for a given behavior in a given state.
Deep Q-learning: The basic working step in Deep Q-Learning [ 52 ] is that the initial state is fed into the neural network, which returns the Q-value of all possible actions as an output. Still, when we have a reasonably simple setting to overcome, Q-learning works well. However, when the number of states and actions becomes more complicated, deep learning can be used as a function approximator.
Reinforcement learning, along with supervised and unsupervised learning, is one of the basic machine learning paradigms. RL can be used to solve numerous real-world problems in various fields, such as game theory, control theory, operations analysis, information theory, simulation-based optimization, manufacturing, supply chain logistics, multi-agent systems, swarm intelligence, aircraft control, robot motion control, and many more.
Artificial Neural Network and Deep Learning
Deep learning is part of a wider family of artificial neural networks (ANN)-based machine learning approaches with representation learning. Deep learning provides a computational architecture by combining several processing layers, such as input, hidden, and output layers, to learn from data [ 41 ]. The main advantage of deep learning over traditional machine learning methods is its better performance in several cases, particularly learning from large datasets [ 105 , 129 ]. Figure 9 shows a general performance of deep learning over machine learning considering the increasing amount of data. However, it may vary depending on the data characteristics and experimental set up.
Machine learning and deep learning performance in general with the amount of data
The most common deep learning algorithms are: Multi-layer Perceptron (MLP), Convolutional Neural Network (CNN, or ConvNet), Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) [ 96 ]. In the following, we discuss various types of deep learning methods that can be used to build effective data-driven models for various purposes.
A structure of an artificial neural network modeling with multiple processing layers
MLP: The base architecture of deep learning, which is also known as the feed-forward artificial neural network, is called a multilayer perceptron (MLP) [ 82 ]. A typical MLP is a fully connected network consisting of an input layer, one or more hidden layers, and an output layer, as shown in Fig. 10 . Each node in one layer connects to each node in the following layer at a certain weight. MLP utilizes the “Backpropagation” technique [ 41 ], the most “fundamental building block” in a neural network, to adjust the weight values internally while building the model. MLP is sensitive to scaling features and allows a variety of hyperparameters to be tuned, such as the number of hidden layers, neurons, and iterations, which can result in a computationally costly model.
CNN or ConvNet: The convolution neural network (CNN) [ 65 ] enhances the design of the standard ANN, consisting of convolutional layers, pooling layers, as well as fully connected layers, as shown in Fig. 11 . As it takes the advantage of the two-dimensional (2D) structure of the input data, it is typically broadly used in several areas such as image and video recognition, image processing and classification, medical image analysis, natural language processing, etc. While CNN has a greater computational burden, without any manual intervention, it has the advantage of automatically detecting the important features, and hence CNN is considered to be more powerful than conventional ANN. A number of advanced deep learning models based on CNN can be used in the field, such as AlexNet [ 60 ], Xception [ 24 ], Inception [ 118 ], Visual Geometry Group (VGG) [ 44 ], ResNet [ 45 ], etc.
LSTM-RNN: Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the area of deep learning [ 38 ]. LSTM has feedback links, unlike normal feed-forward neural networks. LSTM networks are well-suited for analyzing and learning sequential data, such as classifying, processing, and predicting data based on time series data, which differentiates it from other conventional networks. Thus, LSTM can be used when the data are in a sequential format, such as time, sentence, etc., and commonly applied in the area of time-series analysis, natural language processing, speech recognition, etc.
An example of a convolutional neural network (CNN or ConvNet) including multiple convolution and pooling layers
In addition to these most common deep learning methods discussed above, several other deep learning approaches [ 96 ] exist in the area for various purposes. For instance, the self-organizing map (SOM) [ 58 ] uses unsupervised learning to represent the high-dimensional data by a 2D grid map, thus achieving dimensionality reduction. The autoencoder (AE) [ 15 ] is another learning technique that is widely used for dimensionality reduction as well and feature extraction in unsupervised learning tasks. Restricted Boltzmann machines (RBM) [ 46 ] can be used for dimensionality reduction, classification, regression, collaborative filtering, feature learning, and topic modeling. A deep belief network (DBN) is typically composed of simple, unsupervised networks such as restricted Boltzmann machines (RBMs) or autoencoders, and a backpropagation neural network (BPNN) [ 123 ]. A generative adversarial network (GAN) [ 39 ] is a form of the network for deep learning that can generate data with characteristics close to the actual data input. Transfer learning is currently very common because it can train deep neural networks with comparatively low data, which is typically the re-use of a new problem with a pre-trained model [ 124 ]. A brief discussion of these artificial neural networks (ANN) and deep learning (DL) models are summarized in our earlier paper Sarker et al. [ 96 ].
Overall, based on the learning techniques discussed above, we can conclude that various types of machine learning techniques, such as classification analysis, regression, data clustering, feature selection and extraction, and dimensionality reduction, association rule learning, reinforcement learning, or deep learning techniques, can play a significant role for various purposes according to their capabilities. In the following section, we discuss several application areas based on machine learning algorithms.
Applications of Machine Learning
In the current age of the Fourth Industrial Revolution (4IR), machine learning becomes popular in various application areas, because of its learning capabilities from the past and making intelligent decisions. In the following, we summarize and discuss ten popular application areas of machine learning technology.
Predictive analytics and intelligent decision-making: A major application field of machine learning is intelligent decision-making by data-driven predictive analytics [ 21 , 70 ]. The basis of predictive analytics is capturing and exploiting relationships between explanatory variables and predicted variables from previous events to predict the unknown outcome [ 41 ]. For instance, identifying suspects or criminals after a crime has been committed, or detecting credit card fraud as it happens. Another application, where machine learning algorithms can assist retailers in better understanding consumer preferences and behavior, better manage inventory, avoiding out-of-stock situations, and optimizing logistics and warehousing in e-commerce. Various machine learning algorithms such as decision trees, support vector machines, artificial neural networks, etc. [ 106 , 125 ] are commonly used in the area. Since accurate predictions provide insight into the unknown, they can improve the decisions of industries, businesses, and almost any organization, including government agencies, e-commerce, telecommunications, banking and financial services, healthcare, sales and marketing, transportation, social networking, and many others.
Cybersecurity and threat intelligence: Cybersecurity is one of the most essential areas of Industry 4.0. [ 114 ], which is typically the practice of protecting networks, systems, hardware, and data from digital attacks [ 114 ]. Machine learning has become a crucial cybersecurity technology that constantly learns by analyzing data to identify patterns, better detect malware in encrypted traffic, find insider threats, predict where bad neighborhoods are online, keep people safe while browsing, or secure data in the cloud by uncovering suspicious activity. For instance, clustering techniques can be used to identify cyber-anomalies, policy violations, etc. To detect various types of cyber-attacks or intrusions machine learning classification models by taking into account the impact of security features are useful [ 97 ]. Various deep learning-based security models can also be used on the large scale of security datasets [ 96 , 129 ]. Moreover, security policy rules generated by association rule learning techniques can play a significant role to build a rule-based security system [ 105 ]. Thus, we can say that various learning techniques discussed in Sect. Machine Learning Tasks and Algorithms , can enable cybersecurity professionals to be more proactive inefficiently preventing threats and cyber-attacks.
Internet of things (IoT) and smart cities: Internet of Things (IoT) is another essential area of Industry 4.0. [ 114 ], which turns everyday objects into smart objects by allowing them to transmit data and automate tasks without the need for human interaction. IoT is, therefore, considered to be the big frontier that can enhance almost all activities in our lives, such as smart governance, smart home, education, communication, transportation, retail, agriculture, health care, business, and many more [ 70 ]. Smart city is one of IoT’s core fields of application, using technologies to enhance city services and residents’ living experiences [ 132 , 135 ]. As machine learning utilizes experience to recognize trends and create models that help predict future behavior and events, it has become a crucial technology for IoT applications [ 103 ]. For example, to predict traffic in smart cities, parking availability prediction, estimate the total usage of energy of the citizens for a particular period, make context-aware and timely decisions for the people, etc. are some tasks that can be solved using machine learning techniques according to the current needs of the people.
Traffic prediction and transportation: Transportation systems have become a crucial component of every country’s economic development. Nonetheless, several cities around the world are experiencing an excessive rise in traffic volume, resulting in serious issues such as delays, traffic congestion, higher fuel prices, increased CO \(_2\) pollution, accidents, emergencies, and a decline in modern society’s quality of life [ 40 ]. Thus, an intelligent transportation system through predicting future traffic is important, which is an indispensable part of a smart city. Accurate traffic prediction based on machine and deep learning modeling can help to minimize the issues [ 17 , 30 , 31 ]. For example, based on the travel history and trend of traveling through various routes, machine learning can assist transportation companies in predicting possible issues that may occur on specific routes and recommending their customers to take a different path. Ultimately, these learning-based data-driven models help improve traffic flow, increase the usage and efficiency of sustainable modes of transportation, and limit real-world disruption by modeling and visualizing future changes.
Healthcare and COVID-19 pandemic: Machine learning can help to solve diagnostic and prognostic problems in a variety of medical domains, such as disease prediction, medical knowledge extraction, detecting regularities in data, patient management, etc. [ 33 , 77 , 112 ]. Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus, according to the World Health Organization (WHO) [ 3 ]. Recently, the learning techniques have become popular in the battle against COVID-19 [ 61 , 63 ]. For the COVID-19 pandemic, the learning techniques are used to classify patients at high risk, their mortality rate, and other anomalies [ 61 ]. It can also be used to better understand the virus’s origin, COVID-19 outbreak prediction, as well as for disease diagnosis and treatment [ 14 , 50 ]. With the help of machine learning, researchers can forecast where and when, the COVID-19 is likely to spread, and notify those regions to match the required arrangements. Deep learning also provides exciting solutions to the problems of medical image processing and is seen as a crucial technique for potential applications, particularly for COVID-19 pandemic [ 10 , 78 , 111 ]. Overall, machine and deep learning techniques can help to fight the COVID-19 virus and the pandemic as well as intelligent clinical decisions making in the domain of healthcare.
E-commerce and product recommendations: Product recommendation is one of the most well known and widely used applications of machine learning, and it is one of the most prominent features of almost any e-commerce website today. Machine learning technology can assist businesses in analyzing their consumers’ purchasing histories and making customized product suggestions for their next purchase based on their behavior and preferences. E-commerce companies, for example, can easily position product suggestions and offers by analyzing browsing trends and click-through rates of specific items. Using predictive modeling based on machine learning techniques, many online retailers, such as Amazon [ 71 ], can better manage inventory, prevent out-of-stock situations, and optimize logistics and warehousing. The future of sales and marketing is the ability to capture, evaluate, and use consumer data to provide a customized shopping experience. Furthermore, machine learning techniques enable companies to create packages and content that are tailored to the needs of their customers, allowing them to maintain existing customers while attracting new ones.
NLP and sentiment analysis: Natural language processing (NLP) involves the reading and understanding of spoken or written language through the medium of a computer [ 79 , 103 ]. Thus, NLP helps computers, for instance, to read a text, hear speech, interpret it, analyze sentiment, and decide which aspects are significant, where machine learning techniques can be used. Virtual personal assistant, chatbot, speech recognition, document description, language or machine translation, etc. are some examples of NLP-related tasks. Sentiment Analysis [ 90 ] (also referred to as opinion mining or emotion AI) is an NLP sub-field that seeks to identify and extract public mood and views within a given text through blogs, reviews, social media, forums, news, etc. For instance, businesses and brands use sentiment analysis to understand the social sentiment of their brand, product, or service through social media platforms or the web as a whole. Overall, sentiment analysis is considered as a machine learning task that analyzes texts for polarity, such as “positive”, “negative”, or “neutral” along with more intense emotions like very happy, happy, sad, very sad, angry, have interest, or not interested etc.
Image, speech and pattern recognition: Image recognition [ 36 ] is a well-known and widespread example of machine learning in the real world, which can identify an object as a digital image. For instance, to label an x-ray as cancerous or not, character recognition, or face detection in an image, tagging suggestions on social media, e.g., Facebook, are common examples of image recognition. Speech recognition [ 23 ] is also very popular that typically uses sound and linguistic models, e.g., Google Assistant, Cortana, Siri, Alexa, etc. [ 67 ], where machine learning methods are used. Pattern recognition [ 13 ] is defined as the automated recognition of patterns and regularities in data, e.g., image analysis. Several machine learning techniques such as classification, feature selection, clustering, or sequence labeling methods are used in the area.
Sustainable agriculture: Agriculture is essential to the survival of all human activities [ 109 ]. Sustainable agriculture practices help to improve agricultural productivity while also reducing negative impacts on the environment [ 5 , 25 , 109 ]. The sustainable agriculture supply chains are knowledge-intensive and based on information, skills, technologies, etc., where knowledge transfer encourages farmers to enhance their decisions to adopt sustainable agriculture practices utilizing the increasing amount of data captured by emerging technologies, e.g., the Internet of Things (IoT), mobile technologies and devices, etc. [ 5 , 53 , 54 ]. Machine learning can be applied in various phases of sustainable agriculture, such as in the pre-production phase - for the prediction of crop yield, soil properties, irrigation requirements, etc.; in the production phase—for weather prediction, disease detection, weed detection, soil nutrient management, livestock management, etc.; in processing phase—for demand estimation, production planning, etc. and in the distribution phase - the inventory management, consumer analysis, etc.
User behavior analytics and context-aware smartphone applications: Context-awareness is a system’s ability to capture knowledge about its surroundings at any moment and modify behaviors accordingly [ 28 , 93 ]. Context-aware computing uses software and hardware to automatically collect and interpret data for direct responses. The mobile app development environment has been changed greatly with the power of AI, particularly, machine learning techniques through their learning capabilities from contextual data [ 103 , 136 ]. Thus, the developers of mobile apps can rely on machine learning to create smart apps that can understand human behavior, support, and entertain users [ 107 , 137 , 140 ]. To build various personalized data-driven context-aware systems, such as smart interruption management, smart mobile recommendation, context-aware smart searching, decision-making that intelligently assist end mobile phone users in a pervasive computing environment, machine learning techniques are applicable. For example, context-aware association rules can be used to build an intelligent phone call application [ 104 ]. Clustering approaches are useful in capturing users’ diverse behavioral activities by taking into account data in time series [ 102 ]. To predict the future events in various contexts, the classification methods can be used [ 106 , 139 ]. Thus, various learning techniques discussed in Sect. “ Machine Learning Tasks and Algorithms ” can help to build context-aware adaptive and smart applications according to the preferences of the mobile phone users.
In addition to these application areas, machine learning-based models can also apply to several other domains such as bioinformatics, cheminformatics, computer networks, DNA sequence classification, economics and banking, robotics, advanced engineering, and many more.
Challenges and Research Directions
Our study on machine learning algorithms for intelligent data analysis and applications opens several research issues in the area. Thus, in this section, we summarize and discuss the challenges faced and the potential research opportunities and future directions.
In general, the effectiveness and the efficiency of a machine learning-based solution depend on the nature and characteristics of the data, and the performance of the learning algorithms. To collect the data in the relevant domain, such as cybersecurity, IoT, healthcare and agriculture discussed in Sect. “ Applications of Machine Learning ” is not straightforward, although the current cyberspace enables the production of a huge amount of data with very high frequency. Thus, collecting useful data for the target machine learning-based applications, e.g., smart city applications, and their management is important to further analysis. Therefore, a more in-depth investigation of data collection methods is needed while working on the real-world data. Moreover, the historical data may contain many ambiguous values, missing values, outliers, and meaningless data. The machine learning algorithms, discussed in Sect “ Machine Learning Tasks and Algorithms ” highly impact on data quality, and availability for training, and consequently on the resultant model. Thus, to accurately clean and pre-process the diverse data collected from diverse sources is a challenging task. Therefore, effectively modifying or enhance existing pre-processing methods, or proposing new data preparation techniques are required to effectively use the learning algorithms in the associated application domain.
To analyze the data and extract insights, there exist many machine learning algorithms, summarized in Sect. “ Machine Learning Tasks and Algorithms ”. Thus, selecting a proper learning algorithm that is suitable for the target application is challenging. The reason is that the outcome of different learning algorithms may vary depending on the data characteristics [ 106 ]. Selecting a wrong learning algorithm would result in producing unexpected outcomes that may lead to loss of effort, as well as the model’s effectiveness and accuracy. In terms of model building, the techniques discussed in Sect. “ Machine Learning Tasks and Algorithms ” can directly be used to solve many real-world issues in diverse domains, such as cybersecurity, smart cities and healthcare summarized in Sect. “ Applications of Machine Learning ”. However, the hybrid learning model, e.g., the ensemble of methods, modifying or enhancement of the existing learning techniques, or designing new learning methods, could be a potential future work in the area.
Thus, the ultimate success of a machine learning-based solution and corresponding applications mainly depends on both the data and the learning algorithms. If the data are bad to learn, such as non-representative, poor-quality, irrelevant features, or insufficient quantity for training, then the machine learning models may become useless or will produce lower accuracy. Therefore, effectively processing the data and handling the diverse learning algorithms are important, for a machine learning-based solution and eventually building intelligent applications.
In this paper, we have conducted a comprehensive overview of machine learning algorithms for intelligent data analysis and applications. According to our goal, we have briefly discussed how various types of machine learning methods can be used for making solutions to various real-world issues. A successful machine learning model depends on both the data and the performance of the learning algorithms. The sophisticated learning algorithms then need to be trained through the collected real-world data and knowledge related to the target application before the system can assist with intelligent decision-making. We also discussed several popular application areas based on machine learning techniques to highlight their applicability in various real-world issues. Finally, we have summarized and discussed the challenges faced and the potential research opportunities and future directions in the area. Therefore, the challenges that are identified create promising research opportunities in the field which must be addressed with effective solutions in various application areas. Overall, we believe that our study on machine learning-based solutions opens up a promising direction and can be used as a reference guide for potential research and applications for both academia and industry professionals as well as for decision-makers, from a technical point of view.
Canadian institute of cybersecurity, university of new brunswick, iscx dataset, http://www.unb.ca/cic/datasets/index.html/ (Accessed on 20 October 2019).
Cic-ddos2019 [online]. available: https://www.unb.ca/cic/datasets/ddos-2019.html/ (Accessed on 28 March 2020).
World health organization: WHO. http://www.who.int/ .
Google trends. In https://trends.google.com/trends/ , 2019.
Adnan N, Nordin Shahrina Md, Rahman I, Noor A. The effects of knowledge transfer on farmers decision making toward sustainable agriculture practices. World J Sci Technol Sustain Dev. 2018.
Agrawal R, Gehrke J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of data. 1998; 94–105
Agrawal R, Imieliński T, Swami A. Mining association rules between sets of items in large databases. In: ACM SIGMOD Record. ACM. 1993;22: 207–216
Agrawal R, Gehrke J, Gunopulos D, Raghavan P. Fast algorithms for mining association rules. In: Proceedings of the International Joint Conference on Very Large Data Bases, Santiago Chile. 1994; 1215: 487–499.
Aha DW, Kibler D, Albert M. Instance-based learning algorithms. Mach Learn. 1991;6(1):37–66.
Article Google Scholar
Alakus TB, Turkoglu I. Comparison of deep learning approaches to predict covid-19 infection. Chaos Solit Fract. 2020;140:
Amit Y, Geman D. Shape quantization and recognition with randomized trees. Neural Comput. 1997;9(7):1545–88.
Ankerst M, Breunig MM, Kriegel H-P, Sander J. Optics: ordering points to identify the clustering structure. ACM Sigmod Record. 1999;28(2):49–60.
Anzai Y. Pattern recognition and machine learning. Elsevier; 2012.
MATH Google Scholar
Ardabili SF, Mosavi A, Ghamisi P, Ferdinand F, Varkonyi-Koczy AR, Reuter U, Rabczuk T, Atkinson PM. Covid-19 outbreak prediction with machine learning. Algorithms. 2020;13(10):249.
Article MathSciNet Google Scholar
Baldi P. Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning, 2012; 37–49 .
Balducci F, Impedovo D, Pirlo G. Machine learning applications on agricultural datasets for smart farm enhancement. Machines. 2018;6(3):38.
Boukerche A, Wang J. Machine learning-based traffic prediction models for intelligent transportation systems. Comput Netw. 2020;181
Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123–40.
Article MATH Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and regression trees. CRC Press; 1984.
Cao L. Data science: a comprehensive overview. ACM Comput Surv (CSUR). 2017;50(3):43.
Google Scholar
Carpenter GA, Grossberg S. A massively parallel architecture for a self-organizing neural pattern recognition machine. Comput Vis Graph Image Process. 1987;37(1):54–115.
Chiu C-C, Sainath TN, Wu Y, Prabhavalkar R, Nguyen P, Chen Z, Kannan A, Weiss RJ, Rao K, Gonina E, et al. State-of-the-art speech recognition with sequence-to-sequence models. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018 pages 4774–4778. IEEE .
Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258, 2017.
Cobuloglu H, Büyüktahtakın IE. A stochastic multi-criteria decision analysis for sustainable biomass crop selection. Expert Syst Appl. 2015;42(15–16):6065–74.
Das A, Ng W-K, Woon Y-K. Rapid association rule mining. In: Proceedings of the tenth international conference on Information and knowledge management, pages 474–481. ACM, 2001.
de Amorim RC. Constrained clustering with minkowski weighted k-means. In: 2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI), pages 13–17. IEEE, 2012.
Dey AK. Understanding and using context. Person Ubiquit Comput. 2001;5(1):4–7.
Eagle N, Pentland AS. Reality mining: sensing complex social systems. Person Ubiquit Comput. 2006;10(4):255–68.
Essien A, Petrounias I, Sampaio P, Sampaio S. Improving urban traffic speed prediction using data source fusion and deep learning. In: 2019 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE. 2019: 1–8. .
Essien A, Petrounias I, Sampaio P, Sampaio S. A deep-learning model for urban traffic flow prediction with traffic events mined from twitter. In: World Wide Web, 2020: 1–24 .
Ester M, Kriegel H-P, Sander J, Xiaowei X, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd. 1996;96:226–31.
Fatima M, Pasha M, et al. Survey of machine learning algorithms for disease diagnostic. J Intell Learn Syst Appl. 2017;9(01):1.
Flach PA, Lachiche N. Confirmation-guided discovery of first-order rules with tertius. Mach Learn. 2001;42(1–2):61–95.
Freund Y, Schapire RE, et al. Experiments with a new boosting algorithm. In: Icml, Citeseer. 1996; 96: 148–156
Fujiyoshi H, Hirakawa T, Yamashita T. Deep learning-based image recognition for autonomous driving. IATSS Res. 2019;43(4):244–52.
Fukunaga K, Hostetler L. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inform Theory. 1975;21(1):32–40.
Article MathSciNet MATH Google Scholar
Goodfellow I, Bengio Y, Courville A, Bengio Y. Deep learning. Cambridge: MIT Press; 2016.
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Advances in neural information processing systems. 2014: 2672–2680.
Guerrero-Ibáñez J, Zeadally S, Contreras-Castillo J. Sensor technologies for intelligent transportation systems. Sensors. 2018;18(4):1212.
Han J, Pei J, Kamber M. Data mining: concepts and techniques. Amsterdam: Elsevier; 2011.
Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In: ACM Sigmod Record, ACM. 2000;29: 1–12.
Harmon SA, Sanford TH, Sheng X, Turkbey EB, Roth H, Ziyue X, Yang D, Myronenko A, Anderson V, Amalou A, et al. Artificial intelligence for the detection of covid-19 pneumonia on chest ct using multinational datasets. Nat Commun. 2020;11(1):1–7.
He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1904–16.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 770–778.
Hinton GE. A practical guide to training restricted boltzmann machines. In: Neural networks: Tricks of the trade. Springer. 2012; 599-619
Holte RC. Very simple classification rules perform well on most commonly used datasets. Mach Learn. 1993;11(1):63–90.
Hotelling H. Analysis of a complex of statistical variables into principal components. J Edu Psychol. 1933;24(6):417.
Houtsma M, Swami A. Set-oriented mining for association rules in relational databases. In: Data Engineering, 1995. Proceedings of the Eleventh International Conference on, IEEE.1995:25–33.
Jamshidi M, Lalbakhsh A, Talla J, Peroutka Z, Hadjilooei F, Lalbakhsh P, Jamshidi M, La Spada L, Mirmozafari M, Dehghani M, et al. Artificial intelligence and covid-19: deep learning approaches for diagnosis and treatment. IEEE Access. 2020;8:109581–95.
John GH, Langley P. Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc. 1995; 338–345
Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J Artif Intell Res. 1996;4:237–85.
Kamble SS, Gunasekaran A, Gawankar SA. Sustainable industry 4.0 framework: a systematic literature review identifying the current trends and future perspectives. Process Saf Environ Protect. 2018;117:408–25.
Kamble SS, Gunasekaran A, Gawankar SA. Achieving sustainable performance in a data-driven agriculture supply chain: a review for research and applications. Int J Prod Econ. 2020;219:179–94.
Kaufman L, Rousseeuw PJ. Finding groups in data: an introduction to cluster analysis, vol. 344. John Wiley & Sons; 2009.
Keerthi SS, Shevade SK, Bhattacharyya C, Radha Krishna MK. Improvements to platt’s smo algorithm for svm classifier design. Neural Comput. 2001;13(3):637–49.
Khadse V, Mahalle PN, Biraris SV. An empirical comparison of supervised machine learning algorithms for internet of things data. In: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), IEEE. 2018; 1–6
Kohonen T. The self-organizing map. Proc IEEE. 1990;78(9):1464–80.
Koroniotis N, Moustafa N, Sitnikova E, Turnbull B. Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: bot-iot dataset. Fut Gen Comput Syst. 2019;100:779–96.
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, 2012: 1097–1105
Kushwaha S, Bahl S, Bagha AK, Parmar KS, Javaid M, Haleem A, Singh RP. Significant applications of machine learning for covid-19 pandemic. J Ind Integr Manag. 2020;5(4).
Lade P, Ghosh R, Srinivasan S. Manufacturing analytics and industrial internet of things. IEEE Intell Syst. 2017;32(3):74–9.
Lalmuanawma S, Hussain J, Chhakchhuak L. Applications of machine learning and artificial intelligence for covid-19 (sars-cov-2) pandemic: a review. Chaos Sol Fract. 2020:110059 .
LeCessie S, Van Houwelingen JC. Ridge estimators in logistic regression. J R Stat Soc Ser C (Appl Stat). 1992;41(1):191–201.
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.
Liu H, Motoda H. Feature extraction, construction and selection: A data mining perspective, vol. 453. Springer Science & Business Media; 1998.
López G, Quesada L, Guerrero LA. Alexa vs. siri vs. cortana vs. google assistant: a comparison of speech-based natural user interfaces. In: International Conference on Applied Human Factors and Ergonomics, Springer. 2017; 241–250.
Liu B, HsuW, Ma Y. Integrating classification and association rule mining. In: Proceedings of the fourth international conference on knowledge discovery and data mining, 1998.
MacQueen J, et al. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1967;volume 1, pages 281–297. Oakland, CA, USA.
Mahdavinejad MS, Rezvan M, Barekatain M, Adibi P, Barnaghi P, Sheth AP. Machine learning for internet of things data analysis: a survey. Digit Commun Netw. 2018;4(3):161–75.
Marchand A, Marx P. Automated product recommendations with preference-based explanations. J Retail. 2020;96(3):328–43.
McCallum A. Information extraction: distilling structured data from unstructured text. Queue. 2005;3(9):48–57.
Mehrotra A, Hendley R, Musolesi M. Prefminer: mining user’s preferences for intelligent mobile notification management. In: Proceedings of the International Joint Conference on Pervasive and Ubiquitous Computing, Heidelberg, Germany, 12–16 September, 2016; pp. 1223–1234. ACM, New York, USA. .
Mohamadou Y, Halidou A, Kapen PT. A review of mathematical modeling, artificial intelligence and datasets used in the study, prediction and management of covid-19. Appl Intell. 2020;50(11):3913–25.
Mohammed M, Khan MB, Bashier Mohammed BE. Machine learning: algorithms and applications. CRC Press; 2016.
Book Google Scholar
Moustafa N, Slay J. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 military communications and information systems conference (MilCIS), 2015;pages 1–6. IEEE .
Nilashi M, Ibrahim OB, Ahmadi H, Shahmoradi L. An analytical method for diseases prediction using machine learning techniques. Comput Chem Eng. 2017;106:212–23.
Yujin O, Park S, Ye JC. Deep learning covid-19 features on cxr using limited training data sets. IEEE Trans Med Imaging. 2020;39(8):2688–700.
Otter DW, Medina JR , Kalita JK. A survey of the usages of deep learning for natural language processing. IEEE Trans Neural Netw Learn Syst. 2020.
Park H-S, Jun C-H. A simple and fast algorithm for k-medoids clustering. Expert Syst Appl. 2009;36(2):3336–41.
Liii Pearson K. on lines and planes of closest fit to systems of points in space. Lond Edinb Dublin Philos Mag J Sci. 1901;2(11):559–72.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.
MathSciNet MATH Google Scholar
Perveen S, Shahbaz M, Keshavjee K, Guergachi A. Metabolic syndrome and development of diabetes mellitus: predictive modeling based on machine learning techniques. IEEE Access. 2018;7:1365–75.
Santi P, Ram D, Rob C, Nathan E. Behavior-based adaptive call predictor. ACM Trans Auton Adapt Syst. 2011;6(3):21:1–21:28.
Polydoros AS, Nalpantidis L. Survey of model-based reinforcement learning: applications on robotics. J Intell Robot Syst. 2017;86(2):153–73.
Puterman ML. Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons; 2014.
Quinlan JR. Induction of decision trees. Mach Learn. 1986;1:81–106.
Quinlan JR. C4.5: programs for machine learning. Mach Learn. 1993.
Rasmussen C. The infinite gaussian mixture model. Adv Neural Inform Process Syst. 1999;12:554–60.
Ravi K, Ravi V. A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Syst. 2015;89:14–46.
Rokach L. A survey of clustering algorithms. In: Data mining and knowledge discovery handbook, pages 269–298. Springer, 2010.
Safdar S, Zafar S, Zafar N, Khan NF. Machine learning based decision support systems (dss) for heart disease diagnosis: a review. Artif Intell Rev. 2018;50(4):597–623.
Sarker IH. Context-aware rule learning from smartphone data: survey, challenges and future directions. J Big Data. 2019;6(1):1–25.
Sarker IH. A machine learning based robust prediction model for real-life mobile phone data. Internet Things. 2019;5:180–93.
Sarker IH. Ai-driven cybersecurity: an overview, security intelligence modeling and research directions. SN Comput Sci. 2021.
Sarker IH. Deep cybersecurity: a comprehensive overview from neural network and deep learning perspective. SN Comput Sci. 2021.
Sarker IH, Abushark YB, Alsolami F, Khan A. Intrudtree: a machine learning based cyber security intrusion detection model. Symmetry. 2020;12(5):754.
Sarker IH, Abushark YB, Khan A. Contextpca: predicting context-aware smartphone apps usage based on machine learning techniques. Symmetry. 2020;12(4):499.
Sarker IH, Alqahtani H, Alsolami F, Khan A, Abushark YB, Siddiqui MK. Context pre-modeling: an empirical analysis for classification based user-centric context-aware predictive modeling. J Big Data. 2020;7(1):1–23.
Sarker IH, Alan C, Jun H, Khan AI, Abushark YB, Khaled S. Behavdt: a behavioral decision tree learning to build user-centric context-aware predictive model. Mob Netw Appl. 2019; 1–11.
Sarker IH, Colman A, Kabir MA, Han J. Phone call log as a context source to modeling individual user behavior. In: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Ubicomp): Adjunct, Germany, pages 630–634. ACM, 2016.
Sarker IH, Colman A, Kabir MA, Han J. Individualized time-series segmentation for mining mobile phone user behavior. Comput J Oxf Univ UK. 2018;61(3):349–68.
Sarker IH, Hoque MM, MdK Uddin, Tawfeeq A. Mobile data science and intelligent apps: concepts, ai-based modeling and research directions. Mob Netw Appl, pages 1–19, 2020.
Sarker IH, Kayes ASM. Abc-ruleminer: user behavioral rule-based machine learning method for context-aware intelligent services. J Netw Comput Appl. 2020; page 102762
Sarker IH, Kayes ASM, Badsha S, Alqahtani H, Watters P, Ng A. Cybersecurity data science: an overview from machine learning perspective. J Big Data. 2020;7(1):1–29.
Sarker IH, Watters P, Kayes ASM. Effectiveness analysis of machine learning classification models for predicting personalized context-aware smartphone usage. J Big Data. 2019;6(1):1–28.
Sarker IH, Salah K. Appspred: predicting context-aware smartphone apps using random forest learning. Internet Things. 2019;8:
Scheffer T. Finding association rules that trade support optimally against confidence. Intell Data Anal. 2005;9(4):381–95.
Sharma R, Kamble SS, Gunasekaran A, Kumar V, Kumar A. A systematic literature review on machine learning applications for sustainable agriculture supply chain performance. Comput Oper Res. 2020;119:
Shengli S, Ling CX. Hybrid cost-sensitive decision tree, knowledge discovery in databases. In: PKDD 2005, Proceedings of 9th European Conference on Principles and Practice of Knowledge Discovery in Databases. Lecture Notes in Computer Science, volume 3721, 2005.
Shorten C, Khoshgoftaar TM, Furht B. Deep learning applications for covid-19. J Big Data. 2021;8(1):1–54.
Gökhan S, Nevin Y. Data analysis in health and big data: a machine learning medical diagnosis model based on patients’ complaints. Commun Stat Theory Methods. 2019;1–10
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, et al. Mastering the game of go with deep neural networks and tree search. nature. 2016;529(7587):484–9.
Ślusarczyk B. Industry 4.0: Are we ready? Polish J Manag Stud. 17, 2018.
Sneath Peter HA. The application of computers to taxonomy. J Gen Microbiol. 1957;17(1).
Sorensen T. Method of establishing groups of equal amplitude in plant sociology based on similarity of species. Biol Skr. 1948; 5.
Srinivasan V, Moghaddam S, Mukherji A. Mobileminer: mining your frequent patterns on your phone. In: Proceedings of the International Joint Conference on Pervasive and Ubiquitous Computing, Seattle, WA, USA, 13-17 September, pp. 389–400. ACM, New York, USA. 2014.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015; pages 1–9.
Tavallaee M, Bagheri E, Lu W, Ghorbani AA. A detailed analysis of the kdd cup 99 data set. In. IEEE symposium on computational intelligence for security and defense applications. IEEE. 2009;2009:1–6.
Tsagkias M. Tracy HK, Surya K, Vanessa M, de Rijke M. Challenges and research opportunities in ecommerce search and recommendations. In: ACM SIGIR Forum. volume 54. NY, USA: ACM New York; 2021. p. 1–23.
Wagstaff K, Cardie C, Rogers S, Schrödl S, et al. Constrained k-means clustering with background knowledge. Icml. 2001;1:577–84.
Wang W, Yang J, Muntz R, et al. Sting: a statistical information grid approach to spatial data mining. VLDB. 1997;97:186–95.
Wei P, Li Y, Zhang Z, Tao H, Li Z, Liu D. An optimization method for intrusion detection classification model based on deep belief network. IEEE Access. 2019;7:87593–605.
Weiss K, Khoshgoftaar TM, Wang DD. A survey of transfer learning. J Big data. 2016;3(1):9.
Witten IH, Frank E. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann; 2005.
Witten IH, Frank E, Trigg LE, Hall MA, Holmes G, Cunningham SJ. Weka: practical machine learning tools and techniques with java implementations. 1999.
Wu C-C, Yen-Liang C, Yi-Hung L, Xiang-Yu Y. Decision tree induction with a constrained number of leaf nodes. Appl Intell. 2016;45(3):673–85.
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY, et al. Top 10 algorithms in data mining. Knowl Inform Syst. 2008;14(1):1–37.
Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C. Machine learning and deep learning methods for cybersecurity. IEEE Access. 2018;6:35365–81.
Xu D, Yingjie T. A comprehensive survey of clustering algorithms. Ann Data Sci. 2015;2(2):165–93.
Zaki MJ. Scalable algorithms for association mining. IEEE Trans Knowl Data Eng. 2000;12(3):372–90.
Zanella A, Bui N, Castellani A, Vangelista L, Zorzi M. Internet of things for smart cities. IEEE Internet Things J. 2014;1(1):22–32.
Zhao Q, Bhowmick SS. Association rule mining: a survey. Singapore: Nanyang Technological University; 2003.
Zheng T, Xie W, Xu L, He X, Zhang Y, You M, Yang G, Chen Y. A machine learning-based framework to identify type 2 diabetes through electronic health records. Int J Med Inform. 2017;97:120–7.
Zheng Y, Rajasegarar S, Leckie C. Parking availability prediction for sensor-enabled car parks in smart cities. In: Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), 2015 IEEE Tenth International Conference on. IEEE, 2015; pages 1–6.
Zhu H, Cao H, Chen E, Xiong H, Tian J. Exploiting enriched contextual information for mobile app classification. In: Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, 2012; pages 1617–1621
Zhu H, Chen E, Xiong H, Kuifei Y, Cao H, Tian J. Mining mobile user preferences for personalized context-aware recommendation. ACM Trans Intell Syst Technol (TIST). 2014;5(4):58.
Zikang H, Yong Y, Guofeng Y, Xinyu Z. Sentiment analysis of agricultural product ecommerce review data based on deep learning. In: 2020 International Conference on Internet of Things and Intelligent Applications (ITIA), IEEE, 2020; pages 1–7
Zulkernain S, Madiraju P, Ahamed SI. A context aware interruption management system for mobile devices. In: Mobile Wireless Middleware, Operating Systems, and Applications. Springer. 2010; pages 221–234
Zulkernain S, Madiraju P, Ahamed S, Stamm K. A mobile intelligent interruption management system. J UCS. 2010;16(15):2060–80.
Download references
Author information
Authors and affiliations.
Swinburne University of Technology, Melbourne, VIC, 3122, Australia
Iqbal H. Sarker
Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, 4349, Chattogram, Bangladesh
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to Iqbal H. Sarker .
Ethics declarations
Conflict of interest.
The author declares no conflict of interest.
Additional information
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Advances in Computational Approaches for Artificial Intelligence, Image Processing, IoT and Cloud Applications” guest edited by Bhanu Prakash K N and M. Shivakumar.
Rights and permissions
Reprints and permissions
About this article
Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN COMPUT. SCI. 2 , 160 (2021). https://doi.org/10.1007/s42979-021-00592-x
Download citation
Received : 27 January 2021
Accepted : 12 March 2021
Published : 22 March 2021
DOI : https://doi.org/10.1007/s42979-021-00592-x
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Machine learning
- Deep learning
- Artificial intelligence
- Data science
- Data-driven decision-making
- Predictive analytics
- Intelligent applications
- Find a journal
- Publish with us
- Track your research
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
- View all journals
Computer science articles from across Nature Portfolio
Computer science is the study and development of the protocols required for automated processing and manipulation of data. This includes, for example, creating algorithms for efficiently searching large volumes of information or encrypting data so that it can be stored and transmitted securely.
Latest Research and Reviews
Personalized movie recommendation in IoT-enhanced systems using graph convolutional network and multi-layer perceptron
Precision game engineering through reshaping strategic payoffs
- Ali R. Zomorrodi
Deep learning-based improved transformer model on android malware detection and classification in internet of vehicles
- Naif Almakayeel
Deep learning resilience inference for complex networked systems
Estimation of network resilience, the ability to maintain functionality when failures occur, usually requires prior knowledge of network topology and dynamics. The authors propose a deep learning model to predict network resilience based on the observational data of node activities and the network topology.
Grounded situation recognition under data scarcity
- Zhiqiang Liu
Accurate quantification of dislocation loops in complex functional alloys enabled by deep learning image analysis
- Thomas Bilyk
- Alexandra. M. Goryaeva
- Estelle Meslin
News and Comment
AI watermarking must be watertight to be effective
Scientists are closing in on a tool that can reliably identify AI-generated text without affecting the user’s experience. But the technology’s robustness remains a challenge.
Build an international AI ‘telescope’ to curb the power of big tech companies
- Pierre Baldi
- Piero Fariselli
- Giorgio Parisi
How I peer into the geometry behind computer vision
Minh Ha Quang’s work at a Japanese AI research centre aims to understand how machines extract image data from the real world.
Fixing AI’s energy crisis
Hardware that consumes less power will reduce artificial intelligence's appetite for energy. But transparency about its carbon footprint is still needed.
- Katherine Bourzac
Scientific papers that mention AI get a citation boost
An analysis of tens of millions of papers shows which fields have embraced AI tools with enthusiasm — and which have been slower.
- Mariana Lenharo
The AI revolution is always just out of reach
Claims that artificial intelligence will usher in a new scientific and social era have been attracting funding for decades, but the changes they’ve achieved have not been as advertised. Historian James Sumner considers the limits of science’s ability to plan a revolution.
- James Sumner
Quick links
- Explore articles by subject
- Guide to authors
- Editorial policies
Unveiling the Vision: A Comprehensive Review of Computer Vision in AI and ML
Ieee account.
- Change Username/Password
- Update Address
Purchase Details
- Payment Options
- Order History
- View Purchased Documents
Profile Information
- Communications Preferences
- Profession and Education
- Technical Interests
- US & Canada: +1 800 678 4333
- Worldwide: +1 732 981 0060
- Contact & Support
- About IEEE Xplore
- Accessibility
- Terms of Use
- Nondiscrimination Policy
- Privacy & Opting Out of Cookies
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
Help | Advanced Search
Computer Science (since January 1993)
For a specific paper , enter the identifier into the top right search box.
- new (most recent mailing, with abstracts)
- recent (last 5 mailings)
- current month's listings
- specific year/month: 2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 all months 01 (Jan) 02 (Feb) 03 (Mar) 04 (Apr) 05 (May) 06 (Jun) 07 (Jul) 08 (Aug) 09 (Sep) 10 (Oct) 11 (Nov) 12 (Dec)
- Catch-up: Categories: All Artificial Intelligence Hardware Architecture Computational Complexity Computational Engineering, Finance, and Science Computational Geometry Computation and Language Cryptography and Security Computer Vision and Pattern Recognition Computers and Society Databases Distributed, Parallel, and Cluster Computing Digital Libraries Discrete Mathematics Data Structures and Algorithms Emerging Technologies Formal Languages and Automata Theory General Literature Graphics Computer Science and Game Theory Human-Computer Interaction Information Retrieval Information Theory Machine Learning Logic in Computer Science Multiagent Systems Multimedia Mathematical Software Numerical Analysis Neural and Evolutionary Computing Networking and Internet Architecture Other Computer Science Operating Systems Performance Programming Languages Robotics Symbolic Computation Sound Software Engineering Social and Information Networks Systems and Control Changes since: 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 01 (Jan) 02 (Feb) 03 (Mar) 04 (Apr) 05 (May) 06 (Jun) 07 (Jul) 08 (Aug) 09 (Sep) 10 (Oct) 11 (Nov) 12 (Dec) 2024 2023 , view results without with abstracts
- Search within the cs archive
- Article statistics by year: 2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993
Categories within Computer Science
- cs.AI - Artificial Intelligence ( new , recent , current month ) Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
- cs.AR - Hardware Architecture ( new , recent , current month ) Covers systems organization and hardware architecture. Roughly includes material in ACM Subject Classes C.0, C.1, and C.5.
- cs.CC - Computational Complexity ( new , recent , current month ) Covers models of computation, complexity classes, structural complexity, complexity tradeoffs, upper and lower bounds. Roughly includes material in ACM Subject Classes F.1 (computation by abstract devices), F.2.3 (tradeoffs among complexity measures), and F.4.3 (formal languages), although some material in formal languages may be more appropriate for Logic in Computer Science. Some material in F.2.1 and F.2.2, may also be appropriate here, but is more likely to have Data Structures and Algorithms as the primary subject area.
- cs.CE - Computational Engineering, Finance, and Science ( new , recent , current month ) Covers applications of computer science to the mathematical modeling of complex systems in the fields of science, engineering, and finance. Papers here are interdisciplinary and applications-oriented, focusing on techniques and tools that enable challenging computational simulations to be performed, for which the use of supercomputers or distributed computing platforms is often required. Includes material in ACM Subject Classes J.2, J.3, and J.4 (economics).
- cs.CG - Computational Geometry ( new , recent , current month ) Roughly includes material in ACM Subject Classes I.3.5 and F.2.2.
- cs.CL - Computation and Language ( new , recent , current month ) Covers natural language processing. Roughly includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area.
- cs.CR - Cryptography and Security ( new , recent , current month ) Covers all areas of cryptography and security including authentication, public key cryptosytems, proof-carrying code, etc. Roughly includes material in ACM Subject Classes D.4.6 and E.3.
- cs.CV - Computer Vision and Pattern Recognition ( new , recent , current month ) Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.
- cs.CY - Computers and Society ( new , recent , current month ) Covers impact of computers on society, computer ethics, information technology and public policy, legal aspects of computing, computers and education. Roughly includes material in ACM Subject Classes K.0, K.2, K.3, K.4, K.5, and K.7.
- cs.DB - Databases ( new , recent , current month ) Covers database management, datamining, and data processing. Roughly includes material in ACM Subject Classes E.2, E.5, H.0, H.2, and J.1.
- cs.DC - Distributed, Parallel, and Cluster Computing ( new , recent , current month ) Covers fault-tolerance, distributed algorithms, stabilility, parallel computation, and cluster computing. Roughly includes material in ACM Subject Classes C.1.2, C.1.4, C.2.4, D.1.3, D.4.5, D.4.7, E.1.
- cs.DL - Digital Libraries ( new , recent , current month ) Covers all aspects of the digital library design and document and text creation. Note that there will be some overlap with Information Retrieval (which is a separate subject area). Roughly includes material in ACM Subject Classes H.3.5, H.3.6, H.3.7, I.7.
- cs.DM - Discrete Mathematics ( new , recent , current month ) Covers combinatorics, graph theory, applications of probability. Roughly includes material in ACM Subject Classes G.2 and G.3.
- cs.DS - Data Structures and Algorithms ( new , recent , current month ) Covers data structures and analysis of algorithms. Roughly includes material in ACM Subject Classes E.1, E.2, F.2.1, and F.2.2.
- cs.ET - Emerging Technologies ( new , recent , current month ) Covers approaches to information processing (computing, communication, sensing) and bio-chemical analysis based on alternatives to silicon CMOS-based technologies, such as nanoscale electronic, photonic, spin-based, superconducting, mechanical, bio-chemical and quantum technologies (this list is not exclusive). Topics of interest include (1) building blocks for emerging technologies, their scalability and adoption in larger systems, including integration with traditional technologies, (2) modeling, design and optimization of novel devices and systems, (3) models of computation, algorithm design and programming for emerging technologies.
- cs.FL - Formal Languages and Automata Theory ( new , recent , current month ) Covers automata theory, formal language theory, grammars, and combinatorics on words. This roughly corresponds to ACM Subject Classes F.1.1, and F.4.3. Papers dealing with computational complexity should go to cs.CC; papers dealing with logic should go to cs.LO.
- cs.GL - General Literature ( new , recent , current month ) Covers introductory material, survey material, predictions of future trends, biographies, and miscellaneous computer-science related material. Roughly includes all of ACM Subject Class A, except it does not include conference proceedings (which will be listed in the appropriate subject area).
- cs.GR - Graphics ( new , recent , current month ) Covers all aspects of computer graphics. Roughly includes material in all of ACM Subject Class I.3, except that I.3.5 is is likely to have Computational Geometry as the primary subject area.
- cs.GT - Computer Science and Game Theory ( new , recent , current month ) Covers all theoretical and applied aspects at the intersection of computer science and game theory, including work in mechanism design, learning in games (which may overlap with Learning), foundations of agent modeling in games (which may overlap with Multiagent systems), coordination, specification and formal methods for non-cooperative computational environments. The area also deals with applications of game theory to areas such as electronic commerce.
- cs.HC - Human-Computer Interaction ( new , recent , current month ) Covers human factors, user interfaces, and collaborative computing. Roughly includes material in ACM Subject Classes H.1.2 and all of H.5, except for H.5.1, which is more likely to have Multimedia as the primary subject area.
- cs.IR - Information Retrieval ( new , recent , current month ) Covers indexing, dictionaries, retrieval, content and analysis. Roughly includes material in ACM Subject Classes H.3.0, H.3.1, H.3.2, H.3.3, and H.3.4.
- cs.IT - Information Theory ( new , recent , current month ) Covers theoretical and experimental aspects of information theory and coding. Includes material in ACM Subject Class E.4 and intersects with H.1.1.
- cs.LG - Machine Learning ( new , recent , current month ) Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
- cs.LO - Logic in Computer Science ( new , recent , current month ) Covers all aspects of logic in computer science, including finite model theory, logics of programs, modal logic, and program verification. Programming language semantics should have Programming Languages as the primary subject area. Roughly includes material in ACM Subject Classes D.2.4, F.3.1, F.4.0, F.4.1, and F.4.2; some material in F.4.3 (formal languages) may also be appropriate here, although Computational Complexity is typically the more appropriate subject area.
- cs.MA - Multiagent Systems ( new , recent , current month ) Covers multiagent systems, distributed artificial intelligence, intelligent agents, coordinated interactions. and practical applications. Roughly covers ACM Subject Class I.2.11.
- cs.MM - Multimedia ( new , recent , current month ) Roughly includes material in ACM Subject Class H.5.1.
- cs.MS - Mathematical Software ( new , recent , current month ) Roughly includes material in ACM Subject Class G.4.
- cs.NA - Numerical Analysis ( new , recent , current month ) cs.NA is an alias for math.NA. Roughly includes material in ACM Subject Class G.1.
- cs.NE - Neural and Evolutionary Computing ( new , recent , current month ) Covers neural networks, connectionism, genetic algorithms, artificial life, adaptive behavior. Roughly includes some material in ACM Subject Class C.1.3, I.2.6, I.5.
- cs.NI - Networking and Internet Architecture ( new , recent , current month ) Covers all aspects of computer communication networks, including network architecture and design, network protocols, and internetwork standards (like TCP/IP). Also includes topics, such as web caching, that are directly relevant to Internet architecture and performance. Roughly includes all of ACM Subject Class C.2 except C.2.4, which is more likely to have Distributed, Parallel, and Cluster Computing as the primary subject area.
- cs.OH - Other Computer Science ( new , recent , current month ) This is the classification to use for documents that do not fit anywhere else.
- cs.OS - Operating Systems ( new , recent , current month ) Roughly includes material in ACM Subject Classes D.4.1, D.4.2., D.4.3, D.4.4, D.4.5, D.4.7, and D.4.9.
- cs.PF - Performance ( new , recent , current month ) Covers performance measurement and evaluation, queueing, and simulation. Roughly includes material in ACM Subject Classes D.4.8 and K.6.2.
- cs.PL - Programming Languages ( new , recent , current month ) Covers programming language semantics, language features, programming approaches (such as object-oriented programming, functional programming, logic programming). Also includes material on compilers oriented towards programming languages; other material on compilers may be more appropriate in Architecture (AR). Roughly includes material in ACM Subject Classes D.1 and D.3.
- cs.RO - Robotics ( new , recent , current month ) Roughly includes material in ACM Subject Class I.2.9.
- cs.SC - Symbolic Computation ( new , recent , current month ) Roughly includes material in ACM Subject Class I.1.
- cs.SD - Sound ( new , recent , current month ) Covers all aspects of computing with sound, and sound as an information channel. Includes models of sound, analysis and synthesis, audio user interfaces, sonification of data, computer music, and sound signal processing. Includes ACM Subject Class H.5.5, and intersects with H.1.2, H.5.1, H.5.2, I.2.7, I.5.4, I.6.3, J.5, K.4.2.
- cs.SE - Software Engineering ( new , recent , current month ) Covers design tools, software metrics, testing and debugging, programming environments, etc. Roughly includes material in all of ACM Subject Classes D.2, except that D.2.4 (program verification) should probably have Logics in Computer Science as the primary subject area.
- cs.SI - Social and Information Networks ( new , recent , current month ) Covers the design, analysis, and modeling of social and information networks, including their applications for on-line information access, communication, and interaction, and their roles as datasets in the exploration of questions in these and other domains, including connections to the social and biological sciences. Analysis and modeling of such networks includes topics in ACM Subject classes F.2, G.2, G.3, H.2, and I.2; applications in computing include topics in H.3, H.4, and H.5; and applications at the interface of computing and other disciplines include topics in J.1--J.7. Papers on computer communication systems and network protocols (e.g. TCP/IP) are generally a closer fit to the Networking and Internet Architecture (cs.NI) category.
- cs.SY - Systems and Control ( new , recent , current month ) cs.SY is an alias for eess.SY. This section includes theoretical and experimental research covering all facets of automatic control systems. The section is focused on methods of control system analysis and design using tools of modeling, simulation and optimization. Specific areas of research include nonlinear, distributed, adaptive, stochastic and robust control in addition to hybrid and discrete event systems. Application areas include automotive and aerospace control systems, network control, biological systems, multiagent and cooperative control, robotics, reinforcement learning, sensor networks, control of cyber-physical and energy-related systems, and control of computing systems.
Foundation of Computer Science™ Inc is now a part of Web of Science™ Subscribers. FCS subscribes to Web of Science's Reviewer Recognition Service and Reviewer Locator for Publishers programmes.
IJCA is now being indexed with EBSCO, Google Scholar, Informatics, ProQuest CSA Technology Research Database, NASA ADS, CiteSeer, UlrichWeb, ScientificCommons (Univ. of St Gallens), University of Karlsruhe, Germany, PennState University
Call for Paper
IJCA solicits high quality original research papers for the upcoming December Edition of the journal.
Last date of paper submission 20 November 2024
Enhancing Fake News Identification in Social Media through Ensemble Learning Methods
Timothy Moses, Henry Egaga Obi, Christopher Ifeanyi Eke, Jeffrey Agushaka
Privacy and Security Issues: An Assessment of the Awareness Level of Smartphone Users in Nigeria
Omeka Friday Odey, Joshua Abah, Dekera Kenneth Kwaghtyo
Machine Learning-based E-Learners’ Engagement Level Prediction using Benchmark Datasets
God’swill Theophilus, Christopher Ifeanyi Eke
The International Journal of Computer Applications in association with indexing partners Google Scholar, Elsevier CiteuLike et. al. publishes original papers in applied, experimental and theoretical aspects of information systems. The enhanced online publication platform is jointly developed with support from the Foundation of Computer Science, New York, USA.
All published research articles in IJCA academic research journal have undergone rigorous peer review, based on initial editor screening and anonymous refereeing by independent expert referees. The camera-ready is ratified by FCS reviewer network.
Quick Links
International Journal of Computer Applications
Recent articles.
Jainish Shah, Nilanjan Sen, Binto George, Antonio Cardenas-Haro
Owoeye Samuel, Folasade Durodola, Adefolahan Akinsola, Bisiriyu Babatunde, Opeyemi Adewale
Divya Dewangan, Smita Selot, Sreejit Panicker
Hrishikesh Joshi
University affiliates
IJCA journals’ bibliographies & citations are available from HARVARD University Library under Creative Commons Attribution 4.0 International License.
The PennState University Libraries comprise 36 libraries at 24 locations throughout the Commonwealth of Pennsylvania. IJCA releases the articles to PennState University via Proquest.
PUBLICATION ETHICS
Policy on Publication Ethics - Ensuring genuine authorship
Explore more details >
Be a research volunteer
IJCA is fuelled by a highly dispersed and geopraphically separated team of dynamic volunteers. IJCA calls volunteers interested to contribute towards the scientific development in the field of computer science.
ABOUT IJCA & DISCLAIMER
International Journal of Computer Applications (IJCA) is a peer reviewed journal published by Foundation of Computer Science (FCS). The journal publishes papers related with topics including but not limited to Information Systems, Distributed Systems, Graphics and Imaging, Bio-informatics, Natural Language Processing, Software Testing, Human-Computer Interaction, Embedded Systems, Pattern Recognition, Signal Processing
Guest Editorship Program
Prospective authors should note that only original and previously unpublished manuscripts will be considered. Furthermore, simultaneous submissions (including Information systems journal and electronics journal ) are not acceptable. Authors are advised to read Publication Ethics and Malpractice Statement to learn about compliances. Information regarding paper submission to the computer journal can be found at call for papers page.
- Increase Font Size
9 Application of Computer in Research
P.G. Padma Gowri
Introduction :
The computers are the emerging tool in the research process. The main components of Computers are an input device, a Central Processing Unit and an output device. It is an essential tool for research, whether for academic purpose or for commercial purpose. Computers play a major role today in every field of scientific research from genetic engineering to astrophysics research.
Computers with internet led the way to a globalized information portal that is the World Wide Web. Using WWW, researcher can conduct research on massive scale. Various programs and applications have eased our way into computing our research process. In this module, various computer software applications and tools are discussed with respect to research activities like data collection, analysis, etc.
Objectives:
- Understand the Features of computers.
- To know various steps involved in research process.
- Role of Computers in Research Publication.
- Introduction of Analysis Tools used in research process.
Features of a computer :-
There are many reasons why computers are so important in scientific research and here are some of the reasons are:
SPEED: computer can process numbers and information in a very short time. So researcher can process and analyze data quickly. By saving time researcher can conduct further research. A calculation that may take a person several hours to process will take computer mere minutes, if not seconds.
STORAGE DEVICE – Computer can store and retrieve huge data. It can be used when needed.
There is no risk of forgetting and loosing data.
ACCURACY: Computer is incredibly accurate. Accuracy is very much important in scientific research. Wrong calculation could result an entire research or project being filled with incorrect information.
ORGANIZATION: We can store millions of pages of information by using simple folders, word processors & computer programs. Computer is more productive & safer than using a paper filing system in which anything can be easily misplaced.
CONSISTENCY: computer cannot make mistakes through “tiredness” or lack of concentration like human being. This characteristic makes it exceptionally important in scientific research. Large calculations can be done with accuracy and speed with the help of computer.
Automatic Device – The programs which are run on computer are automatic through some instructions
Computational Tools
Computers started for the use of powerful calculators, and that service is important to research today. Huge amount of data can process with the help of computer’s. Statistical programs, modelling programs and spatial mapping tools are all possible use of computers. Researchers can use information in new ways, example layering different types of maps on one another to discover new patterns in how people use their environment.
C ommunication
Building knowledge through research requires communication between experts to identify new areas requiring research and debating results and ideas. Before the invention of computers, this was accomplished through papers and workshops. Now, the world’s experts can communicate via web chatsor email. Information can be spread various ways example by virtual conferences
Researchers can take computers anywhere, it is easier to conduct field research and collect large amount of data. New areas of research in remote areas or at a community level are carried out by the mobility of computers. Social media sites have a new medium for interaction with society and collect the information.
The Steps in Research Process
Research process consists of series of actions necessary to carry out research work effectively The sequencing of these steps listed below
- Formulating the research problem;
- Extensive literature survey;
- Developing the hypothesis;
- Preparing the research design;
- Determining sample design;
- Data Collection;
- Project Execution;
- Data Analysis;
- Hypothesis testing;
- Generalizations and interpretation,
- Preparation of the report or presentation of the results, i.e., formal write-up of conclusions of the research.
Computers in Research
Computers are used in scientific research extremely and it is an important tool .Research process can also be done through computers. Computers are very useful and important tool for processing huge number of samples. It has many storage devices like compact discs and auxiliary memories. Data can be used from these storage devices and retrieved later on. There are various steps necessary to effectively carry out research and the desired sequencing of these steps in the research process. This data can be used for different phases of research process.
There are five major phases of the research process:
- Conceptual phase
- Design and planning phase
- Data collection phase
- Data Analysis phase and
- Research Publication phase
Conceptual Phase and Computer
The conceptual phase consists of formulation of research problem, extensive literature survey, theoretical frame work and developing the hypothesis.
Computer helps in searching the existing literature in the relevant field of research. It helps in finding the relevant existing research papers so that researcher can find out the gap from the existing literature. Computers help for searching the literatures and bibliographic reference stored in the electronic database of the World Wide Web’s.
It can be used for storing relevant published articles to the retrieved whenever needed. This has the advantage over searching the literatures in the form of journals, books and other newsletters at the libraries which consume considerable amount of time and effort.
Bibliographic references can also be stored in World Wide Web. In the latest computers, references can be written easily in different styles. Researcher need not visit libraries .It helps to increase time for research. It helps researchers to know how theoretical framework can be built.
Design and Planning Phase and Computer
Computer can be used for, deciding population sample, questionnaire designing and data collection. They are different internet sites which help to design questionnaire. Software’s can be used to calculate the sample size. It makes pilot study of the research possible. In pilot study, sample size calculation, standard deviations are required. Computer helps in doing all these activities.
Role of Computers in Data collection phase
Empirical phase consists of collecting and preparing the data for analysis:
In research studies, the preparation and computation of data are the most labor-intensive and time consuming aspect of the work. Typically the data will be initially recorded on a questionnaire or record for suitable for its acceptance by the computer. To do this the researcher in connection with the statistician and the programmer, will convert the data into Microsoft word file or excel spreadsheet or any statistical software data file. These data can be directly used with statistical Software’s for analysis.
Data collection and Storage:
The data obtained from the research subjects are stored in computes in the form of word files or excel spread sheets or any statistical software data file. This has the advantage of making necessary corrections or editing the whole layout of the tables if needed, which is impossible or time consuming incase of writing in hand written. Thus, computers help in data editing, data entry, and data management including follow up actions etc. computers also allow for greater flexibility in recording and processing the data while they are collected as well as greater ease during the analysis of these data.
Data exposition:
The researchers are anxious about seeing the data: what they look like; how they are distributed etc. Researchers also examine different dimension of variables or plot them in various charts using a statistical application.
Data Analysis and Computer:
Data Analysis and Computer phase consist of the analysis of data, interpretation and hypothesis testing. Data analysis phase consist of statistical analysis of the data and interpretation of results. Data analysis and interpretation can be done with the help of computers. For data analysis, software’s available. These software help in using the techniques for analysis like average, percentage, correlation and all the mathematical calculations.
Software’s used for data analysis are SPSS, STATA, SYSAT etc. Computers are useful not only for statistical analysis, but also to monitor the accuracy and completeness of the data as they are collected. This software’s also display the results in graphical chart or graph form.
Computers are used in interpretation also. They can check the accuracy and authenticity of data. It helps is drafting tables by which a researcher can interpret the results easily. These tables give a clear proof of the interpretation made by researcher.
Role of Computer in Research Publication
After interpretation, computer helps is converting the results into a research article or report which can be published. This phase consists of preparation of the report or presentation of the results, i.e., formal write-up of conclusions reached. This is the research publication phase. The research article, research paper, research thesis or research dissertation is typed in word processing software and converted to portable data format (PDF) and stored and/or published in the world wide web. Online sites are available through we can convert our word file into any format like html, pdf etc.
Various online applications are also available for this purpose. Even one can prepare our document using online word processing software and can store/edit/access it from anywhere using internet.
References and computer:
After completing the word document, a researcher need to give source of the literature studied and discussed in references. Computers also help in preparing references. References can be written in different styles. All the details of author’s journals, publication volume Books can be filled in the options “reference‟ given in computer and it automatically change the information into the required style. Software used to manage the references.
A researcher needs not to worry about remembering all the articles from where literature in taken, it can be easily managed with the help of computers.
Simulation:
Simulation is the imitation of the operation of a real-world process or system over time. Simulation is used in many contexts, such as simulation of technology for performance optimization, safety engineering, testing, training, education, and video games. Often, computer experiments are used to study simulation models. Simulation can be used to show the eventual real effects of alternative conditions and courses of action. Simulation is mainly used when the real system cannot be engaged, because it may not be accessible, or it may be dangerous or unacceptable to engage, or it is being designed but not yet built, or it may simply not exist. Using computers the simulation in research carried out in various fields.
Role of Computers in Scientific Research:
There are various computer applications used in scientific research. Some of the most important applications used in scientific research are data storage, data analysis, scientific simulations, instrumentation control and knowledge sharing.
Data Storage
Experimentation is the basis of scientific research. Scientific experiment in any of the natural sciences generates a lot of data that needs to be stored and analyzed to derive important conclusions, to validate or disprove hypotheses. Computers attached with experiential apparatuses, directly record data as its generated and subject it to analysis through specially designed software. Data storage is possible in SPSS data file, lotus spreadsheet, excel spreadsheet, DOS text file etc
Data Analysis
Analyzing Huge number of statistical data is made possible using specially designed algorithms that are implemented by computers. This makes the extremely time-consuming job of data analysis to be matter of a few minutes. In genetic engineering, computers have made the sequencing of the entire human genome possible. Data got from different sources can be stored and accessed via computer networks set up in research labs, which makes collaboration simpler.
Scientific Simulations
One of the prime uses of computers in pure science and engineering projects is the running of simulations. A simulation is a mathematical modeling of a problem and a virtual study of its possible solutions.
For example, astrophysicists carry out structure formation simulations, which are aimed at studying how large-scale structures like galaxies are formed. Space missions to the Moon, satellite launches and interplanetary missions are first simulated on computers to determine the best path that can be taken by the launch vehicle and spacecraft to reach its destination safely.
Instrumentation Control
Most advanced scientific instruments come with their own on-board computer, which can be programmed to execute various functions. For example, the Hubble Space Craft has its own onboard computer system which is remotely programmed to probe the deep space. Instrumentation control is one of the most important applications of computers.
Knowledge Sharing through Internet
In the form of Internet, computers have provided an entirely new way to share knowledge. Today, anyone can access the latest research papers that are made available for free on websites. Sharing of knowledge and collaboration through the Internet has made international cooperation on scientific projects possible.
Through various kinds of analytical software programs, computers are contributing to scientific Research in every discipline, ranging from biology to astrophysics, discovering new patterns and providing novel insights.
When the work in neural network based artificial intelligence advances and computers are granted with the ability to learn and think for them, future advances in technology and research will be even more rapid.
Tools and Applications Used In the Research Process Statistical Analysis Tool: SPSS
SPSS is the most popular tool for statisticians. SPSS stands for Statistical Package for Social Sciences.
It provides all analysis facilities like following and many more.
- Provides Data view & variable view
- Measures of central tendency & dispersion
- Statistical inference
- Correlation & Regression analysis
- Analysis of variance
- Non parametric test
- Hypothesis tests: T-test, chi-square, z-test, ANOVA, Bipartite variable….
- Multivariate data analysis
- Frequency distribution
- Data exposition by using various graphs like line, scatter, bar, ogive, histogram,
Data Analysis Tool:
Spreadsheet Packages
A spreadsheet is a computer application that simulates a paper worksheet. It displays multiple cells that together make up a grid consisting of rows and columns, each cell containing eitheral phanumeric text or numeric values. Microsoft Excel is popular spreadsheet software. Others spreadsheet packages are Lotus 1-2-3Quattro Pro, Javeline Plus, Multiplan, VisiCalc, Supercalc, Plan Perfect etc.
Other Statistical Tool
SAS, S-Plus, LISREL, Eviews etc.
Word Processor Packages
A word processor (more formally known as document preparation system) is a computer application used for the production (including composition, editing, formatting, and possibly printing) of any sort of printable material.
The word processing packages are Microsoft Word, WordStar, Word perfect ,Amipro etc.
Presentation Software
A presentation program is a computer software package used to display information, normally in the form of a slide show. It typically includes three major functions: an editor that allows text inserted and formatted a method for inserting and manipulating graphic images and a slideshow system to display the content. The presentation packages are Microsoft Power point, Lotus Freelance Graphics, Corel Presentations, Apple keynote etc.
DATABASE MANAGEMENT PACKAGES (DBMS)
Database is an organized collection of information. A DBMS is a software designed to manage adatabase. Various Desktop Databases are Microsoft Access, Paradox, Dbase or DbaseIII+, FoxBase, Foxpro/ Visual Foxpro, FileMaker Procommercial Database Servers that supports multiuser are Oracle, Ms-SQL Server, Sybase, Ingres, Informix, DB2 UDB (IBM), Unify, Integral, etc.
Open source Database packages are MySQL, PostgreSQL, and Firebird etc. BROWSERS A web browser is a software application which enables a user to display and interact with text, images, videos, music, games and other information typically located on a Web page at a website on the World Wide Web or a local area network.
Examples are Microsoft Internet explorer, Mozilla firefox, Opera, Netscape navigator, Chrome.
Computer has helped in serving the difficulties faced by human beings. By the passing of time, computers have been reduced from a size of room to six of human palm. Computer performs many functions and does variety of jobs with speed and accuracy.
Today, life has become impossible without computers. It is used in Schools, Colleges and has become indispensable part of every business or profession. Research is also an area where computer are playing a major role.
Use of computer in research in science is so extensive that it is difficult to conceive today are search project without computer. Many research studies cannot be carried out without use of computer particularly those involving complex computations, data analysis and modeling. Computer in scientific research is used at all stages starts from study, proposal/budget stage to submission/presentation of findings.
- https://simple.wikipedia.org/wiki/Computer
- https://www.elsevier.com/journals/computers-and…research
- www.sciencedirect.com/journal/computers
Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser .
- We're Hiring!
- Help Center
- Computer Science
- Most Cited Papers
- Most Downloaded Papers
- Newest Papers
- Last »
- Artificial Intelligence Follow Following
- Software Engineering Follow Following
- Computer Vision Follow Following
- Human Computer Interaction Follow Following
- Machine Learning Follow Following
- Data Mining Follow Following
- Computer Graphics Follow Following
- Distributed Computing Follow Following
- Computer Networks Follow Following
- Cloud Computing Follow Following
Enter the email address you signed up with and we'll email you a reset link.
- Academia.edu Journals
- We're Hiring!
- Help Center
- Find new research papers in:
- Health Sciences
- Earth Sciences
- Cognitive Science
- Mathematics
- Academia ©2024
- Privacy Policy
Home » 500+ Computer Science Research Topics
500+ Computer Science Research Topics
Computer Science is a constantly evolving field that has transformed the world we live in today. With new technologies emerging every day, there are countless research opportunities in this field. Whether you are interested in artificial intelligence, machine learning, cybersecurity, data analytics, or computer networks, there are endless possibilities to explore. In this post, we will delve into some of the most interesting and important research topics in Computer Science. From the latest advancements in programming languages to the development of cutting-edge algorithms, we will explore the latest trends and innovations that are shaping the future of Computer Science. So, whether you are a student or a professional, read on to discover some of the most exciting research topics in this dynamic and rapidly expanding field.
Computer Science Research Topics
Computer Science Research Topics are as follows:
- Using machine learning to detect and prevent cyber attacks
- Developing algorithms for optimized resource allocation in cloud computing
- Investigating the use of blockchain technology for secure and decentralized data storage
- Developing intelligent chatbots for customer service
- Investigating the effectiveness of deep learning for natural language processing
- Developing algorithms for detecting and removing fake news from social media
- Investigating the impact of social media on mental health
- Developing algorithms for efficient image and video compression
- Investigating the use of big data analytics for predictive maintenance in manufacturing
- Developing algorithms for identifying and mitigating bias in machine learning models
- Investigating the ethical implications of autonomous vehicles
- Developing algorithms for detecting and preventing cyberbullying
- Investigating the use of machine learning for personalized medicine
- Developing algorithms for efficient and accurate speech recognition
- Investigating the impact of social media on political polarization
- Developing algorithms for sentiment analysis in social media data
- Investigating the use of virtual reality in education
- Developing algorithms for efficient data encryption and decryption
- Investigating the impact of technology on workplace productivity
- Developing algorithms for detecting and mitigating deepfakes
- Investigating the use of artificial intelligence in financial trading
- Developing algorithms for efficient database management
- Investigating the effectiveness of online learning platforms
- Developing algorithms for efficient and accurate facial recognition
- Investigating the use of machine learning for predicting weather patterns
- Developing algorithms for efficient and secure data transfer
- Investigating the impact of technology on social skills and communication
- Developing algorithms for efficient and accurate object recognition
- Investigating the use of machine learning for fraud detection in finance
- Developing algorithms for efficient and secure authentication systems
- Investigating the impact of technology on privacy and surveillance
- Developing algorithms for efficient and accurate handwriting recognition
- Investigating the use of machine learning for predicting stock prices
- Developing algorithms for efficient and secure biometric identification
- Investigating the impact of technology on mental health and well-being
- Developing algorithms for efficient and accurate language translation
- Investigating the use of machine learning for personalized advertising
- Developing algorithms for efficient and secure payment systems
- Investigating the impact of technology on the job market and automation
- Developing algorithms for efficient and accurate object tracking
- Investigating the use of machine learning for predicting disease outbreaks
- Developing algorithms for efficient and secure access control
- Investigating the impact of technology on human behavior and decision making
- Developing algorithms for efficient and accurate sound recognition
- Investigating the use of machine learning for predicting customer behavior
- Developing algorithms for efficient and secure data backup and recovery
- Investigating the impact of technology on education and learning outcomes
- Developing algorithms for efficient and accurate emotion recognition
- Investigating the use of machine learning for improving healthcare outcomes
- Developing algorithms for efficient and secure supply chain management
- Investigating the impact of technology on cultural and societal norms
- Developing algorithms for efficient and accurate gesture recognition
- Investigating the use of machine learning for predicting consumer demand
- Developing algorithms for efficient and secure cloud storage
- Investigating the impact of technology on environmental sustainability
- Developing algorithms for efficient and accurate voice recognition
- Investigating the use of machine learning for improving transportation systems
- Developing algorithms for efficient and secure mobile device management
- Investigating the impact of technology on social inequality and access to resources
- Machine learning for healthcare diagnosis and treatment
- Machine Learning for Cybersecurity
- Machine learning for personalized medicine
- Cybersecurity threats and defense strategies
- Big data analytics for business intelligence
- Blockchain technology and its applications
- Human-computer interaction in virtual reality environments
- Artificial intelligence for autonomous vehicles
- Natural language processing for chatbots
- Cloud computing and its impact on the IT industry
- Internet of Things (IoT) and smart homes
- Robotics and automation in manufacturing
- Augmented reality and its potential in education
- Data mining techniques for customer relationship management
- Computer vision for object recognition and tracking
- Quantum computing and its applications in cryptography
- Social media analytics and sentiment analysis
- Recommender systems for personalized content delivery
- Mobile computing and its impact on society
- Bioinformatics and genomic data analysis
- Deep learning for image and speech recognition
- Digital signal processing and audio processing algorithms
- Cloud storage and data security in the cloud
- Wearable technology and its impact on healthcare
- Computational linguistics for natural language understanding
- Cognitive computing for decision support systems
- Cyber-physical systems and their applications
- Edge computing and its impact on IoT
- Machine learning for fraud detection
- Cryptography and its role in secure communication
- Cybersecurity risks in the era of the Internet of Things
- Natural language generation for automated report writing
- 3D printing and its impact on manufacturing
- Virtual assistants and their applications in daily life
- Cloud-based gaming and its impact on the gaming industry
- Computer networks and their security issues
- Cyber forensics and its role in criminal investigations
- Machine learning for predictive maintenance in industrial settings
- Augmented reality for cultural heritage preservation
- Human-robot interaction and its applications
- Data visualization and its impact on decision-making
- Cybersecurity in financial systems and blockchain
- Computer graphics and animation techniques
- Biometrics and its role in secure authentication
- Cloud-based e-learning platforms and their impact on education
- Natural language processing for machine translation
- Machine learning for predictive maintenance in healthcare
- Cybersecurity and privacy issues in social media
- Computer vision for medical image analysis
- Natural language generation for content creation
- Cybersecurity challenges in cloud computing
- Human-robot collaboration in manufacturing
- Data mining for predicting customer churn
- Artificial intelligence for autonomous drones
- Cybersecurity risks in the healthcare industry
- Machine learning for speech synthesis
- Edge computing for low-latency applications
- Virtual reality for mental health therapy
- Quantum computing and its applications in finance
- Biomedical engineering and its applications
- Cybersecurity in autonomous systems
- Machine learning for predictive maintenance in transportation
- Computer vision for object detection in autonomous driving
- Augmented reality for industrial training and simulations
- Cloud-based cybersecurity solutions for small businesses
- Natural language processing for knowledge management
- Machine learning for personalized advertising
- Cybersecurity in the supply chain management
- Cybersecurity risks in the energy sector
- Computer vision for facial recognition
- Natural language processing for social media analysis
- Machine learning for sentiment analysis in customer reviews
- Explainable Artificial Intelligence
- Quantum Computing
- Blockchain Technology
- Human-Computer Interaction
- Natural Language Processing
- Cloud Computing
- Robotics and Automation
- Augmented Reality and Virtual Reality
- Cyber-Physical Systems
- Computational Neuroscience
- Big Data Analytics
- Computer Vision
- Cryptography and Network Security
- Internet of Things
- Computer Graphics and Visualization
- Artificial Intelligence for Game Design
- Computational Biology
- Social Network Analysis
- Bioinformatics
- Distributed Systems and Middleware
- Information Retrieval and Data Mining
- Computer Networks
- Mobile Computing and Wireless Networks
- Software Engineering
- Database Systems
- Parallel and Distributed Computing
- Human-Robot Interaction
- Intelligent Transportation Systems
- High-Performance Computing
- Cyber-Physical Security
- Deep Learning
- Sensor Networks
- Multi-Agent Systems
- Human-Centered Computing
- Wearable Computing
- Knowledge Representation and Reasoning
- Adaptive Systems
- Brain-Computer Interface
- Health Informatics
- Cognitive Computing
- Cybersecurity and Privacy
- Internet Security
- Cybercrime and Digital Forensics
- Cloud Security
- Cryptocurrencies and Digital Payments
- Machine Learning for Natural Language Generation
- Cognitive Robotics
- Neural Networks
- Semantic Web
- Image Processing
- Cyber Threat Intelligence
- Secure Mobile Computing
- Cybersecurity Education and Training
- Privacy Preserving Techniques
- Cyber-Physical Systems Security
- Virtualization and Containerization
- Machine Learning for Computer Vision
- Network Function Virtualization
- Cybersecurity Risk Management
- Information Security Governance
- Intrusion Detection and Prevention
- Biometric Authentication
- Machine Learning for Predictive Maintenance
- Security in Cloud-based Environments
- Cybersecurity for Industrial Control Systems
- Smart Grid Security
- Software Defined Networking
- Quantum Cryptography
- Security in the Internet of Things
- Natural language processing for sentiment analysis
- Blockchain technology for secure data sharing
- Developing efficient algorithms for big data analysis
- Cybersecurity for internet of things (IoT) devices
- Human-robot interaction for industrial automation
- Image recognition for autonomous vehicles
- Social media analytics for marketing strategy
- Quantum computing for solving complex problems
- Biometric authentication for secure access control
- Augmented reality for education and training
- Intelligent transportation systems for traffic management
- Predictive modeling for financial markets
- Cloud computing for scalable data storage and processing
- Virtual reality for therapy and mental health treatment
- Data visualization for business intelligence
- Recommender systems for personalized product recommendations
- Speech recognition for voice-controlled devices
- Mobile computing for real-time location-based services
- Neural networks for predicting user behavior
- Genetic algorithms for optimization problems
- Distributed computing for parallel processing
- Internet of things (IoT) for smart cities
- Wireless sensor networks for environmental monitoring
- Cloud-based gaming for high-performance gaming
- Social network analysis for identifying influencers
- Autonomous systems for agriculture
- Robotics for disaster response
- Data mining for customer segmentation
- Computer graphics for visual effects in movies and video games
- Virtual assistants for personalized customer service
- Natural language understanding for chatbots
- 3D printing for manufacturing prototypes
- Artificial intelligence for stock trading
- Machine learning for weather forecasting
- Biomedical engineering for prosthetics and implants
- Cybersecurity for financial institutions
- Machine learning for energy consumption optimization
- Computer vision for object tracking
- Natural language processing for document summarization
- Wearable technology for health and fitness monitoring
- Internet of things (IoT) for home automation
- Reinforcement learning for robotics control
- Big data analytics for customer insights
- Machine learning for supply chain optimization
- Natural language processing for legal document analysis
- Artificial intelligence for drug discovery
- Computer vision for object recognition in robotics
- Data mining for customer churn prediction
- Autonomous systems for space exploration
- Robotics for agriculture automation
- Machine learning for predicting earthquakes
- Natural language processing for sentiment analysis in customer reviews
- Big data analytics for predicting natural disasters
- Internet of things (IoT) for remote patient monitoring
- Blockchain technology for digital identity management
- Machine learning for predicting wildfire spread
- Computer vision for gesture recognition
- Natural language processing for automated translation
- Big data analytics for fraud detection in banking
- Internet of things (IoT) for smart homes
- Robotics for warehouse automation
- Machine learning for predicting air pollution
- Natural language processing for medical record analysis
- Augmented reality for architectural design
- Big data analytics for predicting traffic congestion
- Machine learning for predicting customer lifetime value
- Developing algorithms for efficient and accurate text recognition
- Natural Language Processing for Virtual Assistants
- Natural Language Processing for Sentiment Analysis in Social Media
- Explainable Artificial Intelligence (XAI) for Trust and Transparency
- Deep Learning for Image and Video Retrieval
- Edge Computing for Internet of Things (IoT) Applications
- Data Science for Social Media Analytics
- Cybersecurity for Critical Infrastructure Protection
- Natural Language Processing for Text Classification
- Quantum Computing for Optimization Problems
- Machine Learning for Personalized Health Monitoring
- Computer Vision for Autonomous Driving
- Blockchain Technology for Supply Chain Management
- Augmented Reality for Education and Training
- Natural Language Processing for Sentiment Analysis
- Machine Learning for Personalized Marketing
- Big Data Analytics for Financial Fraud Detection
- Cybersecurity for Cloud Security Assessment
- Artificial Intelligence for Natural Language Understanding
- Blockchain Technology for Decentralized Applications
- Virtual Reality for Cultural Heritage Preservation
- Natural Language Processing for Named Entity Recognition
- Machine Learning for Customer Churn Prediction
- Big Data Analytics for Social Network Analysis
- Cybersecurity for Intrusion Detection and Prevention
- Artificial Intelligence for Robotics and Automation
- Blockchain Technology for Digital Identity Management
- Virtual Reality for Rehabilitation and Therapy
- Natural Language Processing for Text Summarization
- Machine Learning for Credit Risk Assessment
- Big Data Analytics for Fraud Detection in Healthcare
- Cybersecurity for Internet Privacy Protection
- Artificial Intelligence for Game Design and Development
- Blockchain Technology for Decentralized Social Networks
- Virtual Reality for Marketing and Advertising
- Natural Language Processing for Opinion Mining
- Machine Learning for Anomaly Detection
- Big Data Analytics for Predictive Maintenance in Transportation
- Cybersecurity for Network Security Management
- Artificial Intelligence for Personalized News and Content Delivery
- Blockchain Technology for Cryptocurrency Mining
- Virtual Reality for Architectural Design and Visualization
- Natural Language Processing for Machine Translation
- Machine Learning for Automated Image Captioning
- Big Data Analytics for Stock Market Prediction
- Cybersecurity for Biometric Authentication Systems
- Artificial Intelligence for Human-Robot Interaction
- Blockchain Technology for Smart Grids
- Virtual Reality for Sports Training and Simulation
- Natural Language Processing for Question Answering Systems
- Machine Learning for Sentiment Analysis in Customer Feedback
- Big Data Analytics for Predictive Maintenance in Manufacturing
- Cybersecurity for Cloud-Based Systems
- Artificial Intelligence for Automated Journalism
- Blockchain Technology for Intellectual Property Management
- Virtual Reality for Therapy and Rehabilitation
- Natural Language Processing for Language Generation
- Machine Learning for Customer Lifetime Value Prediction
- Big Data Analytics for Predictive Maintenance in Energy Systems
- Cybersecurity for Secure Mobile Communication
- Artificial Intelligence for Emotion Recognition
- Blockchain Technology for Digital Asset Trading
- Virtual Reality for Automotive Design and Visualization
- Natural Language Processing for Semantic Web
- Machine Learning for Fraud Detection in Financial Transactions
- Big Data Analytics for Social Media Monitoring
- Cybersecurity for Cloud Storage and Sharing
- Artificial Intelligence for Personalized Education
- Blockchain Technology for Secure Online Voting Systems
- Virtual Reality for Cultural Tourism
- Natural Language Processing for Chatbot Communication
- Machine Learning for Medical Diagnosis and Treatment
- Big Data Analytics for Environmental Monitoring and Management.
- Cybersecurity for Cloud Computing Environments
- Virtual Reality for Training and Simulation
- Big Data Analytics for Sports Performance Analysis
- Cybersecurity for Internet of Things (IoT) Devices
- Artificial Intelligence for Traffic Management and Control
- Blockchain Technology for Smart Contracts
- Natural Language Processing for Document Summarization
- Machine Learning for Image and Video Recognition
- Blockchain Technology for Digital Asset Management
- Virtual Reality for Entertainment and Gaming
- Natural Language Processing for Opinion Mining in Online Reviews
- Machine Learning for Customer Relationship Management
- Big Data Analytics for Environmental Monitoring and Management
- Cybersecurity for Network Traffic Analysis and Monitoring
- Artificial Intelligence for Natural Language Generation
- Blockchain Technology for Supply Chain Transparency and Traceability
- Virtual Reality for Design and Visualization
- Natural Language Processing for Speech Recognition
- Machine Learning for Recommendation Systems
- Big Data Analytics for Customer Segmentation and Targeting
- Cybersecurity for Biometric Authentication
- Artificial Intelligence for Human-Computer Interaction
- Blockchain Technology for Decentralized Finance (DeFi)
- Virtual Reality for Tourism and Cultural Heritage
- Machine Learning for Cybersecurity Threat Detection and Prevention
- Big Data Analytics for Healthcare Cost Reduction
- Cybersecurity for Data Privacy and Protection
- Artificial Intelligence for Autonomous Vehicles
- Blockchain Technology for Cryptocurrency and Blockchain Security
- Virtual Reality for Real Estate Visualization
- Natural Language Processing for Question Answering
- Big Data Analytics for Financial Markets Prediction
- Cybersecurity for Cloud-Based Machine Learning Systems
- Artificial Intelligence for Personalized Advertising
- Blockchain Technology for Digital Identity Verification
- Virtual Reality for Cultural and Language Learning
- Natural Language Processing for Semantic Analysis
- Machine Learning for Business Forecasting
- Big Data Analytics for Social Media Marketing
- Artificial Intelligence for Content Generation
- Blockchain Technology for Smart Cities
- Virtual Reality for Historical Reconstruction
- Natural Language Processing for Knowledge Graph Construction
- Machine Learning for Speech Synthesis
- Big Data Analytics for Traffic Optimization
- Artificial Intelligence for Social Robotics
- Blockchain Technology for Healthcare Data Management
- Virtual Reality for Disaster Preparedness and Response
- Natural Language Processing for Multilingual Communication
- Machine Learning for Emotion Recognition
- Big Data Analytics for Human Resources Management
- Cybersecurity for Mobile App Security
- Artificial Intelligence for Financial Planning and Investment
- Blockchain Technology for Energy Management
- Virtual Reality for Cultural Preservation and Heritage.
- Big Data Analytics for Healthcare Management
- Cybersecurity in the Internet of Things (IoT)
- Artificial Intelligence for Predictive Maintenance
- Computational Biology for Drug Discovery
- Virtual Reality for Mental Health Treatment
- Machine Learning for Sentiment Analysis in Social Media
- Human-Computer Interaction for User Experience Design
- Cloud Computing for Disaster Recovery
- Quantum Computing for Cryptography
- Intelligent Transportation Systems for Smart Cities
- Cybersecurity for Autonomous Vehicles
- Artificial Intelligence for Fraud Detection in Financial Systems
- Social Network Analysis for Marketing Campaigns
- Cloud Computing for Video Game Streaming
- Machine Learning for Speech Recognition
- Augmented Reality for Architecture and Design
- Natural Language Processing for Customer Service Chatbots
- Machine Learning for Climate Change Prediction
- Big Data Analytics for Social Sciences
- Artificial Intelligence for Energy Management
- Virtual Reality for Tourism and Travel
- Cybersecurity for Smart Grids
- Machine Learning for Image Recognition
- Augmented Reality for Sports Training
- Natural Language Processing for Content Creation
- Cloud Computing for High-Performance Computing
- Artificial Intelligence for Personalized Medicine
- Virtual Reality for Architecture and Design
- Augmented Reality for Product Visualization
- Natural Language Processing for Language Translation
- Cybersecurity for Cloud Computing
- Artificial Intelligence for Supply Chain Optimization
- Blockchain Technology for Digital Voting Systems
- Virtual Reality for Job Training
- Augmented Reality for Retail Shopping
- Natural Language Processing for Sentiment Analysis in Customer Feedback
- Cloud Computing for Mobile Application Development
- Artificial Intelligence for Cybersecurity Threat Detection
- Blockchain Technology for Intellectual Property Protection
- Virtual Reality for Music Education
- Machine Learning for Financial Forecasting
- Augmented Reality for Medical Education
- Natural Language Processing for News Summarization
- Cybersecurity for Healthcare Data Protection
- Artificial Intelligence for Autonomous Robots
- Virtual Reality for Fitness and Health
- Machine Learning for Natural Language Understanding
- Augmented Reality for Museum Exhibits
- Natural Language Processing for Chatbot Personality Development
- Cloud Computing for Website Performance Optimization
- Artificial Intelligence for E-commerce Recommendation Systems
- Blockchain Technology for Supply Chain Traceability
- Virtual Reality for Military Training
- Augmented Reality for Advertising
- Natural Language Processing for Chatbot Conversation Management
- Cybersecurity for Cloud-Based Services
- Artificial Intelligence for Agricultural Management
- Blockchain Technology for Food Safety Assurance
- Virtual Reality for Historical Reenactments
- Machine Learning for Cybersecurity Incident Response.
- Secure Multiparty Computation
- Federated Learning
- Internet of Things Security
- Blockchain Scalability
- Quantum Computing Algorithms
- Explainable AI
- Data Privacy in the Age of Big Data
- Adversarial Machine Learning
- Deep Reinforcement Learning
- Online Learning and Streaming Algorithms
- Graph Neural Networks
- Automated Debugging and Fault Localization
- Mobile Application Development
- Software Engineering for Cloud Computing
- Cryptocurrency Security
- Edge Computing for Real-Time Applications
- Natural Language Generation
- Virtual and Augmented Reality
- Computational Biology and Bioinformatics
- Internet of Things Applications
- Robotics and Autonomous Systems
- Explainable Robotics
- 3D Printing and Additive Manufacturing
- Distributed Systems
- Parallel Computing
- Data Center Networking
- Data Mining and Knowledge Discovery
- Information Retrieval and Search Engines
- Network Security and Privacy
- Cloud Computing Security
- Data Analytics for Business Intelligence
- Neural Networks and Deep Learning
- Reinforcement Learning for Robotics
- Automated Planning and Scheduling
- Evolutionary Computation and Genetic Algorithms
- Formal Methods for Software Engineering
- Computational Complexity Theory
- Bio-inspired Computing
- Computer Vision for Object Recognition
- Automated Reasoning and Theorem Proving
- Natural Language Understanding
- Machine Learning for Healthcare
- Scalable Distributed Systems
- Sensor Networks and Internet of Things
- Smart Grids and Energy Systems
- Software Testing and Verification
- Web Application Security
- Wireless and Mobile Networks
- Computer Architecture and Hardware Design
- Digital Signal Processing
- Game Theory and Mechanism Design
- Multi-agent Systems
- Evolutionary Robotics
- Quantum Machine Learning
- Computational Social Science
- Explainable Recommender Systems.
- Artificial Intelligence and its applications
- Cloud computing and its benefits
- Cybersecurity threats and solutions
- Internet of Things and its impact on society
- Virtual and Augmented Reality and its uses
- Blockchain Technology and its potential in various industries
- Web Development and Design
- Digital Marketing and its effectiveness
- Big Data and Analytics
- Software Development Life Cycle
- Gaming Development and its growth
- Network Administration and Maintenance
- Machine Learning and its uses
- Data Warehousing and Mining
- Computer Architecture and Design
- Computer Graphics and Animation
- Quantum Computing and its potential
- Data Structures and Algorithms
- Computer Vision and Image Processing
- Robotics and its applications
- Operating Systems and its functions
- Information Theory and Coding
- Compiler Design and Optimization
- Computer Forensics and Cyber Crime Investigation
- Distributed Computing and its significance
- Artificial Neural Networks and Deep Learning
- Cloud Storage and Backup
- Programming Languages and their significance
- Computer Simulation and Modeling
- Computer Networks and its types
- Information Security and its types
- Computer-based Training and eLearning
- Medical Imaging and its uses
- Social Media Analysis and its applications
- Human Resource Information Systems
- Computer-Aided Design and Manufacturing
- Multimedia Systems and Applications
- Geographic Information Systems and its uses
- Computer-Assisted Language Learning
- Mobile Device Management and Security
- Data Compression and its types
- Knowledge Management Systems
- Text Mining and its uses
- Cyber Warfare and its consequences
- Wireless Networks and its advantages
- Computer Ethics and its importance
- Computational Linguistics and its applications
- Autonomous Systems and Robotics
- Information Visualization and its importance
- Geographic Information Retrieval and Mapping
- Business Intelligence and its benefits
- Digital Libraries and their significance
- Artificial Life and Evolutionary Computation
- Computer Music and its types
- Virtual Teams and Collaboration
- Computer Games and Learning
- Semantic Web and its applications
- Electronic Commerce and its advantages
- Multimedia Databases and their significance
- Computer Science Education and its importance
- Computer-Assisted Translation and Interpretation
- Ambient Intelligence and Smart Homes
- Autonomous Agents and Multi-Agent Systems.
About the author
Muhammad Hassan
Researcher, Academic Writer, Web developer
You may also like
300+ Science Research Topics
500+ Argumentative Research Paper Topics
1000+ Sociology Research Topics
500+ Economics Research Topics
500+ Music Research Topics
500+ Medical Research Topic Ideas
An official website of the United States government
Official websites use .gov A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
- Publications
- Account settings
- Advanced Search
- Journal List
Free software applications for authors for writing a research paper
Himel mondal, ayesha juhi, anupkumar d dhanvijay, mohammed jaffer pinjar, shaikat mondal.
- Author information
- Article notes
- Copyright and License information
Address for correspondence: Dr. Shaikat Mondal, Department of Physiology, Raiganj Government Medical College and Hospital, Raiganj - 733 134, West Bengal, India. E-mail: [email protected]
Received 2023 Mar 3; Revised 2023 Jun 20; Accepted 2023 Jun 22; Issue date 2023 Sep.
This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.
Basic computer skills are essential for authors writing research papers as it has the potential to make the task easier for a researcher. This article provides a glimpse about the essential software programs for a novice author writing a research paper. These software applications help streamline the writing process, improve the quality of work, and ensure that papers are formatted correctly. It covers word processing software, grammar correction software, bibliography management software, paraphrasing tool, writing tools, and statistical software. All of the tools described are free to use. Hence, it would help researchers from resource-limited settings or busy physicians who get lesser time for research writing. We presume this review paper would help provide valuable insights and guidance for novice authors looking to write a high-quality research paper.
Keywords: Authors, computer skills, grammar correction, novice author, paraphrasing tool, research papers, software programs, statistical software, word processing, writing
Introduction
An author is one who “writes a book, article, play, etc.” A researcher is “someone whose job is to study a subject carefully, especially in order to discover new information or understand the subject better.” However, in a broad sense, a researcher is an author first. In a research cycle, a researcher needs to become an author from the very beginning of the research (preparation of proposal) to the end of the research (writing a paper for publication).[ 1 ]
Basic computer skills are essential for authors writing research papers because computers and technology have become a fundamental part of the research and writing process. As a new author writing a research paper, there are several essential software skills that can help you streamline the writing process, improve the quality of work, and ensure that the proposal or paper is formatted correctly.[ 2 ] However, these skills are rarely taught in our formal undergraduate or postgraduate course of study.
In this context, we discuss some of the basic software skills that may enhance the the quality of an research article in this article. This includes word processing software, grammar-checking software, paraphrasing tools, statistical software, writing tools, and keyword-searching tool.
Software applications
We describe some of the free software applications that may help authors during the preparation of a research paper. All the applications described are available either for computers or can be used online without paying any fees. Relevant websites where the tools are available are shown in Table 1 .
Software application (downloadable and online) with websites and their primary use for research purposes
WPS=Writer, Presentation and Spreadsheets, MeSH=Medical Subject Headings, DOI=Digital Object Identifier, JANE=Journal/Author Name Estimator, GPT=Generative Pre-training Transformer
Computer software applications
Apache OpenOffice is a free and open-source office software suite that includes a word processor (writer), spreadsheet, presentation software, and other tools. OpenOffice Writer is similar to Microsoft Word and can be used to write and format your research article.[ 3 ] This program is capable of saving the file into its own format (.odt) and also helps in saving the file in Microsoft Word document format (.doc). Hence, any text typed in this program can easily be opened with Microsoft Word. Along with typing an article, this program can help in making a flow chart (e.g. PRISMA flow chart for systemic review and meta-analysis) for research articles. Figure 1 shows the user interface of OpenOffice writer.
A portion of a story written on OpenOffice Writer showing the user interface
There is alternative office software called WPS (an acronym for Writer, Presentation, and Spreadsheets) office. Its personal basic version is free to use. However, the full version needs a subscription. Hence, researchers who are not comfortable with OpenOffice can use this software for writing their papers.
JAMOVI is open-source software for statistical analysis, which means that it is free to download and use. This can be particularly useful for researchers on a budget who do not have access to expensive commercial software. JAMOVI has a user-friendly interface that is easy to navigate, even for beginners. JAMOVI offers a wide range of statistical analyses, including t-tests, ANOVA, regression, and factor analysis. It is particularly well suited for researchers who need to conduct statistical analyses but are not familiar with the more complex features of traditional statistical software like Statistical Package for the Social Sciences (SPSS).[ 4 ] Figure 2 shows a part of the software when we conducted a Wilcoxon signed rank test (the nonparametric equivalent of paired t -test).
Part of the application JAMOVI when a Wilcoxon signed rank test was conducted
Those who are not interested to learn the basics of the JAMOVI can refer to the “online statistics” section of this article where we provided some websites that help in conducting basic statistical tests.
Zotero allows researchers to collect and organize references from a variety of sources, including library catalogs, websites, and databases. This can help researchers keep track of their sources and ensure that they have all the necessary information to cite them correctly. Zotero allows users to store full-text articles as PDFs, web pages, or other formats, along with their corresponding bibliographic information. This can make it easier to access articles and ensure that the information is all in one place. Zotero makes it easy to create bibliographies in a variety of formats, including APA, MLA, Chicago, and many others. This can save researchers time and reduce the likelihood of errors.[ 5 ]
However, those who are not willing to manage the references by Zotero can simply use the comment option in the word processing software to easily keep the reference with the text, and after the final draft, copy those references to add them to the manuscript file.
Google drive
Google Drive is a cloud-based storage and collaboration tool that can be very useful for researchers. Google Drive allows researchers to access their work from any device with an internet connection, making it easy to work on the go and collaborate with others from anywhere in the world. Google Drive makes it easy for researchers to collaborate with colleagues by sharing documents, spreadsheets, and presentations in real time. Multiple users can work on the same document simultaneously, and changes are saved automatically. Google Drive allows researchers to organize their research materials and data in one place, making it easy to find and access them when needed. By storing research materials and data on Google Drive, researchers can ensure that their work is backed up and secure, reducing the risk of data loss due to hardware failure or other issues.[ 6 ] The drive application can be downloaded and installed [ Table 1 ] on computers that would create a separate drive in the computer and keeping any files in this folder would be synchronized online and you can access it from any device connected to the Internet. However, it is to remember that an account is provided free with 15 GB of free cloud storage.
Online software applications
Grammarly is an online grammar-checking tool that can be very helpful for writers who want to improve the accuracy and clarity of their writing. It uses advanced algorithms and artificial intelligence to analyze text and identify errors in grammar, spelling, and punctuation. In addition to catching grammar and punctuation errors, Grammarly can also suggest vocabulary enhancements improve the style and tone of your writing. This can help you avoid common writing mistakes and create more engaging content. When Grammarly identifies an error in your writing, it explains the rule that you may have violated and suggests corrections that you can make. This can help you learn from your mistakes and avoid making similar errors in the future.[ 7 ] A guide on how to use Grammarly is available elsewhere in the article by Mondal and Mondal.[ 8 ] The premium version of the software provides further enhancement of the article. However, the basic free version helps a lot in correcting grammar that is skipped by common word processing software.
Quillbot is a paraphrasing tool that uses advanced algorithms and artificial intelligence to help researchers rephrase and reword their writing. It can be very helpful for researchers who need to paraphrase content for academic or professional purposes. Quillbot can help researchers save time by automatically rephrasing and rewording content. This can be particularly useful for researchers who need to paraphrase large amounts of text or who are working under tight deadlines. Quillbot can help researchers avoid text similarity (i.e. text plagiarism) by providing a way to paraphrase the content. This can be important for researchers who need to avoid plagiarism in their academic or professional work. Quillbot can be used on a variety of platforms, including web browsers, mobile devices, and desktop applications. This makes it easy to use Quillbot on the platform of your choice and to access your writing from multiple devices.[ 8 ] Figure 3 shows an example where a paragraph of text is being paraphrased.
A paragraph of text is paraphrased by QuillBot
MeSH on demand
MeSH on Demand is a website that provides a user-friendly interface to create Medical Subject Headings (MeSH) terms, which are widely used in the biomedical literature to facilitate the indexing and retrieval of articles. It can be very useful for researchers who need to identify appropriate MeSH terms for their research articles. It generates keywords and phrases related to the text provided by the user. This can be very helpful for researchers who are unfamiliar with the MeSH vocabulary and want to ensure that their articles are indexed correctly.[ 9 ] Figure 4 shows searching MeSH terms in a paragraph of text. After getting the MeSH terms, the author needs to decide which are the most relevant keywords for their manuscript and use those. The majority of the journal has a limitation on the number of keywords.
MeSH terms were searched from a paragraph of text on MeSH on Demand web application
In addition to searching MeSH terms in an article, the search result also includes relevant articles available in PubMed. Authors can check the list if they had missed any relevant literature.
DOI stands for Digital Object Identifier, which is a unique identifier assigned to a digital object such as a research article, data set, or other types of research output. It is widely used in the scholarly publishing industry and can be very useful for researchers. Researchers can always locate and access the digital object with DOI. It also helps researchers accurately cite their sources by providing a unique identifier that can be included in the reference list. This can help ensure that the citation is accurate and can be easily located by copy editors or readers.[ 10 ] In many journals, DOI is printed as a quick response code in the printed version of the journal so that any reader can scan it and get the article online. During writing an article, authors may save the DOI number along with the reference for a quick access of the article in future. However, authors always need to check the DOI before putting it along with references as sometimes, due to technical problems, the DOI does not work. In that case, they can save the URL of the article for accessing the paper later.
JANE stands for Journal/Author Name Estimator, which is a Web-based application designed to help researchers find relevant journals and authors for their research. JANE is a free service provided by the Biosemantics Group and funded by Netherlands Bioinformatics Centre, which makes it an accessible and cost-effective tool for researchers. JANE can help researchers find relevant journals for their research by analyzing the title and abstract of their paper and comparing it to the content of thousands of journals. This can save researchers time and effort in identifying appropriate journals to submit their work. JANE can also help researchers identify potential collaborators for their research by analyzing the authors of the papers in the relevant journals. This can help researchers find other experts in their field who are working on similar research topics.[ 11 ] From the list, authors can get email address of the authors and can use those for suggesting reviewers for the article, if the journal wants some suggested peer reviewer. In Figure 5 , three buttons are shown for finding “journals,” “authors,” and “articles.”
User interface of JANE where text can be pasted or typed and journals, authors, or articles can be searched by pressing buttons below
Online statistics
There are several online free websites that provide statistical tests for researchers. These online free websites can be helpful for researchers who need to conduct statistical tests but may not have access to specialized software or support. They provide a range of statistical tests and tools that are user-friendly and can be accessed from any device with an internet connection. Table 2 is showing some of the websites. Furthermore, detailed guidelines along with practice materials are available in articles by Mondal et al .[ 12 - 15 ]
Websites for statistical analysis
This is not a comprehensive list of online calculator
There is several artificial intelligence (AI)-based writing assistance software available. In recent times, an AI language model, ChatGPT is in discussion among academicians due to its human-like conversational and writing capability. It can be a useful tool for researchers in the process of writing a research paper. Researchers can use ChatGPT to generate ideas and inspiration for their research paper by inputting a topic or question related to their research. ChatGPT can then generate relevant sentences or paragraphs that can serve as a starting point for the paper. Researchers can use ChatGPT to help them write more clearly and effectively. This would particularly be helpful for non-native speakers of English. ChatGPT can provide suggestions for improving the wording, grammar, and structure of sentences, and can also provide synonyms or related words to improve the richness of the text. ChatGPT can be used to summarize long passages of text, making it useful for summarizing articles and research papers for review and analysis. ChatGPT can assist researchers in managing their citations and references by generating citations and reference lists in the appropriate format.[ 16 ] However, many a times, ChatGPT generates fictitious references for text which is not found on the internet. Google Bard is an alternative to ChatGPT which can also help in the tasks done by ChatGPT.
An example of conversation with ChatGPT is shown in Figure 6 where the ChatGPT was asked to explain importance of family medicine in India with three references.
A conversation with ChatGPT showing the input and output
Overall, having a basic understanding of these software tools can help new authors write more efficiently, effectively, and accurately, and create a professional-looking research paper.
There are several advantages of using technology for writing a research paper. Technology can greatly increase the efficiency of the research paper writing process, enabling researchers to complete tasks faster and more accurately. For example, ChatGPT can write a portion of the manuscript within seconds and QuillBot can help paraphrase text in a very short time. Technology can facilitate collaboration among researchers by enabling them to work together remotely and share information and feedback in real time. Digital tools can help researchers organize their research materials and notes more effectively, making it easier to keep track of important information and sources. In both domains, Google Drive is of great help.[ 17 , 18 ]
There are some disadvantages to using technology for research. Overreliance on technology can lead to a loss of critical thinking and writing skills, as well as a reduced ability to solve problems independently. The Internet and other digital tools can be a source of distraction and can hinder concentration and focus, potentially leading to lower-quality research and writing. While the Internet provides access to vast amounts of information, not all of it is reliable or accurate, which can lead to lower-quality research and writing. Not all researchers have access to the necessary technology and resources to complete their research effectively, which can create barriers to entry and hinder research progress.[ 19 , 20 ]
While we use the software applications for shaping our research paper, should we acknowledge them in the research paper as we mention humans who help us for similar task? Researchers mention the software package details, acknowledge any third party editing services, or copy editing by any human. However, they usually do not acknowledge the software. When it comes to word processing software, such as Microsoft Word or OpenOffice, it is not necessary to acknowledge them in a research paper. These tools are commonly used for writing and formatting documents, and their usage is expected. Regarding specific tools like Grammarly or ChatGPT, if substantial help was obtained, then acknowledging them would be appropriate. However, specific role to be mentioned for which the help was taken.[ 21 ] For example, refer to the acknowledgement part of this manuscript to have a glimpse on how we acknowledged ChatGPT for its help in this manuscript. Similar text can be added when help is taken from other tools.
Overall, technology can greatly benefit the research paper writing process, but researchers need to be aware of its limitations and potential drawbacks. By balancing the advantages and disadvantages of using technology, researchers can use it as a tool to enhance their research and writing while maintaining the integrity and quality of their work. Primary care physicians often engage in research activities; however, busy primary care physicians hardly get time for writing. Hence, these applications can assist them in organizing research data, writing manuscripts, and formatting citations and references.
This review paper has discussed the essential software programs that are highly recommended for novice authors writing a research paper. The software programs discussed include Open Office for typing a paper, Jamovi for statistical analysis, Zotero for reference management, Google drive for data storage and accessibility, Grammarly for checking grammar, QuillBot for paraphrasing, MeSH on demand for searching keywords and related articles, DOI for searching the literature, JANE for author search, various online websites for statistical analysis, and language-based AI for generating content for a research paper. Utilizing these essential software programs and maintaining a balanced approach to technology use, novice authors can produce higher-quality research papers and contribute to the advancement of their respective fields.
Financial support and sponsorship
Conflicts of interest.
There are no conflicts of interest.
Acknowledgment
We would like to acknowledge the use of ChatGPT (May 24 Version), an AI language model developed by OpenAI ( https://openai.com/chatgpt ), for assisting in the language editing of this research paper. ChatGPT helped improve the clarity and readability of the manuscript.
- 1. Hsieh HF, Shannon SE. Three approaches to qualitative content analysis. Qual Health Res. 2005;15:1277–88. doi: 10.1177/1049732305276687. [ DOI ] [ PubMed ] [ Google Scholar ]
- 2. Levac D, Colquhoun H, O'Brien KK. Scoping studies:Advancing the methodology. Implement Sci. 2010;5:69. doi: 10.1186/1748-5908-5-69. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 3. Taylor DM, Hodkinson PW, Khan AS, Simon EL. Research skills and the data spreadsheet:A research primer for low- and middle-income countries. Afr J Emerg Med. 2020;10(Suppl 2):S140–4. doi: 10.1016/j.afjem.2020.05.003. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 4. Şahin MD, Aybek EC. Jamovi:An easy to use statistical software for the social scientists. Int J Assess Tool Educ. 2019;6:670–92. [ Google Scholar ]
- 5. Ahmed KK, Al Dhubaib BE. Zotero:A bibliographic assistant to researcher. J Pharmacol Pharmacother. 2011;2:303–5. doi: 10.4103/0976-500X.85940. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 6. Kubaszewski Ł, Kaczmarczyk J, Nowakowski A. Management of scientific information with Google Drive. Pol Orthop Traumatol. 2013;78:213–7. [ PubMed ] [ Google Scholar ]
- 7. Nazari N, Shabbir MS, Setiawan R. Application of Artificial Intelligence powered digital writing assistant in higher education:Randomized controlled trial. Heliyon. 2021;7:e07014. doi: 10.1016/j.heliyon.2021.e07014. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 8. Fitria TN. QuillBot as an online tool:Students'alternative in paraphrasing and rewriting of English writing. Englisia Journal. 2021;9:183. [ Google Scholar ]
- 9. Mondal H, Mondal S, Mondal S. How to choose title and keywords for manuscript according to medical subject headings. Indian J Vasc Endovasc Surg. 2018;5:141–4. [ Google Scholar ]
- 10. Neumann J, Brase J. DataCite and DOI names for research data. J Comput Aided Mol Des. 2014;28:1035–41. doi: 10.1007/s10822-014-9776-5. doi:10.1007/s10822-014-9776-5. [ DOI ] [ PubMed ] [ Google Scholar ]
- 11. Curry CL. Journal/Author Name Estimator (JANE) J Med Libr Assoc. 2019;107:122–4. [ Google Scholar ]
- 12. Mondal H, Mondal S, Majumder R, De R. Conduct common statistical tests online. Indian Dermatol Online J. 2022;13:539–42. doi: 10.4103/idoj.idoj_605_21. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 13. Mondal H, Swain SM, Mondal S. How to conduct descriptive statistics online:A brief hands-on guide for biomedical researchers. Indian J Vasc Endovasc Surg. 2022;9:70–6. [ Google Scholar ]
- 14. Mondal S, Saha S, Mondal H, De R, Majumder R, Saha K. How to conduct inferential statistics online:A brief hands-on guide for biomedical researchers. Indian J Vasc Endovasc Surg. 2022;9:54–62. [ Google Scholar ]
- 15. Mondal S, Mondal H, Panda R. How to conduct inferential statistics online (Part 2):A brief hands-on guide for biomedical researchers. Indian J Vasc Endovasc Surg. 2022;9:63–9. [ Google Scholar ]
- 16. Biswas S. ChatGPT and the future of medical writing. Radiology. 2023;307:e223312. doi: 10.1148/radiol.223312. doi:10.1148/radiol. 223312. [ DOI ] [ PubMed ] [ Google Scholar ]
- 17. Ramírez-Castañeda V. Disadvantages in preparing and publishing scientific papers caused by the dominance of the English language in science:The case of Colombian researchers in biological sciences. PLoS One. 2020;15:e0238372. doi: 10.1371/journal.pone.0238372. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 18. Kumar PM, Priya NS, Musalaiah S, Nagasree M. Knowing and avoiding plagiarism during scientific writing. Ann Med Health Sci Res. 2014;4(Suppl 3):S193–8. doi: 10.4103/2141-9248.141957. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 19. Dontre AJ. The influence of technology on academic distraction:A review. Hum Behav &Emerg Tech. 2021;3:379–90. [DOI for you to check:https://doi.org/10.1002/hbe2229] [ Google Scholar ]
- 20. Lang TA, White NJ, Tran HT, Farrar JJ, Day NP, Fitzpatrick R, et al. Clinical research in resource-limited settings:enhancing research capacity and working together to make trials less complicated. PLoS Negl Trop Dis. 2010;4:e619. doi: 10.1371/journal.pntd.0000619. [DOI for you to check:10.1371/journal.pntd.0000619] [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 21. Rahimi F, Talebi Bezmin Abadi A. ChatGPT and publication ethics. Arch Med Res. 2023;54:272–4. doi: 10.1016/j.arcmed.2023.03.004. [ DOI ] [ PubMed ] [ Google Scholar ]
- View on publisher site
- PDF (2.4 MB)
- Collections
Similar articles
Cited by other articles, links to ncbi databases.
- Download .nbib .nbib
- Format: AMA APA MLA NLM
IMAGES
VIDEO
COMMENTS
Explore the latest full-text research PDFs, articles, conference papers, preprints and more on COMPUTER APPLICATIONS. Find methods information, sources, references or conduct a literature review ...
In this paper, various software applications and tools are discussed with respect to research activities like data collection, analysis, etc. Discover the world's research 25+ million members
Computer science ( CS ) majors are in high demand and account for a large part of national computer and information technology job market applicants. Employment in this sector is projected to grow 12% between 2018 and 2028, which is faster than the average of all other occupations. Published data are available on traditional non-computer ...
Finally, we put forward the future research trends in application sides and the future works. Download: Download high-res image (221KB) Download: Download full-size image; ... They commented in their paper "computer vision is already being put to questionable use and as researchers, we have a responsibility to at least consider the harm our ...
In the current age of the Fourth Industrial Revolution (4IR or Industry 4.0), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding smart and automated applications, the knowledge of artificial intelligence (AI ...
Computer science is the study and development of the protocols required for automated processing and manipulation of data. This includes, for example, creating algorithms for efficiently searching ...
The Journal of Network and Computer Applications welcomes research contributions, surveys and notes in all areas relating to computer networks and applications thereof. The following list of sample-topics is by no means to be understood as restricting contributions to the topics mentioned: new …. View full aims & scope.
In the era of rapid technological advancement, Computer Vision has emerged as a transformative force, reshaping the landscape of Artificial Intelligence (AI) and Machine Learning (ML). This comprehensive review paper aims to delve into the intricate evolution, methodologies, applications, challenges, and future trajectories of Computer Vision. Moving beyond a mere exploration of technical ...
Covers applications of computer science to the mathematical modeling of complex systems in the fields of science, engineering, and finance. ... Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG ...
This paper provides a review of cloud computing technology, cloud models, deployment and cloud applications such as types of applications, reliability, and security. Finally, open research issues ...
International Journal of Computer Applications (IJCA), ISSN 0975-8887, is a peer-reviewed journal published by Foundation of Computer Science (FCS). The journal publishes original papers at the forefront of applied information systems that advance the field and that enhance exchange of innovative ideas between academicians and industrialists.
Inderscience is a global company, a dynamic leading independent journal publisher disseminates the latest research across the broad fields of science, engineering and technology; management, public and business administration; environment, ecological economics and sustainable development; computing, ICT and internet/web services, and related areas.
Thus, this paper provides a summative meta-analysis of cloud computing research from 2009 to 2015. With the aim of taking stock and providing insights into theoretical frameworks and models, research methodologies, geographical focus, and trends of cloud computing research over these years. ... International Journal of Computer Applications ...
The research article, research paper, research thesis or research dissertation is typed in word ... A spreadsheet is a computer application that simulates a paper worksheet. It displays multiple cells that together make up a grid consisting of rows and columns, each cell containing eitheral phanumeric text or numeric values. ...
In this Paper extend the GPU application into other area such as string matching problem. This paper shows the results on adapting the enhanced Boyer-Moore (EBM) string matching algorithm to run on GPU paradigm and comparison with serial version and multithreaded version on CPU.The experimental results demonstrate that GPU version of enhanced ...
This conceptual research paper is written to discuss the implementation of the A.D.A.B model in technology -based and technical subjects such as Computer Science, Engineering, Technical and so on ...
The paper reviews work on informal technical help giving between colleagues. It concentrates on the process of how colleagues help each other to use a computer application to achieve a specific work task, contrasting this with the focus of much prior work on surrounding issues like the choice of whom to ask, information re-use and the larger work context of encouragement or otherwise of such ...
In this paper, we describe the new architecture of the Dame system that hides to the programmer many details of the actual computing platform, and makes programs self-adaptable to the platform without additional eeorts to the parallelization itself. Download. by Salvatore Tucci. Computer Science.
Computer Science Research Topics are as follows: Using machine learning to detect and prevent cyber attacks. Developing algorithms for optimized resource allocation in cloud computing. Investigating the use of blockchain technology for secure and decentralized data storage. Developing intelligent chatbots for customer service.
Basic computer skills are essential for authors writing research papers as it has the potential to make the task easier for a researcher. This article provides a glimpse about the essential software programs for a novice author writing a research paper. These software applications help streamline the writing process, improve the quality of work ...