• Privacy Policy

Buy Me a Coffee

Research Method

Home » Research Methodology – Types, Examples and writing Guide

Research Methodology – Types, Examples and writing Guide

Table of Contents

Research Methodology

Research Methodology

Definition:

Research Methodology refers to the systematic and scientific approach used to conduct research, investigate problems, and gather data and information for a specific purpose. It involves the techniques and procedures used to identify, collect , analyze , and interpret data to answer research questions or solve research problems . Moreover, They are philosophical and theoretical frameworks that guide the research process.

Structure of Research Methodology

Research methodology formats can vary depending on the specific requirements of the research project, but the following is a basic example of a structure for a research methodology section:

I. Introduction

  • Provide an overview of the research problem and the need for a research methodology section
  • Outline the main research questions and objectives

II. Research Design

  • Explain the research design chosen and why it is appropriate for the research question(s) and objectives
  • Discuss any alternative research designs considered and why they were not chosen
  • Describe the research setting and participants (if applicable)

III. Data Collection Methods

  • Describe the methods used to collect data (e.g., surveys, interviews, observations)
  • Explain how the data collection methods were chosen and why they are appropriate for the research question(s) and objectives
  • Detail any procedures or instruments used for data collection

IV. Data Analysis Methods

  • Describe the methods used to analyze the data (e.g., statistical analysis, content analysis )
  • Explain how the data analysis methods were chosen and why they are appropriate for the research question(s) and objectives
  • Detail any procedures or software used for data analysis

V. Ethical Considerations

  • Discuss any ethical issues that may arise from the research and how they were addressed
  • Explain how informed consent was obtained (if applicable)
  • Detail any measures taken to ensure confidentiality and anonymity

VI. Limitations

  • Identify any potential limitations of the research methodology and how they may impact the results and conclusions

VII. Conclusion

  • Summarize the key aspects of the research methodology section
  • Explain how the research methodology addresses the research question(s) and objectives

Research Methodology Types

Types of Research Methodology are as follows:

Quantitative Research Methodology

This is a research methodology that involves the collection and analysis of numerical data using statistical methods. This type of research is often used to study cause-and-effect relationships and to make predictions.

Qualitative Research Methodology

This is a research methodology that involves the collection and analysis of non-numerical data such as words, images, and observations. This type of research is often used to explore complex phenomena, to gain an in-depth understanding of a particular topic, and to generate hypotheses.

Mixed-Methods Research Methodology

This is a research methodology that combines elements of both quantitative and qualitative research. This approach can be particularly useful for studies that aim to explore complex phenomena and to provide a more comprehensive understanding of a particular topic.

Case Study Research Methodology

This is a research methodology that involves in-depth examination of a single case or a small number of cases. Case studies are often used in psychology, sociology, and anthropology to gain a detailed understanding of a particular individual or group.

Action Research Methodology

This is a research methodology that involves a collaborative process between researchers and practitioners to identify and solve real-world problems. Action research is often used in education, healthcare, and social work.

Experimental Research Methodology

This is a research methodology that involves the manipulation of one or more independent variables to observe their effects on a dependent variable. Experimental research is often used to study cause-and-effect relationships and to make predictions.

Survey Research Methodology

This is a research methodology that involves the collection of data from a sample of individuals using questionnaires or interviews. Survey research is often used to study attitudes, opinions, and behaviors.

Grounded Theory Research Methodology

This is a research methodology that involves the development of theories based on the data collected during the research process. Grounded theory is often used in sociology and anthropology to generate theories about social phenomena.

Research Methodology Example

An Example of Research Methodology could be the following:

Research Methodology for Investigating the Effectiveness of Cognitive Behavioral Therapy in Reducing Symptoms of Depression in Adults

Introduction:

The aim of this research is to investigate the effectiveness of cognitive-behavioral therapy (CBT) in reducing symptoms of depression in adults. To achieve this objective, a randomized controlled trial (RCT) will be conducted using a mixed-methods approach.

Research Design:

The study will follow a pre-test and post-test design with two groups: an experimental group receiving CBT and a control group receiving no intervention. The study will also include a qualitative component, in which semi-structured interviews will be conducted with a subset of participants to explore their experiences of receiving CBT.

Participants:

Participants will be recruited from community mental health clinics in the local area. The sample will consist of 100 adults aged 18-65 years old who meet the diagnostic criteria for major depressive disorder. Participants will be randomly assigned to either the experimental group or the control group.

Intervention :

The experimental group will receive 12 weekly sessions of CBT, each lasting 60 minutes. The intervention will be delivered by licensed mental health professionals who have been trained in CBT. The control group will receive no intervention during the study period.

Data Collection:

Quantitative data will be collected through the use of standardized measures such as the Beck Depression Inventory-II (BDI-II) and the Generalized Anxiety Disorder-7 (GAD-7). Data will be collected at baseline, immediately after the intervention, and at a 3-month follow-up. Qualitative data will be collected through semi-structured interviews with a subset of participants from the experimental group. The interviews will be conducted at the end of the intervention period, and will explore participants’ experiences of receiving CBT.

Data Analysis:

Quantitative data will be analyzed using descriptive statistics, t-tests, and mixed-model analyses of variance (ANOVA) to assess the effectiveness of the intervention. Qualitative data will be analyzed using thematic analysis to identify common themes and patterns in participants’ experiences of receiving CBT.

Ethical Considerations:

This study will comply with ethical guidelines for research involving human subjects. Participants will provide informed consent before participating in the study, and their privacy and confidentiality will be protected throughout the study. Any adverse events or reactions will be reported and managed appropriately.

Data Management:

All data collected will be kept confidential and stored securely using password-protected databases. Identifying information will be removed from qualitative data transcripts to ensure participants’ anonymity.

Limitations:

One potential limitation of this study is that it only focuses on one type of psychotherapy, CBT, and may not generalize to other types of therapy or interventions. Another limitation is that the study will only include participants from community mental health clinics, which may not be representative of the general population.

Conclusion:

This research aims to investigate the effectiveness of CBT in reducing symptoms of depression in adults. By using a randomized controlled trial and a mixed-methods approach, the study will provide valuable insights into the mechanisms underlying the relationship between CBT and depression. The results of this study will have important implications for the development of effective treatments for depression in clinical settings.

How to Write Research Methodology

Writing a research methodology involves explaining the methods and techniques you used to conduct research, collect data, and analyze results. It’s an essential section of any research paper or thesis, as it helps readers understand the validity and reliability of your findings. Here are the steps to write a research methodology:

  • Start by explaining your research question: Begin the methodology section by restating your research question and explaining why it’s important. This helps readers understand the purpose of your research and the rationale behind your methods.
  • Describe your research design: Explain the overall approach you used to conduct research. This could be a qualitative or quantitative research design, experimental or non-experimental, case study or survey, etc. Discuss the advantages and limitations of the chosen design.
  • Discuss your sample: Describe the participants or subjects you included in your study. Include details such as their demographics, sampling method, sample size, and any exclusion criteria used.
  • Describe your data collection methods : Explain how you collected data from your participants. This could include surveys, interviews, observations, questionnaires, or experiments. Include details on how you obtained informed consent, how you administered the tools, and how you minimized the risk of bias.
  • Explain your data analysis techniques: Describe the methods you used to analyze the data you collected. This could include statistical analysis, content analysis, thematic analysis, or discourse analysis. Explain how you dealt with missing data, outliers, and any other issues that arose during the analysis.
  • Discuss the validity and reliability of your research : Explain how you ensured the validity and reliability of your study. This could include measures such as triangulation, member checking, peer review, or inter-coder reliability.
  • Acknowledge any limitations of your research: Discuss any limitations of your study, including any potential threats to validity or generalizability. This helps readers understand the scope of your findings and how they might apply to other contexts.
  • Provide a summary: End the methodology section by summarizing the methods and techniques you used to conduct your research. This provides a clear overview of your research methodology and helps readers understand the process you followed to arrive at your findings.

When to Write Research Methodology

Research methodology is typically written after the research proposal has been approved and before the actual research is conducted. It should be written prior to data collection and analysis, as it provides a clear roadmap for the research project.

The research methodology is an important section of any research paper or thesis, as it describes the methods and procedures that will be used to conduct the research. It should include details about the research design, data collection methods, data analysis techniques, and any ethical considerations.

The methodology should be written in a clear and concise manner, and it should be based on established research practices and standards. It is important to provide enough detail so that the reader can understand how the research was conducted and evaluate the validity of the results.

Applications of Research Methodology

Here are some of the applications of research methodology:

  • To identify the research problem: Research methodology is used to identify the research problem, which is the first step in conducting any research.
  • To design the research: Research methodology helps in designing the research by selecting the appropriate research method, research design, and sampling technique.
  • To collect data: Research methodology provides a systematic approach to collect data from primary and secondary sources.
  • To analyze data: Research methodology helps in analyzing the collected data using various statistical and non-statistical techniques.
  • To test hypotheses: Research methodology provides a framework for testing hypotheses and drawing conclusions based on the analysis of data.
  • To generalize findings: Research methodology helps in generalizing the findings of the research to the target population.
  • To develop theories : Research methodology is used to develop new theories and modify existing theories based on the findings of the research.
  • To evaluate programs and policies : Research methodology is used to evaluate the effectiveness of programs and policies by collecting data and analyzing it.
  • To improve decision-making: Research methodology helps in making informed decisions by providing reliable and valid data.

Purpose of Research Methodology

Research methodology serves several important purposes, including:

  • To guide the research process: Research methodology provides a systematic framework for conducting research. It helps researchers to plan their research, define their research questions, and select appropriate methods and techniques for collecting and analyzing data.
  • To ensure research quality: Research methodology helps researchers to ensure that their research is rigorous, reliable, and valid. It provides guidelines for minimizing bias and error in data collection and analysis, and for ensuring that research findings are accurate and trustworthy.
  • To replicate research: Research methodology provides a clear and detailed account of the research process, making it possible for other researchers to replicate the study and verify its findings.
  • To advance knowledge: Research methodology enables researchers to generate new knowledge and to contribute to the body of knowledge in their field. It provides a means for testing hypotheses, exploring new ideas, and discovering new insights.
  • To inform decision-making: Research methodology provides evidence-based information that can inform policy and decision-making in a variety of fields, including medicine, public health, education, and business.

Advantages of Research Methodology

Research methodology has several advantages that make it a valuable tool for conducting research in various fields. Here are some of the key advantages of research methodology:

  • Systematic and structured approach : Research methodology provides a systematic and structured approach to conducting research, which ensures that the research is conducted in a rigorous and comprehensive manner.
  • Objectivity : Research methodology aims to ensure objectivity in the research process, which means that the research findings are based on evidence and not influenced by personal bias or subjective opinions.
  • Replicability : Research methodology ensures that research can be replicated by other researchers, which is essential for validating research findings and ensuring their accuracy.
  • Reliability : Research methodology aims to ensure that the research findings are reliable, which means that they are consistent and can be depended upon.
  • Validity : Research methodology ensures that the research findings are valid, which means that they accurately reflect the research question or hypothesis being tested.
  • Efficiency : Research methodology provides a structured and efficient way of conducting research, which helps to save time and resources.
  • Flexibility : Research methodology allows researchers to choose the most appropriate research methods and techniques based on the research question, data availability, and other relevant factors.
  • Scope for innovation: Research methodology provides scope for innovation and creativity in designing research studies and developing new research techniques.

Research Methodology Vs Research Methods

About the author.

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Research Paper Citation

How to Cite Research Paper – All Formats and...

Data collection

Data Collection – Methods Types and Examples

Delimitations

Delimitations in Research – Types, Examples and...

Research Paper Formats

Research Paper Format – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

  • USC Libraries
  • Research Guides

Organizing Your Social Sciences Research Paper

  • 6. The Methodology
  • Purpose of Guide
  • Design Flaws to Avoid
  • Independent and Dependent Variables
  • Glossary of Research Terms
  • Reading Research Effectively
  • Narrowing a Topic Idea
  • Broadening a Topic Idea
  • Extending the Timeliness of a Topic Idea
  • Academic Writing Style
  • Applying Critical Thinking
  • Choosing a Title
  • Making an Outline
  • Paragraph Development
  • Research Process Video Series
  • Executive Summary
  • The C.A.R.S. Model
  • Background Information
  • The Research Problem/Question
  • Theoretical Framework
  • Citation Tracking
  • Content Alert Services
  • Evaluating Sources
  • Primary Sources
  • Secondary Sources
  • Tiertiary Sources
  • Scholarly vs. Popular Publications
  • Qualitative Methods
  • Quantitative Methods
  • Insiderness
  • Using Non-Textual Elements
  • Limitations of the Study
  • Common Grammar Mistakes
  • Writing Concisely
  • Avoiding Plagiarism
  • Footnotes or Endnotes?
  • Further Readings
  • Generative AI and Writing
  • USC Libraries Tutorials and Other Guides
  • Bibliography

The methods section describes actions taken to investigate a research problem and the rationale for the application of specific procedures or techniques used to identify, select, process, and analyze information applied to understanding the problem, thereby, allowing the reader to critically evaluate a study’s overall validity and reliability. The methodology section of a research paper answers two main questions: How was the data collected or generated? And, how was it analyzed? The writing should be direct and precise and always written in the past tense.

Kallet, Richard H. "How to Write the Methods Section of a Research Paper." Respiratory Care 49 (October 2004): 1229-1232.

Importance of a Good Methodology Section

You must explain how you obtained and analyzed your results for the following reasons:

  • Readers need to know how the data was obtained because the method you chose affects the results and, by extension, how you interpreted their significance in the discussion section of your paper.
  • Methodology is crucial for any branch of scholarship because an unreliable method produces unreliable results and, as a consequence, undermines the value of your analysis of the findings.
  • In most cases, there are a variety of different methods you can choose to investigate a research problem. The methodology section of your paper should clearly articulate the reasons why you have chosen a particular procedure or technique.
  • The reader wants to know that the data was collected or generated in a way that is consistent with accepted practice in the field of study. For example, if you are using a multiple choice questionnaire, readers need to know that it offered your respondents a reasonable range of answers to choose from.
  • The method must be appropriate to fulfilling the overall aims of the study. For example, you need to ensure that you have a large enough sample size to be able to generalize and make recommendations based upon the findings.
  • The methodology should discuss the problems that were anticipated and the steps you took to prevent them from occurring. For any problems that do arise, you must describe the ways in which they were minimized or why these problems do not impact in any meaningful way your interpretation of the findings.
  • In the social and behavioral sciences, it is important to always provide sufficient information to allow other researchers to adopt or replicate your methodology. This information is particularly important when a new method has been developed or an innovative use of an existing method is utilized.

Bem, Daryl J. Writing the Empirical Journal Article. Psychology Writing Center. University of Washington; Denscombe, Martyn. The Good Research Guide: For Small-Scale Social Research Projects . 5th edition. Buckingham, UK: Open University Press, 2014; Lunenburg, Frederick C. Writing a Successful Thesis or Dissertation: Tips and Strategies for Students in the Social and Behavioral Sciences . Thousand Oaks, CA: Corwin Press, 2008.

Structure and Writing Style

I.  Groups of Research Methods

There are two main groups of research methods in the social sciences:

  • The e mpirical-analytical group approaches the study of social sciences in a similar manner that researchers study the natural sciences . This type of research focuses on objective knowledge, research questions that can be answered yes or no, and operational definitions of variables to be measured. The empirical-analytical group employs deductive reasoning that uses existing theory as a foundation for formulating hypotheses that need to be tested. This approach is focused on explanation.
  • The i nterpretative group of methods is focused on understanding phenomenon in a comprehensive, holistic way . Interpretive methods focus on analytically disclosing the meaning-making practices of human subjects [the why, how, or by what means people do what they do], while showing how those practices arrange so that it can be used to generate observable outcomes. Interpretive methods allow you to recognize your connection to the phenomena under investigation. However, the interpretative group requires careful examination of variables because it focuses more on subjective knowledge.

II.  Content

The introduction to your methodology section should begin by restating the research problem and underlying assumptions underpinning your study. This is followed by situating the methods you used to gather, analyze, and process information within the overall “tradition” of your field of study and within the particular research design you have chosen to study the problem. If the method you choose lies outside of the tradition of your field [i.e., your review of the literature demonstrates that the method is not commonly used], provide a justification for how your choice of methods specifically addresses the research problem in ways that have not been utilized in prior studies.

The remainder of your methodology section should describe the following:

  • Decisions made in selecting the data you have analyzed or, in the case of qualitative research, the subjects and research setting you have examined,
  • Tools and methods used to identify and collect information, and how you identified relevant variables,
  • The ways in which you processed the data and the procedures you used to analyze that data, and
  • The specific research tools or strategies that you utilized to study the underlying hypothesis and research questions.

In addition, an effectively written methodology section should:

  • Introduce the overall methodological approach for investigating your research problem . Is your study qualitative or quantitative or a combination of both (mixed method)? Are you going to take a special approach, such as action research, or a more neutral stance?
  • Indicate how the approach fits the overall research design . Your methods for gathering data should have a clear connection to your research problem. In other words, make sure that your methods will actually address the problem. One of the most common deficiencies found in research papers is that the proposed methodology is not suitable to achieving the stated objective of your paper.
  • Describe the specific methods of data collection you are going to use , such as, surveys, interviews, questionnaires, observation, archival research. If you are analyzing existing data, such as a data set or archival documents, describe how it was originally created or gathered and by whom. Also be sure to explain how older data is still relevant to investigating the current research problem.
  • Explain how you intend to analyze your results . Will you use statistical analysis? Will you use specific theoretical perspectives to help you analyze a text or explain observed behaviors? Describe how you plan to obtain an accurate assessment of relationships, patterns, trends, distributions, and possible contradictions found in the data.
  • Provide background and a rationale for methodologies that are unfamiliar for your readers . Very often in the social sciences, research problems and the methods for investigating them require more explanation/rationale than widely accepted rules governing the natural and physical sciences. Be clear and concise in your explanation.
  • Provide a justification for subject selection and sampling procedure . For instance, if you propose to conduct interviews, how do you intend to select the sample population? If you are analyzing texts, which texts have you chosen, and why? If you are using statistics, why is this set of data being used? If other data sources exist, explain why the data you chose is most appropriate to addressing the research problem.
  • Provide a justification for case study selection . A common method of analyzing research problems in the social sciences is to analyze specific cases. These can be a person, place, event, phenomenon, or other type of subject of analysis that are either examined as a singular topic of in-depth investigation or multiple topics of investigation studied for the purpose of comparing or contrasting findings. In either method, you should explain why a case or cases were chosen and how they specifically relate to the research problem.
  • Describe potential limitations . Are there any practical limitations that could affect your data collection? How will you attempt to control for potential confounding variables and errors? If your methodology may lead to problems you can anticipate, state this openly and show why pursuing this methodology outweighs the risk of these problems cropping up.

NOTE :   Once you have written all of the elements of the methods section, subsequent revisions should focus on how to present those elements as clearly and as logically as possibly. The description of how you prepared to study the research problem, how you gathered the data, and the protocol for analyzing the data should be organized chronologically. For clarity, when a large amount of detail must be presented, information should be presented in sub-sections according to topic. If necessary, consider using appendices for raw data.

ANOTHER NOTE : If you are conducting a qualitative analysis of a research problem , the methodology section generally requires a more elaborate description of the methods used as well as an explanation of the processes applied to gathering and analyzing of data than is generally required for studies using quantitative methods. Because you are the primary instrument for generating the data [e.g., through interviews or observations], the process for collecting that data has a significantly greater impact on producing the findings. Therefore, qualitative research requires a more detailed description of the methods used.

YET ANOTHER NOTE :   If your study involves interviews, observations, or other qualitative techniques involving human subjects , you may be required to obtain approval from the university's Office for the Protection of Research Subjects before beginning your research. This is not a common procedure for most undergraduate level student research assignments. However, i f your professor states you need approval, you must include a statement in your methods section that you received official endorsement and adequate informed consent from the office and that there was a clear assessment and minimization of risks to participants and to the university. This statement informs the reader that your study was conducted in an ethical and responsible manner. In some cases, the approval notice is included as an appendix to your paper.

III.  Problems to Avoid

Irrelevant Detail The methodology section of your paper should be thorough but concise. Do not provide any background information that does not directly help the reader understand why a particular method was chosen, how the data was gathered or obtained, and how the data was analyzed in relation to the research problem [note: analyzed, not interpreted! Save how you interpreted the findings for the discussion section]. With this in mind, the page length of your methods section will generally be less than any other section of your paper except the conclusion.

Unnecessary Explanation of Basic Procedures Remember that you are not writing a how-to guide about a particular method. You should make the assumption that readers possess a basic understanding of how to investigate the research problem on their own and, therefore, you do not have to go into great detail about specific methodological procedures. The focus should be on how you applied a method , not on the mechanics of doing a method. An exception to this rule is if you select an unconventional methodological approach; if this is the case, be sure to explain why this approach was chosen and how it enhances the overall process of discovery.

Problem Blindness It is almost a given that you will encounter problems when collecting or generating your data, or, gaps will exist in existing data or archival materials. Do not ignore these problems or pretend they did not occur. Often, documenting how you overcame obstacles can form an interesting part of the methodology. It demonstrates to the reader that you can provide a cogent rationale for the decisions you made to minimize the impact of any problems that arose.

Literature Review Just as the literature review section of your paper provides an overview of sources you have examined while researching a particular topic, the methodology section should cite any sources that informed your choice and application of a particular method [i.e., the choice of a survey should include any citations to the works you used to help construct the survey].

It’s More than Sources of Information! A description of a research study's method should not be confused with a description of the sources of information. Such a list of sources is useful in and of itself, especially if it is accompanied by an explanation about the selection and use of the sources. The description of the project's methodology complements a list of sources in that it sets forth the organization and interpretation of information emanating from those sources.

Azevedo, L.F. et al. "How to Write a Scientific Paper: Writing the Methods Section." Revista Portuguesa de Pneumologia 17 (2011): 232-238; Blair Lorrie. “Choosing a Methodology.” In Writing a Graduate Thesis or Dissertation , Teaching Writing Series. (Rotterdam: Sense Publishers 2016), pp. 49-72; Butin, Dan W. The Education Dissertation A Guide for Practitioner Scholars . Thousand Oaks, CA: Corwin, 2010; Carter, Susan. Structuring Your Research Thesis . New York: Palgrave Macmillan, 2012; Kallet, Richard H. “How to Write the Methods Section of a Research Paper.” Respiratory Care 49 (October 2004):1229-1232; Lunenburg, Frederick C. Writing a Successful Thesis or Dissertation: Tips and Strategies for Students in the Social and Behavioral Sciences . Thousand Oaks, CA: Corwin Press, 2008. Methods Section. The Writer’s Handbook. Writing Center. University of Wisconsin, Madison; Rudestam, Kjell Erik and Rae R. Newton. “The Method Chapter: Describing Your Research Plan.” In Surviving Your Dissertation: A Comprehensive Guide to Content and Process . (Thousand Oaks, Sage Publications, 2015), pp. 87-115; What is Interpretive Research. Institute of Public and International Affairs, University of Utah; Writing the Experimental Report: Methods, Results, and Discussion. The Writing Lab and The OWL. Purdue University; Methods and Materials. The Structure, Format, Content, and Style of a Journal-Style Scientific Paper. Department of Biology. Bates College.

Writing Tip

Statistical Designs and Tests? Do Not Fear Them!

Don't avoid using a quantitative approach to analyzing your research problem just because you fear the idea of applying statistical designs and tests. A qualitative approach, such as conducting interviews or content analysis of archival texts, can yield exciting new insights about a research problem, but it should not be undertaken simply because you have a disdain for running a simple regression. A well designed quantitative research study can often be accomplished in very clear and direct ways, whereas, a similar study of a qualitative nature usually requires considerable time to analyze large volumes of data and a tremendous burden to create new paths for analysis where previously no path associated with your research problem had existed.

To locate data and statistics, GO HERE .

Another Writing Tip

Knowing the Relationship Between Theories and Methods

There can be multiple meaning associated with the term "theories" and the term "methods" in social sciences research. A helpful way to delineate between them is to understand "theories" as representing different ways of characterizing the social world when you research it and "methods" as representing different ways of generating and analyzing data about that social world. Framed in this way, all empirical social sciences research involves theories and methods, whether they are stated explicitly or not. However, while theories and methods are often related, it is important that, as a researcher, you deliberately separate them in order to avoid your theories playing a disproportionate role in shaping what outcomes your chosen methods produce.

Introspectively engage in an ongoing dialectic between the application of theories and methods to help enable you to use the outcomes from your methods to interrogate and develop new theories, or ways of framing conceptually the research problem. This is how scholarship grows and branches out into new intellectual territory.

Reynolds, R. Larry. Ways of Knowing. Alternative Microeconomics . Part 1, Chapter 3. Boise State University; The Theory-Method Relationship. S-Cool Revision. United Kingdom.

Yet Another Writing Tip

Methods and the Methodology

Do not confuse the terms "methods" and "methodology." As Schneider notes, a method refers to the technical steps taken to do research . Descriptions of methods usually include defining and stating why you have chosen specific techniques to investigate a research problem, followed by an outline of the procedures you used to systematically select, gather, and process the data [remember to always save the interpretation of data for the discussion section of your paper].

The methodology refers to a discussion of the underlying reasoning why particular methods were used . This discussion includes describing the theoretical concepts that inform the choice of methods to be applied, placing the choice of methods within the more general nature of academic work, and reviewing its relevance to examining the research problem. The methodology section also includes a thorough review of the methods other scholars have used to study the topic.

Bryman, Alan. "Of Methods and Methodology." Qualitative Research in Organizations and Management: An International Journal 3 (2008): 159-168; Schneider, Florian. “What's in a Methodology: The Difference between Method, Methodology, and Theory…and How to Get the Balance Right?” PoliticsEastAsia.com. Chinese Department, University of Leiden, Netherlands.

  • << Previous: Scholarly vs. Popular Publications
  • Next: Qualitative Methods >>
  • Last Updated: Apr 20, 2024 2:57 PM
  • URL: https://libguides.usc.edu/writingguide

Book cover

Research Methodology for Allied Health Professionals pp 1–6 Cite as

Introduction to Research Methodology

  • Animesh Hazari 2  
  • First Online: 01 March 2024

61 Accesses

The term “research methodology” most often echoes among students, research scholars, and faculty members. Though the application of research methodology is diverse, we shall focus on the content specific to academia and industry. This book would be most helpful to health science students and allow them to learn the process of research in a simple and step-by-step process. In my personal experience, I have found that students are very apprehensive when it comes to learning research methodology as a subject. They often encounter problems in understanding the research methodology as the process starts and throughout the course. At times, they may have completed their research but failed to understand the whole process of how scientifically it was conducted.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Author information

Authors and affiliations.

College of Health Sciences, Gulf Medical University, Ajman, Ajman, United Arab Emirates

Animesh Hazari

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter.

Hazari, A. (2023). Introduction to Research Methodology. In: Research Methodology for Allied Health Professionals. Springer, Singapore. https://doi.org/10.1007/978-981-99-8925-6_1

Download citation

DOI : https://doi.org/10.1007/978-981-99-8925-6_1

Published : 01 March 2024

Publisher Name : Springer, Singapore

Print ISBN : 978-981-99-8924-9

Online ISBN : 978-981-99-8925-6

eBook Packages : Medicine Medicine (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Technical Support
  • Find My Rep

You are here

Introduction to Research Methods

Introduction to Research Methods A Hands-on Approach

  • Bora Pajo - Franklin University
  • Description
  • Digital Option / Courseware SAGE Vantage is an intuitive learning platform that integrates quality SAGE textbook content with assignable multimedia activities and auto-graded assessments to drive student engagement and ensure accountability. Unparalleled in its ease of use and built for dynamic teaching and learning, Vantage offers customizable LMS integration and best-in-class support. It’s a learning platform you, and your students, will actually love.  Learn more.
  • Assignable Video with Assessment Assignable video (available with SAGE Vantage ) is tied to learning objectives and curated exclusively for this text to bring concepts to life. Watch a sample video now.
  • LMS Cartridge : Import this title’s instructor resources into your school’s learning management system (LMS) and save time. Don’t use an LMS? You can still access all of the same online resources for this title via the password-protected Instructor Resource Site. Learn more.

See what’s new to this edition by selecting the Features tab on this page. Should you need additional information or have questions regarding the HEOA information provided for this title, including what is new to this edition, please email [email protected] . Please include your name, contact information, and the name of the title for which you would like more information. For information on the HEOA, please go to http://ed.gov/policy/highered/leg/hea08/index.html .

For assistance with your order: Please email us at [email protected] or connect with your SAGE representative.

SAGE 2455 Teller Road Thousand Oaks, CA 91320 www.sagepub.com

Supplements

  • Editable chapter-specific PowerPoint® slides
  • Lecture notes
  • Instructor Manual containing chapter activities
  • All tables and figures from the textbook

“ Introduction to Research Methods is an excellent resource for students and instructors alike. With readable text in an approachable tone, it effectively covers the breadth of topics related to research methods. Its lucidity is stellar.”

“ Introduction to Research Methods: A Hands-on Approach is highly relevant to the work my undergraduate students do and is delivered at a level they can follow. It does a great job of presenting complex information in an easy-to-understand format. At the graduate level, I could see this text serving as a primary read but then supplementing with more complex articles/chapters. Accessibility of the content is a core strength of this text. It makes in-depth content easy to follow for the novice researcher, while at the same time is engaging enough to not be bland for those with research experience. Overall, this text reads as a road map for how to conduct ethical research and provides the necessary step-by-step instructions on how to be successful during the research process.”

“ Introduction to Research Methods: A Hands-on Approach is an excellent text that presents research methods in an easy-to-understand way with examples that foster critical thinking."

“The hands-on, practical approach to Pajo’s Introduction to Research Methods is a huge key strength.”

Very useful for research methods

  • The new edition is available in SAGE Vantage , an intuitive learning platform that integrates quality SAGE textbook content with assignable multimedia activities and auto-graded assessments to drive student engagement and ensure accountability. Unparalleled in its ease of use and built for dynamic teaching and learning, Vantage offers customizable LMS integration and best-in-class support. Learn more.
  • A new chapter on Big Data offers a window into the possibilities of what social scientists are accomplishing in the world of machine learning and the endless possibilities of data visualization. It familiarizes students with current and future opportunities in social and behavioral science.
  • A completely revised chapter on qualitative designs and data collection illustrates how qualitative designs fit within scientific research methods by emphasizing similarities and identifying differences.
  • Updated research examples and visuals reflect current studies in a variety of interdisciplinary fields.
  • APA Style 7th Edition is used throughout the book so students see and practice the latest citation styles.
  • A conversational and jargon-free writing style brings the relevance of research to life.
  • Research in Action features illustrate real, annotated research examples, giving students the skills to read and interpret research they may encounter in their everyday lives.
  • Research Workshop features offer step-by-step help on practical topics that extend beyond the chapter to provide hands-on tips for students conducting their own research.
  • Ethical Considerations within each chapter address different aspects of research to emphasize ethics as an integral part of the research conversation.
  • Chapter pedagogy includes summaries, key terms, photos, and “Taking It a Step Further” questions to extend understanding and application of text concepts.

Sample Materials & Chapters

Chapter 1: The Purpose of Research

Chapter 2: Formulating a Research Question

For instructors

Please select a format:

Select a Purchasing Option

Shipped Options:

BUNDLE: Pajo, Introduction to Research Methods 2e (Vantage Shipped Access Card) + Pajo, Introduction to Research Methods 2e (Loose-leaf)

Related Products

An EasyGuide to APA Style

Grad Coach

How To Write The Methodology Chapter

The what, why & how explained simply (with examples).

By: Jenna Crossley (PhD) | Reviewed By: Dr. Eunice Rautenbach | September 2021 (Updated April 2023)

So, you’ve pinned down your research topic and undertaken a review of the literature – now it’s time to write up the methodology section of your dissertation, thesis or research paper . But what exactly is the methodology chapter all about – and how do you go about writing one? In this post, we’ll unpack the topic, step by step .

Overview: The Methodology Chapter

  • The purpose  of the methodology chapter
  • Why you need to craft this chapter (really) well
  • How to write and structure the chapter
  • Methodology chapter example
  • Essential takeaways

What (exactly) is the methodology chapter?

The methodology chapter is where you outline the philosophical underpinnings of your research and outline the specific methodological choices you’ve made. The point of the methodology chapter is to tell the reader exactly how you designed your study and, just as importantly, why you did it this way.

Importantly, this chapter should comprehensively describe and justify all the methodological choices you made in your study. For example, the approach you took to your research (i.e., qualitative, quantitative or mixed), who  you collected data from (i.e., your sampling strategy), how you collected your data and, of course, how you analysed it. If that sounds a little intimidating, don’t worry – we’ll explain all these methodological choices in this post .

Free Webinar: Research Methodology 101

Why is the methodology chapter important?

The methodology chapter plays two important roles in your dissertation or thesis:

Firstly, it demonstrates your understanding of research theory, which is what earns you marks. A flawed research design or methodology would mean flawed results. So, this chapter is vital as it allows you to show the marker that you know what you’re doing and that your results are credible .

Secondly, the methodology chapter is what helps to make your study replicable. In other words, it allows other researchers to undertake your study using the same methodological approach, and compare their findings to yours. This is very important within academic research, as each study builds on previous studies.

The methodology chapter is also important in that it allows you to identify and discuss any methodological issues or problems you encountered (i.e., research limitations ), and to explain how you mitigated the impacts of these. Every research project has its limitations , so it’s important to acknowledge these openly and highlight your study’s value despite its limitations . Doing so demonstrates your understanding of research design, which will earn you marks. We’ll discuss limitations in a bit more detail later in this post, so stay tuned!

Need a helping hand?

introduction in research methodology

How to write up the methodology chapter

First off, it’s worth noting that the exact structure and contents of the methodology chapter will vary depending on the field of research (e.g., humanities, chemistry or engineering) as well as the university . So, be sure to always check the guidelines provided by your institution for clarity and, if possible, review past dissertations from your university. Here we’re going to discuss a generic structure for a methodology chapter typically found in the sciences.

Before you start writing, it’s always a good idea to draw up a rough outline to guide your writing. Don’t just start writing without knowing what you’ll discuss where. If you do, you’ll likely end up with a disjointed, ill-flowing narrative . You’ll then waste a lot of time rewriting in an attempt to try to stitch all the pieces together. Do yourself a favour and start with the end in mind .

Section 1 – Introduction

As with all chapters in your dissertation or thesis, the methodology chapter should have a brief introduction. In this section, you should remind your readers what the focus of your study is, especially the research aims . As we’ve discussed many times on the blog, your methodology needs to align with your research aims, objectives and research questions. Therefore, it’s useful to frontload this component to remind the reader (and yourself!) what you’re trying to achieve.

In this section, you can also briefly mention how you’ll structure the chapter. This will help orient the reader and provide a bit of a roadmap so that they know what to expect. You don’t need a lot of detail here – just a brief outline will do.

The intro provides a roadmap to your methodology chapter

Section 2 – The Methodology

The next section of your chapter is where you’ll present the actual methodology. In this section, you need to detail and justify the key methodological choices you’ve made in a logical, intuitive fashion. Importantly, this is the heart of your methodology chapter, so you need to get specific – don’t hold back on the details here. This is not one of those “less is more” situations.

Let’s take a look at the most common components you’ll likely need to cover. 

Methodological Choice #1 – Research Philosophy

Research philosophy refers to the underlying beliefs (i.e., the worldview) regarding how data about a phenomenon should be gathered , analysed and used . The research philosophy will serve as the core of your study and underpin all of the other research design choices, so it’s critically important that you understand which philosophy you’ll adopt and why you made that choice. If you’re not clear on this, take the time to get clarity before you make any further methodological choices.

While several research philosophies exist, two commonly adopted ones are positivism and interpretivism . These two sit roughly on opposite sides of the research philosophy spectrum.

Positivism states that the researcher can observe reality objectively and that there is only one reality, which exists independently of the observer. As a consequence, it is quite commonly the underlying research philosophy in quantitative studies and is oftentimes the assumed philosophy in the physical sciences.

Contrasted with this, interpretivism , which is often the underlying research philosophy in qualitative studies, assumes that the researcher performs a role in observing the world around them and that reality is unique to each observer . In other words, reality is observed subjectively .

These are just two philosophies (there are many more), but they demonstrate significantly different approaches to research and have a significant impact on all the methodological choices. Therefore, it’s vital that you clearly outline and justify your research philosophy at the beginning of your methodology chapter, as it sets the scene for everything that follows.

The research philosophy is at the core of the methodology chapter

Methodological Choice #2 – Research Type

The next thing you would typically discuss in your methodology section is the research type. The starting point for this is to indicate whether the research you conducted is inductive or deductive .

Inductive research takes a bottom-up approach , where the researcher begins with specific observations or data and then draws general conclusions or theories from those observations. Therefore these studies tend to be exploratory in terms of approach.

Conversely , d eductive research takes a top-down approach , where the researcher starts with a theory or hypothesis and then tests it using specific observations or data. Therefore these studies tend to be confirmatory in approach.

Related to this, you’ll need to indicate whether your study adopts a qualitative, quantitative or mixed  approach. As we’ve mentioned, there’s a strong link between this choice and your research philosophy, so make sure that your choices are tightly aligned . When you write this section up, remember to clearly justify your choices, as they form the foundation of your study.

Methodological Choice #3 – Research Strategy

Next, you’ll need to discuss your research strategy (also referred to as a research design ). This methodological choice refers to the broader strategy in terms of how you’ll conduct your research, based on the aims of your study.

Several research strategies exist, including experimental , case studies , ethnography , grounded theory, action research , and phenomenology . Let’s take a look at two of these, experimental and ethnographic, to see how they contrast.

Experimental research makes use of the scientific method , where one group is the control group (in which no variables are manipulated ) and another is the experimental group (in which a specific variable is manipulated). This type of research is undertaken under strict conditions in a controlled, artificial environment (e.g., a laboratory). By having firm control over the environment, experimental research typically allows the researcher to establish causation between variables. Therefore, it can be a good choice if you have research aims that involve identifying causal relationships.

Ethnographic research , on the other hand, involves observing and capturing the experiences and perceptions of participants in their natural environment (for example, at home or in the office). In other words, in an uncontrolled environment.  Naturally, this means that this research strategy would be far less suitable if your research aims involve identifying causation, but it would be very valuable if you’re looking to explore and examine a group culture, for example.

As you can see, the right research strategy will depend largely on your research aims and research questions – in other words, what you’re trying to figure out. Therefore, as with every other methodological choice, it’s essential to justify why you chose the research strategy you did.

Methodological Choice #4 – Time Horizon

The next thing you’ll need to detail in your methodology chapter is the time horizon. There are two options here: cross-sectional and longitudinal . In other words, whether the data for your study were all collected at one point in time (cross-sectional) or at multiple points in time (longitudinal).

The choice you make here depends again on your research aims, objectives and research questions. If, for example, you aim to assess how a specific group of people’s perspectives regarding a topic change over time , you’d likely adopt a longitudinal time horizon.

Another important factor to consider is simply whether you have the time necessary to adopt a longitudinal approach (which could involve collecting data over multiple months or even years). Oftentimes, the time pressures of your degree program will force your hand into adopting a cross-sectional time horizon, so keep this in mind.

Methodological Choice #5 – Sampling Strategy

Next, you’ll need to discuss your sampling strategy . There are two main categories of sampling, probability and non-probability sampling.

Probability sampling involves a random (and therefore representative) selection of participants from a population, whereas non-probability sampling entails selecting participants in a non-random  (and therefore non-representative) manner. For example, selecting participants based on ease of access (this is called a convenience sample).

The right sampling approach depends largely on what you’re trying to achieve in your study. Specifically, whether you trying to develop findings that are generalisable to a population or not. Practicalities and resource constraints also play a large role here, as it can oftentimes be challenging to gain access to a truly random sample. In the video below, we explore some of the most common sampling strategies.

Methodological Choice #6 – Data Collection Method

Next up, you’ll need to explain how you’ll go about collecting the necessary data for your study. Your data collection method (or methods) will depend on the type of data that you plan to collect – in other words, qualitative or quantitative data.

Typically, quantitative research relies on surveys , data generated by lab equipment, analytics software or existing datasets. Qualitative research, on the other hand, often makes use of collection methods such as interviews , focus groups , participant observations, and ethnography.

So, as you can see, there is a tight link between this section and the design choices you outlined in earlier sections. Strong alignment between these sections, as well as your research aims and questions is therefore very important.

Methodological Choice #7 – Data Analysis Methods/Techniques

The final major methodological choice that you need to address is that of analysis techniques . In other words, how you’ll go about analysing your date once you’ve collected it. Here it’s important to be very specific about your analysis methods and/or techniques – don’t leave any room for interpretation. Also, as with all choices in this chapter, you need to justify each choice you make.

What exactly you discuss here will depend largely on the type of study you’re conducting (i.e., qualitative, quantitative, or mixed methods). For qualitative studies, common analysis methods include content analysis , thematic analysis and discourse analysis . In the video below, we explain each of these in plain language.

For quantitative studies, you’ll almost always make use of descriptive statistics , and in many cases, you’ll also use inferential statistical techniques (e.g., correlation and regression analysis). In the video below, we unpack some of the core concepts involved in descriptive and inferential statistics.

In this section of your methodology chapter, it’s also important to discuss how you prepared your data for analysis, and what software you used (if any). For example, quantitative data will often require some initial preparation such as removing duplicates or incomplete responses . Similarly, qualitative data will often require transcription and perhaps even translation. As always, remember to state both what you did and why you did it.

Section 3 – The Methodological Limitations

With the key methodological choices outlined and justified, the next step is to discuss the limitations of your design. No research methodology is perfect – there will always be trade-offs between the “ideal” methodology and what’s practical and viable, given your constraints. Therefore, this section of your methodology chapter is where you’ll discuss the trade-offs you had to make, and why these were justified given the context.

Methodological limitations can vary greatly from study to study, ranging from common issues such as time and budget constraints to issues of sample or selection bias . For example, you may find that you didn’t manage to draw in enough respondents to achieve the desired sample size (and therefore, statistically significant results), or your sample may be skewed heavily towards a certain demographic, thereby negatively impacting representativeness .

In this section, it’s important to be critical of the shortcomings of your study. There’s no use trying to hide them (your marker will be aware of them regardless). By being critical, you’ll demonstrate to your marker that you have a strong understanding of research theory, so don’t be shy here. At the same time, don’t beat your study to death . State the limitations, why these were justified, how you mitigated their impacts to the best degree possible, and how your study still provides value despite these limitations .

Section 4 – Concluding Summary

Finally, it’s time to wrap up the methodology chapter with a brief concluding summary. In this section, you’ll want to concisely summarise what you’ve presented in the chapter. Here, it can be a good idea to use a figure to summarise the key decisions, especially if your university recommends using a specific model (for example, Saunders’ Research Onion ).

Importantly, this section needs to be brief – a paragraph or two maximum (it’s a summary, after all). Also, make sure that when you write up your concluding summary, you include only what you’ve already discussed in your chapter; don’t add any new information.

Keep it simple

Methodology Chapter Example

In the video below, we walk you through an example of a high-quality research methodology chapter from a dissertation. We also unpack our free methodology chapter template so that you can see how best to structure your chapter.

Wrapping Up

And there you have it – the methodology chapter in a nutshell. As we’ve mentioned, the exact contents and structure of this chapter can vary between universities , so be sure to check in with your institution before you start writing. If possible, try to find dissertations or theses from former students of your specific degree program – this will give you a strong indication of the expectations and norms when it comes to the methodology chapter (and all the other chapters!).

Also, remember the golden rule of the methodology chapter – justify every choice ! Make sure that you clearly explain the “why” for every “what”, and reference credible methodology textbooks or academic sources to back up your justifications.

If you need a helping hand with your research methodology (or any other component of your research), be sure to check out our private coaching service , where we hold your hand through every step of the research journey. Until next time, good luck!

introduction in research methodology

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

You Might Also Like:

Quantitative results chapter in a dissertation

50 Comments

DAUDI JACKSON GYUNDA

highly appreciated.

florin

This was very helpful!

Nophie

This was helpful

mengistu

Thanks ,it is a very useful idea.

Thanks ,it is very useful idea.

Lucia

Thank you so much, this information is very useful.

Shemeka Hodge-Joyce

Thank you very much. I must say the information presented was succinct, coherent and invaluable. It is well put together and easy to comprehend. I have a great guide to create the research methodology for my dissertation.

james edwin thomson

Highly clear and useful.

Amir

I understand a bit on the explanation above. I want to have some coach but I’m still student and don’t have any budget to hire one. A lot of question I want to ask.

Henrick

Thank you so much. This concluded my day plan. Thank you so much.

Najat

Thanks it was helpful

Karen

Great information. It would be great though if you could show us practical examples.

Patrick O Matthew

Thanks so much for this information. God bless and be with you

Atugonza Zahara

Thank you so so much. Indeed it was helpful

Joy O.

This is EXCELLENT!

I was totally confused by other explanations. Thank you so much!.

keinemukama surprise

justdoing my research now , thanks for the guidance.

Yucong Huang

Thank uuuu! These contents are really valued for me!

Thokozani kanyemba

This is powerful …I really like it

Hend Zahran

Highly useful and clear, thank you so much.

Harry Kaliza

Highly appreciated. Good guide

Fateme Esfahani

That was helpful. Thanks

David Tshigomana

This is very useful.Thank you

Kaunda

Very helpful information. Thank you

Peter

This is exactly what I was looking for. The explanation is so detailed and easy to comprehend. Well done and thank you.

Shazia Malik

Great job. You just summarised everything in the easiest and most comprehensible way possible. Thanks a lot.

Rosenda R. Gabriente

Thank you very much for the ideas you have given this will really help me a lot. Thank you and God Bless.

Eman

Such great effort …….very grateful thank you

Shaji Viswanathan

Please accept my sincere gratitude. I have to say that the information that was delivered was congruent, concise, and quite helpful. It is clear and straightforward, making it simple to understand. I am in possession of an excellent manual that will assist me in developing the research methods for my dissertation.

lalarie

Thank you for your great explanation. It really helped me construct my methodology paper.

Daniel sitieney

thank you for simplifieng the methodoly, It was realy helpful

Kayode

Very helpful!

Nathan

Thank you for your great explanation.

Emily Kamende

The explanation I have been looking for. So clear Thank you

Abraham Mafuta

Thank you very much .this was more enlightening.

Jordan

helped me create the in depth and thorough methodology for my dissertation

Nelson D Menduabor

Thank you for the great explaination.please construct one methodology for me

I appreciate you for the explanation of methodology. Please construct one methodology on the topic: The effects influencing students dropout among schools for my thesis

This helped me complete my methods section of my dissertation with ease. I have managed to write a thorough and concise methodology!

ASHA KIUNGA

its so good in deed

leslie chihope

wow …what an easy to follow presentation. very invaluable content shared. utmost important.

Ahmed khedr

Peace be upon you, I am Dr. Ahmed Khedr, a former part-time professor at Al-Azhar University in Cairo, Egypt. I am currently teaching research methods, and I have been dealing with your esteemed site for several years, and I found that despite my long experience with research methods sites, it is one of the smoothest sites for evaluating the material for students, For this reason, I relied on it a lot in teaching and translated most of what was written into Arabic and published it on my own page on Facebook. Thank you all… Everything I posted on my page is provided with the names of the writers of Grad coach, the title of the article, and the site. My best regards.

Daniel Edwards

A remarkably simple and useful guide, thank you kindly.

Magnus Mahenge

I real appriciate your short and remarkable chapter summary

Olalekan Adisa

Bravo! Very helpful guide.

Arthur Margraf

Only true experts could provide such helpful, fantastic, and inspiring knowledge about Methodology. Thank you very much! God be with you and us all!

Aruni Nilangi

highly appreciate your effort.

White Label Blog Content

This is a very well thought out post. Very informative and a great read.

FELEKE FACHA

THANKS SO MUCH FOR SHARING YOUR NICE IDEA

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Get science-backed answers as you write with Paperpal's Research feature

How to Write a Research Paper Introduction (with Examples)

How to Write a Research Paper Introduction (with Examples)

The research paper introduction section, along with the Title and Abstract, can be considered the face of any research paper. The following article is intended to guide you in organizing and writing the research paper introduction for a quality academic article or dissertation.

The research paper introduction aims to present the topic to the reader. A study will only be accepted for publishing if you can ascertain that the available literature cannot answer your research question. So it is important to ensure that you have read important studies on that particular topic, especially those within the last five to ten years, and that they are properly referenced in this section. 1 What should be included in the research paper introduction is decided by what you want to tell readers about the reason behind the research and how you plan to fill the knowledge gap. The best research paper introduction provides a systemic review of existing work and demonstrates additional work that needs to be done. It needs to be brief, captivating, and well-referenced; a well-drafted research paper introduction will help the researcher win half the battle.

The introduction for a research paper is where you set up your topic and approach for the reader. It has several key goals:

  • Present your research topic
  • Capture reader interest
  • Summarize existing research
  • Position your own approach
  • Define your specific research problem and problem statement
  • Highlight the novelty and contributions of the study
  • Give an overview of the paper’s structure

The research paper introduction can vary in size and structure depending on whether your paper presents the results of original empirical research or is a review paper. Some research paper introduction examples are only half a page while others are a few pages long. In many cases, the introduction will be shorter than all of the other sections of your paper; its length depends on the size of your paper as a whole.

  • Break through writer’s block. Write your research paper introduction with Paperpal Copilot

Table of Contents

What is the introduction for a research paper, why is the introduction important in a research paper, craft a compelling introduction section with paperpal. try now, 1. introduce the research topic:, 2. determine a research niche:, 3. place your research within the research niche:, craft accurate research paper introductions with paperpal. start writing now, frequently asked questions on research paper introduction, key points to remember.

The introduction in a research paper is placed at the beginning to guide the reader from a broad subject area to the specific topic that your research addresses. They present the following information to the reader

  • Scope: The topic covered in the research paper
  • Context: Background of your topic
  • Importance: Why your research matters in that particular area of research and the industry problem that can be targeted

The research paper introduction conveys a lot of information and can be considered an essential roadmap for the rest of your paper. A good introduction for a research paper is important for the following reasons:

  • It stimulates your reader’s interest: A good introduction section can make your readers want to read your paper by capturing their interest. It informs the reader what they are going to learn and helps determine if the topic is of interest to them.
  • It helps the reader understand the research background: Without a clear introduction, your readers may feel confused and even struggle when reading your paper. A good research paper introduction will prepare them for the in-depth research to come. It provides you the opportunity to engage with the readers and demonstrate your knowledge and authority on the specific topic.
  • It explains why your research paper is worth reading: Your introduction can convey a lot of information to your readers. It introduces the topic, why the topic is important, and how you plan to proceed with your research.
  • It helps guide the reader through the rest of the paper: The research paper introduction gives the reader a sense of the nature of the information that will support your arguments and the general organization of the paragraphs that will follow. It offers an overview of what to expect when reading the main body of your paper.

What are the parts of introduction in the research?

A good research paper introduction section should comprise three main elements: 2

  • What is known: This sets the stage for your research. It informs the readers of what is known on the subject.
  • What is lacking: This is aimed at justifying the reason for carrying out your research. This could involve investigating a new concept or method or building upon previous research.
  • What you aim to do: This part briefly states the objectives of your research and its major contributions. Your detailed hypothesis will also form a part of this section.

How to write a research paper introduction?

The first step in writing the research paper introduction is to inform the reader what your topic is and why it’s interesting or important. This is generally accomplished with a strong opening statement. The second step involves establishing the kinds of research that have been done and ending with limitations or gaps in the research that you intend to address. Finally, the research paper introduction clarifies how your own research fits in and what problem it addresses. If your research involved testing hypotheses, these should be stated along with your research question. The hypothesis should be presented in the past tense since it will have been tested by the time you are writing the research paper introduction.

The following key points, with examples, can guide you when writing the research paper introduction section:

  • Highlight the importance of the research field or topic
  • Describe the background of the topic
  • Present an overview of current research on the topic

Example: The inclusion of experiential and competency-based learning has benefitted electronics engineering education. Industry partnerships provide an excellent alternative for students wanting to engage in solving real-world challenges. Industry-academia participation has grown in recent years due to the need for skilled engineers with practical training and specialized expertise. However, from the educational perspective, many activities are needed to incorporate sustainable development goals into the university curricula and consolidate learning innovation in universities.

  • Reveal a gap in existing research or oppose an existing assumption
  • Formulate the research question

Example: There have been plausible efforts to integrate educational activities in higher education electronics engineering programs. However, very few studies have considered using educational research methods for performance evaluation of competency-based higher engineering education, with a focus on technical and or transversal skills. To remedy the current need for evaluating competencies in STEM fields and providing sustainable development goals in engineering education, in this study, a comparison was drawn between study groups without and with industry partners.

  • State the purpose of your study
  • Highlight the key characteristics of your study
  • Describe important results
  • Highlight the novelty of the study.
  • Offer a brief overview of the structure of the paper.

Example: The study evaluates the main competency needed in the applied electronics course, which is a fundamental core subject for many electronics engineering undergraduate programs. We compared two groups, without and with an industrial partner, that offered real-world projects to solve during the semester. This comparison can help determine significant differences in both groups in terms of developing subject competency and achieving sustainable development goals.

Write a Research Paper Introduction in Minutes with Paperpal

Paperpal Copilot is a generative AI-powered academic writing assistant. It’s trained on millions of published scholarly articles and over 20 years of STM experience. Paperpal Copilot helps authors write better and faster with:

  • Real-time writing suggestions
  • In-depth checks for language and grammar correction
  • Paraphrasing to add variety, ensure academic tone, and trim text to meet journal limits

With Paperpal Copilot, create a research paper introduction effortlessly. In this step-by-step guide, we’ll walk you through how Paperpal transforms your initial ideas into a polished and publication-ready introduction.

introduction in research methodology

How to use Paperpal to write the Introduction section

Step 1: Sign up on Paperpal and click on the Copilot feature, under this choose Outlines > Research Article > Introduction

Step 2: Add your unstructured notes or initial draft, whether in English or another language, to Paperpal, which is to be used as the base for your content.

Step 3: Fill in the specifics, such as your field of study, brief description or details you want to include, which will help the AI generate the outline for your Introduction.

Step 4: Use this outline and sentence suggestions to develop your content, adding citations where needed and modifying it to align with your specific research focus.

Step 5: Turn to Paperpal’s granular language checks to refine your content, tailor it to reflect your personal writing style, and ensure it effectively conveys your message.

You can use the same process to develop each section of your article, and finally your research paper in half the time and without any of the stress.

The purpose of the research paper introduction is to introduce the reader to the problem definition, justify the need for the study, and describe the main theme of the study. The aim is to gain the reader’s attention by providing them with necessary background information and establishing the main purpose and direction of the research.

The length of the research paper introduction can vary across journals and disciplines. While there are no strict word limits for writing the research paper introduction, an ideal length would be one page, with a maximum of 400 words over 1-4 paragraphs. Generally, it is one of the shorter sections of the paper as the reader is assumed to have at least a reasonable knowledge about the topic. 2 For example, for a study evaluating the role of building design in ensuring fire safety, there is no need to discuss definitions and nature of fire in the introduction; you could start by commenting upon the existing practices for fire safety and how your study will add to the existing knowledge and practice.

When deciding what to include in the research paper introduction, the rest of the paper should also be considered. The aim is to introduce the reader smoothly to the topic and facilitate an easy read without much dependency on external sources. 3 Below is a list of elements you can include to prepare a research paper introduction outline and follow it when you are writing the research paper introduction. Topic introduction: This can include key definitions and a brief history of the topic. Research context and background: Offer the readers some general information and then narrow it down to specific aspects. Details of the research you conducted: A brief literature review can be included to support your arguments or line of thought. Rationale for the study: This establishes the relevance of your study and establishes its importance. Importance of your research: The main contributions are highlighted to help establish the novelty of your study Research hypothesis: Introduce your research question and propose an expected outcome. Organization of the paper: Include a short paragraph of 3-4 sentences that highlights your plan for the entire paper

Cite only works that are most relevant to your topic; as a general rule, you can include one to three. Note that readers want to see evidence of original thinking. So it is better to avoid using too many references as it does not leave much room for your personal standpoint to shine through. Citations in your research paper introduction support the key points, and the number of citations depend on the subject matter and the point discussed. If the research paper introduction is too long or overflowing with citations, it is better to cite a few review articles rather than the individual articles summarized in the review. A good point to remember when citing research papers in the introduction section is to include at least one-third of the references in the introduction.

The literature review plays a significant role in the research paper introduction section. A good literature review accomplishes the following: Introduces the topic – Establishes the study’s significance – Provides an overview of the relevant literature – Provides context for the study using literature – Identifies knowledge gaps However, remember to avoid making the following mistakes when writing a research paper introduction: Do not use studies from the literature review to aggressively support your research Avoid direct quoting Do not allow literature review to be the focus of this section. Instead, the literature review should only aid in setting a foundation for the manuscript.

Remember the following key points for writing a good research paper introduction: 4

  • Avoid stuffing too much general information: Avoid including what an average reader would know and include only that information related to the problem being addressed in the research paper introduction. For example, when describing a comparative study of non-traditional methods for mechanical design optimization, information related to the traditional methods and differences between traditional and non-traditional methods would not be relevant. In this case, the introduction for the research paper should begin with the state-of-the-art non-traditional methods and methods to evaluate the efficiency of newly developed algorithms.
  • Avoid packing too many references: Cite only the required works in your research paper introduction. The other works can be included in the discussion section to strengthen your findings.
  • Avoid extensive criticism of previous studies: Avoid being overly critical of earlier studies while setting the rationale for your study. A better place for this would be the Discussion section, where you can highlight the advantages of your method.
  • Avoid describing conclusions of the study: When writing a research paper introduction remember not to include the findings of your study. The aim is to let the readers know what question is being answered. The actual answer should only be given in the Results and Discussion section.

To summarize, the research paper introduction section should be brief yet informative. It should convince the reader the need to conduct the study and motivate him to read further. If you’re feeling stuck or unsure, choose trusted AI academic writing assistants like Paperpal to effortlessly craft your research paper introduction and other sections of your research article.

1. Jawaid, S. A., & Jawaid, M. (2019). How to write introduction and discussion. Saudi Journal of Anaesthesia, 13(Suppl 1), S18.

2. Dewan, P., & Gupta, P. (2016). Writing the title, abstract and introduction: Looks matter!. Indian pediatrics, 53, 235-241.

3. Cetin, S., & Hackam, D. J. (2005). An approach to the writing of a scientific Manuscript1. Journal of Surgical Research, 128(2), 165-167.

4. Bavdekar, S. B. (2015). Writing introduction: Laying the foundations of a research paper. Journal of the Association of Physicians of India, 63(7), 44-6.

Paperpal is a comprehensive AI writing toolkit that helps students and researchers achieve 2x the writing in half the time. It leverages 21+ years of STM experience and insights from millions of research articles to provide in-depth academic writing, language editing, and submission readiness support to help you write better, faster.  

Get accurate academic translations, rewriting support, grammar checks, vocabulary suggestions, and generative AI assistance that delivers human precision at machine speed. Try for free or upgrade to Paperpal Prime starting at US$19 a month to access premium features, including consistency, plagiarism, and 30+ submission readiness checks to help you succeed.  

Experience the future of academic writing – Sign up to Paperpal and start writing for free!  

Related Reads:

  • Scientific Writing Style Guides Explained
  • 5 Reasons for Rejection After Peer Review
  • Ethical Research Practices For Research with Human Subjects
  • 8 Most Effective Ways to Increase Motivation for Thesis Writing 

Practice vs. Practise: Learn the Difference

Academic paraphrasing: why paperpal’s rewrite should be your first choice , you may also like, phd qualifying exam: tips for success , ai in education: it’s time to change the..., is it ethical to use ai-generated abstracts without..., what are journal guidelines on using generative ai..., quillbot review: features, pricing, and free alternatives, what is an academic paper types and elements , should you use ai tools like chatgpt for..., publish research papers: 9 steps for successful publications , what are the different types of research papers, how to make translating academic papers less challenging.

RCV Academy

Educating For Excellence! Making You Job Ready!

Research Methodology Introduction – Characteristics of Research

Research Methodology Introduction – Characteristics of Research

Research methodology can be defined as a scientific and systematic search for pertinent information or facts on the specific topic.

In fact, it is an art of scientific information. Research in common parlance refers to a search for knowledge.

You can undertake research within most professions, more than a set of skills, it is a way of thinking, examining more critically the various aspects of your professional work.

When you say you are undertaking a research study to find answers to a question, you are implying that the process is being undertaken within a framework of a set of philosophies, uses procedures, methods, and techniques.

The word research is composed of two syllables, re and search.

Re is a prefix meaning again, a new & over again. Search is a verb, which means examine closely and carefully.

Together they describe a careful, systematic study and investigation in some field of knowledge undertaken to establish facts and principles.

It is an academic activity which can be also used in technical sense. The meaning of research is, “a careful investigation or inquiry specially through search for new facts in any branch of knowledge”.

Definitions of Research

  • According to Redman and Mory –  A systematized effort to gain new knowledge.
  • According to Clifford Woody –  Research compromises defining and redefining problems, formulating a hypothesis or suggested solutions, collecting and evaluating data, making deductions and reaching conclusions.
  • According to D.Slesinger –  The manipulation of things, concepts or symbols for the purpose of generalizing to extend, correct or verify knowledge aids in construction of theory or in the practice of an art.

We can say that research is used to collect information & data for the purpose of making business decisions.

The research methodology may include publication research, interviews, surveys & other research techniques & could include both present & historical information.

In short, research refers to the systematic method of describing the problem clearly and precisely, formulating a hypothesis, collecting data and analyzing it and then reach a certain conclusion of a problem.

Characteristics of Research

Research is a process of collecting, analyzing and interpreting information to answers. But to complete the process it must have certain characteristics, which are as follows:

It means exploring casualty in the relation of two factors. In simpler words, it can be stated that one must set his study in such a way that minimizes the other factors affecting the relationship.

You must be careful in ensuring the procedures used to find the answers to the questions are relevant, justified & appropriate. The degree of rigor varies between physical and social science.

The procedure adopted to take an investigation should follow a certain logical sequence. Some procedure must follow others.

Valid & verifiable

Whatever you conclude on the basis of your findings is correct and can be verified by you and others.

Any conclusion drawn should be based on the data gathered from information based on life experiences or observations.

The process of investigation must be foolproof and free from drawbacks. The process adopted and the procedures used must be able to withstand critical scrutiny.

For a process to be called research, it is imperative that it has the above characteristics.

It is a structured inquiry that utilizes acceptable scientific methodology to solve problems and create new knowledge that is generally applicable.

Scientific methods consist of systematic observation, classification, and interpretation of data.

  • Trending Now
  • Foundational Courses
  • Data Science
  • Practice Problem
  • Machine Learning
  • System Design
  • DevOps Tutorial
  • ASCII Table
  • Code Reverse Engineering - How To Reverse Engineer Your Expired Software
  • Microsoft Azure - Top Azure Kubernetes Service Features
  • Getting Started with Google Actions
  • Implementation of Data Mart
  • What is Wibree Technology?
  • What is Ipad?
  • Artificial Intelligence - Temporal Logic
  • Computer Fundamental Tutorial
  • How to Build a Web Server Docker File?
  • Applications of Computer Vision
  • Adaptive Resonance Theory (ART)
  • Artificial Intelligence Permeation and Application
  • Software Engineering | Phases of Prototyping Model | Set - 2
  • What is Tux Paint?
  • What is Metasearch Engine?
  • Use cases of Blockchain
  • What is Interferometric Modulator?
  • What is Artificial Eye Technology?

Introduction to Research Methodology

The term Research may be defined as a systematic gathering or collection of data and information. It is a process of analysis for the advancement of knowledge in any subject. 

Definition of Research

There is various definition of Research by various researchers/ as per the fields of their study and the availability of resources at the given time. As a basic definition of Research, we can say Research is the process of gathering information and data to discover a new knowledge/concept or advancement of existing theories, which is a new understanding that was not previously known.

 Characteristics of a Good Research

  • The purpose of the research is clear.
  • It follows a systematic process.
  • It is logical and objective.
  • It is easy to understand.
  • Research starts with a problem/question. 
  • It creates a path for generating new questions.
  • It is ethical.
  • It resolves any current problems/issues.
  • The data of the research is appropriate.
  • It is empirical & replicable.

Purpose of Research

There are many purposes of Research but the main three purposes are –

  • Exploratory: Exploratory research is the first unstructured research for solving new problems that haven’t been explored/discovered before. 
  • Descriptive: Descriptive research expands knowledge of an existing research problem in a structured way.   
  • Explanatory: Explanatory research is casual research, It is experiential research. It is conducted to check the result of specific changes in an existing procedure or system.

Types of Research

Research can be divided into two main types : 

  • Basic research (Pure Research):   It does not focus on solving specific problems or issues. It focuses on the advancement of an existing problem.
  • Applied research: It focuses to find solutions to specific problems or issues.

Types of Research Methods

Research methods are mainly classified into two 

  • Qualitative Research Methods: Qualitative research refers to analyzing in-depth information about human behavior and producing “textual data” (non-numerical).
  • Quantitative Research Methods: Quantitative research refers to analyzing something based on some numerical data and mathematical models and producing “numerical data”.

Both methods have distinctive properties and data collection methods. For more detail please refer Difference between Qualitative research and Quantitative research article.

Please Login to comment...

Similar reads.

author

  • Applied Research Works
  • Computer Subject

advertisewithusBannerImg

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 20 April 2024

Viral decisions: unmasking the impact of COVID-19 info and behavioral quirks on investment choices

  • Wasim ul Rehman   ORCID: orcid.org/0000-0002-9927-2780 1 ,
  • Omur Saltik 2 ,
  • Faryal Jalil 3 &
  • Suleyman Degirmen 4  

Humanities and Social Sciences Communications volume  11 , Article number:  524 ( 2024 ) Cite this article

Metrics details

This study aims to investigate the impact of behavioral biases on investment decisions and the moderating role of COVID-19 pandemic information sharing. Furthermore, it highlights the significance of considering cognitive biases and sociodemographic factors in analyzing investor behavior and in designing agent-based models for market simulation. The findings reveal that these behavioral factors significantly positively affect investment decisions, aligning with prior research. The agent-based model’s outcomes indicate that younger, less experienced agents are more prone to herding behavior and perform worse in the simulation compared to their older, higher-income counterparts. In conclusion, the results offer valuable insights into the influence of behavioral biases and the moderating role of COVID-19 pandemic information sharing on investment decisions. Investors can leverage these insights to devise effective strategies that foster rational decision-making during crises, such as the COVID-19 pandemic.

Introduction

Coronavirus (COVID-19) is recognized as a significant health crisis that has adversely affected the well-being of global economies (Baker et al. 2020 ; Smales 2021 ; Debata et al. 2021 ). First identified in December 2019 as a highly fatal and contagious disease, it was declared a public health emergency by the World Health Organization (WHO) (WHO 2020 ; Baker et al. 2020 ; Altig et al. 2020 ; Smales 2021 ; Li et al. 2020 ). The outbreak swiftly spread across 31 provinces, municipalities, and autonomous regions in China, eventually evolving into a severe global pandemic that significantly impacted the global economy, particularly equity markets and social development (WHO 2020 ; Kazmi et al. 2020 ; Li et al. 2020 ). Since the early 2020 emergence of COVID-19 symptoms, the pandemic has caused considerable market decline and volatility in stock returns, significantly impacting the prosperity of world economies (Rahman et al. 2022 ; Soltani et al. 2021 ; Rubesam and Júnior 2022 ; Debata et al. 2021 ; Baker et al. 2020 ; Altig et al. 2020 ). This situation has garnered the attention of many policymakers and economists since its classification as a public health emergency.

Pakistan’s National Command and Operation Centre reported its first two confirmed COVID-19 cases on February 26, 2020. Following this, the Pakistan Stock Exchange experienced a significant downturn, losing 2266 points and erasing Rs. 436 billion in market equity. Foreign investment saw a notable decline, with stocks worth $22.5 million contracting sharply. By the end of February 2020, stock investments totaling $56.40 million had been liquidated. This dramatic drop in equity markets is attributed to the global outbreak of the COVID-19 pandemic (Khan et al. 2020 ). Additionally, for the first time in 75 years, Pakistan’s economy underwent its most substantial contraction in economic growth, recording a GDP growth rate of −0.4% in the first nine months. All three sectors of the economy—agriculture, services, and industry—fell short of their growth targets, culminating in a loss of one-third of their revenue. Exports declined by more than 50% due to the pandemic. Economists have raised concerns about a potential recession as the country grapples with virus containment efforts (Shafi et al. 2020 ; Naqvi 2020 ). Consequently, the rapid spread of COVID-19 has heightened volatility in financial markets, inflicted substantial losses on investors, and caused widespread turmoil in financial and liquidity markets globally (Zhang et al. 2020 ; Goodell 2020 ; Al-Awadhi et al. 2020 ; Ritika et al. 2023 ). This uncertainty has been exacerbated by an increasing number of positive COVID-19 cases.

Since the magnitude of the COVID-19 outbreak became evident, capital markets worldwide have been experiencing significant declines and volatility in stock returns, affected by all new virus variants despite their effective treatments (Hong et al. 2021 ; Rubesam and Júnior 2022 ; Zhang et al. 2020 ). Previous studies have characterized COVID-19 as a particularly devastating and deadly pandemic, severely impacting socio-economic infrastructures globally (Fernandes 2020 ). The pandemic has disrupted trade and investment activities, leading to imbalances in equity market returns (Xu 2021 ; Shehzad et al. 2020 ; Zaremba et al. 2020 ; Baig et al. 2021 ). In response to the COVID-19 outbreak, various governments, including Pakistan’s, have implemented unprecedented and diverse measures. These include restricting the mobility of the general public and commercial operations, and implementing smart or partial lockdowns, all aimed at mitigating the pandemic’s impact on global economic growth (Rubesam and Júnior 2022 ; Zaremba et al. 2020 ).

Investment decisions become notably complex and challenging when influenced by behavioral biases (Pompian 2012 ). In this context, numerous studies have sought to reconcile various behavioral finance theories with the notion of investors as rational decision-makers. One prominent theory is the Efficient Market Hypothesis, which asserts that capital markets are efficient when decisions are informed by symmetrical information among participants (Fama 1991 ). Yet, in reality, individual investors often struggle to make rational investment choices (Kim and Nofsinger 2008 ), as their decisions are significantly swayed by behavioral biases, leading to market inefficiencies. These biases, including investor sentiment, overconfidence, over/underreaction, and herding behavior, are recognized as widespread in human decision-making (Metawa et al. 2018 ). Prior research has identified various behavioral and psychological biases—such as loss aversion, anchoring, heuristic biases, and the disposition effect—that cause investors to stray from rational investment decisions. Moreover, investors’ responses to COVID-19-related news, like infection rates, vaccine developments, lockdowns, or economic forecasts, often reflect behavioral biases such as investor sentiment, overconfidence, over/underreaction, or herding behavior towards short-term events, thereby affecting market volatility (Soltani and Boujelbene 2023 ; Dash and Maitra 2022 ). These biases may have a wide applicability across different markets, regardless of specific cultural or regulatory differences. Consequently, we posit that these four behavioral biases, in the context of COVID-19, are key factors in reducing vulnerability in investment decisions (Dermawan and Trisnawati 2023 ), especially for individual investors who are more susceptible than in a typical investment environment (Botzen et al. 2021 ; Talwar et al. 2021 ). Therefore, understanding these behavioral biases—such as investor sentiment, overconfidence, over/underreaction, or herding behavior—during the COVID-19 pandemic is crucial, as no previous epidemic has demonstrated such profound impacts of behavioral biases on investment decisions (Baker et al. 2020 ; Sattar et al. 2020 ).

Numerous studies have explored the impact of behavioral biases, including investor sentiment, overconfidence, over/under-reaction, and herding behavior, on investment decisions (Metawa et al. 2018 ; Menike et al. 2015 ; Nofsinger and Varma 2014 ; Qadri and Shabbir 2014 ; Asaad 2012 ; Kengatharan and Kengatharan 2014 ). Recent literature has also shed light on the effects of the COVID-19 pandemic on financial and precious commodity markets (Gao et al. 2023 ; Zhang et al. 2020 ; Corbet et al. 2020 ; Baker et al. 2020 ; Mumtaz and Ahmad 2020 ; Ahmed et al. 2022 ; Hamidon and Kehelwalatenna 2020 ). However, academic research specifically addressing the moderating role of COVID-19 pandemic information sharing on behavioral biases remains limited. It has been observed that global pandemics, such as the Ebola Virus Disease (EVD) and Severe Acute Respiratory Syndrome (SARS), significantly influence stock market dynamics, sparking widespread fear among investors and leading to market uncertainty (Del Giudice and Paltrinieri 2017 ; He et al. 2020 ). This study contributes to the field by examining how behavioral biases, such as investor sentiment, overconfidence, over/under-reaction, and herding behavior, are influenced by the unique circumstances of the COVID-19 crisis. Furthermore, this research provides novel insights into real-time investor behavior and policymaking, thus advancing the academic debate on the role of COVID-19 pandemic information sharing within behavioral finance.

The primary goal of this study is to explore the impact of the COVID-19 crisis on behavioral biases and their effect on investment decisions. Additionally, it aims to assess how various socio-demographic factors influence investment decision-making. These factors include age, occupation, gender, educational qualifications, type of investor, investment objectives, reasons for investing, preferred investment duration, and considerations prior to investing, such as the safety of the principal, risk level, expected returns, maturity period, and sources of investment advice. We hypothesize that these factors significantly influence investment decisions, and our analysis endeavors to investigate the relationship between these factors and investment behavior. By thoroughly examining these variables, the study aims to shed light on the role socio-demographic factors play in investment behavior and enhance the understanding of the investment decision-making process. Additionally, the study seeks to conduct a cluster analysis to identify hierarchical relationships and causality, alongside an agent-based learning model that illustrates the susceptibility of low-income and younger age groups to herding behavior. The article provides the codes and outcomes of the model.

The study will commence with an introduction that outlines the scope and significance of the research. Following this, a literature review will be provided, along with the development of hypotheses concerning the behavioral biases affecting investment decisions and the role of socio-demographic factors in shaping investment behavior. The methodology section will detail the research approach, data collection process, variables considered for analysis, and the statistical methods applied. Subsequently, the results section will present findings from the regression and moderating analyses, cluster analysis, and the agent-based learning model. This will include a detailed explanation of the model codes and their interpretations. The discussion section will interpret the study’s results, highlighting their relevance to policymakers, financial advisors, and individual investors. The article will conclude by summarizing the main discoveries and offering suggestions for further inquiry in this domain.

Literature review and development of hypotheses

Invsetor sentiments and investment decisions.

Pandemic-driven sentiments play a crucial role in determining market returns, making it imperative to understand pandemic-related sentiments to predict future investor returns. Consequently, we posit that the sharing of COVID-19 pandemic information is a critical factor influencing investor sentiments towards investment decisions (Li et al. 2021 ; Anusakumar et al. 2017 ; Zhu and Niu 2016 ; Jiang et al. 2021 ). Generally, investors’ sentiments refer to their beliefs, anticipations, and outlooks regarding future cash flows, which are significantly influenced by external factors (Baker and Wurgler 2006 ). Ding et al. ( 2021 ) define investor sentiment as the collective attitude of investors towards a particular market or security, reflected in trading activities and price movements of securities. A trend of rising prices signals bullish sentiments, while decreasing prices indicate bearish investor sentiment. These sentiments, including emotions and beliefs about investment risks, notably affect investors’ behavior and yield (Baker and Wurgler 2006 ; Anusakumar et al. 2017 ; Jansen and Nahuis 2003 ). Sentiment reacts to stock price news (Mian and Sankaraguruswamy 2012 ), with stock prices responding more positively to favorable earnings news during periods of high sentiment than in low sentiment periods, and vice versa. This sentiment-driven reaction to share price movements is observed across all types of stocks (Mian and Sankaraguruswamy 2012 ). Furthermore, research indicates that market responses to earnings announcements are asymmetrical, especially in the context of pessimistic investor sentiments (Jiang et al. 2019 ). Such reactions were notably pronounced during COVID-19 pandemic news, where sentiments such as fear, greed, or optimism significantly influenced market dynamics (Jiang et al. 2021 ). Thus, information related to the COVID-19 pandemic emerges as a valuable resource for forecasting future returns and market volatility, ultimately affecting investment decision-making (Debata et al. 2021 ).

Overconfidence and investment decision

Standard finance theories suggest that investors aim for rational decision-making (Statman et al. 2006 ). However, their judgments are often swayed by personal sentiments or cognitive errors, leading to overconfidence (Apergis and Apergis 2021 ). Overconfidence in investing can be described as an inflated belief in one’s financial insight and decision-making capabilities (Pikulina et al. 2017 ; Lichtenstein and Fischhoff 1977 ), or a tendency to overvalue one’s skills and knowledge (Dittrich et al. 2005 ). This results in investors perceiving themselves as more knowledgeable than they are (Moore and Healy 2008 ; Pikulina et al. 2017 ).

Overconfidence has been categorized into overestimation, where investors believe their abilities and chances of success are higher than actual, and over-placement, where individuals see themselves as superior to others (Moore and Healy 2008 ). Such overconfidence affects investment choices, leading to potentially inappropriate high-risk investments (Pikulina et al. 2017 ). Overconfident investors often attribute success to personal abilities and failures to external factors (Barber and Odean 2000 ; Tariq and Ullah 2013 ). Overconfidence also leads to suboptimal decision-making, especially under uncertainty (Dittrich et al. 2005 ).

Behavioral finance research shows that individual investors tend to overestimate their chances of success and underestimate risks (Wei et al. 2011 ; Dittrich et al. 2005 ). Excessive overconfidence prompts over-investment, whereas insufficient confidence causes under-investment; moderate confidence, however, leads to more prudent investing (Pikulina et al. 2017 ). The lack of market information often triggers this scenario (Wang 2001 ). Amidst recent market anomalies, COVID-19 information has significantly impacted investors’ overconfidence in their investment decisions. Studies have shown that overconfident investors underestimate their personal risk of COVID-19 compared to the general risk perception (Bottemanne et al. 2020 ; Heimer et al. 2020 ; Boruchowicz and Lopez Boo 2022 ; Druica et al. 2020 ; Raude et al. 2020 ). Overconfidence may lead to adverse selection and undervaluing others’ actions, underestimating the likelihood of loss due to inadequate COVID-19 information (Hossain and Siddiqua 2022 ). Consequently, this study hypothesizes that certain exogenous factors, integral to COVID-19 information sharing, may moderate investment decisions in the context of investor overconfidence.

Over/under reaction and investment decision

The Efficient Market Hypothesis (EMH) suggests that investors’ attempts to act rationally are based on the availability of market information (Fama 1998 ; Fama et al. 1969 ; De Bondt 2000 ). However, psychological biases in investors systematically respond to unwelcome news, leading to overreaction and underreaction, thus challenging the notion of market efficiency (Maher and Parikh 2011 ; De Bondt and Thaler 1985 ). Overreaction and underreaction biases refer to exaggerated responses to recent market news, resulting in the overbuying or overselling of securities in financial markets (Durand et al. 2021 ; Spyrou et al. 2007 ). Barberis et al. ( 1998 ) identified both underreaction and overreaction as pervasive anomalies that drive investors toward irrational investment decisions. Similarly, Hirshleifer ( 2001 ) noted that noisy trading contributes to overreaction, which in turn leads to excessive market volatility.

The impact of the COVID-19 outbreak extends far beyond the loss of millions of lives, disrupting financial markets from every angle (Zhang et al. 2020 ; Iqbal and Bilal 2021 ; Tauni et al. 2020 ; Borgards et al. 2021 ). Market reactions have been significantly shaped by COVID-19 pandemic information sharing, affecting investors’ decisions (Kannadas 2021 ). Recent studies have found that investors’ biases in evaluating the precision and predictive accuracy of COVID-19 information can lead to overreactions and underreactions (Borgards et al. 2021 ; Xu et al. 2022 ; Kannadas 2021 ). Furthermore, research documents the growing influence of COVID-19 information sharing on market reactions worldwide, including in the US, Asian, European, and Australian markets (Xu et al. 2022 ; Nguyen et al. 2020 ; Nguyen and Hoang Dinh 2021 ; Naidu and Ranjeeni 2021 ; Heyden and Heyden 2021 ), indicating that market reactions, characterized by non-linear behavior, are driven by investors’ beliefs.

Previous literature has scarcely explored the role of investors’ overreaction and underreaction in decision-making. Recently, emerging research has begun to enrich the literature by examining the moderating role of COVID-19 pandemic information sharing.

Herding behavior and investment decision

According to the assumptions of Efficient Market Hypothesis (EMH), optimal decision-making is facilitated by the availability of market information and stability of stock returns (Fama 1970 ; Raza et al. 2023 ). However, these conditions are seldom met in reality, as decisions are influenced by human behavior shaped by socio-economic norms (Summers 1986 ; Shiller 1989 ). Behavioral finance research suggests that herding behavior plays a significant role in the decline of asset and stock prices, implying that identifying herding can aid investors in making more rational decisions (Bharti and Kumar 2022 ; Jiang et al. 2022 ; Jiang and Verardo 2018 ; Ali 2022 ). Bikhchandani and Sharma ( 2000 ) define herding as investors’ tendency to mimic others’ trading behaviors, often ignoring their own information. It is essentially a group dynamic where decisions are irrationally based on others’ information, overlooking personal insights, experiences, or beliefs (Bikhchandani and Sharma 2000 ; Huang and Wang 2017 ). Echoing this, Hirshleifer and Hong Teoh ( 2003 ) argue that herding is characterized by investment decisions being influenced by the actions of others.

The sharp market declines prompted by events such as the COVID-19 pandemic raise questions about its influence on investors’ herding behaviors (Rubesam and Júnior 2022 ; Mandaci and Cagli 2022 ; Espinosa-Méndez and Arias 2021 ). Christie and Huang ( 1995 ) observed that investor herding becomes more evident during market uncertainties. Hwang and Salmon ( 2004 ) noted that investors are less likely to exhibit herding during crises compared to stable market periods when confidence in future market prospects is higher. The COVID-19 pandemic, as a major market disruptor, necessitates that investors pay close attention to market fundamentals before making investment decisions. Recent studies suggest that an overload of COVID-19 information could lead to irrational decision-making, potentially challenging the EMH by influencing herding behavior (Jiang et al. 2022 ; Mandaci and Cagli 2022 ). This highlights the importance for investors to be aware of market information asymmetry changes, such as those triggered by the COVID-19 outbreak, which could negatively impact their investment portfolios by altering their herding tendencies. This effect may be more pronounced among individual investors than institutional ones (Metawa et al. 2018 ). A yet unexplored area is the extent to which COVID-19 pandemic information sharing amplifies the herding behavior among investors during investment decision-making processes (Mandaci and Cagli 2022 ).

COVID-19 pandemic information sharing moderating the relationship between behavioral biases and investment decisions

Recent research indicates that the COVID-19 pandemic has notably influenced behavioral biases among investors, affecting their decision-making processes (Betthäuser et al. 2023 ; Vasileiou 2020 ). Since the pandemic’s onset, investors have shown increased sensitivity to pandemic-related news or developments, leading to intensified behavioral biases. This heightened sensitivity poses challenges to investors’ abilities to respond effectively. Specifically, information related to economic uncertainty, infection rates, and vaccination progress has shifted investor sentiment regarding risk perception (Gao et al. 2023 ). Additionally, pandemic news has altered the risk perception of overconfident investors, who previously may have underestimated the risks associated with COVID-19 (Bouteska et al. 2023 ). The increased uncertainty and market volatility triggered by COVID-19 news have also prompted investors to adapt their reactions based on new information, potentially fostering more rational decision-making (Jiang et al. 2022 ). The rapid spread of COVID-19-related news has been shown to diminish mimicry in investment decisions (Nguyen et al. 2023 ). This indicates that viral news about the pandemic makes investors more discerning regarding risk perceptions and investment strategies, moving away from mere herd behavior. Based on this discussion, the study proposes that COVID-19 pandemic information sharing acts as a moderating factor in the relationship between behavioral biases and investment decisions.

Sociodemographic factors and investment decision

The influence of demographic factors like gender, age, income, and marital status on investor behavior is well-documented in financial literature. However, examining these relationships within specific geographical contexts—such as countries, regions, states, and provinces—reveals that cultural values, beliefs, and experiences may blur the distinctions between human and cognitive biases in terms of their nuanced impacts. Evidence shows that certain demographic groups, particularly young male investors with lower portfolio values from regions less developed in terms of education and income, are more prone to overconfidence and familiarity bias in their trading activities. Conversely, investors with higher education levels and female investors are inclined to trade less frequently, resulting in better investment returns (Barber and Odean 2000 ; Gervais and Odean 2001 ; Glaser and Weber 2007 ).

This study’s findings further suggest that with increased stock market experience, investors tend to discount emotional factors, leading to more rational investment choices. Nonetheless, experience alone does not appear to markedly influence the decision-making process among investors (Al-Hilu et al. 2017 ; Metawa et al. 2019 ).

In summary, demographic variables such as age, gender, and education significantly impact investment decisions, especially when considered alongside behavioral aspects like investor sentiment, overconfidence, and herd behavior. Gaining insight into these dynamics is crucial for investors, financial advisors, and policymakers to devise effective investment strategies and enhance financial literacy.

Research methodology

Data and sampling.

The research methodology outlines the strategy for achieving the study’s objectives. This research adopted a quantitative approach, utilizing a survey method (questionnaire) to examine the behavioral biases of individual investors in Pakistan during the COVID-19 pandemic. The target population comprised individual investors from Punjab province, specifically those interested in capital investments. Data were collected through convenient sampling techniques. A total of 750 questionnaires were distributed via an online survey (Google Form) to investors in four major cities of Punjab province: Karachi, Lahore, Islamabad, and Faisalabad. Initially, 257 respondents completed the survey following follow-up reminder emails. Out of these, 223 responses were deemed usable, yielding a valid response rate of 29.73% for further analysis (Saunders et al. 2012 ).

To mitigate potential biases during the data collection process, we conducted analyses for non-response and common method biases. Non-response bias, which arises when there is a significant difference between early and late respondents in a survey, was addressed by comparing the mean scores of early and late respondents using the independent samples t -test (Armstrong and Overton 1977 ). Results (see Table 1 ) indicated no statistically significant ( p  > 0.05) difference between early and late responses, suggesting that response bias was not a significant issue in the dataset.

Furthermore, to assess the potential threat of common method variance, we applied Harman’s single-factor test, a widely used method to evaluate common method biases in datasets (Podsakoff et al. 2003 ). This technique is aimed at identifying systematic biases that could compromise the validity of the scale. Through exploratory factor analysis (EFA) conducted without rotation, it was determined that no single factor accounted for a variance greater than the threshold (i.e., 50%). Consequently, common method variance was not considered a problem in the dataset, ensuring the reliability of the findings.

Figure 1 illustrates the framework of the model established for regression and moderating analyses that reveal the interactions between behavioral biases, investment decisions and COVID-19 pandemic information sharing.

figure 1

Covid-19 pandemic informing sharing.

Measures for behavioral biases

A close-ended questionnaire based on five-point Likert measurement scales was prepared scaling (1= “strongly disagree” to 5= “strongly agree”) to operationalize the behavioral biases of investors. The first predictor is investor sentiments. It refers to investors’ beliefs and perspectives related to future cash flows or discourses of specific assets. It is a crucial behavioral factor that often drives the market movements, especially during pandemic. We used the modified 5-items scale from the study of (Metawa et al. 2018 ; Baker and Wurgler 2006 ). Second important behavioral factor is overconfidence, which measured the tendency of decision-makers to unwittingly give excessive weight to the judgment of knowledge and correctness of information possessed and ignore the public information (Lichtenstein and Fischhoff 1977 ; Metawa et al. 2018 ). This construct was measured by using the 3-items scale developed by Dittrich et al. ( 2005 ). In line with the studies of (see for example (De Bondt and Thaler 1985 ; Metawa et al. 2018 ), we opted the 4-items scale to measure the over/under reactions. It illustrates that investors systematically overreact to unexpected news, and this leads to the violation of market efficiency. They conclude that investors attach great importance to past performance, ignoring trends back to the average of that performance (Boubaker et al. 2014 ). Last, herding behavior effect means theoretical set-up suggesting that investment managers are imitating the strategy of others despite having exclusive information. Such managers prefer to make decisions according to the connected group to avoid the risk of reputational damage (Scharfstein and Stein 1990 ). In sense, a modified scale was anchored to examine the herd behavior of investors from the studies of Bikhchandani and Sharma ( 2000 ) and Metawa et al. ( 2018 ).

Measures for COVID-19 pandemic information sharing

To assess the moderating effect of COVID-19 pandemic information sharing, it was examined in terms of uncertainty, fear, and perceived risk associated with the virus (Kiruba and Vasantha 2021 ). Previous studies indicate that COVID-19 news and developments have markedly affected the behavioral biases of investors (Jiang et al. 2022 ; Nguyen et al. 2023 ). To this end, an initial scale was developed to measure the moderating effect of COVID-19 pandemic information sharing. The primary reason for creating a new scale was that existing scales lacked clarity and were not specifically designed to assess how anchoring behavioral biases affect investment decisions. Subsequently, a self-developed scale was refined with input from a panel of experts, including two academicians specializing in neuro or behavioral finance and two investors with expertise in the capital market, to ensure the scale’s face and content validity regarding COVID-19 pandemic information sharing. They reviewed the scale in terms of format, content, and wording. Based on their comprehensive review, minor modifications were made, particularly aligning the scale with pandemic news and developments to accurately measure the impact of the COVID-19 health crisis on investors’ behavioral biases. Ultimately, a four-item scale, employing a five-point Likert scale (1= “strongly disagree” to 5= “strongly agree”), focusing on COVID-19 related aspects (e.g., infection rates, lockdowns, vaccine development, and government stimulus packages) was utilized to operationalize the construct of COVID-19 pandemic information sharing (Bin-Nashwan and Muneeza 2023 ; Li and Cao 2021 ).

I believe that increasing information about rate of COVID-19 infections influenced my investment decisions.

I believe that increasing information about COVID-19 lockdowns influenced my investment decisions.

I believe that increasing information about COVID-19 vaccinations development, influenced my investment decisions, and

I believe that increasing information about government stimulus packages influenced my investment decisions.

Measures for investment decisions

To measure investment decision, the modified five points Likert scale ranging from (1= “strongly disagree” to 5= “strongly agree”) has been opted from the study of Metawa et al. ( 2018 ).

Hypotheses of study

The hypotheses of the study regarding regression analysis and moderating analyses are as follows in Table 2 :

The hypotheses outlined above were tested using regression analyses and moderating analyses. To reveal the clustering tendencies of investors exhibiting similar behaviors, cognitive biases, and sociodemographic variables, the feature importance values were investigated using K-means clustering analyses. Furthermore, findings and recommendations were provided to policymakers using agent-based models to develop policy suggestions within the scope of these hypotheses, offering insights for academic purposes.

Demographic profile of respondents

Table 3 provides a brief demographic profile of respondents.

Based on the percentages presented in Table 3 , the study primarily focuses on a specific demographic profile. Most participants were 20–30 years old (61.0%) with a higher educational background, particularly a master’s degree (67.3%). They were mostly salaried individuals (56.5%), male (61.0%), and identified as seasonal investors (63.7%). The investment objective of this group was mostly focused on growth and income (37.2%), while wealth creation (41.3%) was their primary purpose for investing. They preferred to invest equally in medium-term (43.5%) and long-term (28.3%) periods and considered high returns (38.6%) as the primary factor before investing. They received investment advice primarily from family and friends (44.8%) and social media (29.6%). Overall, the study indicates that the sample consisted of younger, male, salaried individuals with higher education levels who rely on personal networks and social media for investment advice. Their investment objectives are focused on wealth creation through growth and income, with an equal preference for medium and long-term investments.

Analysis and results

Descriptive summary.

Table 4 outlines the measures used to evaluate the constructs of the study, detailing the number of items for each construct, mean values, standard deviations, zero-order bivariate correlations among the variables, and Cronbach’s Alpha values. The evaluation encompasses a total of 29 items spread across six constructs: investor sentiments (5 items), overconfidence (3 items), over/under reaction (4 items), herding theory (3 items), investment decision (10 items), and COVID-19 information impact (4 items). The mean scores for these items fall between 3.535 and 3.779, with standard deviations ranging from 0.877 to 0.965.

Parallel coordinates (see Figs. 2 – 5 ) visualization is employed as a method to depict high-dimensional data on a two-dimensional plane, proving particularly beneficial for datasets with a large number of features or attributes. This technique involves the use of vertical axes to represent each feature, connected by horizontal lines that represent individual data points. This visualization method facilitates the identification of patterns, detection of clusters or outliers, and discovery of correlations among the features. Therefore, parallel coordinates visualization is instrumental in analyzing complex datasets, aiding in the informed decision-making process based on the insights obtained.

figure 2

Strongly disagree (CIS1) choice parallel coordinates.

figure 3

Disagree (CIS2) choice parallel coordinates.

figure 4

Agree (CIS3) choice parallel coordinates.

figure 5

Strongly agree (CIS4) choice parallel coordinates.

The analysis of responses to the COVID-19 information sharing questions reveals a significant correlation with the second and fourth-level responses concerning cognitive biases, including investor sentiment, overconfidence, over/under reaction, and herding behavior. This observation leads to two key insights. Firstly, participants demonstrate an ability to perceive, respond to, and comprehend the nuances of their investment decisions as related to investor sentiment, overconfidence, over/under reaction, and herding behavior. Consequently, they show a propensity to make clear decisions, indicating agreement or disagreement in their responses. Secondly, it is noted that individuals who acknowledge being significantly influenced by COVID-19 news tend to adopt more balanced investment strategies concerning these cognitive biases. Additionally, younger individuals, particularly those self-employed or not professionally investing, who show a preference for long-term value investments, are more inclined to exhibit these tendencies.

The value of the Pearson correlation coefficient (r) was calculated to investigate the nature, strength and relationship between variables. The results of correlation analysis reveal that all the constructs positively correlated.

To investigate the interconnections among variables in the dataset, correlations were computed and illustrated through a network graph. The correlation matrix’s values served as the basis for edge weights in the graph, with more robust correlations depicted by thicker lines (see Fig. 6a ). Each variable received a unique color, and connections showcasing higher correlations utilized a distinct color scheme to enhance visual clarity. This method offers a graphical depiction of the intricate relationships among various variables, facilitating the discovery of patterns and insights that might remain obscured within a conventional correlation matrix.

figure 6

a Correlation diagraphs and matrix. b Correlation diagraphs and matrix.

The correlation analysis revealed a pronounced relationship between cognitive biases (such as investor sentiments, overconfidence, herd behavior, and investment decisions), COVID-19 information sharing, and socio-demographic factors (including age group, occupation, gender, educational qualifications, type of investor, investment objectives, investment purposes, preferred investment duration, factors considered prior to investing, and sources of investment advice). A correlation matrix graph was constructed to further elucidate these correlations, assigning different colors to each variable for visual differentiation (see Fig. 6b ). The thickness of the lines in the graph correlates with the strength of the relationships, indicating variables with high correlation more prominently.

These findings underscore the interconnected nature of the study variables, demonstrating that cognitive biases and socio-demographic factors exert a considerable impact on investment decisions. This analytical approach highlights the complexity of investor behavior and underscores the multifaceted influences on investment choices, providing valuable insights for understanding how various factors interact within the investment decision-making process.

Reliability test

For reliability test, the Cronbach alpha values were examined to check the internal consistency of the measure. The internal consistency of an instrument tends to indicate whether a metric or an indicator measure what it is intended to measure (Creswell 2009 ). The Cronbach’s alpha greater than 0.7 indicates that all the items or the questions regarding the respective variable are good, highly correlated and reliable. The calculated Cronbach coefficient value for Investor sentiments (alpha = 0.888), over confidence (alpha = 0.827), over/under reaction (alpha = 0.858), herding behavior theory (alpha = 0.741), Investment decision (alpha = 0.933) and COVID-19 (alpha = 0.782) indicates that all of the constructs are reliable.

Validity test

Validity refers to the extent to which an instrument accurately measures or performs what it is designed to measure (Kothari 2004 ). To ensure the validity of the questionnaire and its constructs, the researcher engaged in a comprehensive literature review, sought the advice of consultants, and incorporated feedback from other professionals in the field. Additionally, the concepts of convergent validity and discriminant validity were evaluated to further assess the instrument’s validity.

Convergent validity assesses the extent to which items that are theoretically related to a single construct are, in fact, related in practice (Wang et al. 2017 ). To determine convergent validity, factor loading, Average Variance Extracted (AVE), and Composite Reliability (CR) were calculated. According to Hair et al. ( 1998 ), factor loading values should exceed 0.60, composite reliability should be 0.70 or higher, and AVE should surpass 0.50 to confirm adequate convergent validity.

Table 5 demonstrates that all constructs utilized in this study surpass these threshold values, indicating strong convergent validity. This suggests that the items within each construct are consistently measuring the same underlying structure, reinforcing the validity of the questionnaire’s design and the constructs it aims to measure.

Discriminant validity measures the degree that the concepts are distinct from each other (Bagozzi et al. 1991 ) and it is evident that if alpha value of a construct is greater than the average correlation of the construct with other variables in model, the existence of discriminant validity exist (Ghiselli et al. 1981 ).

Hypotheses testing

To examine the conditional moderating effect of COVID-19 on the influence of behavioral factors (investor sentiments, overconfidence, over/under reaction, and herding behavior) on investment decision-making, moderation analysis was conducted using the Process Macro (Model 1) for SPSS, as developed by Hayes, with bootstrapping samples at 95% confidence intervals. According to Hayes ( 2018 ), the analysis first explores the direct impact of the behavioral factors on investment decisions. Subsequently, it assesses the indirect influence exerted by the moderating variable (COVID-19). This two-step approach allows for a comprehensive understanding of how COVID-19 modifies the relationship between investors’ behavioral biases and their decision-making processes, shedding light on the extent to which the pandemic acts as a moderating factor in these dynamics.

For this study the mathematical model to test moderating role of COVID-19 pandemic information sharing can be explained as:

Y = Investment decisions (Dependent variable)

β 0  = Intercept

X 1  = Investment sentiments (Independent variable)

X 2  = Overconfidence (Independent variable)

X 3  = Over/under reaction (Independent variable)

X 4  = Herding behavior (Independent variable)

β 1 X 1  = Intercept of investors sentiments

β 2 X 2  = Intercept of overconfidence

β 3 X 3  = Intercept of over/under reaction

β 4 X 4  = Intercept of herding behavior

(X 1 * COVID-19) = Investors’ sentiments and moderation effect of COVID-19 information

(X 2 * COVID-19) = Overconfidence and moderation effect of COVID-19 information

(X 3 * COVID-19) = Over/under reaction and moderation effect of COVID-19 information

(X 4 * COVID-19) = Herding behavior and moderation effect of COVID-19 information

μ = Residual term.

Direct effect

In Table 6 , the direct effect of the independent variables on the dependent variable demonstrates that the behavioral factors (investor sentiments, overconfidence, over/under reaction, and herding behavior) significantly influence investment decision (ID) with beta values of 0.961, 0.867, 0.884, and 0.698, respectively. The confidence interval (CI) values presented in Table 6 confirm these relationships are statistically significant. The positive and significant outcomes underline that behavioral factors critically impact investors’ decision-making attitudes. Consequently, Hypotheses 1, 2, 3, and 4 (H1, H2, H3, and H4) are accepted, affirming the substantial role of investor sentiments, overconfidence, over/under reaction, and herding behavior in shaping investment decisions.

Indirect moderating effect

In the context of the COVID-19 pandemic and its associated risks, the impact of behavioral factors (investor sentiments, overconfidence, over/under reaction, and herding behavior) on investment decisions tends to diminish. The findings presented in Table 6 and illustrated in Fig. 7 indicate that COVID-19 information sharing significantly and negatively moderates the relationship between these factors and investment decisions, leading to the acceptance of Hypotheses 5, 6, 7, and 8 (H5, H6, H7, and H8). The negative beta values underscore that the presence of COVID-19 adversely influences investors’ behavior, steering them away from rational investment decisions. This demonstrates that the pandemic context acts as a moderating factor, altering how behavioral biases impact investment choices, ultimately guiding investors towards more cautious or altered decision-making processes.

figure 7

Moderating effect of Covid-19 pandemic information sharing.

K-means clustering analysis

K-means clustering analysis is utilized to uncover natural groupings within datasets by analyzing similarities between observations. This technique is especially beneficial for managing large and complex datasets as it reveals patterns and relationships among variables that may not be immediately evident. In this study, K-means clustering helps identify natural groupings based on socio-demographic factors, cognitive biases regarding investment decisions, and COVID-19 pandemic information sharing, thereby offering insights into the data’s underlying structure and identifying potential patterns or relationships among key variables.

The cluster analysis aims to ascertain the feature importance value of groups with similar investor behaviors, which is crucial for determining agents’ investment functions in subsequent agent-based modeling. Selecting the appropriate number of clusters in the K-means algorithm is essential, yet challenging, as different numbers of clusters can yield varying results (Li and Wu 2012 ).

Two prevalent methods for determining the optimal number of clusters are:

Elbow Method: This approach involves running the K-means algorithm with varying cluster numbers and calculating the total sum of squared errors (SSE) for each. SSE represents the squared distances of each data point from its cluster’s centroid. Plotting the SSE values against the number of clusters reveals a point known as the “elbow,” where the rate of SSE decrease markedly slows, indicating the optimal cluster number (Syakur et al. 2018 ).

Silhouette Analysis: Not mentioned directly in the narrative, but it’s another method that measures how similar an object is to its own cluster compared to other clusters. The silhouette score ranges from −1 to 1, where a high value indicates the object is well matched to its own cluster and poorly matched to neighboring clusters.

The sklearn library provides tools for implementing the elbow method and silhouette analysis. For example, the code snippet described applies the elbow method by varying the number of clusters from 1 to 10 and calculating SSE for each scenario. The optimal number of clusters is identified by selecting a value near the elbow point on the resulting plot.

After clustering, the analysis progresses by using the fit () method from sklearn’s K-Means class to cluster the data, determine each cluster’s center coordinates, and assign each data point to a cluster. Feature importance values can be calculated using the Extra Trees Classifier class from sklearn, and these values can be visualized through a line graph.

Finally, to illustrate the clusters’ membership to the CIS1, CIS2, CIS3, and CIS4 inputs as a color scale bar, the seaborn library is used (see Fig. 8 (top) and Fig. 8 (bottom)). This involves calculating the average membership values for each cluster and visualizing these averages, providing a clear depiction of how each cluster associates with the different inputs, enriching the analysis of investor behaviors and their responses to COVID-19 information sharing.

figure 8

Elbow method sum of squared error class determination (top) and clustering analysis results (bottom).

After employing a network diagram constructed from a correlation matrix to elucidate the interrelationships among variables, and utilizing the Elbow method to ascertain the optimal number of clusters, the K-means clustering algorithm was applied (see Fig. 9 ). This approach successfully identified three distinct clusters, highlighting the variables that exerted a significant influence on these clusters. Notably, the COVID-19 pandemic information sharing variable, along with its corresponding CIS1, CIS2, CIS3, and CIS4 values, emerged as significant factors. The analysis indicated that overconfidence and overreaction were the predominant factors in crucial clustering, alongside cognitive biases and investment strategies that lead to similar behaviors among investors and varying levels of impact from COVID-19.

figure 9

Cluster analysis feature importance value results.

Furthermore, sociodemographic factors such as age, occupation, and investor type were also identified as influential determinants. Leveraging these insights, policymakers and researchers can develop an agent-based model that incorporates herd behavior, along with age and income levels categorized by occupation, to effectively simulate market dynamics. This approach facilitates a comprehensive understanding of how different factors, particularly those related to the COVID-19 pandemic, influence investor behavior and market movements, thereby enabling the formulation of more informed strategies and policies.

An ingenious agent-based simulation for herding behavior

In this study, the findings of behavioral economics and finance research may contain results that are easy to interpret for policymakers but may involve certain difficulties in practical implementation. Specifically, for policymakers, an agent-based model has been created (see Appendix 1 for pseudo codes. In case, requested python codes are available). In a model consisting of 223 agents who trade on a single stock, prototypes of investors have been created based on the analysis presented here, and characteristics such as age group and income status, which are relatively easy to access or predict regarding their socio-demographic profiles, have been taken into account in the herd behavior function, considering the decision to follow the group or make independent decisions. Younger and lower-income agents were allowed to exhibit a greater tendency to follow the group, while 50 successful transactions were monitored to determine in which trend of stock price increase or decrease the balance of the most successful agent was increased or decreased (Gervais and Odean 2001 ).

In addressing the influence of age and income status on herding behavior, it is imperative to underscore the nuanced interplay between various socio-economic and psychological factors within our agent-based model framework. The model’s robustness stems from its capacity to simulate a range of investor behaviors by integrating key determinants such as investor sentiment, overconfidence, reaction to market events, and socio-demographic characteristics. Herein we expound on the contributory elements:

Investor Sentiment (IS1–IS5)

The model encapsulates the variability of investor sentiment, which oscillates with age and income, influencing individuals’ financial perspectives and risk propensities. Younger investors’ sentiment may tilt towards optimism driven by a more extensive investment horizon, while lower-income investors’ sentiment could lean towards caution, primarily driven by the pressing requirement for financial dsecurity (Baker and Wurgler 2007 ).

Overconfidence (OF1–OF5)

The tendency towards overconfidence is dynamically modeled, particularly among younger investors who may overrate their market acumen and predictive capabilities. This overconfidence may also manifest among lower-income investors as a psychological compensatory mechanism for resource inadequacy (Malmendier and Tate 2005 ).

Over/Under Reaction (OUR1–OUR5)

The model accounts for the influence of age and income on the velocity and extent of response to market stimuli. Inexperienced or financially restricted investors may be prone to overreactions due to a lack of market exposure or intensified economic strain (Daniel et al. 1998 ).

Herding Behavior (HB1–HB4)

Within the simulated environment, herding is more pronounced among younger investors, possibly due to peer influence, and among lower-income investors who may seek safety in conformity (Bikhchandani et al. 1992 ).

Investment Decision (ID1–ID10)

The model intricately reflects the complexities of investment decisions influenced by age-specific factors such as projected earnings and lifecycle influences. Investors with limited income may exhibit a predilection for security, swaying their investment choices (Yao and Curl 2011 ).

COVID-19 Information Sharing (CIS1–CIS4)

The pandemic era’s nuances are integrated into the model, acknowledging that younger investors could be more susceptible to digitally disseminated information, which, in turn, impacts their investment decisions. The credibility and source of information are also calibrated based on income levels (Shiller 2020 ).

Socio-demographic factors

Age: The model simulates younger investors’ reliance on the conduct of others, utilizing it as a heuristic substitute for experience (Dobni and Racine 2016 ).

Occupation: It captures how occupational background can broaden or restrict access to information and influence herding tendencies (Hong et al. 2000 ).

Gender: Gender disparities are incorporated, reflecting on investment styles where men may be more disposed to herding due to overconfidence (Barber and Odean 2001 ).

Qualification (Qualif.): The model acknowledges that higher education and financial literacy levels can curtail herding by fostering self-reliant decision-making (Lusardi and Mitchell 2007 ).

Investor Type (InvTyp): It differentiates between retail and institutional investors, noting that limited resources might push retail investors towards herding (Nofsinger and Sias 1999 ).

Investment Objective (InvObj): The model recognizes that short-term objectives might amplify herding as investors chase swift gains (Odean 1998 ).

Purpose: It contemplates the conservative herding behavior that is aligned with goals like retirement savings (Yao and Curl 2011 ).

Investment Horizon (Horizon): A lengthier investment horizon is modeled to potentially dampen herding tendencies (Kaustia and Knüpfer 2008 ).

Factors Considered Before Investing (factors): The model simulates a range of investment considerations, including risk tolerance and expected returns, which influence herding propensities (Shefrin and Statman 2000 ).

Source of Investment Advice (source): The influence of advice sources, such as analysts or financial media, on herding is also captured within the model (Tetlock 2007 ).

In conclusion, the agent-based model we present is meticulously designed to reflect the intricate fabric of financial market behavior. It is particularly attuned to the multi-layered aspects that drive herding, informed by empirical evidence and theoretical underpinnings that rigorously define the interrelations between investor demographics and market behavior. The aforementioned socio-economic and psychological facets provide a comprehensive backdrop against which the validity and consistency of the model are substantiated.

The following code has been prepared using Python programming language with the Mesa, Pandas, SciPy, NumPy, Random and Matplotlib libraries. This code simulates a herd behavior of stock traders in a simple market (Hunt and Thomas 2010 ; McKinney 2010 ; Harris et al. 2020 ; Virtanen et al. 2020 ; Van Rossum 2020 ; Hunter 2007 ). The simulation runs for 50-time steps, with the stock price and balance of each agent printed at each step. The decision-making process of agents in the simulation is stochastic, with agents randomly choosing to buy, sell, or follow the market trend based on their characteristics and decision-making strategy.

The Stock Trader class in the model symbolizes individual agents, each characterized by a unique ID, balance, and a stock price. These agents are equipped with a method to compute the current stock price. The step() function within each agent embodies their decision-making process, which is influenced by their current balance and the prevailing stock price. Agents have the option to buy, sell, or align with the market trend, reflecting various investment strategies.

The Herding Model class encapsulates the entire simulation framework. It generates a population of Stock Trader agents and progresses the simulation over a designated number of time steps. Within this class, the agent_decision() method orchestrates each agent’s decision-making, factoring in individual characteristics and strategies. The step() method, in turn, adjusts the stock price based on the aggregate current stock prices of all agents before executing the step() method for each agent, thereby simulating the dynamic nature of the stock market.

Socio-demographic factors, specifically age and income status, are integrated into the agent-based model simulations, drawing upon insights from Parallel Coordinates and Cluster Analysis as well as relevant literature. The simulation posits that agents of younger age and lower income are predisposed to mimicking the market trend, whereas other agents exhibit a propensity for independent decision-making. Given the stochastic nature of the decision-making process, the behavior of agents varies across different runs of the simulation, introducing an element of unpredictability.

At each time step, the simulation outputs the stock price and balance of each agent, offering a snapshot of the market dynamics at that moment. Figure 10 provides a flow diagram elucidating the operational framework of the model’s code, presenting a visual representation of how the simulation unfolds over time.

figure 10

Flowchart of agent-based model.

This model architecture allows for the exploration of how socio-demographic characteristics influence investment behaviors within a simulated market environment, offering valuable insights into the mechanisms driving market trends and individual investor decisions.

Within our agent-based model (ABM), “performance” embodies multiple dimensions reflective of the agents’ investment outcomes, influenced by socio-demographic factors and behavioral biases. The provided pseudo-code conceptualizes the implementation of these facets in the model.

Metrics used to quantify agent performance

Balance trajectory.

This primary indicator tracks the evolution of each agent’s financial balance over time, reflecting the impact of their buy, sell, or market trend-following decisions (Arthur 1991 ).

Decision strategy efficacy

Evaluates the effectiveness of an agent’s decision-making strategy (‘buy’, ‘sell’, or ‘follow’), influenced by socio-demographic variables such as age and income, as delineated in the agent_decision method (Tesfatsion and Judd 2006 ).

Market trend alignment

Assesses the correlation between an agent’s balance trajectory and overall market trends, indicating successful performance if an agent’s balance increases with market prices (Shiller 2003 ).

Risk management

Infers risk management skill from the volatility of balance changes, with less volatility indicating stable and potentially successful investment strategies (Markowitz 1952 ).

Wealth accumulation

Agents are ranked by their final balance at the simulation’s end to identify the most financially successful outcomes (De Long et al. 1990 ).

Adaptive behavior

The model evaluates agents’ adaptability to market price changes, revealing their capacity to capitalize on market movements (Gode and Sunder 1993 ).

Herding influence

Considers how herding behavior impacts financial outcomes, especially for younger and lower-income agents as programmed in the Herding Model class (Bikhchandani et al. 1992 ).

These performance metrics are quantified through agents’ balance and stock price histories, updated at each simulation step. These histories offer a time series analysis of financial trajectories, enabling pattern identification such as herding tendencies or the effects of overconfidence.

The model’s realism is enhanced by parameters like young_follow_factor and low_income_follow_factor, adjusting the propensity for herding among different socio-demographic groups. This inclusion allows the model to reflect real-world dynamics where age and income significantly impact investment performance.

In conclusion, our ABM presents a detailed framework for examining investment performance’s complex nature. It integrates behavioral economics and socio-demographic data, providing insights into investor behavior under simulated market conditions.

Characteristics of agents in the agent-based model

Demographics (age and income): Consistent with the focus of our study on socio-demographic factors, each agent is characterized by age and income parameters, which influence their investment behavior, particularly their propensity towards herding. Age and income are randomly assigned within realistic bounds reflecting the demographic distribution of typical investor populations.

Cognitive biases: Agents are imbued with behavioral attributes such as overconfidence, herding instinct, and over/under-reaction tendencies to market news, reflecting the psychological dimensions of real-world investors.

Investment strategy: Each agent follows a distinct investment strategy categorized broadly as ‘buy’, ‘sell’, or ‘follow’ (herding). The strategy is influenced by the agent’s demographic characteristics and cognitive biases.

Adaptability: Agents are capable of learning and adapting to market changes over time, simulating the dynamic and evolving nature of real-world investor behavior.

Social influence: Agents are influenced by other agents’ behaviors, especially under conditions conducive to herding, modeling the social dynamics of investment communities.

Wealth and portfolio: Agents have a variable representing their wealth, which fluctuates based on investment decisions and market performance. Their portfolio composition and changes therein are also tracked, offering insights into their risk-taking and diversification behaviors.

Significance of agent-based modeling

Agent-based modeling is a powerful tool that allows researchers to simulate and analyze complex systems composed of interacting agents. Its significance and utility in various fields, including economics and finance are profound:

Complexity and emergence: ABM can capture the emergent phenomena that arise from the interactions of many individual agents, providing insights into complex market dynamics that are not apparent at the individual level (Epstein and Axtell 1996 ).

Customizability and scalability: ABMs can be tailored to include various levels of detail and complexity, allowing for the simulation of systems ranging from small groups to entire markets (Tesfatsion and Judd 2006 ).

Experimental flexibility: ABMs facilitate virtual experiments that would be impractical or impossible in the real world, enabling researchers to explore hypothetical scenarios and policy implications (Gilbert and Troitzsch 2005 ).

Realism in behavioral representation: By incorporating cognitive biases and decision-making rules, ABMs can realistically represent human behavior, providing deeper behavioral insights than models assuming perfect rationality (Hommes 2006 ).

Policy analysis and forecasting: In economics and finance, ABMs are particularly useful for policy analysis, risk assessment, and forecasting, as they can incorporate a wide range of real-world factors and individual behaviors (LeBaron and Tesfatsion 2008 ).

By integrating these agent characteristics into our ABM and considering the broader implications of agent-based modeling, our study aims to provide nuanced insights into herding behavior among investors. We believe that our approach not only aligns with best practices in the field but also significantly contributes to the understanding of complex investment behaviors and market dynamics. We trust that this expanded description addresses the reviewer’s comment and underscores the robustness and relevance of our agent-based simulation approach.

Figure 11a, b panels display the balance changes of agents with respect to stock prices, age, and income status. By coding the balance increases and decreases as +1 and −1, respectively, and employing a line graph that matches the changes in stock prices, it has become possible to provide information about the agents’ performance. In panels a and b, it is observed that agents created after the age of 37.5 have been included in the higher income group on average, and during transitions of stock prices below 12.75 units, between 17 and 20 units, and between 26 and 27.50 units, the agents’ responses to price state changes are accompanied by noticeable transitions (increases and decreases) in their portfolio states, depending on age and income status.

figure 11

a Agents’ performance. b Agents’ responses.

In Fig. 12 , in the agent-based model’s 50 repeated simulations, at the 45th simulation, the stock price is 20.03 units, and the balance of agent number 74 reaches 911 units. The price-income-balance change graph for the agent throughout the 50 transactions is presented below.

figure 12

Balance change according to stock price for agent 74.

Upon examining the descriptive statistics of the income for agent number 74, who diverges from the herding tendency profile of the model and is in the higher income group aged 40 and above, the highest balance value is 911 units, the lowest balance level is 732 units, the average is 799 units, and the standard deviation is 41 units. When the overall balance of the agents is investigated, it is observed that the average balance of the agents is around 84 units. Considering the existence of an agent with the lowest balance of −670 units, it can be concluded that agent number 74 has demonstrated a significantly superior performance.

Discussion and conclusion

The influence of behavioral biases on investors’ decision-making has yielded mixed findings in literature. Wan ( 2018 ) observed a positive impact of behavioral biases, considered forward-looking factors, on investment decisions. Conversely, Zulfiqar et al. ( 2018 ) noted a markedly negative impact of overconfidence on investment decisions. Similarly, Aziz and Khan ( 2016 ) explored the role of heuristic factors (representative, anchoring, overconfidence, and availability bases) and found them significantly influencing investment decision and performance. However, they reported that prospect factors (loss aversion, regret aversion, and mental accounting biases) had an insignificant impact on these outcomes.

These varied results may stem from a complex interplay of factors such as cultural differences, pandemic-related information, economic conditions, regulatory environments, historical context, and investors’ financial literacy levels, contributing to differences in how behavioral biases influence investment decisions across regions (Metawa et al. 2018 ).

This study contributes to the field of behavioral finance by revealing the moderating role of COVID-19 pandemic information sharing on the relationship between behavioral quirks and investment choices, specifically in the context of Pakistan. Key contributions include:

Investors’ sentiments

This study shows that COVID-19 pandemic information sharing significantly moderates the relationship between investors’ sentiments and their investment decisions, validating that pandemic-related information, such as infection rates and economic downturns, heavily influences investors’ sentiments and alters their risk perceptions (Anastasiou et al. 2022 ; Hsu and Tang 2022 ; Bin-Nashwan and Muneeza 2023 ; Gao et al. 2023 ; Sohail et al. 2020 ).

Overconfidence

It reveals how COVID-19 information reshapes overconfident investors’ risk perceptions, urging them to reassess their investment portfolios in light of the pandemic’s uncertainties and economic implications (Bouteska et al. 2023 ; Li and Cao 2021 ).

Over/under reaction

The study uncovers that the pandemic information moderates the relationship between over-under reaction and investment decisions, suggesting that investors adjust their reactions based on evolving pandemic information, leading to more informed and rational investment choices (Jiang et al. 2022 ).

Herd behavior

It finds that COVID-19 pandemic information significantly reduces herd behavior among investors, encouraging them to make rational decisions rather than blindly following the majority (Nguyen et al. 2023 ).

In conclusion, this study illustrates that the COVID-19 pandemic has significantly moderated the relationship between behavioral biases and investment decisions. Furthermore, clustering analyses and agent-based outcomes suggest that younger, less experienced agents prone to herding behavior exhibit a higher propensity for such behavior and demonstrate lower performance in agent-based models. These findings pave the way for further research into additional cognitive biases and socio-demographic variables’ effects on investment decisions.

Implications

This study contributes to the field of behavioral finance that COVID-19 pandemic information sharing significantly moderates the relationship between behavioral biases (e.g., investors’ sentiments, overconfidence, over/under reaction, and herd behavior) and investment decisions. Therefore, policy implications stem from findings are substantial, and thus addressing behavioral biases during COVID-19 pandemic to mitigate the market inefficiencies and promote better decision-making. First, this study suggests that investing in comprehensive financial education plans will enhance the financial literacy of investors and enable them to better recognize the behavioral biases during times of uncertainty and crises. Second, findings imply that accurate and transparent information sharing about COVID-19 pandemic can better mitigate the behavioral biases, especially government interventions (e.g., National Command and Coordination Centre) ensuring reliable information can lead the investors to make more rational and informed investment decisions during the time of uncertainty and crises. Last, findings provide insights to policy makers that pandemic news and developments significantly influenced behavioral biases of investment decisions (Khurshid et al. 2021 ). For example, news about number of causalities, infection rates, vaccine progress, government stimulus packages, or stock market downturns had immediate effects on behavioral biases especially when an investor is overconfidence, over/under reaction, and herd behavior. In this sense, enhancing information transparency about COVID-19 news in media can reduce the influence of sensationalized news on investor decisions.

Limitations and call for future research

This study significantly enhances the understanding of behavioral factors’ impact on investors’ decision-making processes, presenting important findings within the context of the COVID-19 pandemic. While these contributions are notable, the research is subject to certain limitations that pave the way for future exploration and deeper investigation into this complex field.

Firstly, the study underscores the necessity for further research to validate its results through larger sample sizes and a more diverse array of respondents. Adopting a longitudinal design could prove particularly insightful, enabling an analysis of behavioral biases across different stages of the pandemic and providing a dynamic perspective on how investor behaviors evolve over time.

In addition, there’s a highlighted opportunity for future studies to delve into the behaviors influencing institutional investor decisions within Pakistan. The complex decision-making processes and investment portfolios of institutional investors, coupled with challenges like data availability and the heterogeneity among institutions, present a fertile ground for investigation. Such research could unravel how various factors, including market conditions and macroeconomic assessments, impact institutional investment strategies.

The study also points out the need to broaden the investigation to include other potential behavioral factors beyond those focused on in the current research, such as loss aversion, personality traits, anchoring, and recency biases. Expanding the scope of behavioral factors examined could significantly enrich the behavioral finance field by offering a more comprehensive view of the influences on investment decisions.

Moreover, while the insights gained from a Pakistani context during the COVID-19 pandemic are invaluable, extending the research to include global (e.g., China, Japan, USA) and other emerging markets (e.g., BRICS) would enhance understanding of the universality or specificity of behavioral biases in investment decisions across various economic, cultural, and regulatory environments.

Lastly, the study’s reliance on quantitative data points to the potential benefits of incorporating qualitative data into future research. Undertaking case studies within specific securities brokerages or investment banks could provide an in-depth investigation of investor behavior, generating new insights that could inspire further research.

To support the development of more sophisticated agent-based models and to foster collaborative research efforts, the study makes its source code available to other researchers. This openness to collaboration promises to stimulate innovative approaches to understanding and modeling investor behavior across diverse contexts, contributing to the advancement of the behavioral finance field.

Author information

Authors and affiliations.

Department of Business Administration, University of the Punjab, Gujranwala Campus, Gujranwala, Pakistan

Wasim ul Rehman

Manager of Economics Research Department, Marbas Securities Co., Istanbul, Turkey

Omur Saltik

Institute of Quality and Technology Management, University of the Punjab, Lahore, Pakistan

Faryal Jalil

Department of Economics, Mersin University, Mersin, Turkey

Suleyman Degirmen

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed equally to this research work.

Corresponding author

Correspondence to Wasim ul Rehman .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

The data was collated through an online survey approach (questionnaire) during the last variant of COVID-19 where anonymity of the respondents is meticulously preserved. The respondents were not asked to provide their names, identification, address, or any other identifying elements. The authors minutely observed the ethical guidelines of the Declaration of Helsinki. In addition, we hereby certify that this study was conducted under the ethical approval guidelines of Office of Research Innovation and Commercialization, University of the Punjab granted under the office order No. D/ 409/ORIC dated 31-12-2021.

Informed consent

The consent of participants was obtained through consent form during the last variant of COVID-19. The consent form contains the title of study, intent of study, procedure to participate, confidentiality, voluntary participation of respondents, questions/query and consent of the respondents. The respondents were requested to provide their willingness to participate in survey on consent form via email before filling the online-surveyed (questionnaire). Further, participants were also assured that their anonymity would be maintained and that no personal information or identifying element would be disclosed. The consent form is in the supplementary files.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Consent form, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Rehman, W.u., Saltik, O., Jalil, F. et al. Viral decisions: unmasking the impact of COVID-19 info and behavioral quirks on investment choices. Humanit Soc Sci Commun 11 , 524 (2024). https://doi.org/10.1057/s41599-024-03011-7

Download citation

Received : 17 June 2023

Accepted : 28 March 2024

Published : 20 April 2024

DOI : https://doi.org/10.1057/s41599-024-03011-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

introduction in research methodology

  • Open access
  • Published: 20 April 2024

Interpretable machine learning in predicting drug-induced liver injury among tuberculosis patients: model development and validation study

  • Yue Xiao 1 ,
  • Yanfei Chen 1 ,
  • Ruijian Huang 1 ,
  • Feng Jiang 1 ,
  • Jifang Zhou 1   na1 &
  • Tianchi Yang 2   na1  

BMC Medical Research Methodology volume  24 , Article number:  92 ( 2024 ) Cite this article

Metrics details

The objective of this research was to create and validate an interpretable prediction model for drug-induced liver injury (DILI) during tuberculosis (TB) treatment.

A dataset of TB patients from Ningbo City was used to develop models employing the eXtreme Gradient Boosting (XGBoost), random forest (RF), and the least absolute shrinkage and selection operator (LASSO) logistic algorithms. The model's performance was evaluated through various metrics, including the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPR) alongside the decision curve. The Shapley Additive exPlanations (SHAP) method was used to interpret the variable contributions of the superior model.

A total of 7,071 TB patients were identified from the regional healthcare dataset. The study cohort consisted of individuals with a median age of 47 years, 68.0% of whom were male, and 16.3% developed DILI. We utilized part of the high dimensional propensity score (HDPS) method to identify relevant variables and obtained a total of 424 variables. From these, 37 variables were selected for inclusion in a logistic model using LASSO. The dataset was then split into training and validation sets according to a 7:3 ratio. In the validation dataset, the XGBoost model displayed improved overall performance, with an AUROC of 0.89, an AUPR of 0.75, an F1 score of 0.57, and a Brier score of 0.07. Both SHAP analysis and XGBoost model highlighted the contribution of baseline liver-related ailments such as DILI, drug-induced hepatitis (DIH), and fatty liver disease (FLD). Age, alanine transaminase (ALT), and total bilirubin (Tbil) were also linked to DILI status.

XGBoost demonstrates improved predictive performance compared to RF and LASSO logistic in this study. Moreover, the introduction of the SHAP method enhances the clinical understanding and potential application of the model. For further research, external validation and more detailed feature integration are necessary.

Peer Review reports

Drug-induced liver injury (DILI) presents significant challenges in the context of tuberculosis (TB) treatment. Anti-TB drugs exhibit noteworthy involvement in the occurrence of DILI [ 1 , 2 ], and the lack of certain early-detection biomarkers [ 3 ] further poses challenges to the timely diagnosis and management of DILI. This absence of early detection may result in treatment interruptions and failures amongst TB patients [ 4 , 5 ], impeding global TB eradication efforts [ 6 ]. In China, the elevated incidence rates of DILI in comparison to western nations highlight the potential involvement of traditional Chinese medicines (TCM) and herbal medicines in the development of DILI [ 7 ]. This requires addressing various challenges and complexities associated with DILI assessment in a comprehensive and objective manner. Therefore, the primary objective of this study is to develop an optimal predictive model for assessing DILI status, with a specific focus on TB patients within the Chinese context.

The emergence of machine learning (ML) algorithms presents an exciting opportunity to enhance DILI prediction models [ 8 ]. Among these, eXtreme Gradient Boosting (XGBoost) [ 9 ] and random forest (RF) [ 10 ] stand out as two widely-used ensemble learning techniques, each distinguished by its algorithmic approach and features. Selecting the most suitable option between them hinges on the particular characteristics of the data and the prediction objective. Therefore, it is often advisable to conduct experiments with both models to compare their performance.

Nevertheless, one of the primary challenges in implementing ML algorithms in clinical settings is interpreting the outcomes of the models [ 11 , 12 ]. The Shapley Additive exPlanations (SHAP) framework [ 13 ] provides insights into the influence of various features on model predictions and the effect of these features on the DILI status in individuals, thus bridging the interpretability gap.

This study focuses on the development and validation of a prediction model for DILI in the context of TB treatment by using advanced ML algorithms with SHAP interpretability. Through this endeavor, we aim to achieve a balance between accurate prediction and the interpretability of the model, which is crucial for its clinical application.

Data source

The study participants comprised individuals diagnosed with TB at specified hospitals in Ningbo from 1st January 2015 to 2nd January 2020, initially referred by the Chinese Center for Disease Control and Prevention (CDC) [ 14 ]. Thereafter, they were connected to administrative records obtained from the electronic health records (EHR) system employed by the local government [ 15 ]. The merged dataset comprised demographic information, hospitalization records (both inpatient and outpatient), laboratory tests, and medication profiles.

Exclusion criteria

To ensure consistency in the identification of covariates, individuals with only one health care encounter during the study period were excluded. Furthermore, individuals without ethnicity information and those under 18 years old at diagnosis were not included in the study. The exclusion criteria also filtered out misdiagnosed cases of DILI and liver injuries attributed to known factors like alcohol-related liver disease, non-alcoholic fatty liver disease (NAFLD), and viral hepatitis unrelated to drug-induced causes. The detailed flowchart is presented in Fig.  1 .

figure 1

Study schema for subject selection. Abbreviations: EHR, Electronic healthcare record; CDC, Center for Disease Control and Prevention

Baseline laboratory result collection

For patients included in the study, we defined the baseline period for collecting laboratory test results as from January 1, 2015, to the day before the index diagnosis of pulmonary tuberculosis, as shown in Supplemental Fig.  1 . Additionally, liver function test indicators such as alanine transaminase (ALT) or alkaline phosphatase (ALP) were simultaneously examined.

To address the issue of varied baseline definitions in laboratory testing, we utilized two main strategies. Firstly, we employed a binary variable approach to categorize laboratory testing indicators as abnormal or normal, by comparing their values with predefined normal ranges. Secondly, we utilized ratio-based representation to quantify indicator abnormalities, such as calculating ALT multiples relative to the upper limit of the normal (ULN) range.

Factor identification

In our research, we followed the initial steps outlined in the high dimensional propensity score (HDPS) methodology by Schneeweiss et al. [ 16 ]. First, we identified 24 common factors, such as age and gender, to integrate into our models. We then categorized our data into four dimensions: outpatient records, inpatient records, laboratory test records, and medication records. Following the approach of Chen et al. [ 17 ], we identified the top 500 most prevalent codes within each dimension. Next, we evaluated code recurrence, classifying codes into three binary variables based on their frequency of occurrence over a 12-month baseline period. This yielded a total of 4*500*3 binary factors. Using a multiplicative model considering binary factor and DILI status, we prioritized covariates and selected the top 400 for inclusion in our final model based on an arbitrary cutoff recommendation [ 18 , 19 ]. Finally, considering the previously specified 24 variables, our model training ultimately involved incorporating a total of 424 factors.

DILI diagnostic process

The determination of DILI outcomes followed the revised criteria set forth by the Chinese Society of Hepatology (CSH) DILI consensus, as outlined in Supplemental Table  1 [ 20 ].

Extraction of features used in prediction model

The LASSO regression method, aimed at reducing the number of variables and preventing overfitting [ 21 ], was applied to extract significant features for constructing the logistic model. Additionally, both the XGBoost and RF algorithms come equipped with their own feature selection techniques tailored to enhance their respective models.

Statistical analysis

The study reported the features of both the non-DILI and DILI groups by mean and standard deviation (SD) or as numbers and percentages whenever necessary. Laboratory variables were represented in median and quartiles [ 22 ]. The Kruskal–Wallis rank sum test was used for continuous variables, while the chi-square test was used for categorical variables. These analyses were conducted using the statistical software packages SAS 9.4 and R 4.0.3. A statistically significant result was determined with a two-sided P -value below 0.05.

Data splitting

In order to create training and validation sets, a stratified random function in R randomly assigned records at a 7:3 ratio, following conventional practices.

Parameter optimization

To optimize the parameters of the XGBoost and RF models, a ten-fold cross-validation process combined with grid search [ 23 ] was employed. This approach entailed identifying the hyperparameter set that yielded the maximum receiver operating characteristic (ROC). A detailed breakdown of the grid search particulars and optimal results can be found in Supplemental Table 2 .

Model evaluation and interpretation

To assess the model's capacity to differentiate between positive and negative cases, we computed both the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPR) [ 24 ]. Calibration was examined through reliability diagrams and Brier scores. Furthermore, the model's clinical utility was evaluated using decision curve analysis. The SHAP technique was utilized to delve deeper into variable contributions. A comprehensive overview of the workflow can be found in Supplemental Fig. 2 .

Participant and factor identification

The preliminary linkage of data yielded 12,087 instances. Following the application of exclusion criteria, a total of 7,071 subjects were identified as suitable for inclusion in the study.

During a one-year baseline period, we identified the 500 most prevalent codes across each data dimension (outpatient, inpatient, medication, and laboratory test) using the International Classification of Diseases-Tenth Revision (ICD-10), Current Procedural Terminology (CPT), and generic drug names. These items were then categorized into three binary variables: "ever occurring", "sporadically occurring", and "frequently occurring", indicating their recurrence. This process resulted in a total of 6,000 variables, from which the top 400 binary empirical variables were chosen based on their highest risk ratios associated with DILI status. Additionally, the final model incorporated 24 predefined baseline variables, such as gender, age, education level, medication, and maximum ratio of ULN for ALT, ALP, and Tbil, etc. Out of an initial pool of 424 features, 37 were selected for logistic model development using LASSO. The factors included in the LASSO logistic model are detailed in Supplemental Table 3 .

Epidemiology of DILI

The incidence of DILI was observed to be 16.3% overall, with a slightly higher observed in female patients (17.3% vs. 15.8%, p  = 0.134). Detailed demographics and clinical information are outlined in Table  1 . Compared to non-DILI individuals, those with DILI demonstrated lower educational attainment and a higher incidence of abnormal baseline levels in ALT and ALP [ALT: 91 (7.9%) vs. 273 (4.6%), p  < 0.001; ALP: 100 (8.7%) vs. 400 (6.8%), p  = 0.023]. Individuals of middle age, females, and those with pre-existing chronic liver conditions were found to have a higher susceptibility to DILI. Significant associations with DILI were identified for certain drugs, including pyrazinamide (PZA), isoniazid (INH), traditional Chinese medicines (TCM), and hepatoprotective agents such as silymarin and glycyrrhetinic acid.

Model development and validation

The XGBoost and RF models were constructed using optimal parameters obtained through the previously mentioned GridSearchCV method. The LASSO logistic model was constructed with the aforementioned variables. Internal validation was conducted by partitioning validation sets, resulting in a comparison of model performance among the three models showcased in Table 2 . The XGBoost model exhibited slightly superior discriminatory ability when compared with the RF and LASSO logistic model, with AUROC values of 0.89 versus 0.88/0.85 and AUPR values of 0.75 versus 0.73/0.67, respectively, as shown in Figs. 2 and 3 . The RF model demonstrated increased recall with a score of 0.78, while the XGBoost model achieved the highest F1-score of 0.57. Calibration was evaluated through ten predictive probability-based bins and verified by the reliability diagram presented in Fig. 4 , supported by a Brier score of 0.08, indicating the impressive alignment in calibration between the XGBoost and LASSO logistic models. Extensive analysis of the decision curve revealed positive net benefits for all models. Notably, XGBoost models outperformed both the RF and LASSO logistic models within the threshold range of approximately 0.2 to 0.5, as demonstrated in Fig.  5 .

figure 2

Comparison of the AUROC of the XGBoost, logistic and random forest in the validation set

figure 3

Comparison of the AUPR of the XGBoost, logistic and random forest in the validation set

figure 4

Comparison of the calibration curve of the XGBoost, logistic and random forest in the validation set

figure 5

The decision curve of the XGBoost, logistic and random forest in the validation set

Model interpretation

Revealing the factors that influenced the outperformed model's predictions, Fig.  6 laid out the most paramount features of XGBoost (with feature importance > 0.01). Of note, historical occurrences of DILI, DIH, and fatty liver disease (FLD) during the baseline phase were consistently highlighted. Moreover, the ULN for ALT, ALP and Tbil were also identified as critical factors. The SHAP values calculated for the XGBoost model, as shown in Supplemental Fig. 3 , indicate that individuals who had chronic liver disease during baseline were more likely to be in DILI status. Interestingly, we found that those with a lower educational level were more susceptible to DILI status. To gain a deeper understanding of the underlying mechanism and the effects of features in the XGBoost model, we randomly selected two typical patients from the dataset. Furthermore, we created force plots to visualize their decision process, as illustrated in Supplemental Fig.  4 and Supplemental Fig.  5 . The average SHAP value was 0.168, where yellow indicates a positive impact and purple represents a negative impact. In Supplemental Fig.  4 , the identified patient with a SHAP value of 1.06, surpassing the average, is likely to develop DILI. The significant influencing factor is being diagnosed with DILI or DIH at least once during the baseline period. The same rationale applies to the identified patient as depicted in Supplemental Fig.  5 . Additionally, Supplemental Fig.  6 presents a force plot that captures the aggregate effect in the validation set.

figure 6

Top important features selected by XGBoost (> 0.01). Abbreviations: ODILIO, outpatient drug-induced liver injury, once occurring; ODIHO, outpatient drug induced hepatitis, once occurring; ODIHS, outpatient drug induced hepatitis, sporadically occurring; IDIHO, inpatient drug induced hepatitis, once occurring; ODILIS, outpatient drug induced liver injury, sporadically occurring; IDIHF, inpatient drug induced hepatitis, frequently occurring; IDILIO, inpatient drug induced liver injury, once occurring; ODILIF, outpatient drug induced liver injury, frequently occurring; TBIL, total bilirubin; ALP, alkaline phosphatase; IDILIS, inpatient drug induced liver injury, sporadically occurring; ALT, alanine aminotransferase; FLD, fatty liver disease

To our knowledge, this study represents the initial attempt to evaluate the prediction for DILI in an Asian population, predominantly of Han ethnicity, with TB using regional electronic health records. We observed slightly enhanced discrimination abilities in ML models compared to the logistic model. While logistic regression offers better clinical generalizability, it struggles with overfitting and handling missing variables, resulting in overall weaker performance than anticipated. In contrast, both XGBoost and RF employ more advanced techniques. XGBoost utilizes gradient boosting, progressively building weak learners and effectively capturing non-linear relationships with built-in regularization. On the other hand, RF, a bagging ensemble method, constructs independent decision trees on random subsets of data, resulting in robust averaging but with less explicit regularization. XGBoost excels in capturing intricate non-linear patterns, making it suitable for tasks involving complex and dynamic interactions like predicting DILI during TB treatment. Its training efficiency is also evident when handling large datasets. RF, with its robust averaging, is well-suited for further application in diverse datasets but may encounter challenges in effectively capturing subtle non-linear patterns among multiple explanatory variables.

Several prior studies have identified risk factors associated with DILI during TB treatment, involving chronic liver disease, specific drug combinations, age, and various demographic characteristics [ 25 , 26 , 27 ]. Lammert et al. [ 28 ] suggested an increased risk of DILI in patients with chronic liver disease indicative of NAFLD. Chang et al. [ 29 ] indicated a significant rise in hepatotoxicity risk associated with adding PZA to INH and RIF. Hosford et al. [ 30 ] established a notable elevation in hepatotoxicity risk among individuals over 60 years of age through a systematic literature review. Abbara et al. [ 2 ] found low patient weight, HIV-1 co-infection, higher baseline ALP levels, and alcohol intake were risk factors. Thus, in our model, we predefined enzyme levels, utilization of anti-TB drugs such as PZA, INH, and RIF, hepatoprotective agents such as silymarin and glycyrrhetinic acid, alcohol intake, and demographic variables such as age, gender, education level, ethnicity, profession as predictors. In the ultimate XGBoost model, the contribution weights for chronic liver disease, ULN of ALT, ALP, Tbil, and age surpass 0.01, consistent with earlier research discoveries.

Currently, a range of predictive models for DILI primarily operates at the molecular level in preclinical settings [ 31 ], utilizing diverse artificial intelligence assisted algorithms [ 32 ]. Minerali et al. [ 33 ] employed the Bayesian ML method, resulting in an AUROC of 0.81, 74% sensitivity, 76% specificity, and 75% accuracy. Xu et al. [ 34 ] proposed a deep learning model, achieving 87% accuracy, 83% sensitivity, 93% specificity, and an AUROC of 0.96. Dominic et al.'s Bayesian prediction model [ 35 ] demonstrated balanced performance with 86% accuracy, 87% sensitivity, 85% specificity, 92% positive predictive value, and 78% negative predictive value. In the clinical stage, only Zhong et al. introduced a single tree XGBoost model with 90% precision, 74% recall, and 76% classification accuracy for DILI prediction, using a clinical sample of 743 TB cases [ 36 ]. In our study, we leveraged regional healthcare data and employed the XGBoost algorithm. The model exhibited 76% recall, 82% specificity, and 81% accuracy in predicting DILI status. Our approach was proven robust, as evidenced by a mean AUROC of 0.89 and AUPR of 0.75 upon tenfold cross validation. During the clinical treatment stage, our model exhibited high levels of accuracy and interpretability.

The choice of a cutoff in a DILI prediction model is crucial and depends on specific study goals and requirements. Various studies have investigated optimal cutoff values in DILI prediction models to enhance understanding and prediction accuracy. For instance, in a study focused on drug-induced liver tumors, the maximum Youden index was utilized to determine the ideal cutoff point [ 37 ]. Another study, aimed at predicting DILI and cardiotoxicity, determined 0.4 as the optimal cutoff value using chemical structure and in vitro assay data [ 38 ]. Similarly, a system named DILIps, designed to predict DILI in drug safety, utilized the ROC curve to select the best cutoff value [ 39 ]. Given the imbalanced dataset in our study, we found the precision recall curve method seemed to be more appropriate. Additionally, considering the severe consequences of DILI, prioritizing the detection of DILI suggests choosing a lower cutoff to maximize sensitivity. Thus, in our study, we opted for the maximum Youden index as the best cutoff.

However, the acceptability of ML in the medical community faces a significant hurdle regarding interpretability, particularly in settings where clinical decisions are paramount. Our research employed SHAP strategies to illuminate the complex mechanisms of the XGBoost model.

Strengths and limitations

The study utilized a large dataset of over 7,000 TB patients to develop a robust model and comprehensively included clinical, demographic, and biochemical variables to improve predictive accuracy. Furthermore, the model incorporates SHAP analysis to improve interpretability. However, as we embark on the integration of ML into clinical settings, a vital concern persists regarding the generalizability of models [ 40 ]. While our model demonstrates enhanced predictive accuracy, it's important to recognize the inherent limitations stemming from the lack of external validation. Patient characteristics [ 41 ] and drug interactions [ 42 ] may differ widely across populations. This underscores the importance of validating models on diverse patient cohorts and geographical regions. Moreover, the study's reliance on a data-driven approach and the inherent complexity of integrating ML models into clinical practice present additional limitations [ 43 ]. Additionally, the dependence on clinical diagnosis for DILI and the potential influence of unmeasured variables on model accuracy are acknowledged. While the study's findings offer valuable insights, careful consideration is warranted when interpreting them.

Conclusions

XGBoost shows improved predictive performance compared to RF and LASSO logistics in this study. Moreover, introducing the SHAP method enhances the clinical understanding and potential application of the model. For further research, external validation and more detailed feature integration are necessary.

Code availability statement

To enhance reproducibility and facilitate peer review, we uploaded the code used for model fitting. The source code associated with this research is available on the GitHub repository ( https://github.com/cpu-pharmacoepi/TB-DILI ). For inquiries or assistance related to the code, please contact 1,020,202,[email protected].

Availability of data and materials

The datasets used and analyzed during the current study are available from the corresponding author on reasonable request. Data cannot be shared publicly because of privacy and confidentially of the TB patients in Ningbo, Zhejiang, China.

Abbreviations

Alkaline phosphatase

Alanine transaminase

Area under the precision recall curve

Area under the receiver operating characteristic curve

Center for Disease Control and Prevention

Current procedural terminology

Chinese Society of Hepatology

Drug-induced hepatitis

  • Drug-induced liver injury

Electronic healthcare record

Fatty liver disease

High dimensional propensity score

International Classification of Diseases-Tenth Revision

International normalized ratio

Least Absolute Shrinkage and Selection Operator

  • Machine learning

Nonalcoholic fatty liver disease

Pyrazinamide

Random Forest

Receiver operating characteristic

Standard deviation

Shapley Additive exPlanations

Standardized mean difference

  • Tuberculosis

Total serum bilirubin

Traditional Chinese medicine

Upper limit of normal

EXtreme Gradient Boosting

Jiang F, Yan H, Liang L, et al. Incidence and risk factors of anti-tuberculosis drug induced liver injury (DILI): Large cohort study involving 4,652 Chinese adult tuberculosis patients. Liver Int. 2021;41(7):1565–75.

Article   CAS   PubMed   Google Scholar  

Abbara A, Chitty S, Roe JK, et al. Drug-induced liver injury from antituberculosis treatment: a retrospective study from a large TB center in the UK. BMC Infect Dis. 2017;17:231.

Article   PubMed   PubMed Central   Google Scholar  

Council for International Organizations Medical Sciences. Drug-induced liver injury. Geneva: CIMOS; 2020. Available from: https://cioms.ch/wp-content/uploads/2020/06/CIOMS_DILI_Web_16Jun2020.pdf . Accessed 01 Mar 2021

Nahid P, Dorman SE, Alipanah N, et al. Official American Thoracic Society/Centers for Disease Control and Prevention/Infectious Diseases Society of America Clinical Practice Guidelines: Treatment of Drug-Susceptible Tuberculosis. Clin Infect Dis. 2016;63(7):e147–95.

Stravitz RT. WM Lee. Acute liver failure The Lancet. 2019;394(10201):869–81.

CAS   Google Scholar  

World Health Organization. Global tuberculosis report. Geneva: WHO; 2020. Available from: https://www.who.int/tb/publications/global_report/en/ .

Shen T, Liu Y, Shang J, et al. Incidence and Etiology of Drug-Induced Liver Injury in Mainland China. Gastroenterology. 2019;156(8):2230-2241.e11.

Article   PubMed   Google Scholar  

Sarker IH. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN COMPUT. 2021;2:160.

Article   Google Scholar  

Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM; 2016;785–795.

Breiman L. Random Forests. Mach Learn. 2001;45:5–32.

Bjerregaard SS. Exploring predictors of welfare dependency 1, 3, and 5 years after mental health-related absence in Danish municipalities between 2010 and 2012 using flexible machine learning modelling. BMC Public Health. 2023;23(1):224.

Alan I, Andrew P, Catherine BH. Visualizing Variable Importance and Variable Interaction Effects in Machine Learning Models. J Comput Graph Stat. 2022;31(3):766–78.

Lu S, Chen R, Wei W, et al. Understanding Heart Failure Patients EHR Clinical Features via SHAP Interpretation of Tree-Based Machine Learning Model Predictions. AMIA Annu Symp Proc. 2022;2021:813–22.

PubMed   PubMed Central   Google Scholar  

Jiang WX, Huang F, Tang SL, et al. Implementing a new tuberculosis surveillance system in Zhejiang, Jilin and Ningxia: improvements, challenges and implications for China’s National Health Information System. Infect Dis Poverty. 2021;10(1):22.

Liu Z, Zhang L, Yang Y, et al. Active Surveillance of Adverse Events Following Human Papillomavirus Vaccination: Feasibility Pilot Study Based on the Regional Health Care Information Platform in the City of Ningbo, China. J Med Internet Res. 2020;22(6): e17446.

Schneeweiss S. Automated data-adaptive analytics for electronic healthcare data to study causal treatment effects. Clin Epidemiol. 2018;10:771–88.

Chen Q, Hu A, Ma A, et al. Effectiveness of Prophylactic Use of Hepatoprotectants for Tuberculosis Drug-Induced Liver Injury: A Population-Based Cohort Analysis Involving 6,743 Chinese Patients. Front Pharmacol. 2022;20(13): 813682.

Polinski JM, Schneeweiss S, Glynn RJ, et al. Confronting “confounding by health system use” in Medicare Part D: comparative effectiveness of propensity score approaches to confounding adjustment. Pharmacoepidemiol Drug Saf. 2012;21(Suppl 2):90–8.

Schneeweiss S, Rassen JA, Glynn RJ, et al. High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009;20(4):512–22.

Yu YC, Mao YM, Chen CW, et al. CSH guidelines for the diagnosis and treatment of drug-induced liver injury. Hepatol Int. 2017;11(3):221–41.

Sun L, Wang Q, Liu M, et al. Albumin binding function is a novel biomarker for early liver damage and disease progression in non-alcoholic fatty liver disease. Endocrine. 2020;69:294–302.

James G, Witten D, Hastie T, et al. An introduction to statistical learning: with applications in R. New York: Springer; 2013.

Book   Google Scholar  

Sattar N, Scherbakova O, Ford I, et al. Elevated alanine aminotransferase predicts new-onset type 2 diabetes independently of classical risk factors, metabolic syndrome, and C-reactive protein in the west of Scotland coronary prevention study. Diabetes. 2004;53(11):2855–60.

Coyner AS, Chen JS, Singh P, et al. Single-Examination Risk Prediction of Severe Retinopathy of Prematurity. Pediatrics. 2021;148(6): e2021051772.

Cao J, Mi Y, Shi C, et al. First-line anti-tuberculosis drugs induce hepatotoxicity: A novel mechanism based on a urinary metabolomics platform. Biochem Biophys Res Commun. 2018;497(2):485–91.

Tweed CD, Wills GH, Crook AM, et al. Liver toxicity associated with tuberculosis chemotherapy in the REMoxTB study. BMC Med. 2018;16(1):46.

Patterson B, Abbara A, Collin S, et al. Predicting drug-induced liver injury from anti-tuberculous medications by early monitoring of liver tests. J Infect. 2021;82(2):240–4.

Lammert C, Imler T, Teal E, et al. Patients With Chronic Liver Disease Suggestive of Nonalcoholic Fatty Liver Disease May Be at Higher Risk for Drug-Induced Liver Injury. Clin Gastroenterol Hepatol. 2019;17(13):2814–5.

Chang KC, Leung CC, Yew WW, et al. Hepatotoxicity of pyrazinamide: cohort and case-control analyses. Am J Respir Crit Care Med. 2008;177(12):1391–6.

Hosford JD, von Fricken ME, Lauzardo M, et al. Hepatotoxicity from antituberculous therapy in the elderly: a systematic review. Tuberculosis (Edinb). 2015;95(2):112–22.

Chen M, Bisgin H, Tong L, et al. Toward predictive models for drug-induced liver injury in humans: are we there yet? Biomark Med. 2014;8(2):201–13.

Vall A, Sabnis Y, Shi J, et al. The Promise of AI for DILI Prediction. Front Artif Intell. 2021;14(4): 638410.

Minerali E, Foil DH, Zorn KM, et al. Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI). Mol Pharm. 2020;17(7):2628–37.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Xu Y, Dai Z, Chen F, et al. Deep Learning for Drug-Induced Liver Injury. J Chem Inf Model. 2015;55(10):2085–93.

Williams DP, Lazic SE, Foster AJ, et al. Predicting Drug-Induced Liver Injury with Bayesian Machine Learning. Chem Res Toxicol. 2020;33(1):239–48.

Zhong T, Zhuang Z, Dong X, et al. Predicting Antituberculosis Drug-Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study. JMIR Med Inform. 2021;9(7): e29226.

Linden A. Measuring diagnostic and predictive accuracy in disease management: an introduction to receiver operating characteristic (ROC) analysis. J Eval Clin Pract. 2006;12(2):132–9.

Ye L, Ngan DK, Xu T, et al. Prediction of drug-induced liver injury and cardiotoxicity using chemical structure and in vitro assay data. Toxicol Appl Pharmacol. 2022;1(454): 116250.

Liu Z, Shi Q, Ding D, et al. Translating clinical findings into knowledge in drug safety evaluation–drug induced liver injury prediction system (DILIps). PLoS Comput Biol. 2011;7(12): e1002310.

Fisher S, Rosella LC. Priorities for successful use of artificial intelligence by public health organizations: a literature review. BMC Public Health. 2022;22:2146.

Obermeyer Z, et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–53.

Juurlink David N. Drug-drug interactions among elderly patients hospitalized for drug toxicity. JAMA. 2003;289(13):1652–8.

Luo W, Phung D, Tran T, et al. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View. J Med Internet Res. 2016;18(12): e323.

Download references

Acknowledgements

The authors thank all staff of the tuberculosis control centers, designated hospitals, community health service centers, and township health centers in ten counties/districts from Ningbo for their hard work and help in collecting clinical data. We also thank our colleagues from Ningbo Health Information Center for providing clinically relevant data for this study.

Disclosure of AI tools

We hereby disclose that generative AI tools were not utilized in the preparation or analysis of data presented in this manuscript. All methodologies and analyses were conducted utilizing established statistical and machine learning techniques as outlined in the Method section.

This research was supported by Zhejiang Medical Research Project(2018KY733) and Natural Science Foundation of Ningbo (2019A610386, 2019A610385). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.

Author information

Jifang Zhou and Tianchi Yang are both authors contributed equally to this work and shared corresponding authorship.

Authors and Affiliations

School of International Pharmaceutical Business, China Pharmaceutical University, Nanjing, Jiangsu, China

Yue Xiao, Yanfei Chen, Ruijian Huang, Feng Jiang & Jifang Zhou

Institute of Tuberculosis Prevention and Control, Ningbo Municipal Center for Disease Control and Prevention, No.237, Yongfeng Road, Ningbo, Zhejiang, China

Tianchi Yang

You can also search for this author in PubMed   Google Scholar

Contributions

All authors were involved in the design of the study, FJ and RH cleaned data and constructed the cohort; YC was involved in conceptualizing the study; YX and JZ were responsible for the analysis of the data and interpretation of the results.; YX, JZ and TY contributed to the drafting of the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Jifang Zhou or Tianchi Yang .

Ethics declarations

Ethics approval and consent to participate.

All aspects of this study, including research methods were conducted in strict accordance with relevant guidelines and regulations. This study was conducted in compliance with the ethical principles outlined in the Declaration of Helsinki. All patient data in the database were de-identified, and this study was determined to be exempt by the Institutional Review Board of the Ningbo Municipal Center for Disease Control and Prevention. Written informed consent was waived for the present study. The institutional Review Board of the Ningbo Municipal Center for Disease Control and Prevention waived the need for informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Xiao, Y., Chen, Y., Huang, R. et al. Interpretable machine learning in predicting drug-induced liver injury among tuberculosis patients: model development and validation study. BMC Med Res Methodol 24 , 92 (2024). https://doi.org/10.1186/s12874-024-02214-5

Download citation

Received : 09 October 2023

Accepted : 10 April 2024

Published : 20 April 2024

DOI : https://doi.org/10.1186/s12874-024-02214-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Logistic regression
  • Retrospective study

BMC Medical Research Methodology

ISSN: 1471-2288

introduction in research methodology

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

6 Common Leadership Styles — and How to Decide Which to Use When

  • Rebecca Knight

introduction in research methodology

Being a great leader means recognizing that different circumstances call for different approaches.

Research suggests that the most effective leaders adapt their style to different circumstances — be it a change in setting, a shift in organizational dynamics, or a turn in the business cycle. But what if you feel like you’re not equipped to take on a new and different leadership style — let alone more than one? In this article, the author outlines the six leadership styles Daniel Goleman first introduced in his 2000 HBR article, “Leadership That Gets Results,” and explains when to use each one. The good news is that personality is not destiny. Even if you’re naturally introverted or you tend to be driven by data and analysis rather than emotion, you can still learn how to adapt different leadership styles to organize, motivate, and direct your team.

Much has been written about common leadership styles and how to identify the right style for you, whether it’s transactional or transformational, bureaucratic or laissez-faire. But according to Daniel Goleman, a psychologist best known for his work on emotional intelligence, “Being a great leader means recognizing that different circumstances may call for different approaches.”

introduction in research methodology

  • RK Rebecca Knight is a journalist who writes about all things related to the changing nature of careers and the workplace. Her essays and reported stories have been featured in The Boston Globe, Business Insider, The New York Times, BBC, and The Christian Science Monitor. She was shortlisted as a Reuters Institute Fellow at Oxford University in 2023. Earlier in her career, she spent a decade as an editor and reporter at the Financial Times in New York, London, and Boston.

Partner Center

IMAGES

  1. Introduction to Research Methodology

    introduction in research methodology

  2. (PDF) Research Methodology: An Introduction

    introduction in research methodology

  3. PPT

    introduction in research methodology

  4. Research Methodology Brief Description

    introduction in research methodology

  5. (PDF) CHAPTER THREE RESEARCH METHODOLOGY 3.0. Introduction

    introduction in research methodology

  6. PPT

    introduction in research methodology

VIDEO

  1. Research Methodology Sinhala / What is Research /Episode 1/ Dr Chaminda Malalasekara /

  2. Englishစာ သင်ထောက်ကူ _ Lessons / Introduction & Research Methodology

  3. Referencing Basics (Part 1b)

  4. Introduction to Research Methodology🎧 #research #researchmethodology #bs #typesofresearch

  5. Metho 6: The Research Process (Introduction)

  6. Important questions on Research Methodology

COMMENTS

  1. What is Research Methodology? Definition, Types, and Examples

    Definition, Types, and Examples. Research methodology 1,2 is a structured and scientific approach used to collect, analyze, and interpret quantitative or qualitative data to answer research questions or test hypotheses. A research methodology is like a plan for carrying out research and helps keep researchers on track by limiting the scope of ...

  2. What Is a Research Methodology?

    Step 1: Explain your methodological approach. Step 2: Describe your data collection methods. Step 3: Describe your analysis method. Step 4: Evaluate and justify the methodological choices you made. Tips for writing a strong methodology chapter. Other interesting articles.

  3. Research Methodology

    Research methodology formats can vary depending on the specific requirements of the research project, but the following is a basic example of a structure for a research methodology section: I. Introduction. Provide an overview of the research problem and the need for a research methodology section; Outline the main research questions and ...

  4. Research Methods

    Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design. When planning your methods, there are two key decisions you will make. First, decide how you will collect data. Your methods depend on what type of data you need to answer your research question:

  5. Research Methodology: An Introduction

    2.1 Research Methodology. Method can be described as a set of tools and techniques for finding something out, or for reducing levels of uncertainty. According to Saunders (2012) method is the technique and procedures used to obtain and analyse research data, including for example questionnaires, observation, interviews, and statistical and non-statistical techniques [].

  6. Writing a Research Paper Introduction

    Table of contents. Step 1: Introduce your topic. Step 2: Describe the background. Step 3: Establish your research problem. Step 4: Specify your objective (s) Step 5: Map out your paper. Research paper introduction examples. Frequently asked questions about the research paper introduction.

  7. PDF Chapter 1 Introduction to Research Methodology

    The research design is a fundamental aspect of research methodology, outlining the overall strategy and structure of the study. It includes decisions regarding the research type (e.g., descriptive, experimental), the selection of variables, and the determination of the study's scope and timeframe. We must carefully consider the design to ...

  8. What Is Research Methodology? Definition + Examples

    As we mentioned, research methodology refers to the collection of practical decisions regarding what data you'll collect, from who, how you'll collect it and how you'll analyse it. Research design, on the other hand, is more about the overall strategy you'll adopt in your study. For example, whether you'll use an experimental design ...

  9. 6. The Methodology

    The introduction to your methodology section should begin by restating the research problem and underlying assumptions underpinning your study. This is followed by situating the methods you used to gather, analyze, and process information within the overall "tradition" of your field of study and within the particular research design you ...

  10. Introduction to Research Methodology

    The research design is a fundamental aspect of research methodology, outlining the overall strategy and structure of the study. It includes decisions regarding the research type (e.g., descriptive, experimental), the selection of variables, and the determination of the study's scope and timeframe. We must carefully consider the design to ...

  11. Introduction to Research Methods

    Features. Preview. The Second Edition of Introduction to Research Methods: A Hands-On Approach by Bora Pajo continues to make research easy to understand and easy to construct. Covering both quantitative and qualitative methods, this new edition lays out the differences between research approaches so readers can better understand when and how ...

  12. (PDF) Introduction to research: Mastering the basics

    Accepted February 25, 2023. This paper provides an in-depth introduction to r esearch methods. and discusses numerous aspects r elated to the r esearch process. It. begins with an overview of ...

  13. CHAPTER 3 METHODOLOGY 1. INTRODUCTION

    2. RESEARCH DESIGN. This research is exploratory in nature as it attempts to explore the experiences of mothers of incest survivors. Their subjective perceptions formed the core data of the study; hence it needed the method that would deal with the topic in an exploratory nature. For the purpose of this study, the research paradigm that was ...

  14. How To Write The Methodology Chapter

    Do yourself a favour and start with the end in mind. Section 1 - Introduction. As with all chapters in your dissertation or thesis, the methodology chapter should have a brief introduction. In this section, you should remind your readers what the focus of your study is, especially the research aims. As we've discussed many times on the blog ...

  15. (PDF) An Introduction to Research Methodology

    1. From the French word "recherche" which means to. travel through or survey. Research is an active, diligent and systematic process of. inquiry in order to discover, interpret or revise facts ...

  16. How to Write a Research Paper Introduction (with Examples)

    Define your specific research problem and problem statement. Highlight the novelty and contributions of the study. Give an overview of the paper's structure. The research paper introduction can vary in size and structure depending on whether your paper presents the results of original empirical research or is a review paper.

  17. (PDF) Chapter 1: Introduction to Research Methodology

    Chapter 1: Introduction to Research Methodology. September 2023. In book: Research Methodology Lectures for Postgraduate Students (pp.1-11) Authors: Amer Al-ani. (University of Anbar - Iraq)

  18. PDF Research Methodology: An Introduction Meaning Of Research

    Research Methodology: An Introduction Meaning Of Research Research may be very broadly defined as systematic gathering of data and information and its analysis for advancement of knowledge in any subject. Research attempts to find answer intellectual and practical questions through application of systematic methods.

  19. What Is a Research Design

    A research design is a strategy for answering your research question using empirical data. Creating a research design means making decisions about: Your overall research objectives and approach. Whether you'll rely on primary research or secondary research. Your sampling methods or criteria for selecting subjects. Your data collection methods.

  20. Research Methodology Introduction

    Research methodology can be defined as a scientific and systematic search for pertinent information or facts on the specific topic. In fact, it is an art of scientific information. Research in common parlance refers to a search for knowledge. You can undertake research within most professions, more than a set of skills, it is a way.

  21. Introduction to Research Methodology

    The purpose of the research is clear. It follows a systematic process. It is logical and objective. It is easy to understand. Research starts with a problem/question. It creates a path for generating new questions. It is ethical. It resolves any current problems/issues. The data of the research is appropriate.

  22. Viral decisions: unmasking the impact of COVID-19 info and ...

    The study will commence with an introduction that outlines the scope and significance of the research. ... (2004) Research methodology: Methods and techniques, 2nd Edition. New Age International ...

  23. Interpretable machine learning in predicting drug-induced liver injury

    Background The objective of this research was to create and validate an interpretable prediction model for drug-induced liver injury (DILI) during tuberculosis (TB) treatment. Methods A dataset of TB patients from Ningbo City was used to develop models employing the eXtreme Gradient Boosting (XGBoost), random forest (RF), and the least absolute shrinkage and selection operator (LASSO) logistic ...

  24. (PDF) Introduction to research methodology

    Dr. Kavitha Chalakkal. Methodology is crucial -an unreliable method produces unreliable. results and, as a consequence, undermines the value of your. analysis of the findings. The methodology ...

  25. Constraining global transport of perfluoroalkyl acids on sea spray

    INTRODUCTION. Per- and polyfluoroalkyl substances ... transport, and fate is an urgent research need. PFAAs released into the environment are expected to be ... and writing (review and editing). S.M.B.: Investigation and writing (review and editing). I.T.C.: Conceptualization, methodology, resources, funding acquisition, data curation ...

  26. What Is Qualitative Research?

    Qualitative research involves collecting and analyzing non-numerical data (e.g., text, video, or audio) to understand concepts, opinions, or experiences. It can be used to gather in-depth insights into a problem or generate new ideas for research. Qualitative research is the opposite of quantitative research, which involves collecting and ...

  27. Agronomy

    The phytochrome-interacting factor (PIF) proteins are part of a subfamily of basic helix-loop-helix (bHLH) transcription factors that integrate with phytochromes (PHYs) and are known to play important roles in adaptive changes in plant architecture. However, the characterization and function of PIFs in potatoes are currently poorly understood. In this study, we identified seven PIF members ...

  28. 6 Common Leadership Styles

    Summary. Research suggests that the most effective leaders adapt their style to different circumstances — be it a change in setting, a shift in organizational dynamics, or a turn in the business ...