Table of Contents

Types of statistical analysis, importance of statistical analysis, benefits of statistical analysis, statistical analysis process, statistical analysis methods, statistical analysis software, statistical analysis examples, career in statistical analysis, choose the right program, become proficient in statistics today, what is statistical analysis types, methods and examples.

What Is Statistical Analysis?

Statistical analysis is the process of collecting and analyzing data in order to discern patterns and trends. It is a method for removing bias from evaluating data by employing numerical analysis. This technique is useful for collecting the interpretations of research, developing statistical models, and planning surveys and studies.

Statistical analysis is a scientific tool in AI and ML that helps collect and analyze large amounts of data to identify common patterns and trends to convert them into meaningful information. In simple words, statistical analysis is a data analysis tool that helps draw meaningful conclusions from raw and unstructured data. 

The conclusions are drawn using statistical analysis facilitating decision-making and helping businesses make future predictions on the basis of past trends. It can be defined as a science of collecting and analyzing data to identify trends and patterns and presenting them. Statistical analysis involves working with numbers and is used by businesses and other institutions to make use of data to derive meaningful information. 

Given below are the 6 types of statistical analysis:

Descriptive Analysis

Descriptive statistical analysis involves collecting, interpreting, analyzing, and summarizing data to present them in the form of charts, graphs, and tables. Rather than drawing conclusions, it simply makes the complex data easy to read and understand.

Inferential Analysis

The inferential statistical analysis focuses on drawing meaningful conclusions on the basis of the data analyzed. It studies the relationship between different variables or makes predictions for the whole population.

Predictive Analysis

Predictive statistical analysis is a type of statistical analysis that analyzes data to derive past trends and predict future events on the basis of them. It uses machine learning algorithms, data mining , data modelling , and artificial intelligence to conduct the statistical analysis of data.

Prescriptive Analysis

The prescriptive analysis conducts the analysis of data and prescribes the best course of action based on the results. It is a type of statistical analysis that helps you make an informed decision. 

Exploratory Data Analysis

Exploratory analysis is similar to inferential analysis, but the difference is that it involves exploring the unknown data associations. It analyzes the potential relationships within the data. 

Causal Analysis

The causal statistical analysis focuses on determining the cause and effect relationship between different variables within the raw data. In simple words, it determines why something happens and its effect on other variables. This methodology can be used by businesses to determine the reason for failure. 

Statistical analysis eliminates unnecessary information and catalogs important data in an uncomplicated manner, making the monumental work of organizing inputs appear so serene. Once the data has been collected, statistical analysis may be utilized for a variety of purposes. Some of them are listed below:

  • The statistical analysis aids in summarizing enormous amounts of data into clearly digestible chunks.
  • The statistical analysis aids in the effective design of laboratory, field, and survey investigations.
  • Statistical analysis may help with solid and efficient planning in any subject of study.
  • Statistical analysis aid in establishing broad generalizations and forecasting how much of something will occur under particular conditions.
  • Statistical methods, which are effective tools for interpreting numerical data, are applied in practically every field of study. Statistical approaches have been created and are increasingly applied in physical and biological sciences, such as genetics.
  • Statistical approaches are used in the job of a businessman, a manufacturer, and a researcher. Statistics departments can be found in banks, insurance businesses, and government agencies.
  • A modern administrator, whether in the public or commercial sector, relies on statistical data to make correct decisions.
  • Politicians can utilize statistics to support and validate their claims while also explaining the issues they address.

Become a Data Science & Business Analytics Professional

  • 28% Annual Job Growth By 2026
  • 11.5 M Expected New Jobs For Data Science By 2026

Data Analyst

  • Industry-recognized Data Analyst Master’s certificate from Simplilearn
  • Dedicated live sessions by faculty of industry experts

Data Scientist

  • Add the IBM Advantage to your Learning
  • 25 Industry-relevant Projects and Integrated labs

Here's what learners are saying regarding our programs:

Gayathri Ramesh

Gayathri Ramesh

Associate data engineer , publicis sapient.

The course was well structured and curated. The live classes were extremely helpful. They made learning more productive and interactive. The program helped me change my domain from a data analyst to an Associate Data Engineer.

A.Anthony Davis

A.Anthony Davis

Simplilearn has one of the best programs available online to earn real-world skills that are in demand worldwide. I just completed the Machine Learning Advanced course, and the LMS was excellent.

Statistical analysis can be called a boon to mankind and has many benefits for both individuals and organizations. Given below are some of the reasons why you should consider investing in statistical analysis:

  • It can help you determine the monthly, quarterly, yearly figures of sales profits, and costs making it easier to make your decisions.
  • It can help you make informed and correct decisions.
  • It can help you identify the problem or cause of the failure and make corrections. For example, it can identify the reason for an increase in total costs and help you cut the wasteful expenses.
  • It can help you conduct market analysis and make an effective marketing and sales strategy.
  • It helps improve the efficiency of different processes.

Given below are the 5 steps to conduct a statistical analysis that you should follow:

  • Step 1: Identify and describe the nature of the data that you are supposed to analyze.
  • Step 2: The next step is to establish a relation between the data analyzed and the sample population to which the data belongs. 
  • Step 3: The third step is to create a model that clearly presents and summarizes the relationship between the population and the data.
  • Step 4: Prove if the model is valid or not.
  • Step 5: Use predictive analysis to predict future trends and events likely to happen. 

Although there are various methods used to perform data analysis, given below are the 5 most used and popular methods of statistical analysis:

Mean or average mean is one of the most popular methods of statistical analysis. Mean determines the overall trend of the data and is very simple to calculate. Mean is calculated by summing the numbers in the data set together and then dividing it by the number of data points. Despite the ease of calculation and its benefits, it is not advisable to resort to mean as the only statistical indicator as it can result in inaccurate decision making. 

Standard Deviation

Standard deviation is another very widely used statistical tool or method. It analyzes the deviation of different data points from the mean of the entire data set. It determines how data of the data set is spread around the mean. You can use it to decide whether the research outcomes can be generalized or not. 

Regression is a statistical tool that helps determine the cause and effect relationship between the variables. It determines the relationship between a dependent and an independent variable. It is generally used to predict future trends and events.

Hypothesis Testing

Hypothesis testing can be used to test the validity or trueness of a conclusion or argument against a data set. The hypothesis is an assumption made at the beginning of the research and can hold or be false based on the analysis results. 

Sample Size Determination

Sample size determination or data sampling is a technique used to derive a sample from the entire population, which is representative of the population. This method is used when the size of the population is very large. You can choose from among the various data sampling techniques such as snowball sampling, convenience sampling, and random sampling. 

Everyone can't perform very complex statistical calculations with accuracy making statistical analysis a time-consuming and costly process. Statistical software has become a very important tool for companies to perform their data analysis. The software uses Artificial Intelligence and Machine Learning to perform complex calculations, identify trends and patterns, and create charts, graphs, and tables accurately within minutes. 

Look at the standard deviation sample calculation given below to understand more about statistical analysis.

The weights of 5 pizza bases in cms are as follows:

Calculation of Mean = (9+2+5+4+12)/5 = 32/5 = 6.4

Calculation of mean of squared mean deviation = (6.76+19.36+1.96+5.76+31.36)/5 = 13.04

Sample Variance = 13.04

Standard deviation = √13.04 = 3.611

A Statistical Analyst's career path is determined by the industry in which they work. Anyone interested in becoming a Data Analyst may usually enter the profession and qualify for entry-level Data Analyst positions right out of high school or a certificate program — potentially with a Bachelor's degree in statistics, computer science, or mathematics. Some people go into data analysis from a similar sector such as business, economics, or even the social sciences, usually by updating their skills mid-career with a statistical analytics course.

Statistical Analyst is also a great way to get started in the normally more complex area of data science. A Data Scientist is generally a more senior role than a Data Analyst since it is more strategic in nature and necessitates a more highly developed set of technical abilities, such as knowledge of multiple statistical tools, programming languages, and predictive analytics models.

Aspiring Data Scientists and Statistical Analysts generally begin their careers by learning a programming language such as R or SQL. Following that, they must learn how to create databases, do basic analysis, and make visuals using applications such as Tableau. However, not every Statistical Analyst will need to know how to do all of these things, but if you want to advance in your profession, you should be able to do them all.

Based on your industry and the sort of work you do, you may opt to study Python or R, become an expert at data cleaning, or focus on developing complicated statistical models.

You could also learn a little bit of everything, which might help you take on a leadership role and advance to the position of Senior Data Analyst. A Senior Statistical Analyst with vast and deep knowledge might take on a leadership role leading a team of other Statistical Analysts. Statistical Analysts with extra skill training may be able to advance to Data Scientists or other more senior data analytics positions.

Supercharge your career in AI and ML with Simplilearn's comprehensive courses. Gain the skills and knowledge to transform industries and unleash your true potential. Enroll now and unlock limitless possibilities!

Program Name AI Engineer Post Graduate Program In Artificial Intelligence Post Graduate Program In Artificial Intelligence Geo All Geos All Geos IN/ROW University Simplilearn Purdue Caltech Course Duration 11 Months 11 Months 11 Months Coding Experience Required Basic Basic No Skills You Will Learn 10+ skills including data structure, data manipulation, NumPy, Scikit-Learn, Tableau and more. 16+ skills including chatbots, NLP, Python, Keras and more. 8+ skills including Supervised & Unsupervised Learning Deep Learning Data Visualization, and more. Additional Benefits Get access to exclusive Hackathons, Masterclasses and Ask-Me-Anything sessions by IBM Applied learning via 3 Capstone and 12 Industry-relevant Projects Purdue Alumni Association Membership Free IIMJobs Pro-Membership of 6 months Resume Building Assistance Upto 14 CEU Credits Caltech CTME Circle Membership Cost $$ $$$$ $$$$ Explore Program Explore Program Explore Program

Hope this article assisted you in understanding the importance of statistical analysis in every sphere of life. Artificial Intelligence (AI) can help you perform statistical analysis and data analysis very effectively and efficiently. 

If you are a science wizard and fascinated by the role of AI in statistical analysis, check out this amazing Caltech Post Graduate Program in AI & ML course in collaboration with Caltech. With a comprehensive syllabus and real-life projects, this course is one of the most popular courses and will help you with all that you need to know about Artificial Intelligence. 

Our AI & Machine Learning Courses Duration And Fees

AI & Machine Learning Courses typically range from a few weeks to several months, with fees varying based on program and institution.

Get Free Certifications with free video courses

Introduction to Data Analytics Course

Data Science & Business Analytics

Introduction to Data Analytics Course

Introduction to Data Science

Introduction to Data Science

Learn from Industry Experts with free Masterclasses

Ai & machine learning.

Gain Gen AI expertise in Purdue's Applied Gen AI Specialization

Unlock Your Career Potential: Land Your Dream Job with Gen AI Tools

Make Your Gen AI & ML Career Shift in 2024 a Success with iHUB DivyaSampark, IIT Roorkee

Recommended Reads

Free eBook: Guide To The CCBA And CBAP Certifications

Understanding Statistical Process Control (SPC) and Top Applications

A Complete Guide on the Types of Statistical Studies

Digital Marketing Salary Guide 2021

What Is Data Analysis: A Comprehensive Guide

A Complete Guide to Get a Grasp of Time Series Analysis

Get Affiliated Certifications with Live Class programs

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Statistical Methods for Data Analysis: a Comprehensive Guide

In today’s data-driven world, understanding statistical methods for data analysis is like having a superpower.

Whether you’re a student, a professional, or just a curious mind, diving into the realm of data can unlock insights and decisions that propel success.

Statistical methods for data analysis are the tools and techniques used to collect, analyze, interpret, and present data in a meaningful way.

From businesses optimizing operations to researchers uncovering new discoveries, these methods are foundational to making informed decisions based on data.

In this blog post, we’ll embark on a journey through the fascinating world of statistical analysis, exploring its key concepts, methodologies, and applications.

Introduction to Statistical Methods

At its core, statistical methods are the backbone of data analysis, helping us make sense of numbers and patterns in the world around us.

Whether you’re looking at sales figures, medical research, or even your fitness tracker’s data, statistical methods are what turn raw data into useful insights.

But before we dive into complex formulas and tests, let’s start with the basics.

Data comes in two main types: qualitative and quantitative data .

Qualitative vs Quantitative Data - a simple infographic

Quantitative data is all about numbers and quantities (like your height or the number of steps you walked today), while qualitative data deals with categories and qualities (like your favorite color or the breed of your dog).

And when we talk about measuring these data points, we use different scales like nominal, ordinal , interval , and ratio.

These scales help us understand the nature of our data—whether we’re ranking it (ordinal), simply categorizing it (nominal), or measuring it with a true zero point (ratio).

Scales of Data Measurement - an infographic

In a nutshell, statistical methods start with understanding the type and scale of your data.

This foundational knowledge sets the stage for everything from summarizing your data to making complex predictions.

Descriptive Statistics: Simplifying Data

What is Descriptive Statistics - an infographic

Imagine you’re at a party and you meet a bunch of new people.

When you go home, your roommate asks, “So, what were they like?” You could describe each person in detail, but instead, you give a summary: “Most were college students, around 20-25 years old, pretty fun crowd!”

That’s essentially what descriptive statistics does for data.

It summarizes and describes the main features of a collection of data in an easy-to-understand way. Let’s break this down further.

The Basics: Mean, Median, and Mode

  • Mean is just a fancy term for the average. If you add up everyone’s age at the party and divide by the number of people, you’ve got your mean age.
  • Median is the middle number in a sorted list. If you line up everyone from the youngest to the oldest and pick the person in the middle, their age is your median. This is super handy when someone’s age is way off the chart (like if your grandma crashed the party), as it doesn’t skew the data.
  • Mode is the most common age at the party. If you notice a lot of people are 22, then 22 is your mode. It’s like the age that wins the popularity contest.

Spreading the News: Range, Variance, and Standard Deviation

  • Range gives you an idea of how spread out the ages are. It’s the difference between the oldest and the youngest. A small range means everyone’s around the same age, while a big range means a wider variety.
  • Variance is a bit more complex. It measures how much the ages differ from the average age. A higher variance means ages are more spread out.
  • Standard Deviation is the square root of variance. It’s like variance but back on a scale that makes sense. It tells you, on average, how far each person’s age is from the mean age.

Picture Perfect: Graphical Representations

  • Histograms are like bar charts showing how many people fall into different age groups. They give you a quick glance at how ages are distributed.
  • Bar Charts are great for comparing different categories, like how many men vs. women were at the party.
  • Box Plots (or box-and-whisker plots) show you the median, the range, and if there are any outliers (like grandma).
  • Scatter Plots are used when you want to see if there’s a relationship between two things, like if bringing more snacks means people stay longer at the party.

Why Descriptive Statistics Matter?

Descriptive statistics are your first step in data analysis.

They help you understand your data at a glance and prepare you for deeper analysis.

Without them, you’re like someone trying to guess what a party was like without any context.

Whether you’re looking at survey responses, test scores, or party attendees, descriptive statistics give you the tools to summarize and describe your data in a way that’s easy to grasp.

This approach is crucial in educational settings, particularly for enhancing math learning outcomes. For those looking to deepen their understanding of math or seeking additional support, check out this link:  https://www.mathnasium.com/ math-tutors-near-me .

Remember, the goal of descriptive statistics is to simplify the complex.

Inferential Statistics: Beyond the Basics

Statistics Seminar Illustration

Let’s keep the party analogy rolling, but this time, imagine you couldn’t attend the party yourself.

You’re curious if the party was as fun as everyone said it would be.

Instead of asking every single attendee, you decide to ask a few friends who went.

Based on their experiences, you try to infer what the entire party was like.

This is essentially what inferential statistics does with data.

It allows you to make predictions or draw conclusions about a larger group (the population) based on a smaller group (a sample). Let’s dive into how this works.

Probability

Inferential statistics is all about playing the odds.

When you make an inference, you’re saying, “Based on my sample, there’s a certain probability that my conclusion about the whole population is correct.”

It’s like betting on whether the party was fun, based on a few friends’ opinions.

The Central Limit Theorem (CLT)

The Central Limit Theorem is the superhero of statistics.

It tells us that if you take enough samples from a population, the sample means (averages) will form a normal distribution (a bell curve), no matter what the population distribution looks like.

This is crucial because it allows us to use sample data to make inferences about the population mean with a known level of uncertainty.

Confidence Intervals

Imagine you’re pretty sure the party was fun, but you want to know how fun.

A confidence interval gives you a range of values within which you believe the true mean fun level of the party lies.

It’s like saying, “I’m 95% confident the party’s fun rating was between 7 and 9 out of 10.”

Hypothesis Testing

This is where you get to be a bit of a detective. You start with a hypothesis (a guess) about the population.

For example, your null hypothesis might be “the party was average fun.” Then you use your sample data to test this hypothesis.

If the data strongly suggests otherwise, you might reject the null hypothesis and accept the alternative hypothesis, which could be “the party was super fun.”

The p-value tells you how likely it is that your data would have occurred by random chance if the null hypothesis were true.

A low p-value (typically less than 0.05) indicates that your findings are significant—that is, unlikely to have happened by chance.

It’s like saying, “The chance that all my friends are exaggerating about the party being fun is really low, so the party probably was fun.”

Why Inferential Statistics Matter?

Inferential statistics let us go beyond just describing our data.

They allow us to make educated guesses about a larger population based on a sample.

This is incredibly useful in almost every field—science, business, public health, and yes, even planning your next party.

By using probability, the Central Limit Theorem, confidence intervals, hypothesis testing, and p-values, we can make informed decisions without needing to ask every single person in the population.

It saves time, resources, and helps us understand the world more scientifically.

Remember, while inferential statistics gives us powerful tools for making predictions, those predictions come with a level of uncertainty.

Being a good data scientist means understanding and communicating that uncertainty clearly.

So next time you hear about a party you missed, use inferential statistics to figure out just how much FOMO (fear of missing out) you should really feel!

Common Statistical Tests: Choosing Your Data’s Best Friend

Data Analysis Research and Statistics Concept

Alright, now that we’ve covered the basics of descriptive and inferential statistics, it’s time to talk about how we actually apply these concepts to make sense of data.

It’s like deciding on the best way to find out who was the life of the party.

You have several tools (tests) at your disposal, and choosing the right one depends on what you’re trying to find out and the type of data you have.

Let’s explore some of the most common statistical tests and when to use them.

T-Tests: Comparing Averages

Imagine you want to know if the average fun level was higher at this year’s party compared to last year’s.

A t-test helps you compare the means (averages) of two groups to see if they’re statistically different.

There are a couple of flavors:

  • Independent t-test : Use this when comparing two different groups, like this year’s party vs. last year’s party.
  • Paired t-test : Use this when comparing the same group at two different times or under two different conditions, like if you measured everyone’s fun level before and after the party.

ANOVA : When Three’s Not a Crowd.

But what if you had three or more parties to compare? That’s where ANOVA (Analysis of Variance) comes in handy.

It lets you compare the means across multiple groups at once to see if at least one of them is significantly different.

It’s like comparing the fun levels across several years’ parties to see if one year stood out.

Chi-Square Test: Categorically Speaking

Now, let’s say you’re interested in whether the type of music (pop, rock, electronic) affects party attendance.

Since you’re dealing with categories (types of music) and counts (number of attendees), you’ll use the Chi-Square test.

It’s great for seeing if there’s a relationship between two categorical variables.

Correlation and Regression: Finding Relationships

What if you suspect that the amount of snacks available at the party affects how long guests stay? To explore this, you’d use:

  • Correlation analysis to see if there’s a relationship between two continuous variables (like snacks and party duration). It tells you how closely related two things are.
  • Regression analysis goes a step further by not only showing if there’s a relationship but also how one variable predicts the other. It’s like saying, “For every extra bag of chips, guests stay an average of 10 minutes longer.”

Non-parametric Tests: When Assumptions Don’t Hold

All the tests mentioned above assume your data follows a normal distribution and meets other criteria.

But what if your data doesn’t play by these rules?

Enter non-parametric tests, like the Mann-Whitney U test (for comparing two groups when you can’t use a t-test) or the Kruskal-Wallis test (like ANOVA but for non-normal distributions).

Picking the Right Test

Choosing the right statistical test is crucial and depends on:

  • The type of data you have (categorical vs. continuous).
  • Whether you’re comparing groups or looking for relationships.
  • The distribution of your data (normal vs. non-normal).

Why These Tests Matter?

Just like you’d pick the right tool for a job, selecting the appropriate statistical test helps you make valid and reliable conclusions about your data.

Whether you’re trying to prove a point, make a decision, or just understand the world a bit better, these tests are your gateway to insights.

By mastering these tests, you become a detective in the world of data, ready to uncover the truth behind the numbers!

Regression Analysis: Predicting the Future

Regression Analysis

Ever wondered if you could predict how much fun you’re going to have at a party based on the number of friends going, or how the amount of snacks available might affect the overall party vibe?

That’s where regression analysis comes into play, acting like a crystal ball for your data.

What is Regression Analysis?

Regression analysis is a powerful statistical method that allows you to examine the relationship between two or more variables of interest.

Think of it as detective work, where you’re trying to figure out if, how, and to what extent certain factors (like snacks and music volume) predict an outcome (like the fun level at a party).

The Two Main Characters: Independent and Dependent Variables

  • Independent Variable(s): These are the predictors or factors that you suspect might influence the outcome. For example, the quantity of snacks.
  • Dependent Variable: This is the outcome you’re interested in predicting. In our case, it could be the fun level of the party.

Linear Regression: The Straight Line Relationship

The most basic form of regression analysis is linear regression .

It predicts the outcome based on a linear relationship between the independent and dependent variables.

If you plot this on a graph, you’d ideally see a straight line where, as the amount of snacks increases, so does the fun level (hopefully!).

  • Simple Linear Regression involves just one independent variable. It’s like saying, “Let’s see if just the number of snacks can predict the fun level.”
  • Multiple Linear Regression takes it up a notch by including more than one independent variable. Now, you’re looking at whether the quantity of snacks, type of music, and number of guests together can predict the fun level.

Logistic Regression: When Outcomes are Either/Or

Not all predictions are about numbers.

Sometimes, you just want to know if something will happen or not—will the party be a hit or a flop?

Logistic regression is used for these binary outcomes.

Instead of predicting a precise fun level, it predicts the probability of the party being a hit based on the same predictors (snacks, music, guests).

Making Sense of the Results

  • Coefficients: In regression analysis, each predictor has a coefficient, telling you how much the dependent variable is expected to change when that predictor changes by one unit, all else being equal.
  • R-squared : This value tells you how much of the variation in your dependent variable can be explained by the independent variables. A higher R-squared means a better fit between your model and the data.

Why Regression Analysis Rocks?

Regression analysis is like having a superpower. It helps you understand which factors matter most, which can be ignored, and how different factors come together to influence the outcome.

This insight is invaluable whether you’re planning a party, running a business, or conducting scientific research.

Bringing It All Together

Imagine you’ve gathered data on several parties, including the number of guests, type of music, and amount of snacks, along with a fun level rating for each.

By running a regression analysis, you can start to predict future parties’ success, tailoring your planning to maximize fun.

It’s a practical tool for making informed decisions based on past data, helping you throw legendary parties, optimize business strategies, or understand complex relationships in your research.

In essence, regression analysis helps turn your data into actionable insights, guiding you towards smarter decisions and better predictions.

So next time you’re knee-deep in data, remember: regression analysis might just be the key to unlocking its secrets.

Non-parametric Methods: Playing By Different Rules

So far, we’ve talked a lot about statistical methods that rely on certain assumptions about your data, like it being normally distributed (forming that classic bell curve) or having a specific scale of measurement.

But what happens when your data doesn’t fit these molds?

Maybe the scores from your last party’s karaoke contest are all over the place, or you’re trying to compare the popularity of various party games but only have rankings, not scores.

This is where non-parametric methods come to the rescue.

Breaking Free from Assumptions

Non-parametric methods are the rebels of the statistical world.

They don’t assume your data follows a normal distribution or that it meets strict requirements regarding measurement scales.

These methods are perfect for dealing with ordinal data (like rankings), nominal data (like categories), or when your data is skewed or has outliers that would throw off other tests.

When to Use Non-parametric Methods?

  • Your data is not normally distributed, and transformations don’t help.
  • You have ordinal data (like survey responses that range from “Strongly Disagree” to “Strongly Agree”).
  • You’re dealing with ranks or categories rather than precise measurements.
  • Your sample size is small, making it hard to meet the assumptions required for parametric tests.

Some Popular Non-parametric Tests

  • Mann-Whitney U Test: Think of it as the non-parametric counterpart to the independent samples t-test. Use this when you want to compare the differences between two independent groups on a ranking or ordinal scale.
  • Kruskal-Wallis Test: This is your go-to when you have three or more groups to compare, and it’s similar to an ANOVA but for ranked/ordinal data or when your data doesn’t meet ANOVA’s assumptions.
  • Spearman’s Rank Correlation: When you want to see if there’s a relationship between two sets of rankings, Spearman’s got your back. It’s like Pearson’s correlation for continuous data but designed for ranks.
  • Wilcoxon Signed-Rank Test: Use this for comparing two related samples when you can’t use the paired t-test, typically because the differences between pairs are not normally distributed.

The Beauty of Flexibility

The real charm of non-parametric methods is their flexibility.

They let you work with data that’s not textbook perfect, which is often the case in the real world.

Whether you’re analyzing customer satisfaction surveys, comparing the effectiveness of different marketing strategies, or just trying to figure out if people prefer pizza or tacos at parties, non-parametric tests provide a robust way to get meaningful insights.

Keeping It Real

It’s important to remember that while non-parametric methods are incredibly useful, they also come with their own limitations.

They might be more conservative, meaning you might need a larger effect to detect a significant result compared to parametric tests.

Plus, because they often work with ranks rather than actual values, some information about your data might get lost in translation.

Non-parametric methods are your statistical toolbox’s Swiss Army knife, ready to tackle data that doesn’t fit into the neat categories required by more traditional tests.

They remind us that in the world of data analysis, there’s more than one way to uncover insights and make informed decisions.

So, the next time you’re faced with skewed distributions or rankings instead of scores, remember that non-parametric methods have got you covered, offering a way to navigate the complexities of real-world data.

Data Cleaning and Preparation: The Unsung Heroes of Data Analysis

Before any party can start, there’s always a bit of housecleaning to do—sweeping the floors, arranging the furniture, and maybe even hiding those laundry piles you’ve been ignoring all week.

Similarly, in the world of data analysis, before we can dive into the fun stuff like statistical tests and predictive modeling, we need to roll up our sleeves and get our data nice and tidy.

This process of data cleaning and preparation might not be the most glamorous part of data science, but it’s absolutely critical.

Let’s break down what this involves and why it’s so important.

Why Clean and Prepare Data?

Imagine trying to analyze party RSVPs when half the responses are “yes,” a quarter are “Y,” and the rest are a creative mix of “yup,” “sure,” and “why not?”

Without standardization, it’s hard to get a clear picture of how many guests to expect.

The same goes for any data set. Cleaning ensures that your data is consistent, accurate, and ready for analysis.

Preparation involves transforming this clean data into a format that’s useful for your specific analysis needs.

The Steps to Sparkling Clean Data

  • Dealing with Missing Values: Sometimes, data is incomplete. Maybe a survey respondent skipped a question, or a sensor failed to record a reading. You’ll need to decide whether to fill in these gaps (imputation), ignore them, or drop the observations altogether.
  • Identifying and Handling Outliers: Outliers are data points that are significantly different from the rest. They might be errors, or they might be valuable insights. The challenge is determining which is which and deciding how to handle them—remove, adjust, or analyze separately.
  • Correcting Inconsistencies: This is like making sure all your RSVPs are in the same format. It could involve standardizing text entries, correcting typos, or converting all measurements to the same units.
  • Formatting Data: Your analysis might require data in a specific format. This could mean transforming data types (e.g., converting dates into a uniform format) or restructuring data tables to make them easier to work with.
  • Reducing Dimensionality: Sometimes, your data set might have more information than you actually need. Reducing dimensionality (through methods like Principal Component Analysis) can help simplify your data without losing valuable information.
  • Creating New Variables: You might need to derive new variables from your existing ones to better capture the relationships in your data. For example, turning raw survey responses into a numerical satisfaction score.

The Tools of the Trade

There are many tools available to help with data cleaning and preparation, ranging from spreadsheet software like Excel to programming languages like Python and R.

These tools offer functions and libraries specifically designed to make data cleaning as painless as possible.

Why It Matters

Skipping the data cleaning and preparation stage is like trying to cook without prepping your ingredients first.

Sure, you might end up with something edible, but it’s not going to be as good as it could have been.

Clean and well-prepared data leads to more accurate, reliable, and meaningful analysis results.

It’s the foundation upon which all good data analysis is built.

Data cleaning and preparation might not be the flashiest part of data science, but it’s where all successful data analysis projects begin.

By taking the time to thoroughly clean and prepare your data, you’re setting yourself up for clearer insights, better decisions, and, ultimately, more impactful outcomes.

Software Tools for Statistical Analysis: Your Digital Assistants

Diving into the world of data without the right tools can feel like trying to cook a gourmet meal without a kitchen.

Just as you need pots, pans, and a stove to create a culinary masterpiece, you need the right software tools to analyze data and uncover the insights hidden within.

These digital assistants range from user-friendly applications for beginners to powerful suites for the pros.

Let’s take a closer look at some of the most popular software tools for statistical analysis.

R and RStudio: The Dynamic Duo

  • R is like the Swiss Army knife of statistical analysis. It’s a programming language designed specifically for data analysis, graphics, and statistical modeling. Think of R as the kitchen where you’ll be cooking up your data analysis.
  • RStudio is an integrated development environment (IDE) for R. It’s like having the best kitchen setup with organized countertops (your coding space) and all your tools and ingredients within reach (packages and datasets).

Why They Rock:

R is incredibly powerful and can handle almost any data analysis task you throw at it, from the basics to the most advanced statistical models.

Plus, there’s a vast community of users, which means a wealth of tutorials, forums, and free packages to add on.

Python with pandas and scipy: The Versatile Virtuoso

  • Python is not just for programming; with the right libraries, it becomes an excellent tool for data analysis. It’s like a kitchen that’s not only great for baking but also equipped for gourmet cooking.
  • pandas is a library that provides easy-to-use data structures and data analysis tools for Python. Imagine it as your sous-chef, helping you to slice and dice data with ease.
  • scipy is another library used for scientific and technical computing. It’s like having a set of precision knives for the more intricate tasks.

Why They Rock: Python is known for its readability and simplicity, making it accessible for beginners. When combined with pandas and scipy, it becomes a powerhouse for data manipulation, analysis, and visualization.

SPSS: The Point-and-Click Professional

SPSS (Statistical Package for the Social Sciences) is a software package used for interactive, or batched, statistical analysis. Long produced by SPSS Inc., it was acquired by IBM in 2009.

Why It Rocks: SPSS is particularly user-friendly with its point-and-click interface, making it a favorite among non-programmers and researchers in the social sciences. It’s like having a kitchen gadget that does the job with the push of a button—no manual setup required.

SAS: The Corporate Chef

SAS (Statistical Analysis System) is a software suite developed for advanced analytics, multivariate analysis, business intelligence, data management, and predictive analytics.

Why It Rocks: SAS is a powerhouse in the corporate world, known for its stability, deep analytical capabilities, and support for large data sets. It’s like the industrial kitchen used by professional chefs to serve hundreds of guests.

Excel: The Accessible Apprentice

Excel might not be a specialized statistical software, but it’s widely accessible and capable of handling basic statistical analyses. Think of Excel as the microwave in your kitchen—it might not be fancy, but it gets the job done for quick and simple tasks.

Why It Rocks: Almost everyone has access to Excel and knows the basics, making it a great starting point for those new to data analysis. Plus, with add-ons like the Analysis ToolPak, Excel’s capabilities can be extended further into statistical territory.

Choosing Your Tool

Selecting the right software tool for statistical analysis is like choosing the right kitchen for your cooking style—it depends on your needs, expertise, and the complexity of your recipes (data).

Whether you’re a coding chef ready to tackle R or Python, or someone who prefers the straightforwardness of SPSS or Excel, there’s a tool out there that’s perfect for your data analysis kitchen.

Ethical Considerations

Digital Ethics and Privacy Abstract Concept

Embarking on a data analysis journey is like setting sail on the vast ocean of information.

Just as a captain needs a compass to navigate the seas safely and responsibly, a data analyst requires a strong sense of ethics to guide their exploration of data.

Ethical considerations in data analysis are the moral compass that ensures we respect privacy, consent, and integrity while uncovering the truths hidden within data. Let’s delve into why ethics are so crucial and what principles you should keep in mind.

Respect for Privacy

Imagine you’ve found a diary filled with personal secrets.

Reading it without permission would be a breach of privacy.

Similarly, when you’re handling data, especially personal or sensitive information, it’s essential to ensure that privacy is protected.

This means not only securing data against unauthorized access but also anonymizing data to prevent individuals from being identified.

Informed Consent

Before you can set sail, you need the ship owner’s permission.

In the world of data, this translates to informed consent. Participants should be fully aware of what their data will be used for and voluntarily agree to participate.

This is particularly important in research or when collecting data directly from individuals. It’s like asking for permission before you start the journey.

Data Integrity

Maintaining data integrity is like keeping the ship’s log accurate and unaltered during your voyage.

It involves ensuring the data is not corrupted or modified inappropriately and that any data analysis is conducted accurately and reliably.

Tampering with data or cherry-picking results to fit a narrative is not just unethical—it’s like falsifying the ship’s log, leading to mistrust and potentially dangerous outcomes.

Avoiding Bias

The sea is vast, and your compass must be calibrated correctly to avoid going off course. Similarly, avoiding bias in data analysis ensures your findings are valid and unbiased.

This means being aware of and actively addressing any personal, cultural, or statistical biases that might skew your analysis.

It’s about striving for objectivity and ensuring your journey is guided by truth, not preconceived notions.

Transparency and Accountability

A trustworthy captain is open about their navigational choices and ready to take responsibility for them.

In data analysis, this translates to transparency about your methods and accountability for your conclusions.

Sharing your methodologies, data sources, and any limitations of your analysis helps build trust and allows others to verify or challenge your findings.

Ethical Use of Findings

Finally, just as a captain must consider the impact of their journey on the wider world, you must consider how your data analysis will be used.

This means thinking about the potential consequences of your findings and striving to ensure they are used to benefit, not harm, society.

It’s about being mindful of the broader implications of your work and using data for good.

Navigating with a Moral Compass

In the realm of data analysis, ethical considerations form the moral compass that guides us through complex moral waters.

They ensure that our work respects individuals’ rights, contributes positively to society, and upholds the highest standards of integrity and professionalism.

Just as a captain navigates the seas with respect for the ocean and its dangers, a data analyst must navigate the world of data with a deep commitment to ethical principles.

This commitment ensures that the insights gained from data analysis serve to enlighten and improve, rather than exploit or harm.

Conclusion and Key Takeaways

And there you have it—a whirlwind tour through the fascinating landscape of statistical methods for data analysis.

From the grounding principles of descriptive and inferential statistics to the nuanced details of regression analysis and beyond, we’ve explored the tools and ethical considerations that guide us in turning raw data into meaningful insights.

The Takeaway

Think of data analysis as embarking on a grand adventure, one where numbers and facts are your map and compass.

Just as every explorer needs to understand the terrain, every aspiring data analyst must grasp these foundational concepts.

Whether it’s summarizing data sets with descriptive statistics, making predictions with inferential statistics, choosing the right statistical test, or navigating the ethical considerations that ensure our analyses benefit society, each aspect is a crucial step on your journey.

The Importance of Preparation

Remember, the key to a successful voyage is preparation.

Cleaning and preparing your data sets the stage for a smooth journey, while choosing the right software tools ensures you have the best equipment at your disposal.

And just as every responsible navigator respects the sea, every data analyst must navigate the ethical dimensions of their work with care and integrity.

Charting Your Course

As you embark on your own data analysis adventures, remember that the path you chart is unique to you.

Your questions will guide your journey, your curiosity will fuel your exploration, and the insights you gain will be your treasure.

The world of data is vast and full of mysteries waiting to be uncovered. With the tools and principles we’ve discussed, you’re well-equipped to start uncovering those mysteries, one data set at a time.

The Journey Ahead

The journey of statistical methods for data analysis is ongoing, and the landscape is ever-evolving.

As new methods emerge and our understanding deepens, there will always be new horizons to explore and new insights to discover.

But the fundamentals we’ve covered will remain your steadfast guide, helping you navigate the challenges and opportunities that lie ahead.

So set your sights on the questions that spark your curiosity, arm yourself with the tools of the trade, and embark on your data analysis journey with confidence.

About The Author

statistical analysis research methods

Silvia Valcheva

Silvia Valcheva is a digital marketer with over a decade of experience creating content for the tech industry. She has a strong passion for writing about emerging software and technologies such as big data, AI (Artificial Intelligence), IoT (Internet of Things), process automation, etc.

Leave a Reply Cancel Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed .

What Is Statistical Analysis?

statistical analysis research methods

Statistical analysis is a technique we use to find patterns in data and make inferences about those patterns to describe variability in the results of a data set or an experiment. 

In its simplest form, statistical analysis answers questions about:

  • Quantification — how big/small/tall/wide is it?
  • Variability — growth, increase, decline
  • The confidence level of these variabilities

What Are the 2 Types of Statistical Analysis?

  • Descriptive Statistics:  Descriptive statistical analysis describes the quality of the data by summarizing large data sets into single measures. 
  • Inferential Statistics:  Inferential statistical analysis allows you to draw conclusions from your sample data set and make predictions about a population using statistical tests.

What’s the Purpose of Statistical Analysis?

Using statistical analysis, you can determine trends in the data by calculating your data set’s mean or median. You can also analyze the variation between different data points from the mean to get the standard deviation . Furthermore, to test the validity of your statistical analysis conclusions, you can use hypothesis testing techniques, like P-value, to determine the likelihood that the observed variability could have occurred by chance.

More From Abdishakur Hassan The 7 Best Thematic Map Types for Geospatial Data

Statistical Analysis Methods

There are two major types of statistical data analysis: descriptive and inferential. 

Descriptive Statistical Analysis

Descriptive statistical analysis describes the quality of the data by summarizing large data sets into single measures. 

Within the descriptive analysis branch, there are two main types: measures of central tendency (i.e. mean, median and mode) and measures of dispersion or variation (i.e. variance , standard deviation and range). 

For example, you can calculate the average exam results in a class using central tendency or, in particular, the mean. In that case, you’d sum all student results and divide by the number of tests. You can also calculate the data set’s spread by calculating the variance. To calculate the variance, subtract each exam result in the data set from the mean, square the answer, add everything together and divide by the number of tests.

Inferential Statistics

On the other hand, inferential statistical analysis allows you to draw conclusions from your sample data set and make predictions about a population using statistical tests. 

There are two main types of inferential statistical analysis: hypothesis testing and regression analysis. We use hypothesis testing to test and validate assumptions in order to draw conclusions about a population from the sample data. Popular tests include Z-test, F-Test, ANOVA test and confidence intervals . On the other hand, regression analysis primarily estimates the relationship between a dependent variable and one or more independent variables. There are numerous types of regression analysis but the most popular ones include linear and logistic regression .  

Statistical Analysis Steps  

In the era of big data and data science, there is a rising demand for a more problem-driven approach. As a result, we must approach statistical analysis holistically. We may divide the entire process into five different and significant stages by using the well-known PPDAC model of statistics: Problem, Plan, Data, Analysis and Conclusion.

statistical analysis chart of the statistical cycle. The chart is in the shape of a circle going clockwise starting with one and going up to five. Each number corresponds to a brief description of that step in the PPDAC cylce. The circle is gray with blue number. Step four is orange.

In the first stage, you define the problem you want to tackle and explore questions about the problem. 

Next is the planning phase. You can check whether data is available or if you need to collect data for your problem. You also determine what to measure and how to measure it. 

The third stage involves data collection, understanding the data and checking its quality. 

4. Analysis

Statistical data analysis is the fourth stage. Here you process and explore the data with the help of tables, graphs and other data visualizations.  You also develop and scrutinize your hypothesis in this stage of analysis. 

5. Conclusion

The final step involves interpretations and conclusions from your analysis. It also covers generating new ideas for the next iteration. Thus, statistical analysis is not a one-time event but an iterative process.

Statistical Analysis Uses

Statistical analysis is useful for research and decision making because it allows us to understand the world around us and draw conclusions by testing our assumptions. Statistical analysis is important for various applications, including:

  • Statistical quality control and analysis in product development 
  • Clinical trials
  • Customer satisfaction surveys and customer experience research 
  • Marketing operations management
  • Process improvement and optimization
  • Training needs 

More on Statistical Analysis From Built In Experts Intro to Descriptive Statistics for Machine Learning

Benefits of Statistical Analysis

Here are some of the reasons why statistical analysis is widespread in many applications and why it’s necessary:

Understand Data

Statistical analysis gives you a better understanding of the data and what they mean. These types of analyses provide information that would otherwise be difficult to obtain by merely looking at the numbers without considering their relationship.

Find Causal Relationships

Statistical analysis can help you investigate causation or establish the precise meaning of an experiment, like when you’re looking for a relationship between two variables.

Make Data-Informed Decisions

Businesses are constantly looking to find ways to improve their services and products . Statistical analysis allows you to make data-informed decisions about your business or future actions by helping you identify trends in your data, whether positive or negative. 

Determine Probability

Statistical analysis is an approach to understanding how the probability of certain events affects the outcome of an experiment. It helps scientists and engineers decide how much confidence they can have in the results of their research, how to interpret their data and what questions they can feasibly answer.

You’ve Got Questions. Our Experts Have Answers. Confidence Intervals, Explained!

What Are the Risks of Statistical Analysis?

Statistical analysis can be valuable and effective, but it’s an imperfect approach. Even if the analyst or researcher performs a thorough statistical analysis, there may still be known or unknown problems that can affect the results. Therefore, statistical analysis is not a one-size-fits-all process. If you want to get good results, you need to know what you’re doing. It can take a lot of time to figure out which type of statistical analysis will work best for your situation .

Thus, you should remember that our conclusions drawn from statistical analysis don’t always guarantee correct results. This can be dangerous when making business decisions. In marketing , for example, we may come to the wrong conclusion about a product . Therefore, the conclusions we draw from statistical data analysis are often approximated; testing for all factors affecting an observation is impossible.

Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

Great Companies Need Great People. That's Where We Come In.

Research Graduate

The Best PhD and Masters Consulting Company

graph, diagram, growth-3033203.jpg

Introduction to Statistical Analysis: A Beginner’s Guide.

Statistical analysis is a crucial component of research work across various disciplines, helping researchers derive meaningful insights from data. Whether you’re conducting scientific studies, social research, or data-driven investigations, having a solid understanding of statistical analysis is essential. In this beginner’s guide, we will explore the fundamental concepts and techniques of statistical analysis specifically tailored for research work, providing you with a strong foundation to enhance the quality and credibility of your research findings.

1. Importance of Statistical Analysis in Research:

Research aims to uncover knowledge and make informed conclusions. Statistical analysis plays a pivotal role in achieving this by providing tools and methods to analyze and interpret data accurately. It helps researchers identify patterns, test hypotheses, draw inferences, and quantify the strength of relationships between variables. Understanding the significance of statistical analysis empowers researchers to make evidence-based decisions.

2. Data Collection and Organization:

Before diving into statistical analysis, researchers must collect and organize their data effectively. We will discuss the importance of proper sampling techniques, data quality assurance, and data preprocessing. Additionally, we will explore methods to handle missing data and outliers, ensuring that your dataset is reliable and suitable for analysis.

3. Exploratory Data Analysis (EDA):

Exploratory Data Analysis is a preliminary step that involves visually exploring and summarizing the main characteristics of the data. We will cover techniques such as data visualization, descriptive statistics, and data transformations to gain insights into the distribution, central tendencies, and variability of the variables in your dataset. EDA helps researchers understand the underlying structure of the data and identify potential relationships for further investigation.

4. Statistical Inference and Hypothesis Testing:

Statistical inference allows researchers to make generalizations about a population based on a sample. We will delve into hypothesis testing, covering concepts such as null and alternative hypotheses, p-values, and significance levels. By understanding these concepts, you will be able to test your research hypotheses and determine if the observed results are statistically significant.

5. Parametric and Non-parametric Tests:

Parametric and non-parametric tests are statistical techniques used to analyze data based on different assumptions about the underlying population distribution. We will explore commonly used parametric tests, such as t-tests and analysis of variance (ANOVA), as well as non-parametric tests like the Mann-Whitney U test and Kruskal-Wallis test. Understanding when to use each type of test is crucial for selecting the appropriate analysis method for your research questions.

6. Correlation and Regression Analysis:

Correlation and regression analysis allow researchers to explore relationships between variables and make predictions. We will cover Pearson correlation coefficients, multiple regression analysis, and logistic regression. These techniques enable researchers to quantify the strength and direction of associations and identify predictive factors in their research.

7. Sample Size Determination and Power Analysis:

Sample size determination is a critical aspect of research design, as it affects the validity and reliability of your findings. We will discuss methods for estimating sample size based on statistical power analysis, ensuring that your study has sufficient statistical power to detect meaningful effects. Understanding sample size determination is essential for planning robust research studies.

Conclusion:

Statistical analysis is an indispensable tool for conducting high-quality research. This beginner’s guide has provided an overview of key concepts and techniques specifically tailored for research work, enabling you to enhance the credibility and reliability of your findings. By understanding the importance of statistical analysis, collecting and organizing data effectively, performing exploratory data analysis, conducting hypothesis testing, utilizing parametric and non-parametric tests, and considering sample size determination, you will be well-equipped to carry out rigorous research and contribute valuable insights to your field. Remember, continuous learning, practice, and seeking guidance from statistical experts will further enhance your skills in statistical analysis for research.

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

statistical analysis research methods

Statistical Analysis in Research: Meaning, Methods and Types

Home » Videos » Statistical Analysis in Research: Meaning, Methods and Types

The scientific method is an empirical approach to acquiring new knowledge by making skeptical observations and analyses to develop a meaningful interpretation. It is the basis of research and the primary pillar of modern science. Researchers seek to understand the relationships between factors associated with the phenomena of interest. In some cases, research works with vast chunks of data, making it difficult to observe or manipulate each data point. As a result, statistical analysis in research becomes a means of evaluating relationships and interconnections between variables with tools and analytical techniques for working with large data. Since researchers use statistical power analysis to assess the probability of finding an effect in such an investigation, the method is relatively accurate. Hence, statistical analysis in research eases analytical methods by focusing on the quantifiable aspects of phenomena.

What is Statistical Analysis in Research? A Simplified Definition

Statistical analysis uses quantitative data to investigate patterns, relationships, and patterns to understand real-life and simulated phenomena. The approach is a key analytical tool in various fields, including academia, business, government, and science in general. This statistical analysis in research definition implies that the primary focus of the scientific method is quantitative research. Notably, the investigator targets the constructs developed from general concepts as the researchers can quantify their hypotheses and present their findings in simple statistics.

When a business needs to learn how to improve its product, they collect statistical data about the production line and customer satisfaction. Qualitative data is valuable and often identifies the most common themes in the stakeholders’ responses. On the other hand, the quantitative data creates a level of importance, comparing the themes based on their criticality to the affected persons. For instance, descriptive statistics highlight tendency, frequency, variation, and position information. While the mean shows the average number of respondents who value a certain aspect, the variance indicates the accuracy of the data. In any case, statistical analysis creates simplified concepts used to understand the phenomenon under investigation. It is also a key component in academia as the primary approach to data representation, especially in research projects, term papers and dissertations. 

Most Useful Statistical Analysis Methods in Research

Using statistical analysis methods in research is inevitable, especially in academic assignments, projects, and term papers. It’s always advisable to seek assistance from your professor or you can try research paper writing by CustomWritings before you start your academic project or write statistical analysis in research paper. Consulting an expert when developing a topic for your thesis or short mid-term assignment increases your chances of getting a better grade. Most importantly, it improves your understanding of research methods with insights on how to enhance the originality and quality of personalized essays. Professional writers can also help select the most suitable statistical analysis method for your thesis, influencing the choice of data and type of study.

Descriptive Statistics

Descriptive statistics is a statistical method summarizing quantitative figures to understand critical details about the sample and population. A description statistic is a figure that quantifies a specific aspect of the data. For instance, instead of analyzing the behavior of a thousand students, research can identify the most common actions among them. By doing this, the person utilizes statistical analysis in research, particularly descriptive statistics.

  • Measures of central tendency . Central tendency measures are the mean, mode, and media or the averages denoting specific data points. They assess the centrality of the probability distribution, hence the name. These measures describe the data in relation to the center.
  • Measures of frequency . These statistics document the number of times an event happens. They include frequency, count, ratios, rates, and proportions. Measures of frequency can also show how often a score occurs.
  • Measures of dispersion/variation . These descriptive statistics assess the intervals between the data points. The objective is to view the spread or disparity between the specific inputs. Measures of variation include the standard deviation, variance, and range. They indicate how the spread may affect other statistics, such as the mean.
  • Measures of position . Sometimes researchers can investigate relationships between scores. Measures of position, such as percentiles, quartiles, and ranks, demonstrate this association. They are often useful when comparing the data to normalized information.

Inferential Statistics

Inferential statistics is critical in statistical analysis in quantitative research. This approach uses statistical tests to draw conclusions about the population. Examples of inferential statistics include t-tests, F-tests, ANOVA, p-value, Mann-Whitney U test, and Wilcoxon W test. This

Common Statistical Analysis in Research Types

Although inferential and descriptive statistics can be classified as types of statistical analysis in research, they are mostly considered analytical methods. Types of research are distinguishable by the differences in the methodology employed in analyzing, assembling, classifying, manipulating, and interpreting data. The categories may also depend on the type of data used.

Predictive Analysis

Predictive research analyzes past and present data to assess trends and predict future events. An excellent example of predictive analysis is a market survey that seeks to understand customers’ spending habits to weigh the possibility of a repeat or future purchase. Such studies assess the likelihood of an action based on trends.

Prescriptive Analysis

On the other hand, a prescriptive analysis targets likely courses of action. It’s decision-making research designed to identify optimal solutions to a problem. Its primary objective is to test or assess alternative measures.

Causal Analysis

Causal research investigates the explanation behind the events. It explores the relationship between factors for causation. Thus, researchers use causal analyses to analyze root causes, possible problems, and unknown outcomes.

Mechanistic Analysis

This type of research investigates the mechanism of action. Instead of focusing only on the causes or possible outcomes, researchers may seek an understanding of the processes involved. In such cases, they use mechanistic analyses to document, observe, or learn the mechanisms involved.

Exploratory Data Analysis

Similarly, an exploratory study is extensive with a wider scope and minimal limitations. This type of research seeks insight into the topic of interest. An exploratory researcher does not try to generalize or predict relationships. Instead, they look for information about the subject before conducting an in-depth analysis.

The Importance of Statistical Analysis in Research

As a matter of fact, statistical analysis provides critical information for decision-making. Decision-makers require past trends and predictive assumptions to inform their actions. In most cases, the data is too complex or lacks meaningful inferences. Statistical tools for analyzing such details help save time and money, deriving only valuable information for assessment. An excellent statistical analysis in research example is a randomized control trial (RCT) for the Covid-19 vaccine. You can download a sample of such a document online to understand the significance such analyses have to the stakeholders. A vaccine RCT assesses the effectiveness, side effects, duration of protection, and other benefits. Hence, statistical analysis in research is a helpful tool for understanding data.

Sources and links For the articles and videos I use different databases, such as Eurostat, OECD World Bank Open Data, Data Gov and others. You are free to use the video I have made on your site using the link or the embed code. If you have any questions, don’t hesitate to write to me!

Support statistics and data, if you have reached the end and like this project, you can donate a coffee to “statistics and data”..

Copyright © 2022 Statistics and Data

Enago Academy

Effective Use of Statistics in Research – Methods and Tools for Data Analysis

' src=

Remember that impending feeling you get when you are asked to analyze your data! Now that you have all the required raw data, you need to statistically prove your hypothesis. Representing your numerical data as part of statistics in research will also help in breaking the stereotype of being a biology student who can’t do math.

Statistical methods are essential for scientific research. In fact, statistical methods dominate the scientific research as they include planning, designing, collecting data, analyzing, drawing meaningful interpretation and reporting of research findings. Furthermore, the results acquired from research project are meaningless raw data unless analyzed with statistical tools. Therefore, determining statistics in research is of utmost necessity to justify research findings. In this article, we will discuss how using statistical methods for biology could help draw meaningful conclusion to analyze biological studies.

Table of Contents

Role of Statistics in Biological Research

Statistics is a branch of science that deals with collection, organization and analysis of data from the sample to the whole population. Moreover, it aids in designing a study more meticulously and also give a logical reasoning in concluding the hypothesis. Furthermore, biology study focuses on study of living organisms and their complex living pathways, which are very dynamic and cannot be explained with logical reasoning. However, statistics is more complex a field of study that defines and explains study patterns based on the sample sizes used. To be precise, statistics provides a trend in the conducted study.

Biological researchers often disregard the use of statistics in their research planning, and mainly use statistical tools at the end of their experiment. Therefore, giving rise to a complicated set of results which are not easily analyzed from statistical tools in research. Statistics in research can help a researcher approach the study in a stepwise manner, wherein the statistical analysis in research follows –

1. Establishing a Sample Size

Usually, a biological experiment starts with choosing samples and selecting the right number of repetitive experiments. Statistics in research deals with basics in statistics that provides statistical randomness and law of using large samples. Statistics teaches how choosing a sample size from a random large pool of sample helps extrapolate statistical findings and reduce experimental bias and errors.

2. Testing of Hypothesis

When conducting a statistical study with large sample pool, biological researchers must make sure that a conclusion is statistically significant. To achieve this, a researcher must create a hypothesis before examining the distribution of data. Furthermore, statistics in research helps interpret the data clustered near the mean of distributed data or spread across the distribution. These trends help analyze the sample and signify the hypothesis.

3. Data Interpretation Through Analysis

When dealing with large data, statistics in research assist in data analysis. This helps researchers to draw an effective conclusion from their experiment and observations. Concluding the study manually or from visual observation may give erroneous results; therefore, thorough statistical analysis will take into consideration all the other statistical measures and variance in the sample to provide a detailed interpretation of the data. Therefore, researchers produce a detailed and important data to support the conclusion.

Types of Statistical Research Methods That Aid in Data Analysis

statistics in research

Statistical analysis is the process of analyzing samples of data into patterns or trends that help researchers anticipate situations and make appropriate research conclusions. Based on the type of data, statistical analyses are of the following type:

1. Descriptive Analysis

The descriptive statistical analysis allows organizing and summarizing the large data into graphs and tables . Descriptive analysis involves various processes such as tabulation, measure of central tendency, measure of dispersion or variance, skewness measurements etc.

2. Inferential Analysis

The inferential statistical analysis allows to extrapolate the data acquired from a small sample size to the complete population. This analysis helps draw conclusions and make decisions about the whole population on the basis of sample data. It is a highly recommended statistical method for research projects that work with smaller sample size and meaning to extrapolate conclusion for large population.

3. Predictive Analysis

Predictive analysis is used to make a prediction of future events. This analysis is approached by marketing companies, insurance organizations, online service providers, data-driven marketing, and financial corporations.

4. Prescriptive Analysis

Prescriptive analysis examines data to find out what can be done next. It is widely used in business analysis for finding out the best possible outcome for a situation. It is nearly related to descriptive and predictive analysis. However, prescriptive analysis deals with giving appropriate suggestions among the available preferences.

5. Exploratory Data Analysis

EDA is generally the first step of the data analysis process that is conducted before performing any other statistical analysis technique. It completely focuses on analyzing patterns in the data to recognize potential relationships. EDA is used to discover unknown associations within data, inspect missing data from collected data and obtain maximum insights.

6. Causal Analysis

Causal analysis assists in understanding and determining the reasons behind “why” things happen in a certain way, as they appear. This analysis helps identify root cause of failures or simply find the basic reason why something could happen. For example, causal analysis is used to understand what will happen to the provided variable if another variable changes.

7. Mechanistic Analysis

This is a least common type of statistical analysis. The mechanistic analysis is used in the process of big data analytics and biological science. It uses the concept of understanding individual changes in variables that cause changes in other variables correspondingly while excluding external influences.

Important Statistical Tools In Research

Researchers in the biological field find statistical analysis in research as the scariest aspect of completing research. However, statistical tools in research can help researchers understand what to do with data and how to interpret the results, making this process as easy as possible.

1. Statistical Package for Social Science (SPSS)

It is a widely used software package for human behavior research. SPSS can compile descriptive statistics, as well as graphical depictions of result. Moreover, it includes the option to create scripts that automate analysis or carry out more advanced statistical processing.

2. R Foundation for Statistical Computing

This software package is used among human behavior research and other fields. R is a powerful tool and has a steep learning curve. However, it requires a certain level of coding. Furthermore, it comes with an active community that is engaged in building and enhancing the software and the associated plugins.

3. MATLAB (The Mathworks)

It is an analytical platform and a programming language. Researchers and engineers use this software and create their own code and help answer their research question. While MatLab can be a difficult tool to use for novices, it offers flexibility in terms of what the researcher needs.

4. Microsoft Excel

Not the best solution for statistical analysis in research, but MS Excel offers wide variety of tools for data visualization and simple statistics. It is easy to generate summary and customizable graphs and figures. MS Excel is the most accessible option for those wanting to start with statistics.

5. Statistical Analysis Software (SAS)

It is a statistical platform used in business, healthcare, and human behavior research alike. It can carry out advanced analyzes and produce publication-worthy figures, tables and charts .

6. GraphPad Prism

It is a premium software that is primarily used among biology researchers. But, it offers a range of variety to be used in various other fields. Similar to SPSS, GraphPad gives scripting option to automate analyses to carry out complex statistical calculations.

This software offers basic as well as advanced statistical tools for data analysis. However, similar to GraphPad and SPSS, minitab needs command over coding and can offer automated analyses.

Use of Statistical Tools In Research and Data Analysis

Statistical tools manage the large data. Many biological studies use large data to analyze the trends and patterns in studies. Therefore, using statistical tools becomes essential, as they manage the large data sets, making data processing more convenient.

Following these steps will help biological researchers to showcase the statistics in research in detail, and develop accurate hypothesis and use correct tools for it.

There are a range of statistical tools in research which can help researchers manage their research data and improve the outcome of their research by better interpretation of data. You could use statistics in research by understanding the research question, knowledge of statistics and your personal experience in coding.

Have you faced challenges while using statistics in research? How did you manage it? Did you use any of the statistical tools to help you with your research data? Do write to us or comment below!

Frequently Asked Questions

Statistics in research can help a researcher approach the study in a stepwise manner: 1. Establishing a sample size 2. Testing of hypothesis 3. Data interpretation through analysis

Statistical methods are essential for scientific research. In fact, statistical methods dominate the scientific research as they include planning, designing, collecting data, analyzing, drawing meaningful interpretation and reporting of research findings. Furthermore, the results acquired from research project are meaningless raw data unless analyzed with statistical tools. Therefore, determining statistics in research is of utmost necessity to justify research findings.

Statistical tools in research can help researchers understand what to do with data and how to interpret the results, making this process as easy as possible. They can manage large data sets, making data processing more convenient. A great number of tools are available to carry out statistical analysis of data like SPSS, SAS (Statistical Analysis Software), and Minitab.

' src=

nice article to read

Holistic but delineating. A very good read.

Rate this article Cancel Reply

Your email address will not be published.

statistical analysis research methods

Enago Academy's Most Popular Articles

Research Interviews for Data Collection

  • Reporting Research

Research Interviews: An effective and insightful way of data collection

Research interviews play a pivotal role in collecting data for various academic, scientific, and professional…

Planning Your Data Collection

Planning Your Data Collection: Designing methods for effective research

Planning your research is very important to obtain desirable results. In research, the relevance of…

best plagiarism checker

  • Language & Grammar

Best Plagiarism Checker Tool for Researchers — Top 4 to choose from!

While common writing issues like language enhancement, punctuation errors, grammatical errors, etc. can be dealt…

Year

  • Industry News
  • Publishing News

2022 in a Nutshell — Reminiscing the year when opportunities were seized and feats were achieved!

It’s beginning to look a lot like success! Some of the greatest opportunities to research…

statistical analysis research methods

  • Manuscript Preparation
  • Publishing Research

Qualitative Vs. Quantitative Research — A step-wise guide to conduct research

A research study includes the collection and analysis of data. In quantitative research, the data…

2022 in a Nutshell — Reminiscing the year when opportunities were seized and feats…

statistical analysis research methods

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

statistical analysis research methods

What should universities' stance be on AI tools in research and academic writing?

  • Online Degree Explore Bachelor’s & Master’s degrees
  • MasterTrack™ Earn credit towards a Master’s degree
  • University Certificates Advance your career with graduate-level learning
  • Top Courses
  • Join for Free

What Is Statistical Analysis? Definition, Types, and Jobs

Statistical analytics is a high demand career with great benefits. Learn how you can apply your statistical and data science skills to this growing field.

[Featured image] Analysts study sheets of paper containing statistical harts and graphs

Statistical analysis is the process of collecting large volumes of data and then using statistics and other data analysis techniques to identify trends, patterns, and insights. If you're a whiz at data and statistics, statistical analysis could be a great career match for you. The rise of big data, machine learning, and technology in our society has created a high demand for statistical analysts, and it's an exciting time to develop these skills and find a job you love. In this article, you'll learn more about statistical analysis, including its definition, different types of it, how it's done, and jobs that use it. At the end, you'll also explore suggested cost-effective courses than can help you gain greater knowledge of both statistical and data analytics.

Statistical analysis definition

Statistical analysis is the process of collecting and analyzing large volumes of data in order to identify trends and develop valuable insights.

In the professional world, statistical analysts take raw data and find correlations between variables to reveal patterns and trends to relevant stakeholders. Working in a wide range of different fields, statistical analysts are responsible for new scientific discoveries, improving the health of our communities, and guiding business decisions.

Types of statistical analysis

There are two main types of statistical analysis: descriptive and inferential. As a statistical analyst, you'll likely use both types in your daily work to ensure that data is both clearly communicated to others and that it's used effectively to develop actionable insights. At a glance, here's what you need to know about both types of statistical analysis:

Descriptive statistical analysis

Descriptive statistics summarizes the information within a data set without drawing conclusions about its contents. For example, if a business gave you a book of its expenses and you summarized the percentage of money it spent on different categories of items, then you would be performing a form of descriptive statistics.

When performing descriptive statistics, you will often use data visualization to present information in the form of graphs, tables, and charts to clearly convey it to others in an understandable format. Typically, leaders in a company or organization will then use this data to guide their decision making going forward.

Inferential statistical analysis

Inferential statistics takes the results of descriptive statistics one step further by drawing conclusions from the data and then making recommendations. For example, instead of only summarizing the business's expenses, you might go on to recommend in which areas to reduce spending and suggest an alternative budget.

Inferential statistical analysis is often used by businesses to inform company decisions and in scientific research to find new relationships between variables. 

Statistical analyst duties

Statistical analysts focus on making large sets of data understandable to a more general audience. In effect, you'll use your math and data skills to translate big numbers into easily digestible graphs, charts, and summaries for key decision makers within businesses and other organizations. Typical job responsibilities of statistical analysts include:

Extracting and organizing large sets of raw data

Determining which data is relevant and which should be excluded

Developing new data collection strategies

Meeting with clients and professionals to review data analysis plans

Creating data reports and easily understandable representations of the data

Presenting data

Interpreting data results

Creating recommendations for a company or other organizations

Your job responsibilities will differ depending on whether you work for a federal agency, a private company, or another business sector. Many industries need statistical analysts, so exploring your passions and seeing how you can best apply your data skills can be exciting. 

Statistical analysis skills

Because most of your job responsibilities will likely focus on data and statistical analysis, mathematical skills are crucial. High-level math skills can help you fact-check your work and create strategies to analyze the data, even if you use software for many computations. When honing in on your mathematical skills, focusing on statistics—specifically statistics with large data sets—can help set you apart when searching for job opportunities. Competency with computer software and learning new platforms will also help you excel in more advanced positions and put you in high demand.

Data analytics , problem-solving, and critical thinking are vital skills to help you determine the data set’s true meaning and bigger picture. Often, large data sets may not represent what they appear on the surface. To get to the bottom of things, you'll need to think critically about factors that may influence the data set, create an informed analysis plan, and parse out bias to identify insightful trends. 

To excel in the workplace, you'll need to hone your database management skills, keep up to date on statistical methodology, and continually improve your research skills. These skills take time to build, so starting with introductory courses and having patience while you build skills is important.

Common software used in statistical analytics jobs

Statistical analysis often involves computations using big data that is too large to compute by hand. The good news is that many kinds of statistical software have been developed to help analyze data effectively and efficiently. Gaining mastery over this statistical software can make you look attractive to employers and allow you to work on more complex projects. 

Statistical software is beneficial for both descriptive and inferential statistics. You can use it to generate charts and graphs or perform computations to draw conclusions and inferences from the data. While the type of statistical software you will use will depend on your employer, common software used include:

Read more: The 7 Data Analysis Software You Need to Know

Pathways to a career in statistical analytics

Many paths to becoming a statistical analyst exist, but most jobs in this field require a bachelor’s degree. Employers will typically look for a degree in an area that focuses on math, computer science, statistics, or data science to ensure you have the skills needed for the job. If your bachelor’s degree is in another field, gaining experience through entry-level data entry jobs can help get your foot in the door. Many employers look for work experience in related careers such as being a research assistant, data manager, or intern in the field.

Earning a graduate degree in statistical analytics or a related field can also help you stand out on your resume and demonstrate a deep knowledge of the skills needed to perform the job successfully. Generally, employers focus more on making sure you have the mathematical and data analysis skills required to perform complex statistical analytics on its data. After all, you will be helping them to make decisions, so they want to feel confident in your ability to advise them in the right direction.

Read more: Your Guide to a Career as a Statistician—What to Expect

How much do statistical analytics professionals earn? 

Statistical analysts earn well above the national average and enjoy many benefits on the job. There are many careers utilizing statistical analytics, so comparing salaries can help determine if the job benefits align with your expectations.

Median annual salary: $113,990

Job outlook for 2022 to 2032: 23% [ 1 ]

Data scientist

Median annual salary: $103,500

Job outlook for 2022 to 2032: 35% [ 2 ]

Financial risk specialist

Median annual salary: $102,120

Job outlook for 2022 to 2032: 8% [ 3 ]

Investment analyst

Median annual salary: $95,080

Operational research analyst

Median annual salary: $85,720

Job outlook for 2022 to 2032: 23% [ 4 ]

Market research analyst

Median annual salary: $68,230

Job outlook for 2022 to 2032: 13% [ 5 ]

Statistician

Median annual salary: $99,960

Job outlook for 2022 to 2032: 30% [ 6 ]

Read more: How Much Do Statisticians Make? Your 2022 Statistician Salary Guide

Statistical analysis job outlook

Jobs that use statistical analysis have a positive outlook for the foreseeable future.

According to the US Bureau of Labor Statistics (BLS), the number of jobs for mathematicians and statisticians is projected to grow by 30 percent between 2022 and 2032, adding an average of 3,500 new jobs each year throughout the decade [ 6 ].

As we create more ways to collect data worldwide, there will be an increased need for people able to analyze and make sense of the data.

Ready to take the next step in your career?

Statistical analytics could be an excellent career match for those with an affinity for math, data, and problem-solving. Here are some popular courses to consider as you prepare for a career in statistical analysis:

Learn fundamental processes and tools with Google's Data Analytics Professional Certificate . You'll learn how to process and analyze data, use key analysis tools, apply R programming, and create visualizations that can inform key business decisions.

Grow your comfort using R with Duke University's Data Analysis with R Specialization . Statistical analysts commonly use R for testing, modeling, and analysis. Here, you'll learn and practice those processes.

Apply statistical analysis with Rice University's Business Statistics and Analysis Specialization . Contextualize your technical and analytical skills by using them to solve business problems and complete a hands-on Capstone Project to demonstrate your knowledge.

Article sources

US Bureau of Labor Statistics. " Occupational Outlook Handbook: Actuaries , https://www.bls.gov/ooh/math/actuaries.htm." Accessed November 21, 2023.

US Bureau of Labor Statistics. " Occupational Outlook Handbook: Data Scientists , https://www.bls.gov/ooh/math/data-scientists.htm." Accessed Accessed November 21, 2023.

US Bureau of Labor Statistics. " Occupational Outlook Handbook: Financial Analysts , https://www.bls.gov/ooh/business-and-financial/financial-analysts.htm." Accessed Accessed November 21, 2023.

US Bureau of Labor Statistics. " Occupational Outlook Handbook: Operations Research Analysts , https://www.bls.gov/ooh/math/operations-research-analysts.htm." Accessed Accessed November 21, 2023.

US Bureau of Labor Statistics. " Occupational Outlook Handbook: Market Research Analyst , https://www.bls.gov/ooh/business-and-financial/market-research-analysts.htm." Accessed Accessed November 21, 2023.

US Bureau of Labor Statistics. " Occupational Outlook Handbook: Mathematicians and Statisticians , https://www.bls.gov/ooh/math/mathematicians-and-statisticians.htm." Accessed Accessed November 21, 2023.

Keep reading

Coursera staff.

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

statistical analysis research methods

Home Market Research

Data Analysis in Research: Types & Methods

data-analysis-in-research

Content Index

Why analyze data in research?

Types of data in research, finding patterns in the qualitative data, methods used for data analysis in qualitative research, preparing data for analysis, methods used for data analysis in quantitative research, considerations in research data analysis, what is data analysis in research.

Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. 

Three essential things occur during the data analysis process — the first is data organization . Summarization and categorization together contribute to becoming the second known method used for data reduction. It helps find patterns and themes in the data for easy identification and linking. The third and last way is data analysis – researchers do it in both top-down and bottom-up fashion.

LEARN ABOUT: Research Process Steps

On the other hand, Marshall and Rossman describe data analysis as a messy, ambiguous, and time-consuming but creative and fascinating process through which a mass of collected data is brought to order, structure and meaning.

We can say that “the data analysis and data interpretation is a process representing the application of deductive and inductive logic to the research and data analysis.”

Researchers rely heavily on data as they have a story to tell or research problems to solve. It starts with a question, and data is nothing but an answer to that question. But, what if there is no question to ask? Well! It is possible to explore data even without a problem – we call it ‘Data Mining’, which often reveals some interesting patterns within the data that are worth exploring.

Irrelevant to the type of data researchers explore, their mission and audiences’ vision guide them to find the patterns to shape the story they want to tell. One of the essential things expected from researchers while analyzing data is to stay open and remain unbiased toward unexpected patterns, expressions, and results. Remember, sometimes, data analysis tells the most unforeseen yet exciting stories that were not expected when initiating data analysis. Therefore, rely on the data you have at hand and enjoy the journey of exploratory research. 

Create a Free Account

Every kind of data has a rare quality of describing things after assigning a specific value to it. For analysis, you need to organize these values, processed and presented in a given context, to make it useful. Data can be in different forms; here are the primary data types.

  • Qualitative data: When the data presented has words and descriptions, then we call it qualitative data . Although you can observe this data, it is subjective and harder to analyze data in research, especially for comparison. Example: Quality data represents everything describing taste, experience, texture, or an opinion that is considered quality data. This type of data is usually collected through focus groups, personal qualitative interviews , qualitative observation or using open-ended questions in surveys.
  • Quantitative data: Any data expressed in numbers of numerical figures are called quantitative data . This type of data can be distinguished into categories, grouped, measured, calculated, or ranked. Example: questions such as age, rank, cost, length, weight, scores, etc. everything comes under this type of data. You can present such data in graphical format, charts, or apply statistical analysis methods to this data. The (Outcomes Measurement Systems) OMS questionnaires in surveys are a significant source of collecting numeric data.
  • Categorical data: It is data presented in groups. However, an item included in the categorical data cannot belong to more than one group. Example: A person responding to a survey by telling his living style, marital status, smoking habit, or drinking habit comes under the categorical data. A chi-square test is a standard method used to analyze this data.

Learn More : Examples of Qualitative Data in Education

Data analysis in qualitative research

Data analysis and qualitative data research work a little differently from the numerical data as the quality data is made up of words, descriptions, images, objects, and sometimes symbols. Getting insight from such complicated information is a complicated process. Hence it is typically used for exploratory research and data analysis .

Although there are several ways to find patterns in the textual information, a word-based method is the most relied and widely used global technique for research and data analysis. Notably, the data analysis process in qualitative research is manual. Here the researchers usually read the available data and find repetitive or commonly used words. 

For example, while studying data collected from African countries to understand the most pressing issues people face, researchers might find  “food”  and  “hunger” are the most commonly used words and will highlight them for further analysis.

LEARN ABOUT: Level of Analysis

The keyword context is another widely used word-based technique. In this method, the researcher tries to understand the concept by analyzing the context in which the participants use a particular keyword.  

For example , researchers conducting research and data analysis for studying the concept of ‘diabetes’ amongst respondents might analyze the context of when and how the respondent has used or referred to the word ‘diabetes.’

The scrutiny-based technique is also one of the highly recommended  text analysis  methods used to identify a quality data pattern. Compare and contrast is the widely used method under this technique to differentiate how a specific text is similar or different from each other. 

For example: To find out the “importance of resident doctor in a company,” the collected data is divided into people who think it is necessary to hire a resident doctor and those who think it is unnecessary. Compare and contrast is the best method that can be used to analyze the polls having single-answer questions types .

Metaphors can be used to reduce the data pile and find patterns in it so that it becomes easier to connect data with theory.

Variable Partitioning is another technique used to split variables so that researchers can find more coherent descriptions and explanations from the enormous data.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

There are several techniques to analyze the data in qualitative research, but here are some commonly used methods,

  • Content Analysis:  It is widely accepted and the most frequently employed technique for data analysis in research methodology. It can be used to analyze the documented information from text, images, and sometimes from the physical items. It depends on the research questions to predict when and where to use this method.
  • Narrative Analysis: This method is used to analyze content gathered from various sources such as personal interviews, field observation, and  surveys . The majority of times, stories, or opinions shared by people are focused on finding answers to the research questions.
  • Discourse Analysis:  Similar to narrative analysis, discourse analysis is used to analyze the interactions with people. Nevertheless, this particular method considers the social context under which or within which the communication between the researcher and respondent takes place. In addition to that, discourse analysis also focuses on the lifestyle and day-to-day environment while deriving any conclusion.
  • Grounded Theory:  When you want to explain why a particular phenomenon happened, then using grounded theory for analyzing quality data is the best resort. Grounded theory is applied to study data about the host of similar cases occurring in different settings. When researchers are using this method, they might alter explanations or produce new ones until they arrive at some conclusion.

LEARN ABOUT: 12 Best Tools for Researchers

Data analysis in quantitative research

The first stage in research and data analysis is to make it for the analysis so that the nominal data can be converted into something meaningful. Data preparation consists of the below phases.

Phase I: Data Validation

Data validation is done to understand if the collected data sample is per the pre-set standards, or it is a biased data sample again divided into four different stages

  • Fraud: To ensure an actual human being records each response to the survey or the questionnaire
  • Screening: To make sure each participant or respondent is selected or chosen in compliance with the research criteria
  • Procedure: To ensure ethical standards were maintained while collecting the data sample
  • Completeness: To ensure that the respondent has answered all the questions in an online survey. Else, the interviewer had asked all the questions devised in the questionnaire.

Phase II: Data Editing

More often, an extensive research data sample comes loaded with errors. Respondents sometimes fill in some fields incorrectly or sometimes skip them accidentally. Data editing is a process wherein the researchers have to confirm that the provided data is free of such errors. They need to conduct necessary checks and outlier checks to edit the raw edit and make it ready for analysis.

Phase III: Data Coding

Out of all three, this is the most critical phase of data preparation associated with grouping and assigning values to the survey responses . If a survey is completed with a 1000 sample size, the researcher will create an age bracket to distinguish the respondents based on their age. Thus, it becomes easier to analyze small data buckets rather than deal with the massive data pile.

LEARN ABOUT: Steps in Qualitative Research

After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical analysis plans are the most favored to analyze numerical data. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities. The method is again classified into two groups. First, ‘Descriptive Statistics’ used to describe data. Second, ‘Inferential statistics’ that helps in comparing the data .

Descriptive statistics

This method is used to describe the basic features of versatile types of data in research. It presents the data in such a meaningful way that pattern in the data starts making sense. Nevertheless, the descriptive analysis does not go beyond making conclusions. The conclusions are again based on the hypothesis researchers have formulated so far. Here are a few major types of descriptive analysis methods.

Measures of Frequency

  • Count, Percent, Frequency
  • It is used to denote home often a particular event occurs.
  • Researchers use it when they want to showcase how often a response is given.

Measures of Central Tendency

  • Mean, Median, Mode
  • The method is widely used to demonstrate distribution by various points.
  • Researchers use this method when they want to showcase the most commonly or averagely indicated response.

Measures of Dispersion or Variation

  • Range, Variance, Standard deviation
  • Here the field equals high/low points.
  • Variance standard deviation = difference between the observed score and mean
  • It is used to identify the spread of scores by stating intervals.
  • Researchers use this method to showcase data spread out. It helps them identify the depth until which the data is spread out that it directly affects the mean.

Measures of Position

  • Percentile ranks, Quartile ranks
  • It relies on standardized scores helping researchers to identify the relationship between different scores.
  • It is often used when researchers want to compare scores with the average count.

For quantitative research use of descriptive analysis often give absolute numbers, but the in-depth analysis is never sufficient to demonstrate the rationale behind those numbers. Nevertheless, it is necessary to think of the best method for research and data analysis suiting your survey questionnaire and what story researchers want to tell. For example, the mean is the best way to demonstrate the students’ average scores in schools. It is better to rely on the descriptive statistics when the researchers intend to keep the research or outcome limited to the provided  sample  without generalizing it. For example, when you want to compare average voting done in two different cities, differential statistics are enough.

Descriptive analysis is also called a ‘univariate analysis’ since it is commonly used to analyze a single variable.

Inferential statistics

Inferential statistics are used to make predictions about a larger population after research and data analysis of the representing population’s collected sample. For example, you can ask some odd 100 audiences at a movie theater if they like the movie they are watching. Researchers then use inferential statistics on the collected  sample  to reason that about 80-90% of people like the movie. 

Here are two significant areas of inferential statistics.

  • Estimating parameters: It takes statistics from the sample research data and demonstrates something about the population parameter.
  • Hypothesis test: I t’s about sampling research data to answer the survey research questions. For example, researchers might be interested to understand if the new shade of lipstick recently launched is good or not, or if the multivitamin capsules help children to perform better at games.

These are sophisticated analysis methods used to showcase the relationship between different variables instead of describing a single variable. It is often used when researchers want something beyond absolute numbers to understand the relationship between variables.

Here are some of the commonly used methods for data analysis in research.

  • Correlation: When researchers are not conducting experimental research or quasi-experimental research wherein the researchers are interested to understand the relationship between two or more variables, they opt for correlational research methods.
  • Cross-tabulation: Also called contingency tables,  cross-tabulation  is used to analyze the relationship between multiple variables.  Suppose provided data has age and gender categories presented in rows and columns. A two-dimensional cross-tabulation helps for seamless data analysis and research by showing the number of males and females in each age category.
  • Regression analysis: For understanding the strong relationship between two variables, researchers do not look beyond the primary and commonly used regression analysis method, which is also a type of predictive analysis used. In this method, you have an essential factor called the dependent variable. You also have multiple independent variables in regression analysis. You undertake efforts to find out the impact of independent variables on the dependent variable. The values of both independent and dependent variables are assumed as being ascertained in an error-free random manner.
  • Frequency tables: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Analysis of variance: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Researchers must have the necessary research skills to analyze and manipulation the data , Getting trained to demonstrate a high standard of research practice. Ideally, researchers must possess more than a basic understanding of the rationale of selecting one statistical method over the other to obtain better data insights.
  • Usually, research and data analytics projects differ by scientific discipline; therefore, getting statistical advice at the beginning of analysis helps design a survey questionnaire, select data collection  methods, and choose samples.

LEARN ABOUT: Best Data Collection Tools

  • The primary aim of data research and analysis is to derive ultimate insights that are unbiased. Any mistake in or keeping a biased mind to collect data, selecting an analysis method, or choosing  audience  sample il to draw a biased inference.
  • Irrelevant to the sophistication used in research data and analysis is enough to rectify the poorly defined objective outcome measurements. It does not matter if the design is at fault or intentions are not clear, but lack of clarity might mislead readers, so avoid the practice.
  • The motive behind data analysis in research is to present accurate and reliable data. As far as possible, avoid statistical errors, and find a way to deal with everyday challenges like outliers, missing data, data altering, data mining , or developing graphical representation.

LEARN MORE: Descriptive Research vs Correlational Research The sheer amount of data generated daily is frightening. Especially when data analysis has taken center stage. in 2018. In last year, the total data supply amounted to 2.8 trillion gigabytes. Hence, it is clear that the enterprises willing to survive in the hypercompetitive world must possess an excellent capability to analyze complex research data, derive actionable insights, and adapt to the new market needs.

LEARN ABOUT: Average Order Value

QuestionPro is an online survey platform that empowers organizations in data analysis and research and provides them a medium to collect data by creating appealing surveys.

MORE LIKE THIS

AI Question Generator

AI Question Generator: Create Easy + Accurate Tests and Surveys

Apr 6, 2024

ux research software

Top 17 UX Research Software for UX Design in 2024

Apr 5, 2024

Healthcare Staff Burnout

Healthcare Staff Burnout: What it Is + How To Manage It

Apr 4, 2024

employee retention software

Top 15 Employee Retention Software in 2024

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

JAMA Guide to Statistics and Methods

Explore this JAMA essay series that explains the basics of statistical techniques used in clinical research, to help clinicians interpret and critically appraise the medical literature.

Publication

Article type.

This JAMA Guide to Statistics and Methods article explains effect score analyses, an approach for evaluating the heterogeneity of treatment effects, and examines its use in a study of oxygen-saturation targets in critically ill patients.

This JAMA Guide to Statistics and Methods explains the use of historical controls—persons who had received a specific control treatment in a previous study—when randomizing participants to that control treatment in a subsequent trial may not be practical or ethical.

This JAMA Guide to Statistics and Methods discusses the early stopping of clinical trials for futility due to lack of evidence supporting the desired benefit, evidence of harm, or practical issues that make successful completion unlikely.

This JAMA Guide to Statistics and Methods explains sequential, multiple assignment, randomized trial (SMART) study designs, in which some or all participants are randomized at 2 or more decision points depending on the participant’s response to prior treatment.

This JAMA Guide to Statistics and Methods article examines conditional power, calculated while a trial is ongoing and based on both the currently observed data and an assumed treatment effect for future patients.

This Guide to Statistics and Methods describes the use of target trial emulation to design an observational study so it preserves the advantages of a randomized clinical trial, points out the limitations of the method, and provides an example of its use.

This Guide to Statistics and Methods provides an overview of the use of adjustment for baseline characteristics in the analysis of randomized clinical trials and emphasizes several important considerations.

This Guide to Statistics and Methods provides an overview of regression models for ordinal outcomes, including an explanation of why they are used and their limitations.

This Guide to Statistics and Methods provides an overview of patient-reported outcome measures for clinical research, emphasizes several important considerations when using them, and points out their limitations.

This JAMA Guide to Statistics and Methods discusses instrumental variable analysis, a method designed to reduce or eliminate unobserved confounding in observational studies, with the goal of achieving unbiased estimation of treatment effects.

This JAMA Guide to Statistics and Methods describes collider bias, illustrates examples in directed acyclic graphs, and explains how it can threaten the internal validity of a study and the accurate estimation of causal relationships in randomized clinical trials and observational studies.

This JAMA Guide to Statistics and Methods discusses the CONSERVE guidelines, which address how to report extenuating circumstances that lead to a modification in trial design, conduct, or analysis.

This JAMA Guide to Statistics and Methods discusses the basics of causal directed acyclic graphs, which are useful tools for communicating researchers’ understanding of the potential interplay among variables and are commonly used for mediation analysis.

This JAMA Guide to Statistics and Methods discusses cardinality matching, a method for finding the largest possible number of matched pairs in an observational data set, with the goal of balanced and representative samples of study participants between groups.

This Guide to Statistics and Methods discusses the various approaches to estimating variability in treatment effects, including heterogeneity of treatment effect, which was used to assess the association between surgery to close patent foramen ovale and risk of recurrent stroke in patients who presented with a stroke in a related JAMA article.

This Guide to Statistics and Methods describes how confidence intervals can be used to help in the interpretation of nonsignificant findings across all study designs.

This JAMA Guide to Statistics and Methods describes why interim analyses are performed during group sequential trials, provides examples of the limitations of interim analyses, and provides guidance on interpreting the results of interim analyses performed during group sequential trials.

This JAMA Guide to Statistics and Methods describes how ACC/AHA guidelines are formatted to rate class (denoting strength of a recommendation) and level (indicating the level of evidence on which a recommendation is based) and summarizes the strengths and benefits of this rating system in comparison with other commonly used ones.

This JAMA Guide to Statistics and Methods takes a look at estimands, estimators, and estimates in the context of randomized clinical trials and suggests several qualities that make for good estimands, including their scope, ability to summarize treatment effects, external validity, and ability to provide good estimates.

This JAMA Guide to Statistics and Methods describes how intention-to-treat, per-protocol, and as-treated approaches to analysis differ with regard to the patient population and treatment assignments and their implications for interpretation of treatment effects in clinical trials.

Select Your Interests

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing
  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base

The Beginner's Guide to Statistical Analysis | 5 Steps & Examples

Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organisations.

To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

After collecting data from your sample, you can organise and summarise the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalise your findings.

This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.

Table of contents

Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarise your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, frequently asked questions about statistics.

To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.

Writing statistical hypotheses

The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.

A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.

While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.

  • Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
  • Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
  • Null hypothesis: Parental income and GPA have no relationship with each other in college students.
  • Alternative hypothesis: Parental income and GPA are positively correlated in college students.

Planning your research design

A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.

First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.

  • In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
  • In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
  • In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.

Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.

  • In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
  • In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
  • In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
  • Experimental
  • Correlational

First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.

In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.

Measuring variables

When planning a research design, you should operationalise your variables and decide exactly how you will measure them.

For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:

  • Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
  • Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).

Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.

Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.

In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.

Population vs sample

In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.

Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.

Sampling for statistical analysis

There are two main approaches to selecting a sample.

  • Probability sampling: every member of the population has a chance of being selected for the study through random selection.
  • Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.

In theory, for highly generalisable findings, you should use a probability sampling method. Random selection reduces sampling bias and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.

But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to be biased, they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.

If you want to use parametric tests for non-probability samples, you have to make the case that:

  • your sample is representative of the population you’re generalising your findings to.
  • your sample lacks systematic bias.

Keep in mind that external validity means that you can only generalise your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialised, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.

If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalised in your discussion section .

Create an appropriate sampling procedure

Based on the resources available for your research, decide on how you’ll recruit participants.

  • Will you have resources to advertise your study widely, including outside of your university setting?
  • Will you have the means to recruit a diverse sample that represents a broad population?
  • Do you have time to contact and follow up with members of hard-to-reach groups?

Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.

Calculate sufficient sample size

Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.

There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.

To use these calculators, you have to understand and input these key components:

  • Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
  • Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
  • Expected effect size : a standardised indication of how large the expected result of your study will be, usually based on other similar studies.
  • Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.

Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarise them.

Inspect your data

There are various ways to inspect your data, including the following:

  • Organising data from each variable in frequency distribution tables .
  • Displaying data from a key variable in a bar chart to view the distribution of responses.
  • Visualising the relationship between two variables using a scatter plot .

By visualising your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.

A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.

Mean, median, mode, and standard deviation in a normal distribution

In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.

Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Calculate measures of central tendency

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:

  • Mode : the most popular response or value in the data set.
  • Median : the value in the exact middle of the data set when ordered from low to high.
  • Mean : the sum of all values divided by the number of values.

However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Calculate measures of variability

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:

  • Range : the highest value minus the lowest value of the data set.
  • Interquartile range : the range of the middle half of the data set.
  • Standard deviation : the average distance between each value in your data set and the mean.
  • Variance : the square of the standard deviation.

Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.

From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.

It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.

A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.

Researchers often use two main methods (simultaneously) to make inferences in statistics.

  • Estimation: calculating population parameters based on sample statistics.
  • Hypothesis testing: a formal process for testing research predictions about the population using samples.

You can make two types of estimates of population parameters from sample statistics:

  • A point estimate : a value that represents your best guess of the exact parameter.
  • An interval estimate : a range of values that represent your best guess of where the parameter lies.

If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.

You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.

A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.

Hypothesis testing

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.

Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:

  • A test statistic tells you how much your data differs from the null hypothesis of the test.
  • A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.

Statistical tests come in three main varieties:

  • Comparison tests assess group differences in outcomes.
  • Regression tests assess cause-and-effect relationships between variables.
  • Correlation tests assess relationships between variables without assuming causation.

Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.

Parametric tests

Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.

A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).

  • A simple linear regression includes one predictor variable and one outcome variable.
  • A multiple linear regression includes two or more predictor variables and one outcome variable.

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.

  • A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
  • A z test is for exactly 1 or 2 groups when the sample is large.
  • An ANOVA is for 3 or more groups.

The z and t tests have subtypes based on the number and types of samples and the hypotheses:

  • If you have only one sample that you want to compare to a population mean, use a one-sample test .
  • If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
  • If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
  • If you expect a difference between groups in a specific direction, use a one-tailed test .
  • If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .

The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:

  • a t value (test statistic) of 3.00
  • a p value of 0.0028

Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.

A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:

  • a t value of 3.08
  • a p value of 0.001

The final step of statistical analysis is interpreting your results.

Statistical significance

In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.

Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.

This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.

Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.

Effect size

A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.

In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .

With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.

Decision errors

Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.

You can aim to minimise the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.

Frequentist versus Bayesian statistics

Traditionally, frequentist statistics emphasises null hypothesis significance testing and always starts with the assumption of a true null hypothesis.

However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.

Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts, and meanings, use qualitative methods .
  • If you want to analyse a large amount of readily available data, use secondary data. If you want data specific to your purposes with control over how they are generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Statistical analysis is the main method for analyzing quantitative research data . It uses probabilities and models to test predictions about a population from sample data.

Is this article helpful?

Other students also liked, a quick guide to experimental design | 5 steps & examples, controlled experiments | methods & examples of control, between-subjects design | examples, pros & cons, more interesting articles.

  • Central Limit Theorem | Formula, Definition & Examples
  • Central Tendency | Understanding the Mean, Median & Mode
  • Correlation Coefficient | Types, Formulas & Examples
  • Descriptive Statistics | Definitions, Types, Examples
  • How to Calculate Standard Deviation (Guide) | Calculator & Examples
  • How to Calculate Variance | Calculator, Analysis & Examples
  • How to Find Degrees of Freedom | Definition & Formula
  • How to Find Interquartile Range (IQR) | Calculator & Examples
  • How to Find Outliers | Meaning, Formula & Examples
  • How to Find the Geometric Mean | Calculator & Formula
  • How to Find the Mean | Definition, Examples & Calculator
  • How to Find the Median | Definition, Examples & Calculator
  • How to Find the Range of a Data Set | Calculator & Formula
  • Inferential Statistics | An Easy Introduction & Examples
  • Levels of measurement: Nominal, ordinal, interval, ratio
  • Missing Data | Types, Explanation, & Imputation
  • Normal Distribution | Examples, Formulas, & Uses
  • Null and Alternative Hypotheses | Definitions & Examples
  • Poisson Distributions | Definition, Formula & Examples
  • Skewness | Definition, Examples & Formula
  • T-Distribution | What It Is and How To Use It (With Examples)
  • The Standard Normal Distribution | Calculator, Examples & Uses
  • Type I & Type II Errors | Differences, Examples, Visualizations
  • Understanding Confidence Intervals | Easy Examples & Formulas
  • Variability | Calculating Range, IQR, Variance, Standard Deviation
  • What is Effect Size and Why Does It Matter? (Examples)
  • What Is Interval Data? | Examples & Definition
  • What Is Nominal Data? | Examples & Definition
  • What Is Ordinal Data? | Examples & Definition
  • What Is Ratio Data? | Examples & Definition
  • What Is the Mode in Statistics? | Definition, Examples & Calculator
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Quantitative Research – Methods, Types and Analysis

Quantitative Research – Methods, Types and Analysis

Table of Contents

What is Quantitative Research

Quantitative Research

Quantitative research is a type of research that collects and analyzes numerical data to test hypotheses and answer research questions . This research typically involves a large sample size and uses statistical analysis to make inferences about a population based on the data collected. It often involves the use of surveys, experiments, or other structured data collection methods to gather quantitative data.

Quantitative Research Methods

Quantitative Research Methods

Quantitative Research Methods are as follows:

Descriptive Research Design

Descriptive research design is used to describe the characteristics of a population or phenomenon being studied. This research method is used to answer the questions of what, where, when, and how. Descriptive research designs use a variety of methods such as observation, case studies, and surveys to collect data. The data is then analyzed using statistical tools to identify patterns and relationships.

Correlational Research Design

Correlational research design is used to investigate the relationship between two or more variables. Researchers use correlational research to determine whether a relationship exists between variables and to what extent they are related. This research method involves collecting data from a sample and analyzing it using statistical tools such as correlation coefficients.

Quasi-experimental Research Design

Quasi-experimental research design is used to investigate cause-and-effect relationships between variables. This research method is similar to experimental research design, but it lacks full control over the independent variable. Researchers use quasi-experimental research designs when it is not feasible or ethical to manipulate the independent variable.

Experimental Research Design

Experimental research design is used to investigate cause-and-effect relationships between variables. This research method involves manipulating the independent variable and observing the effects on the dependent variable. Researchers use experimental research designs to test hypotheses and establish cause-and-effect relationships.

Survey Research

Survey research involves collecting data from a sample of individuals using a standardized questionnaire. This research method is used to gather information on attitudes, beliefs, and behaviors of individuals. Researchers use survey research to collect data quickly and efficiently from a large sample size. Survey research can be conducted through various methods such as online, phone, mail, or in-person interviews.

Quantitative Research Analysis Methods

Here are some commonly used quantitative research analysis methods:

Statistical Analysis

Statistical analysis is the most common quantitative research analysis method. It involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis can be used to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.

Regression Analysis

Regression analysis is a statistical technique used to analyze the relationship between one dependent variable and one or more independent variables. Researchers use regression analysis to identify and quantify the impact of independent variables on the dependent variable.

Factor Analysis

Factor analysis is a statistical technique used to identify underlying factors that explain the correlations among a set of variables. Researchers use factor analysis to reduce a large number of variables to a smaller set of factors that capture the most important information.

Structural Equation Modeling

Structural equation modeling is a statistical technique used to test complex relationships between variables. It involves specifying a model that includes both observed and unobserved variables, and then using statistical methods to test the fit of the model to the data.

Time Series Analysis

Time series analysis is a statistical technique used to analyze data that is collected over time. It involves identifying patterns and trends in the data, as well as any seasonal or cyclical variations.

Multilevel Modeling

Multilevel modeling is a statistical technique used to analyze data that is nested within multiple levels. For example, researchers might use multilevel modeling to analyze data that is collected from individuals who are nested within groups, such as students nested within schools.

Applications of Quantitative Research

Quantitative research has many applications across a wide range of fields. Here are some common examples:

  • Market Research : Quantitative research is used extensively in market research to understand consumer behavior, preferences, and trends. Researchers use surveys, experiments, and other quantitative methods to collect data that can inform marketing strategies, product development, and pricing decisions.
  • Health Research: Quantitative research is used in health research to study the effectiveness of medical treatments, identify risk factors for diseases, and track health outcomes over time. Researchers use statistical methods to analyze data from clinical trials, surveys, and other sources to inform medical practice and policy.
  • Social Science Research: Quantitative research is used in social science research to study human behavior, attitudes, and social structures. Researchers use surveys, experiments, and other quantitative methods to collect data that can inform social policies, educational programs, and community interventions.
  • Education Research: Quantitative research is used in education research to study the effectiveness of teaching methods, assess student learning outcomes, and identify factors that influence student success. Researchers use experimental and quasi-experimental designs, as well as surveys and other quantitative methods, to collect and analyze data.
  • Environmental Research: Quantitative research is used in environmental research to study the impact of human activities on the environment, assess the effectiveness of conservation strategies, and identify ways to reduce environmental risks. Researchers use statistical methods to analyze data from field studies, experiments, and other sources.

Characteristics of Quantitative Research

Here are some key characteristics of quantitative research:

  • Numerical data : Quantitative research involves collecting numerical data through standardized methods such as surveys, experiments, and observational studies. This data is analyzed using statistical methods to identify patterns and relationships.
  • Large sample size: Quantitative research often involves collecting data from a large sample of individuals or groups in order to increase the reliability and generalizability of the findings.
  • Objective approach: Quantitative research aims to be objective and impartial in its approach, focusing on the collection and analysis of data rather than personal beliefs, opinions, or experiences.
  • Control over variables: Quantitative research often involves manipulating variables to test hypotheses and establish cause-and-effect relationships. Researchers aim to control for extraneous variables that may impact the results.
  • Replicable : Quantitative research aims to be replicable, meaning that other researchers should be able to conduct similar studies and obtain similar results using the same methods.
  • Statistical analysis: Quantitative research involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis allows researchers to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.
  • Generalizability: Quantitative research aims to produce findings that can be generalized to larger populations beyond the specific sample studied. This is achieved through the use of random sampling methods and statistical inference.

Examples of Quantitative Research

Here are some examples of quantitative research in different fields:

  • Market Research: A company conducts a survey of 1000 consumers to determine their brand awareness and preferences. The data is analyzed using statistical methods to identify trends and patterns that can inform marketing strategies.
  • Health Research : A researcher conducts a randomized controlled trial to test the effectiveness of a new drug for treating a particular medical condition. The study involves collecting data from a large sample of patients and analyzing the results using statistical methods.
  • Social Science Research : A sociologist conducts a survey of 500 people to study attitudes toward immigration in a particular country. The data is analyzed using statistical methods to identify factors that influence these attitudes.
  • Education Research: A researcher conducts an experiment to compare the effectiveness of two different teaching methods for improving student learning outcomes. The study involves randomly assigning students to different groups and collecting data on their performance on standardized tests.
  • Environmental Research : A team of researchers conduct a study to investigate the impact of climate change on the distribution and abundance of a particular species of plant or animal. The study involves collecting data on environmental factors and population sizes over time and analyzing the results using statistical methods.
  • Psychology : A researcher conducts a survey of 500 college students to investigate the relationship between social media use and mental health. The data is analyzed using statistical methods to identify correlations and potential causal relationships.
  • Political Science: A team of researchers conducts a study to investigate voter behavior during an election. They use survey methods to collect data on voting patterns, demographics, and political attitudes, and analyze the results using statistical methods.

How to Conduct Quantitative Research

Here is a general overview of how to conduct quantitative research:

  • Develop a research question: The first step in conducting quantitative research is to develop a clear and specific research question. This question should be based on a gap in existing knowledge, and should be answerable using quantitative methods.
  • Develop a research design: Once you have a research question, you will need to develop a research design. This involves deciding on the appropriate methods to collect data, such as surveys, experiments, or observational studies. You will also need to determine the appropriate sample size, data collection instruments, and data analysis techniques.
  • Collect data: The next step is to collect data. This may involve administering surveys or questionnaires, conducting experiments, or gathering data from existing sources. It is important to use standardized methods to ensure that the data is reliable and valid.
  • Analyze data : Once the data has been collected, it is time to analyze it. This involves using statistical methods to identify patterns, trends, and relationships between variables. Common statistical techniques include correlation analysis, regression analysis, and hypothesis testing.
  • Interpret results: After analyzing the data, you will need to interpret the results. This involves identifying the key findings, determining their significance, and drawing conclusions based on the data.
  • Communicate findings: Finally, you will need to communicate your findings. This may involve writing a research report, presenting at a conference, or publishing in a peer-reviewed journal. It is important to clearly communicate the research question, methods, results, and conclusions to ensure that others can understand and replicate your research.

When to use Quantitative Research

Here are some situations when quantitative research can be appropriate:

  • To test a hypothesis: Quantitative research is often used to test a hypothesis or a theory. It involves collecting numerical data and using statistical analysis to determine if the data supports or refutes the hypothesis.
  • To generalize findings: If you want to generalize the findings of your study to a larger population, quantitative research can be useful. This is because it allows you to collect numerical data from a representative sample of the population and use statistical analysis to make inferences about the population as a whole.
  • To measure relationships between variables: If you want to measure the relationship between two or more variables, such as the relationship between age and income, or between education level and job satisfaction, quantitative research can be useful. It allows you to collect numerical data on both variables and use statistical analysis to determine the strength and direction of the relationship.
  • To identify patterns or trends: Quantitative research can be useful for identifying patterns or trends in data. For example, you can use quantitative research to identify trends in consumer behavior or to identify patterns in stock market data.
  • To quantify attitudes or opinions : If you want to measure attitudes or opinions on a particular topic, quantitative research can be useful. It allows you to collect numerical data using surveys or questionnaires and analyze the data using statistical methods to determine the prevalence of certain attitudes or opinions.

Purpose of Quantitative Research

The purpose of quantitative research is to systematically investigate and measure the relationships between variables or phenomena using numerical data and statistical analysis. The main objectives of quantitative research include:

  • Description : To provide a detailed and accurate description of a particular phenomenon or population.
  • Explanation : To explain the reasons for the occurrence of a particular phenomenon, such as identifying the factors that influence a behavior or attitude.
  • Prediction : To predict future trends or behaviors based on past patterns and relationships between variables.
  • Control : To identify the best strategies for controlling or influencing a particular outcome or behavior.

Quantitative research is used in many different fields, including social sciences, business, engineering, and health sciences. It can be used to investigate a wide range of phenomena, from human behavior and attitudes to physical and biological processes. The purpose of quantitative research is to provide reliable and valid data that can be used to inform decision-making and improve understanding of the world around us.

Advantages of Quantitative Research

There are several advantages of quantitative research, including:

  • Objectivity : Quantitative research is based on objective data and statistical analysis, which reduces the potential for bias or subjectivity in the research process.
  • Reproducibility : Because quantitative research involves standardized methods and measurements, it is more likely to be reproducible and reliable.
  • Generalizability : Quantitative research allows for generalizations to be made about a population based on a representative sample, which can inform decision-making and policy development.
  • Precision : Quantitative research allows for precise measurement and analysis of data, which can provide a more accurate understanding of phenomena and relationships between variables.
  • Efficiency : Quantitative research can be conducted relatively quickly and efficiently, especially when compared to qualitative research, which may involve lengthy data collection and analysis.
  • Large sample sizes : Quantitative research can accommodate large sample sizes, which can increase the representativeness and generalizability of the results.

Limitations of Quantitative Research

There are several limitations of quantitative research, including:

  • Limited understanding of context: Quantitative research typically focuses on numerical data and statistical analysis, which may not provide a comprehensive understanding of the context or underlying factors that influence a phenomenon.
  • Simplification of complex phenomena: Quantitative research often involves simplifying complex phenomena into measurable variables, which may not capture the full complexity of the phenomenon being studied.
  • Potential for researcher bias: Although quantitative research aims to be objective, there is still the potential for researcher bias in areas such as sampling, data collection, and data analysis.
  • Limited ability to explore new ideas: Quantitative research is often based on pre-determined research questions and hypotheses, which may limit the ability to explore new ideas or unexpected findings.
  • Limited ability to capture subjective experiences : Quantitative research is typically focused on objective data and may not capture the subjective experiences of individuals or groups being studied.
  • Ethical concerns : Quantitative research may raise ethical concerns, such as invasion of privacy or the potential for harm to participants.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Case Study Research

Case Study – Methods, Examples and Guide

Observational Research

Observational Research – Methods and Guide

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

Survey Research

Survey Research – Types, Methods, Examples

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Inferential Statistics | An Easy Introduction & Examples

Inferential Statistics | An Easy Introduction & Examples

Published on September 4, 2020 by Pritha Bhandari . Revised on June 22, 2023.

While descriptive statistics summarize the characteristics of a data set, inferential statistics help you come to conclusions and make predictions based on your data.

When you have collected data from a sample , you can use inferential statistics to understand the larger population from which the sample is taken.

Inferential statistics have two main uses:

  • making estimates about populations (for example, the mean SAT score of all 11th graders in the US).
  • testing hypotheses to draw conclusions about populations (for example, the relationship between SAT scores and family income).

Table of contents

Descriptive versus inferential statistics, estimating population parameters from sample statistics, hypothesis testing, other interesting articles, frequently asked questions about inferential statistics.

Descriptive statistics allow you to describe a data set, while inferential statistics allow you to make inferences based on a data set.

  • Descriptive statistics

Using descriptive statistics, you can report characteristics of your data:

  • The distribution concerns the frequency of each value.
  • The central tendency concerns the averages of the values.
  • The variability concerns how spread out the values are.

In descriptive statistics, there is no uncertainty – the statistics precisely describe the data that you collected. If you collect data from an entire population, you can directly compare these descriptive statistics to those from other populations.

Inferential statistics

Most of the time, you can only acquire data from samples, because it is too difficult or expensive to collect data from the whole population that you’re interested in.

While descriptive statistics can only summarize a sample’s characteristics, inferential statistics use your sample to make reasonable guesses about the larger population.

With inferential statistics, it’s important to use random and unbiased sampling methods . If your sample isn’t representative of your population, then you can’t make valid statistical inferences or generalize .

Sampling error in inferential statistics

Since the size of a sample is always smaller than the size of the population, some of the population isn’t captured by sample data. This creates sampling error , which is the difference between the true population values (called parameters) and the measured sample values (called statistics).

Sampling error arises any time you use a sample, even if your sample is random and unbiased. For this reason, there is always some uncertainty in inferential statistics. However, using probability sampling methods reduces this uncertainty.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

The characteristics of samples and populations are described by numbers called statistics and parameters :

  • A statistic is a measure that describes the sample (e.g., sample mean ).
  • A parameter is a measure that describes the whole population (e.g., population mean).

Sampling error is the difference between a parameter and a corresponding statistic. Since in most cases you don’t know the real population parameter, you can use inferential statistics to estimate these parameters in a way that takes sampling error into account.

There are two important types of estimates you can make about the population: point estimates and interval estimates .

  • A point estimate is a single value estimate of a parameter. For instance, a sample mean is a point estimate of a population mean.
  • An interval estimate gives you a range of values where the parameter is expected to lie. A confidence interval is the most common type of interval estimate.

Both types of estimates are important for gathering a clear idea of where a parameter is likely to lie.

Confidence intervals

A confidence interval uses the variability around a statistic to come up with an interval estimate for a parameter. Confidence intervals are useful for estimating parameters because they take sampling error into account.

While a point estimate gives you a precise value for the parameter you are interested in, a confidence interval tells you the uncertainty of the point estimate. They are best used in combination with each other.

Each confidence interval is associated with a confidence level. A confidence level tells you the probability (in percentage) of the interval containing the parameter estimate if you repeat the study again.

A 95% confidence interval means that if you repeat your study with a new sample in exactly the same way 100 times, you can expect your estimate to lie within the specified range of values 95 times.

Although you can say that your estimate will lie within the interval a certain percentage of the time, you cannot say for sure that the actual population parameter will. That’s because you can’t know the true value of the population parameter without collecting data from the full population.

However, with random sampling and a suitable sample size, you can reasonably expect your confidence interval to contain the parameter a certain percentage of the time.

Your point estimate of the population mean paid vacation days is the sample mean of 19 paid vacation days.

Hypothesis testing is a formal process of statistical analysis using inferential statistics. The goal of hypothesis testing is to compare populations or assess relationships between variables using samples.

Hypotheses , or predictions, are tested using statistical tests . Statistical tests also estimate sampling errors so that valid inferences can be made.

Statistical tests can be parametric or non-parametric. Parametric tests are considered more statistically powerful because they are more likely to detect an effect if one exists.

Parametric tests make assumptions that include the following:

  • the population that the sample comes from follows a normal distribution of scores
  • the sample size is large enough to represent the population
  • the variances , a measure of variability , of each group being compared are similar

When your data violates any of these assumptions, non-parametric tests are more suitable. Non-parametric tests are called “distribution-free tests” because they don’t assume anything about the distribution of the population data.

Statistical tests come in three forms: tests of comparison, correlation or regression.

Comparison tests

Comparison tests assess whether there are differences in means, medians or rankings of scores of two or more groups.

To decide which test suits your aim, consider whether your data meets the conditions necessary for parametric tests, the number of samples, and the levels of measurement of your variables.

Means can only be found for interval or ratio data , while medians and rankings are more appropriate measures for ordinal data .

Correlation tests

Correlation tests determine the extent to which two variables are associated.

Although Pearson’s r is the most statistically powerful test, Spearman’s r is appropriate for interval and ratio variables when the data doesn’t follow a normal distribution.

The chi square test of independence is the only test that can be used with nominal variables.

Regression tests

Regression tests demonstrate whether changes in predictor variables cause changes in an outcome variable. You can decide which regression test to use based on the number and types of variables you have as predictors and outcomes.

Most of the commonly used regression tests are parametric. If your data is not normally distributed, you can perform data transformations.

Data transformations help you make your data normally distributed using mathematical operations, like taking the square root of each value.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Confidence interval
  • Measures of central tendency
  • Correlation coefficient

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Descriptive statistics summarize the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalizable to the broader population.

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

A sampling error is the difference between a population parameter and a sample statistic .

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). Inferential Statistics | An Easy Introduction & Examples. Scribbr. Retrieved April 2, 2024, from https://www.scribbr.com/statistics/inferential-statistics/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, parameter vs statistic | definitions, differences & examples, descriptive statistics | definitions, types, examples, hypothesis testing | a step-by-step guide with easy examples, what is your plagiarism score.

What is Statistical Analysis? Types, Methods, Software, Examples

Appinio Research · 29.02.2024 · 31min read

What Is Statistical Analysis Types Methods Software Examples

Ever wondered how we make sense of vast amounts of data to make informed decisions? Statistical analysis is the answer. In our data-driven world, statistical analysis serves as a powerful tool to uncover patterns, trends, and relationships hidden within data. From predicting sales trends to assessing the effectiveness of new treatments, statistical analysis empowers us to derive meaningful insights and drive evidence-based decision-making across various fields and industries. In this guide, we'll explore the fundamentals of statistical analysis, popular methods, software tools, practical examples, and best practices to help you harness the power of statistics effectively. Whether you're a novice or an experienced analyst, this guide will equip you with the knowledge and skills to navigate the world of statistical analysis with confidence.

What is Statistical Analysis?

Statistical analysis is a methodical process of collecting, analyzing, interpreting, and presenting data to uncover patterns, trends, and relationships. It involves applying statistical techniques and methodologies to make sense of complex data sets and draw meaningful conclusions.

Importance of Statistical Analysis

Statistical analysis plays a crucial role in various fields and industries due to its numerous benefits and applications:

  • Informed Decision Making : Statistical analysis provides valuable insights that inform decision-making processes in business, healthcare, government, and academia. By analyzing data, organizations can identify trends, assess risks, and optimize strategies for better outcomes.
  • Evidence-Based Research : Statistical analysis is fundamental to scientific research, enabling researchers to test hypotheses, draw conclusions, and validate theories using empirical evidence. It helps researchers quantify relationships, assess the significance of findings, and advance knowledge in their respective fields.
  • Quality Improvement : In manufacturing and quality management, statistical analysis helps identify defects, improve processes, and enhance product quality. Techniques such as Six Sigma and Statistical Process Control (SPC) are used to monitor performance, reduce variation, and achieve quality objectives.
  • Risk Assessment : In finance, insurance, and investment, statistical analysis is used for risk assessment and portfolio management. By analyzing historical data and market trends, analysts can quantify risks, forecast outcomes, and make informed decisions to mitigate financial risks.
  • Predictive Modeling : Statistical analysis enables predictive modeling and forecasting in various domains, including sales forecasting, demand planning, and weather prediction. By analyzing historical data patterns, predictive models can anticipate future trends and outcomes with reasonable accuracy.
  • Healthcare Decision Support : In healthcare, statistical analysis is integral to clinical research, epidemiology, and healthcare management. It helps healthcare professionals assess treatment effectiveness, analyze patient outcomes, and optimize resource allocation for improved patient care.

Statistical Analysis Applications

Statistical analysis finds applications across diverse domains and disciplines, including:

  • Business and Economics : Market research , financial analysis, econometrics, and business intelligence.
  • Healthcare and Medicine : Clinical trials, epidemiological studies, healthcare outcomes research, and disease surveillance.
  • Social Sciences : Survey research, demographic analysis, psychology experiments, and public opinion polls.
  • Engineering : Reliability analysis, quality control, process optimization, and product design.
  • Environmental Science : Environmental monitoring, climate modeling, and ecological research.
  • Education : Educational research, assessment, program evaluation, and learning analytics.
  • Government and Public Policy : Policy analysis, program evaluation, census data analysis, and public administration.
  • Technology and Data Science : Machine learning, artificial intelligence, data mining, and predictive analytics.

These applications demonstrate the versatility and significance of statistical analysis in addressing complex problems and informing decision-making across various sectors and disciplines.

Fundamentals of Statistics

Understanding the fundamentals of statistics is crucial for conducting meaningful analyses. Let's delve into some essential concepts that form the foundation of statistical analysis.

Basic Concepts

Statistics is the science of collecting, organizing, analyzing, and interpreting data to make informed decisions or conclusions. To embark on your statistical journey, familiarize yourself with these fundamental concepts:

  • Population vs. Sample : A population comprises all the individuals or objects of interest in a study, while a sample is a subset of the population selected for analysis. Understanding the distinction between these two entities is vital, as statistical analyses often rely on samples to draw conclusions about populations.
  • Independent Variables : Variables that are manipulated or controlled in an experiment.
  • Dependent Variables : Variables that are observed or measured in response to changes in independent variables.
  • Parameters vs. Statistics : Parameters are numerical measures that describe a population, whereas statistics are numerical measures that describe a sample. For instance, the population mean is denoted by μ (mu), while the sample mean is denoted by x̄ (x-bar).

Descriptive Statistics

Descriptive statistics involve methods for summarizing and describing the features of a dataset. These statistics provide insights into the central tendency, variability, and distribution of the data. Standard measures of descriptive statistics include:

  • Mean : The arithmetic average of a set of values, calculated by summing all values and dividing by the number of observations.
  • Median : The middle value in a sorted list of observations.
  • Mode : The value that appears most frequently in a dataset.
  • Range : The difference between the maximum and minimum values in a dataset.
  • Variance : The average of the squared differences from the mean.
  • Standard Deviation : The square root of the variance, providing a measure of the average distance of data points from the mean.
  • Graphical Techniques : Graphical representations, including histograms, box plots, and scatter plots, offer visual insights into the distribution and relationships within a dataset. These visualizations aid in identifying patterns, outliers, and trends.

Inferential Statistics

Inferential statistics enable researchers to draw conclusions or make predictions about populations based on sample data. These methods allow for generalizations beyond the observed data. Fundamental techniques in inferential statistics include:

  • Null Hypothesis (H0) : The hypothesis that there is no significant difference or relationship.
  • Alternative Hypothesis (H1) : The hypothesis that there is a significant difference or relationship.
  • Confidence Intervals : Confidence intervals provide a range of plausible values for a population parameter. They offer insights into the precision of sample estimates and the uncertainty associated with those estimates.
  • Regression Analysis : Regression analysis examines the relationship between one or more independent variables and a dependent variable. It allows for the prediction of the dependent variable based on the values of the independent variables.
  • Sampling Methods : Sampling methods, such as simple random sampling, stratified sampling, and cluster sampling, are employed to ensure that sample data are representative of the population of interest. These methods help mitigate biases and improve the generalizability of results.

Probability Distributions

Probability distributions describe the likelihood of different outcomes in a statistical experiment. Understanding these distributions is essential for modeling and analyzing random phenomena. Some common probability distributions include:

  • Normal Distribution : The normal distribution, also known as the Gaussian distribution, is characterized by a symmetric, bell-shaped curve. Many natural phenomena follow this distribution, making it widely applicable in statistical analysis.
  • Binomial Distribution : The binomial distribution describes the number of successes in a fixed number of independent Bernoulli trials. It is commonly used to model binary outcomes, such as success or failure, heads or tails.
  • Poisson Distribution : The Poisson distribution models the number of events occurring in a fixed interval of time or space. It is often used to analyze rare or discrete events, such as the number of customer arrivals in a queue within a given time period.

Types of Statistical Analysis

Statistical analysis encompasses a diverse range of methods and approaches, each suited to different types of data and research questions. Understanding the various types of statistical analysis is essential for selecting the most appropriate technique for your analysis. Let's explore some common distinctions in statistical analysis methods.

Parametric vs. Non-parametric Analysis

Parametric and non-parametric analyses represent two broad categories of statistical methods, each with its own assumptions and applications.

  • Parametric Analysis : Parametric methods assume that the data follow a specific probability distribution, often the normal distribution. These methods rely on estimating parameters (e.g., means, variances) from the data. Parametric tests typically provide more statistical power but require stricter assumptions. Examples of parametric tests include t-tests, ANOVA, and linear regression.
  • Non-parametric Analysis : Non-parametric methods make fewer assumptions about the underlying distribution of the data. Instead of estimating parameters, non-parametric tests rely on ranks or other distribution-free techniques. Non-parametric tests are often used when data do not meet the assumptions of parametric tests or when dealing with ordinal or non-normal data. Examples of non-parametric tests include the Wilcoxon rank-sum test, Kruskal-Wallis test, and Spearman correlation.

Descriptive vs. Inferential Analysis

Descriptive and inferential analyses serve distinct purposes in statistical analysis, focusing on summarizing data and making inferences about populations, respectively.

  • Descriptive Analysis : Descriptive statistics aim to describe and summarize the features of a dataset. These statistics provide insights into the central tendency, variability, and distribution of the data. Descriptive analysis techniques include measures of central tendency (e.g., mean, median, mode), measures of dispersion (e.g., variance, standard deviation), and graphical representations (e.g., histograms, box plots).
  • Inferential Analysis : Inferential statistics involve making inferences or predictions about populations based on sample data. These methods allow researchers to generalize findings from the sample to the larger population. Inferential analysis techniques include hypothesis testing, confidence intervals, regression analysis, and sampling methods. These methods help researchers draw conclusions about population parameters, such as means, proportions, or correlations, based on sample data.

Exploratory vs. Confirmatory Analysis

Exploratory and confirmatory analyses represent two different approaches to data analysis, each serving distinct purposes in the research process.

  • Exploratory Analysis : Exploratory data analysis (EDA) focuses on exploring data to discover patterns, relationships, and trends. EDA techniques involve visualizing data, identifying outliers, and generating hypotheses for further investigation. Exploratory analysis is particularly useful in the early stages of research when the goal is to gain insights and generate hypotheses rather than confirm specific hypotheses.
  • Confirmatory Analysis : Confirmatory data analysis involves testing predefined hypotheses or theories based on prior knowledge or assumptions. Confirmatory analysis follows a structured approach, where hypotheses are tested using appropriate statistical methods. Confirmatory analysis is common in hypothesis-driven research, where the goal is to validate or refute specific hypotheses using empirical evidence. Techniques such as hypothesis testing, regression analysis, and experimental design are often employed in confirmatory analysis.

Methods of Statistical Analysis

Statistical analysis employs various methods to extract insights from data and make informed decisions. Let's explore some of the key methods used in statistical analysis and their applications.

Hypothesis Testing

Hypothesis testing is a fundamental concept in statistics, allowing researchers to make decisions about population parameters based on sample data. The process involves formulating null and alternative hypotheses, selecting an appropriate test statistic, determining the significance level, and interpreting the results. Standard hypothesis tests include:

  • t-tests : Used to compare means between two groups.
  • ANOVA (Analysis of Variance) : Extends the t-test to compare means across multiple groups.
  • Chi-square test : Assessing the association between categorical variables.

Regression Analysis

Regression analysis explores the relationship between one or more independent variables and a dependent variable. It is widely used in predictive modeling and understanding the impact of variables on outcomes. Key types of regression analysis include:

  • Simple Linear Regression : Examines the linear relationship between one independent variable and a dependent variable.
  • Multiple Linear Regression : Extends simple linear regression to analyze the relationship between multiple independent variables and a dependent variable.
  • Logistic Regression : Used for predicting binary outcomes or modeling probabilities.

Analysis of Variance (ANOVA)

ANOVA is a statistical technique used to compare means across two or more groups. It partitions the total variability in the data into components attributable to different sources, such as between-group differences and within-group variability. ANOVA is commonly used in experimental design and hypothesis testing scenarios.

Time Series Analysis

Time series analysis deals with analyzing data collected or recorded at successive time intervals. It helps identify patterns, trends, and seasonality in the data. Time series analysis techniques include:

  • Trend Analysis : Identifying long-term trends or patterns in the data.
  • Seasonal Decomposition : Separating the data into seasonal, trend, and residual components.
  • Forecasting : Predicting future values based on historical data.

Survival Analysis

Survival analysis is used to analyze time-to-event data, such as time until death, failure, or occurrence of an event of interest. It is widely used in medical research, engineering, and social sciences to analyze survival probabilities and hazard rates over time.

Factor Analysis

Factor analysis is a statistical method used to identify underlying factors or latent variables that explain patterns of correlations among observed variables. It is commonly used in psychology, sociology, and market research to uncover underlying dimensions or constructs.

Cluster Analysis

Cluster analysis is a multivariate technique that groups similar objects or observations into clusters or segments based on their characteristics. It is widely used in market segmentation, image processing, and biological classification.

Principal Component Analysis (PCA)

PCA is a dimensionality reduction technique used to transform high-dimensional data into a lower-dimensional space while preserving most of the variability in the data. It identifies orthogonal axes (principal components) that capture the maximum variance in the data. PCA is useful for data visualization, feature selection, and data compression.

How to Choose the Right Statistical Analysis Method?

Selecting the appropriate statistical method is crucial for obtaining accurate and meaningful results from your data analysis.

Understanding Data Types and Distribution

Before choosing a statistical method, it's essential to understand the types of data you're working with and their distribution. Different statistical methods are suitable for different types of data:

  • Continuous vs. Categorical Data : Determine whether your data are continuous (e.g., height, weight) or categorical (e.g., gender, race). Parametric methods such as t-tests and regression are typically used for continuous data, while non-parametric methods like chi-square tests are suitable for categorical data.
  • Normality : Assess whether your data follows a normal distribution. Parametric methods often assume normality, so if your data are not normally distributed, non-parametric methods may be more appropriate.

Assessing Assumptions

Many statistical methods rely on certain assumptions about the data. Before applying a method, it's essential to assess whether these assumptions are met:

  • Independence : Ensure that observations are independent of each other. Violations of independence assumptions can lead to biased results.
  • Homogeneity of Variance : Verify that variances are approximately equal across groups, especially in ANOVA and regression analyses. Levene's test or Bartlett's test can be used to assess homogeneity of variance.
  • Linearity : Check for linear relationships between variables, particularly in regression analysis. Residual plots can help diagnose violations of linearity assumptions.

Considering Research Objectives

Your research objectives should guide the selection of the appropriate statistical method.

  • What are you trying to achieve with your analysis? : Determine whether you're interested in comparing groups, predicting outcomes, exploring relationships, or identifying patterns.
  • What type of data are you analyzing? : Choose methods that are suitable for your data type and research questions.
  • Are you testing specific hypotheses or exploring data for insights? : Confirmatory analyses involve testing predefined hypotheses, while exploratory analyses focus on discovering patterns or relationships in the data.

Consulting Statistical Experts

If you're unsure about the most appropriate statistical method for your analysis, don't hesitate to seek advice from statistical experts or consultants:

  • Collaborate with Statisticians : Statisticians can provide valuable insights into the strengths and limitations of different statistical methods and help you select the most appropriate approach.
  • Utilize Resources : Take advantage of online resources, forums, and statistical software documentation to learn about different methods and their applications.
  • Peer Review : Consider seeking feedback from colleagues or peers familiar with statistical analysis to validate your approach and ensure rigor in your analysis.

By carefully considering these factors and consulting with experts when needed, you can confidently choose the suitable statistical method to address your research questions and obtain reliable results.

Statistical Analysis Software

Choosing the right software for statistical analysis is crucial for efficiently processing and interpreting your data. In addition to statistical analysis software, it's essential to consider tools for data collection, which lay the foundation for meaningful analysis.

What is Statistical Analysis Software?

Statistical software provides a range of tools and functionalities for data analysis, visualization, and interpretation. These software packages offer user-friendly interfaces and robust analytical capabilities, making them indispensable tools for researchers, analysts, and data scientists.

  • Graphical User Interface (GUI) : Many statistical software packages offer intuitive GUIs that allow users to perform analyses using point-and-click interfaces. This makes statistical analysis accessible to users with varying levels of programming expertise.
  • Scripting and Programming : Advanced users can leverage scripting and programming capabilities within statistical software to automate analyses, customize functions, and extend the software's functionality.
  • Visualization : Statistical software often includes built-in visualization tools for creating charts, graphs, and plots to visualize data distributions, relationships, and trends.
  • Data Management : These software packages provide features for importing, cleaning, and manipulating datasets, ensuring data integrity and consistency throughout the analysis process.

Popular Statistical Analysis Software

Several statistical software packages are widely used in various industries and research domains. Some of the most popular options include:

  • R : R is a free, open-source programming language and software environment for statistical computing and graphics. It offers a vast ecosystem of packages for data manipulation, visualization, and analysis, making it a popular choice among statisticians and data scientists.
  • Python : Python is a versatile programming language with robust libraries like NumPy, SciPy, and pandas for data analysis and scientific computing. Python's simplicity and flexibility make it an attractive option for statistical analysis, particularly for users with programming experience.
  • SPSS : SPSS (Statistical Package for the Social Sciences) is a comprehensive statistical software package widely used in social science research, marketing, and healthcare. It offers a user-friendly interface and a wide range of statistical procedures for data analysis and reporting.
  • SAS : SAS (Statistical Analysis System) is a powerful statistical software suite used for data management, advanced analytics, and predictive modeling. SAS is commonly employed in industries such as healthcare, finance, and government for data-driven decision-making.
  • Stata : Stata is a statistical software package that provides tools for data analysis, manipulation, and visualization. It is popular in academic research, economics, and social sciences for its robust statistical capabilities and ease of use.
  • MATLAB : MATLAB is a high-level programming language and environment for numerical computing and visualization. It offers built-in functions and toolboxes for statistical analysis, machine learning, and signal processing.

Data Collection Software

In addition to statistical analysis software, data collection software plays a crucial role in the research process. These tools facilitate data collection, management, and organization from various sources, ensuring data quality and reliability.

When it comes to data collection, precision and efficiency are paramount. Appinio offers a seamless solution for gathering real-time consumer insights, empowering you to make informed decisions swiftly. With our intuitive platform, you can define your target audience with precision, launch surveys effortlessly, and access valuable data in minutes.   Experience the power of Appinio and elevate your data collection process today. Ready to see it in action? Book a demo now!

Book a Demo

How to Choose the Right Statistical Analysis Software?

When selecting software for statistical analysis and data collection, consider the following factors:

  • Compatibility : Ensure the software is compatible with your operating system, hardware, and data formats.
  • Usability : Choose software that aligns with your level of expertise and provides features that meet your analysis and data collection requirements.
  • Integration : Consider whether the software integrates with other tools and platforms in your workflow, such as data visualization software or data storage systems.
  • Cost and Licensing : Evaluate the cost of licensing or subscription fees, as well as any additional costs for training, support, or maintenance.

By carefully evaluating these factors and considering your specific analysis and data collection needs, you can select the right software tools to support your research objectives and drive meaningful insights from your data.

Statistical Analysis Examples

Understanding statistical analysis methods is best achieved through practical examples. Let's explore three examples that demonstrate the application of statistical techniques in real-world scenarios.

Example 1: Linear Regression

Scenario : A marketing analyst wants to understand the relationship between advertising spending and sales revenue for a product.

Data : The analyst collects data on monthly advertising expenditures (in dollars) and corresponding sales revenue (in dollars) over the past year.

Analysis : Using simple linear regression, the analyst fits a regression model to the data, where advertising spending is the independent variable (X) and sales revenue is the dependent variable (Y). The regression analysis estimates the linear relationship between advertising spending and sales revenue, allowing the analyst to predict sales based on advertising expenditures.

Result : The regression analysis reveals a statistically significant positive relationship between advertising spending and sales revenue. For every additional dollar spent on advertising, sales revenue increases by an estimated amount (slope coefficient). The analyst can use this information to optimize advertising budgets and forecast sales performance.

Example 2: Hypothesis Testing

Scenario : A pharmaceutical company develops a new drug intended to lower blood pressure. The company wants to determine whether the new drug is more effective than the existing standard treatment.

Data : The company conducts a randomized controlled trial (RCT) involving two groups of participants: one group receives the new drug, and the other receives the standard treatment. Blood pressure measurements are taken before and after the treatment period.

Analysis : The company uses hypothesis testing, specifically a two-sample t-test, to compare the mean reduction in blood pressure between the two groups. The null hypothesis (H0) states that there is no difference in the mean reduction in blood pressure between the two treatments, while the alternative hypothesis (H1) suggests that the new drug is more effective.

Result : The t-test results indicate a statistically significant difference in the mean reduction in blood pressure between the two groups. The company concludes that the new drug is more effective than the standard treatment in lowering blood pressure, based on the evidence from the RCT.

Example 3: ANOVA

Scenario : A researcher wants to compare the effectiveness of three different teaching methods on student performance in a mathematics course.

Data : The researcher conducts an experiment where students are randomly assigned to one of three groups: traditional lecture-based instruction, active learning, or flipped classroom. At the end of the semester, students' scores on a standardized math test are recorded.

Analysis : The researcher performs an analysis of variance (ANOVA) to compare the mean test scores across the three teaching methods. ANOVA assesses whether there are statistically significant differences in mean scores between the groups.

Result : The ANOVA results reveal a significant difference in mean test scores between the three teaching methods. Post-hoc tests, such as Tukey's HSD (Honestly Significant Difference), can be conducted to identify which specific teaching methods differ significantly from each other in terms of student performance.

These examples illustrate how statistical analysis techniques can be applied to address various research questions and make data-driven decisions in different fields. By understanding and applying these methods effectively, researchers and analysts can derive valuable insights from their data to inform decision-making and drive positive outcomes.

Statistical Analysis Best Practices

Statistical analysis is a powerful tool for extracting insights from data, but it's essential to follow best practices to ensure the validity, reliability, and interpretability of your results.

  • Clearly Define Research Questions : Before conducting any analysis, clearly define your research questions or objectives . This ensures that your analysis is focused and aligned with the goals of your study.
  • Choose Appropriate Methods : Select statistical methods suitable for your data type, research design , and objectives. Consider factors such as data distribution, sample size, and assumptions of the chosen method.
  • Preprocess Data : Clean and preprocess your data to remove errors, outliers, and missing values. Data preprocessing steps may include data cleaning, normalization, and transformation to ensure data quality and consistency.
  • Check Assumptions : Verify that the assumptions of the chosen statistical methods are met. Assumptions may include normality, homogeneity of variance, independence, and linearity. Conduct diagnostic tests or exploratory data analysis to assess assumptions.
  • Transparent Reporting : Document your analysis procedures, including data preprocessing steps, statistical methods used, and any assumptions made. Transparent reporting enhances reproducibility and allows others to evaluate the validity of your findings.
  • Consider Sample Size : Ensure that your sample size is sufficient to detect meaningful effects or relationships. Power analysis can help determine the minimum sample size required to achieve adequate statistical power.
  • Interpret Results Cautiously : Interpret statistical results with caution and consider the broader context of your research. Be mindful of effect sizes, confidence intervals, and practical significance when interpreting findings.
  • Validate Findings : Validate your findings through robustness checks, sensitivity analyses, or replication studies. Cross-validation and bootstrapping techniques can help assess the stability and generalizability of your results.
  • Avoid P-Hacking and Data Dredging : Guard against p-hacking and data dredging by pre-registering hypotheses, conducting planned analyses, and avoiding selective reporting of results. Maintain transparency and integrity in your analysis process.

By following these best practices, you can conduct rigorous and reliable statistical analyses that yield meaningful insights and contribute to evidence-based decision-making in your field.

Conclusion for Statistical Analysis

Statistical analysis is a vital tool for making sense of data and guiding decision-making across diverse fields. By understanding the fundamentals of statistical analysis, including concepts like hypothesis testing, regression analysis, and data visualization, you gain the ability to extract valuable insights from complex datasets. Moreover, selecting the appropriate statistical methods, choosing the right software, and following best practices ensure the validity and reliability of your analyses. In today's data-driven world, the ability to conduct rigorous statistical analysis is a valuable skill that empowers individuals and organizations to make informed decisions and drive positive outcomes. Whether you're a researcher, analyst, or decision-maker, mastering statistical analysis opens doors to new opportunities for understanding the world around us and unlocking the potential of data to solve real-world problems.

How to Collect Data for Statistical Analysis in Minutes?

Introducing Appinio , your gateway to effortless data collection for statistical analysis. As a real-time market research platform, Appinio specializes in delivering instant consumer insights, empowering businesses to make swift, data-driven decisions.

With Appinio, conducting your own market research is not only feasible but also exhilarating. Here's why:

  • Obtain insights in minutes, not days:  From posing questions to uncovering insights, Appinio accelerates the entire research process, ensuring rapid access to valuable data.
  • User-friendly interface:  No advanced degrees required! Our platform is designed to be intuitive and accessible to anyone, allowing you to dive into market research with confidence.
  • Targeted surveys, global reach:  Define your target audience with precision using our extensive array of demographic and psychographic characteristics, and reach respondents in over 90 countries effortlessly.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

What is Field Research Definition Types Methods Examples

05.04.2024 | 27min read

What is Field Research? Definition, Types, Methods, Examples

What is Cluster Sampling Definition Methods Examples

03.04.2024 | 29min read

What is Cluster Sampling? Definition, Methods, Examples

Cross Tabulation Analysis Examples A Full Guide

01.04.2024 | 26min read

Cross-Tabulation Analysis: A Full Guide (+ Examples)

  • Open access
  • Published: 06 April 2024

Statistical analyses of ordinal outcomes in randomised controlled trials: a scoping review

  • Chris J. Selman   ORCID: orcid.org/0000-0002-1277-5538 1 , 2 ,
  • Katherine J. Lee 1 , 2 ,
  • Kristin N. Ferguson 4 ,
  • Clare L. Whitehead 4 , 5 ,
  • Brett J. Manley 4 , 6 , 7 &
  • Robert K. Mahar 1 , 3  

Trials volume  25 , Article number:  241 ( 2024 ) Cite this article

2 Altmetric

Metrics details

Randomised controlled trials (RCTs) aim to estimate the causal effect of one or more interventions relative to a control. One type of outcome that can be of interest in an RCT is an ordinal outcome, which is useful to answer clinical questions regarding complex and evolving patient states. The target parameter of interest for an ordinal outcome depends on the research question and the assumptions the analyst is willing to make. This review aimed to provide an overview of how ordinal outcomes have been used and analysed in RCTs.

The review included RCTs with an ordinal primary or secondary outcome published between 2017 and 2022 in four highly ranked medical journals (the British Medical Journal , New England Journal of Medicine , The Lancet , and the Journal of the American Medical Association ) identified through PubMed. Details regarding the study setting, design, the target parameter, and statistical methods used to analyse the ordinal outcome were extracted.

The search identified 309 studies, of which 144 were eligible for inclusion. The most used target parameter was an odds ratio, reported in 78 (54%) studies. The ordinal outcome was dichotomised for analysis in 47 ( \(33\%\) ) studies, and the most common statistical model used to analyse the ordinal outcome on the full ordinal scale was the proportional odds model (64 [ \(44\%\) ] studies). Notably, 86 (60%) studies did not explicitly check or describe the robustness of the assumptions for the statistical method(s) used.

Conclusions

The results of this review indicate that in RCTs that use an ordinal outcome, there is variation in the target parameter and the analytical approaches used, with many dichotomising the ordinal outcome. Few studies provided assurance regarding the appropriateness of the assumptions and methods used to analyse the ordinal outcome. More guidance is needed to improve the transparent reporting of the analysis of ordinal outcomes in future trials.

Peer Review reports

Randomised controlled trials (RCTs) aim to estimate the causal effect of one or more interventions relative to a control or reference intervention. Ordinal outcomes are useful in RCTs because the categories can represent multiple patient states within a single endpoint. The definition of an ordinal outcome is one that comprises monotonically ranked categories that are ordered hierarchically such that the distance between any two categories is not necessarily equal (or even meaningfully quantifiable) [ 1 ]. Ordinal outcomes should have categories that are mutually exclusive and unambiguously defined and can be used to capture improvement and deterioration relative to a baseline value where relevant [ 2 ]. If an ordinal scale is being used to capture change in patient status, then the ordinal outcome should also be symmetric to avoid favouring a better or worse health outcome [ 2 ]. Commonly used ordinal outcomes in RCTs include the modified-Rankin scale, a 7-category measure of disability following stroke or neurological insult [ 3 , 4 , 5 , 6 ], the Glasgow Outcome Scale-Extended (GOS-E), an 8-category measure of functional impairment post traumatic brain injury [ 7 ], and the World Health Organization (WHO) COVID-19 Clinical Progression Scale [ 8 ], an 11-point measure of disease severity among patients with COVID-19. The WHO Clinical Progression Scale, developed specifically for COVID-19 in 2020 [ 8 ], has been used in many RCTs evaluating COVID-19 disease severity and progression [ 9 , 10 ] and has helped to increase the familiarity of ordinal data and modelling approaches for ordinal outcomes for clinicians and statisticians alike [ 11 ].

Randomised controlled trials that use ordinal outcomes need to be designed and analysed with care. This includes the need to explicitly define the target parameter to compare the intervention groups (i.e. the target of estimation, for example, a proportional odds ratio (OR)), the analysis approach, and whether assumptions used in the analysis are valid. Although this is true for all RCTs, these issues are more complex when using an ordinal outcome compared to a binary or continuous outcome. For example, the choice of target parameter for an ordinal outcome depends on both the research question [ 12 , 13 ] and the assumptions that the analyst is willing to make about the data.

One option is to preserve the ordinal nature of the outcome, which can give rise to a number of different target parameters. Principled analysis of ordinal data often relies on less familiar statistical methods and underlying assumptions. Many statistical methods have been proposed to analyse ordinal outcomes. One approach to estimate the effect of treatment on the distribution of ordinal endpoints is to use a cumulative logistic model [ 14 , 15 ]. This model uses the distribution of the cumulative log-odds of the ordinal outcome to estimate a set of ORs [ 16 ], which, for an increase in the value of a covariate, represents the odds of being in the same or higher category at each level of the ordinal scale [ 15 ]. Modelling is vastly simplified by assuming that each covariate in the model exerts the same effect on the cumulative log odds for each binary split of the ordinal outcome, regardless of the threshold. This is known as the proportional odds (PO) assumption, with the model referred to as ordered logistic regression or the PO model (we refer to the latter term herein). The PO model has desirable properties of palindromic invariance (where the estimates of the parameters are not equivalent when the order of the categories are reversed) and invariance under collapsibility (where the estimated target parameter is changed when categories of the response are combined or removed) [ 17 ]. Studies have shown that an ordinal analysis of the outcome using a PO model increases the statistical power relative to an analysis of the dichotomised scale [ 18 , 19 ]. The target parameter from this model, the proportional or common OR, also has a relatively intuitive interpretation [ 20 , 21 ], representing a shift in the distribution of ordinal scale scores toward a better outcome in an intervention group compared to a reference group.

The PO model approach makes the assumption that the odds are proportional for each binary split of the ordinal outcome. If this assumption is violated then the proportional OR may be misleading in certain circumstances. Specifically, violation to PO can affect type I or II errors and/or may distort the magnitude of the treatment effect. For example, violation of proportional odds can increase the likelihood of making a type I error since the model may incorrectly identify evidence of a relationship between the treatment and outcome. Violation of the proportional odds assumption may also increase the likelihood of a type II error as the model may fail to identify a relationship between the treatment and the ordinal outcome because the model may fail to capture the true complexity of the relationship. In addition, a treatment may exert a harmful effect for some categories of the ordinal outcome but exert a beneficial effect for the remaining categories, which can ‘average’ out to no treatment effect when assuming a constant OR across the levels of the ordinal scale. The violation of PO may be harmful if the interest is also to estimate predicted probabilities for the categories of the ordinal scale, which will be too low or high for some outcomes when PO is assumed. Although the PO assumption will ‘average’ the treatment effect across the categories of the ordinal outcome, this may not be a problem if all of the treatment effects for each cut-point are in the same direction and the research aim is to simply show whether the treatment is effective even in the presence of non-PO. If the PO assumption is meaningfully violated and the interest is either in the treatment effect on a specific range of the outcome or to obtain predicted probabilities for each category of the scale, the PO model can be extended to a partial proportional odds (PPO) model which allows the PO assumption to be relaxed for a specific set or for all covariates in the model [ 22 ]. There are two types of PPO models: the unconstrained PPO model, in which the cumulative log-ORs across each cut-point vary freely across some or all of the cut-points [ 23 ], and the constrained PPO model, which assumes some functional relationship between the cumulative log-ORs [ 21 ]. However, such an approach may be more inefficient than using a PO model [ 24 , 25 ].

Alternative statistical methods that can be used to analyse the ordinal outcome include multinomial regression, which estimates an OR for each category of the ordinal outcome relative to the baseline category. The disadvantage of multinomial regression is that the number of ORs requiring estimation increases with the number of categories in the ordinal outcome. A larger sample size may therefore be required to ensure accurate precision of the many target parameters. Other methods are the continuation ratio model or adjacent-category logistic model, though these models lack two desirable properties: palindromic invariance and invariance under collapsibility [ 15 , 17 , 26 ].

Another option is to use alternative methods, such as the Mann-Whitney U  test or Wilcoxon rank-sum test [ 27 ] (referred to as the Wilcoxon test herein). The Wilcoxon test is equivalent to the PO model with a single binary exposure variable [ 15 , 28 ]. The treatment effect from a Wilcoxon test is the concordance probability that represents the probability that a randomly chosen observation from a treatment group is greater than a randomly chosen observation from a control group [ 29 , 30 ]. This parameter closely mirrors the OR derived from the PO model. Importantly, the direction of the OR from the PO model always agrees with the direction of the concordance probability. The disadvantages of the Wilcoxon test are that the concordance probability may be unfamiliar to clinicians, and the Wilcoxon test cannot be adjusted for covariates.

Another option is to dichotomise the ordinal outcome and use an OR or risk difference as the target parameter, estimated using logistic or binomial regression. This produces an effect estimate with clear clinical interpretations that may be suitable for specific clinical settings. The disadvantage of dichotomising an ordinal outcome is that it means discarding potentially useful information within the levels of the scale. This means that the trial may require a larger sample size to maintain the same level of statistical power to detect a clinically important treatment effect [ 19 ], which may not be feasible in all RCTs depending on cost constraints or the rate of recruitment. The decision to dichotomise may also depend on when the outcome is being measured. This was highlighted in a study that showed that an ordinal analysis of the modified-Rankin scale captured differences in long-term outcomes in survivors of stroke better than an analysis that dichotomised the ordinal outcome [ 3 , 31 ].

An alternative to dichotomisation is to treat the ordinal outcome as continuous and focus on the mean difference as the target parameter. This choice to treat the outcome as continuous may be based on the number of categories, where the more categories, the more the outcome resembles a continuum if proximate categories measure similar states or if the scale reflects a latent continuous variable. This has the advantage that modelling is straightforward and familiar, but it can lead to ill-defined clinical interpretations of the treatment effect since the difference between proximate categories is unequal nor quantifiable. Such an analysis also wrongly assumes that the outcome has an unbounded range.

There has been commentary [ 32 ] and research conducted on the methodology of using ordinal outcomes in certain RCT settings that have mainly focused on the benefit of an ordinal analysis using a PO model [ 19 , 33 , 34 , 35 ], including investigations into the use of a PPO model when the PO assumption is violated [ 36 ]. However, these studies have primarily focused on a limited number of statistical methods and in mostly specific medical areas such as neurology and may not be applicable more generally. Given the growing use of ordinal outcomes in RCTs, it is crucial to gain a deeper understanding of how ordinal outcomes are utilised in practice. This understanding will help identify any issues in the use of ordinal outcomes in RCTs and facilitate discussions on improving the reporting and analysis of such outcomes. To address this, we conducted a scoping review to systematically examine the use and analysis of ordinal outcomes in the current literature. Specifically, we aimed to:

Identify which target parameters are of interest in RCTs that use an ordinal outcome and whether these are explicitly defined.

Describe how ordinal outcomes are analysed in RCTs to estimate a treatment effect.

Describe whether RCTs that use an ordinal outcome adequately report key methodological aspects specific to the analysis of the ordinal outcome.

A pre-specified protocol was developed for this scoping review [ 37 ]. Deviations from the protocol are outlined in Additional file 1 . Here, we provide an overview of the protocol and present the findings from the review which have been reported using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) checklist [ 38 ].

Eligibility criteria

Studies were included in the review if they were published in one of four highly ranked medical journals ( British Medical Journal (BMJ), New England Journal of Medical (NEJM), Journal of the American Medical Association (JAMA), or The Lancet) between 1 January 2017 and 31 July 2022 and reported the results of at least one RCT (e.g. if reporting results from multiple trials) with either a primary or secondary outcome that was measured on an ordinal scale. These journals were chosen because they are leading medical journals that publish original and peer-reviewed research with primarily clinical aims and have been used in other reviews of trial methodology [ 39 , 40 ]. RCTs were defined using the Cochrane definition of an RCT, which is a study that prospectively assigns individuals to one of two (or more) interventions using some random or quasi-random method of allocation [ 41 ].

Studies were excluded from this review if they were written in a language other than English, since we did not have sufficient resources to translate studies written in another language. We also excluded studies which were purely methodological, where the abstract or full-text was not available, which reported data from non-human subjects, and those that provided a commentary, review opinion, or were description only. Manuscripts that reported only a trial protocol or statistical analysis plan were also excluded, since one of the main objectives of this review was to determine which statistical methods are being used to analyse trial data. Studies that used ordinal outcomes that were measured on a numerical rating or visual analogue scale were also excluded. Although these scales are often considered ordinal, they imply equidistance between contiguous categories, and can conceivably be analysed as continuous data.

Information sources

Studies were identified and included in the review by searching the online bibliographic database, PubMed, executed on 3 August, 2022.

Search strategy

The search strategy for this review was developed by CJS in consultation with KJL and RKM. The search strategy employed terms that have been developed to identify RCTs [ 41 ] and terms that have been used to describe an ordinal outcome in published manuscripts for RCTs. The complete search strategy that was used in this review is described in Table 1 .

Selection of sources of evidence

There was no pre-specified sample size for this review. All eligible studies that were identified via the search strategy were included in the review.

Piloting of the eligibility criteria was conducted by CJS and RKM who independently assessed the titles and abstracts of 20 studies to ensure consistency between reviewers. CJS then performed the search on the PubMed database. All titles and abstracts identified were extracted into Covidence, a web-based tool for managing systematic reviews [ 42 ]. A two-phase screening process was employed, where all abstracts and titles were screened by CJS in the first phase. Those studies that were not excluded were then moved to the second phase of the screening process, where the full text was evaluated against the eligibility criteria by CJS. A random sample of 40 studies were also assessed for eligibility by a second reviewer (one of KJL, RKM, BJM, or CLW). All studies that were deemed eligible were included in the data extraction.

Data extraction

A data extraction questionnaire was developed in Covidence [ 42 ] and piloted by CJS and RKM using a sample of 10 studies, which was further refined. The final version of the questionnaire is shown in Additional file 2 , and a full list of the data extraction items is provided in Table 2 . Data was extracted from both the main manuscript and any supplementary material, including statistical analysis plans. CJS extracted data from all eligible studies in the review. Double data extraction was performed by KJL and RKM on a random sample of 20 studies. Any uncertainties in the screening and data extraction process were discussed and resolved by consensus among all reviewers. Simplifications and assumptions that were made for eligibility and data extraction are outlined in Additional file 1 .

Synthesis of results

The data extracted from Covidence were cleaned and analysed using Stata [ 43 ]. Descriptive statistics were used to summarise the data. Frequencies and percentages and medians and interquartile ranges (IQRs) were reported for categorical and continuous variables respectively. Qualitative data were synthesised in a narrative format.

Results of the search

The initial search identified 309 studies, of which 46 were excluded for not being an RCT. There were 263 studies that underwent full text review. Of these, 119 were excluded: 110 because they did not have an ordinal outcome, and nine because they were not an RCT. In total, 144 studies were eligible for data extraction [ 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 , 120 , 121 , 122 , 123 , 124 , 125 , 126 , 127 , 128 , 129 , 130 , 131 , 132 , 133 , 134 , 135 , 136 , 137 , 138 , 139 , 140 , 141 , 142 , 143 , 144 , 145 , 146 , 147 , 148 , 149 , 150 , 151 , 152 , 153 , 154 , 155 , 156 , 157 , 158 , 159 , 160 , 161 , 162 , 163 , 164 , 165 , 166 , 167 , 168 , 169 , 170 , 171 , 172 , 173 , 174 , 175 , 176 , 177 , 178 , 179 , 180 , 181 , 182 , 183 , 184 , 185 , 186 , 187 ]. A flow diagram of the study selection is shown in Fig. 1 . The questionnaire that was used to extract the data from each study is provided in Additional file 2 .

figure 1

Flow diagram of the study

Study characteristics

A summary of the study characteristics is presented in Table 3 . The highest proportion of studies were published in the NEJM (61 studies, \(42\%\) ), followed by JAMA (40, 28%) and The Lancet (34, 24%), with only nine studies published in the BMJ ( \(6\%\) ). The number of studies that used an ordinal outcome were higher in 2020 and 2021 ( \(30, 21\%\) in each year) compared to earlier years ( \(21, 15\%\) in 2019, \(24, 17\%\) in 2018 and \(23, 16\%\) in 2017). Nearly all studies were conducted in a clinical setting ( \(141, 98\%\) ). The most common medical condition being studied was stroke ( \(39, 28\%\) ), followed by COVID-19 ( \(22, 16\%\) ) and atopic dermatitis ( \(6, 4\%\) ). The most common medical field was neurology ( \(54, 38\%\) ) followed by infectious diseases ( \(22, 16\%\) , all of which were COVID-19 studies), dermatology ( \(13, 9\%\) ), and psychiatry ( \(12, 9\%\) ). Studies were mostly funded by public sources ( \(104, 72\%\) ). The median number of participants in the primary analysis of the ordinal outcome was 380 (interquartile range (IQR): 202–803).

Of the 144 included studies, 58 (40%) used some form of adaptive design, with 47 ( \(33\%\) ) having explicitly defined early stopping rules for efficacy or futility, 18 ( \(13\%\) ) used sample size re-estimation, three ( \(2\%\) ) used response adaptive randomisation, three ( \(2\%\) ) used covariate adaptive randomisation, three ( \(2\%\) ) were platform trials, and three ( \(2\%\) ) used adaptive enrichment that focused on specific subgroups of patients.

Ordinal outcomes and target parameters

A summary of the properties of the ordinal outcomes used in the studies is shown in Table 4 . An ordinal scale was used as a primary outcome in 59 ( \(41\%\) ) of studies. Most studies used an ordinal scale to describe an outcome at a single point in time ( \(128, 89\%\) ), with 16 studies using an ordinal outcome to capture changes over time ( \(11\%\) ). One study used a Likert scale where the categories were ambiguously defined in the manuscript. Another study used an ordinal outcome to measure change over time, but it was asymmetric and biased towards a favourable outcome. The median number of categories in the ordinal outcome was 7 (IQR: 6–7) and ranged from 3 to 23 categories.

There were 32 studies that determined the sample size in advance based on the ordinal outcome, of which 26 out of 32 studies ( \(81\%\) ) used an analytical approach and 6 out of 32 studies ( \(19\%\) ) used simulation to estimate the sample size. Among those studies that used an analytical approach, five studies reported to have used the Whitehead method and three studies reported to have used a t -test. Among the remaining studies that used an analytical approach, it was unclear which specific method was used to compute the sample size.

The ordinal outcome was dichotomised for analysis in 47 ( \(33\%\) ) studies. Some justifications for the dichotomisation of the ordinal outcome included that it represented a clinically meaningful effect and/or that it was common in the analysis of the outcome in similar studies (reported in 24 studies), that the dichotomised outcome represented an agreeable endpoint based on feedback between clinicians and/or patients and families (two studies), or that the assumptions of the statistical model for the categorical outcome were violated (reported in three studies).

There were a variety of target parameters used for the ordinal outcomes. In 130 studies, the target parameter could be determined; however, 59 of these studies ( \(45\%\) ) did not clearly or explicitly define the target parameter of interest. Of those where the target parameter could be determined based on the information provided in the manuscript (e.g. since it was not reported), an OR was the most common target parameter ( \(78, 54\%\) ), followed by a risk difference ( \(31, 22\%\) ). A difference in mean or median was the target parameter in 11 (8%) and 8 (6%) studies respectively. There were 14 ( \(10\%\) ) studies that did not estimate a target parameter. This was either because the study was descriptive in nature, the analysis used a non-parametric procedure, or the target parameter could not be determined (or some combination thereof).

Statistical methods and assumptions

There was a variety of descriptive measures used to summarise the distribution of the ordinal outcome by intervention groups (Table 5 ). The most common descriptive statistics were frequencies and/or percentages in each category of the ordinal outcome ( \(116, 81\%\) ), followed by the median score across all categories ( \(33, 23\%\) ) and IQRs ( \(31, 22\%\) ). The mean and standard deviation across the categories of the ordinal outcome were only summarised in 16 (11%) and 10 (7%) studies respectively.

Many different statistical methods were used to analyse the ordinal outcome (Table 5 ). The PO model was the most common statistical method used to analyse the ordinal outcome (64, \(44\%\) ) that was used to estimate a proportional OR in 62 studies. In studies that used a PO model for the analysis, the interpretation of the target parameter varied between studies (see Additional file 3 ). The most frequent definition used was that the proportional OR represented an ordinal shift in the distribution of ordinal scale scores toward a better outcome in the intervention relative to the control group ( \(12, 19\%\) ). When the outcome was dichotomised, logistic regression was used in 16 studies ( \(11\%\) of all studies) that usually estimated an OR or a risk difference using g-computation. Seven studies estimated a risk difference or risk ratio using binomial regression. Studies also calculated and reported a risk difference with corresponding \(95\%\) confidence intervals estimated using methods such as the Wald method or bootstrapping ( \(31, 22\%\) ). There were 19 (13%) studies that used a non-parametric method to analyse the ordinal outcome (either dichotomised or not), including the Cochran-Mantel-Haenszel test ( \(15, 10\%\) ) to estimate an odds ratio, the Wilcoxon test ( \(14, 10\%\) ), of which no study reported a concordance probability as the target parameter, or the Fisher’s exact or Chi-Square test (12, \(8\%\) ). Other statistical methods that were used were the Hodges-Lehmann estimator, used to estimate a median difference ( \(3, 2\%\) ) and the Van Elteren test ( \(2, 1\%\) ), an extension of the Wilcoxon test for comparing treatments in a stratified experiment. Linear regression was used in 16 ( \(11\%\) ) studies that tended to estimate a mean or risk difference (despite the model having an unbounded support).

The majority of studies ( \(86, 60\%\) ) did not explicitly check the validity of the assumptions for the statistical method(s) used. For example, no study that analysed the ordinal outcome using linear regression commented on the appropriateness of assigning specific numbers of the outcome categories. Among the 64 studies that used a PO model, 20 (31%) did not report whether the assumption of PO was satisfied. Overall, there were 46 studies that reported checking key modelling assumptions; however, the method that was used to check these assumptions were not reported in 6 ( \(13\%)\) of these studies. The most common way to verify model assumptions was to use statistical methods ( \(31, 67\%\) ), followed by graphical methods ( \(2, 4\%\) ).

Among the 44 studies that assessed the validity of the PO assumption for a PO model, 13 studies ( \(30\%\) ) used a likelihood ratio test, 10 studies ( \(23\%\) ) used the Brant test, and 10 studies ( \(23\%\) ) also used the Score test. Six ( \(14\%\) ) studies assessed the robustness of the PO assumption by fitting a logistic regression model to every level of the ordinal outcome across the scale, in which the OR for each dichotomous break was presented. Two studies assessed the PO assumption using graphical methods, which plotted either the inverse cumulative log odds or the empirical cumulative log odds. It was unclear which method was used to assess the PO assumption in ten studies that reported to have checked the assumption.

There were 12 studies ( \(8\%\) ) that reported using a different statistical method than originally planned. Ten of these studies had originally planned to use a PO model, but the PO assumption was determined to have been violated and an alternative method was chosen. One study removed the covariate that was reported to have violated the PO assumption and still used a PO model to analyse the outcome. Two studies used an unconstrained PPO model that reported an adjusted OR for each binary split of the ordinal outcome. Three studies used a Wilcoxon test, with one study stratifying by a baseline covariate that violated the PO assumption. Another study dichotomised the ordinal outcome for the analysis. One study used a Van Elteren test that estimated a median difference (which inappropriately assumes that there is an equal distance between proximate categories), another used a Poisson model with robust standard errors, and one study retained the analysis despite the violation in PO. Notably, a PPO model was not reported to have been used in studies that reported that a covariate other than the treatment violated the PO assumption. Seven studies also did not report which covariate(s) violated the PO assumption.

Frequentist inference was the most common framework for conducting the analysis (133, 92%), with Bayesian methods being used in eight (6%) studies (where two studies used both), of which all eight studies used an adaptive design. Of those using Bayesian methods, seven studies used a Bayesian PO model for analysis. Of these studies, four used a Dirichlet prior distribution to model the baseline probabilities, and three used a normally distributed prior on the proportional log-OR scale. Two of these studies reported to use the median proportional OR with corresponding \(95\%\) credible interval, while one study reported the mean proportional OR. Three studies reported that the models were fitted with the use of a Markov-chain Monte Carlo algorithm with either 10, 000 (one study) or 100, 000 (two studies) samples from the joint posterior distribution. No study reported how the goodness-of-fit of the model was assessed.

For the 38 studies that collected repeated measurements on the ordinal outcome, 18 adjusted for the baseline measurement ( \(47\%\) ), 14 used mixed effects models ( \(37\%\) ), and four used generalised estimated equations ( \(11\%\) ) to capture the correlation among the repeated measures for an individual.

A range of statistical packages were used for the analysis of the ordinal outcome, with SAS ( \(81, 56\%\) ) and R ( \(35, 24\%\) ) being most common. Twelve ( \(8\%\) ) studies did not report the software used.

This review has provided an overview of how ordinal outcomes are used and analysed in contemporary RCTs. We describe the insight this review has provided on the study design, statistical analyses and reporting of trials using ordinal outcomes.

Target parameter

The target parameter of interest is an important consideration when planning any trial and should be aligned with the research question [ 12 , 13 ]. The most common target parameter in this review was an OR, either for a dichotomised version of the ordinal outcome or in an analysis that used the ordinal scale. When an ordinal analysis was used, it was common that the target parameter was a proportional OR, although there was variation in the interpretation of this parameter between studies. We found that it was most common to interpret the proportional OR as an average shift in the distribution of the ordinal scale scores toward a better outcome in the intervention, relative to the comparator(s) [ 19 , 35 , 188 , 189 ]. In the studies that dichotomised the ordinal outcome, many lacked justification for doing so and, in one case, dichotomisation was used only due to the violation of PO, despite the fact that this changed the target parameter.

Some studies in our review treated the ordinal outcome as if it were continuous, and used a difference in means or medians as the target parameter. These quantities do not represent a clinically meaningful effect when the outcome is ordinal, since proximate categories in the scale are not necessarily separated by a quantifiable or equal distance, which can affect the translation of the trial results into practice. If a study is to use a mean difference then the researchers should justify the appropriateness of assigning specific numbers used to the ordinal outcome categories.

The target parameter and statistical method used to estimate it could not be determined in some studies. Notably, the definition of the target parameter was not explicitly defined in almost half of the studies, despite the current recommendations on the importance of clearly defining the estimand of interest, one component of which is the target parameter [ 12 , 13 ]. Furthermore, there is a lack of clarity in defining the target parameter when a PO model was used, despite the interpretation being analogous to the OR for a binary outcome, but applying to an interval of the ordinal scale instead of a single value. Consistency in the definition of a target parameter in RCTs can allow easy interpretation for clinicians and applied researchers. Explicit definition of the target parameter of interest is essential for readers to understand the interpretation of a clinically meaningful treatment effect, and also reflects the present push within clinical research with regards to estimands [ 12 , 13 ].

Statistical methods

It is important to summarise the distribution of the outcome by intervention group in any RCT. When the outcome is ordinal, frequencies and percentages in each category can provide a useful summary of this distribution. Most studies in this review reported frequencies and percentages in each category, although some studies that dichotomised the outcome only reported these summaries for the dichotomised scale. Some studies reported means and standard deviations across the categories which, as mentioned previously, may not have a valid interpretation.

Although there are a range of statistical methods that can be used to analyse an ordinal outcome, we found that the PO model was the most commonly used. This is likely because the PO model is relatively well-known among statisticians and is quite straightforward to fit in most statistical packages, and it possesses the desirable properties of palindromic invariance and invariance under collapsibility. However, when using this approach to estimate a specific treatment effect across all levels of the outcome, it is important to assess and report whether the PO assumption has been met when the aim is to estimate the treatment effect across the different categories or to estimate predicted probabilities in each category. The validity of the PO assumption is less important when the objective is to understand whether one treatment is ‘better’ on average compared to a comparator. In this review, it was common for studies that used a PO model to define the target parameter that related to a treatment benefiting patients with regard to every level of the outcome scale. However, only 44 out of 64 studies reported to have checked the PO assumption, which highlights the deficiency in this practice. Statistical methods were commonly used to assess the PO assumption, although it may be preferable to avoid hypothesis testing when assessing the PO assumption, particularly with small sample sizes, as these statistical tests can have poor statistical power [ 22 , 190 ]. Also, researchers should keep in mind that when the PO assumption is tested, the type I error of the analysis may change and that p -values and confidence intervals based on the updated model ignore the model-fitting uncertainty [ 191 ].

When the PO assumption was violated, a PPO model was rarely used, and instead baseline covariates were removed from the model to address the departure to PO. The fact that the PPO model is underused could be due to a lack of knowledge that such models exist and can be used to address violations in PO. Such a model could have been particularly useful in these studies that had only covariates other than the treatment variable that violated the PO assumption, as the PPO model could have been used to estimate a single proportional OR for the treatment effect. Of note, however, is that an unconstrained PPO model does not necessarily require ordinality as the categories can be arranged and the model fit would be hardly affected [ 192 ], and that estimated probabilities can be negative [ 193 ].

There are other methods that can be used to assess the validity of the PO assumption, such as plotting the differences in predicted log-odds between different categories of the ordinal outcome that should be parallel [ 16 ]. Another option is to fit a logistic regression model to every level of the ordinal outcome across the scale and compare the estimated ORs and corresponding confidence interval for each binary split of the ordinal outcome or simulating predictive distributions. However, estimating separate ORs in this way can be inefficient, particularly when the ordinal outcome has a high number of categories. Arguably, more important than assessing the validity of the PO assumption is to assess the impact of making compared to not making the assumption. If the treatment effect goes in the same direction across each category of the ordinal scale and the objective is to simply understand whether one treatment is better overall, then departures from PO may not be important. If, however, the interest is in estimating a treatment effect for every level of the ordinal outcome and/or the treatment has a detrimental effect for one end of the ordinal scale but a beneficial effect for the remaining categories, there should be careful consideration as to the validity to the type I and II error and the treatment effect if the PO model is used.

Finally, a handful of studies also used the Wilcoxon, Chi-Square, or Fisher’s exact test (the latter being too conservative [ 194 ] and potentially providing misleading results), where commonly only a p -value, not a target parameter, was reported when these methods were used. The lack of a target parameter for the treatment effect can make it difficult for clinicians to translate the results to practice.

Strengths and limitations

The strengths of this study are that we present a review of a large number of RCTs that used ordinal outcomes published in four highly ranked medical journals to highlight the current state of practice for analysing ordinal outcomes. The screening and data extraction process was conducted systematically, and pilot tests and double data extraction ensured the consistency and reliability of the extracted data. The PRISMA-ScR checklist was used to ensure that reporting has been conducted to the highest standard.

This review does, however, have limitations. The restriction to the PubMed database and four highly ranked medical journals may affect the generalisability of this review. We made this decision given the scoping nature of the review, to ensure reproducibility and to ensure that the total number of studies included in the review was manageable. We also aimed to include studies that are likely to reflect best practice of how research using ordinal outcomes is being conducted and reported upon at present. Given the selected journals represent highly ranked medical journals, these findings are likely to reflect the best-case scenario given these journals' reputation for rigour. In addition, our search strategy may have missed certain phrases or variants (particularly related to an ordinal outcome); however, we attempted to mitigate this through our piloting phase. Finally, we also did not review the protocol papers of the trials that may have included additional information related to the statistical methodology. This includes methods that were planned to be used to assess the PO assumption, and any alternative methods that were to be used instead.

Implications of this research

This review has implications for researchers designing RCTs that use an ordinal outcome. Although the majority of studies included in this review were in the fields of neurology and infectious diseases, the results of this review would apply to RCTs in all medical fields that use an ordinal outcome. We have shown that there is substantial variation in the analysis and reporting of ordinal outcomes in practice. Our results suggest that researchers should carefully consider the target parameter of interest and explicitly report what the target parameter represents; this is particularly important for an ordinal outcome which can be unfamiliar to readers. Defining the target parameter upfront will help to ensure that appropriate analytical methods are used to analyse the ordinal outcome and make transparent the assumptions the researchers are willing to make.

Our review also highlights the need for careful assessment and reporting of the validity of the model assumptions made during the analysis of an ordinal outcome. Doing so will ensure that robust statistical methods that align with the research question and categorical nature of the ordinal outcome are used to estimate a valid, clinically relevant target parameter that can be translated to practice.

Availability of data and materials

The datasets and code generated and/or analysed during the current study are available on GitHub [ 195 ].

Abbreviations

Randomised controlled trial

Proportional odds

Partial proportional odds

Statistical analysis plan

Velleman PF, Wilkinson L. Nominal, ordinal, interval, and ratio typologies are misleading. Am Stat. 1993;47(1):65–72.

Article   Google Scholar  

MacKenzie CR, Charlson ME. Standards for the use of ordinal scales in clinical trials. Br Med J (Clin Res Ed). 1986;292(6512):40–3.

Article   CAS   PubMed   Google Scholar  

Banks JL, Marotta CA. Outcomes validity and reliability of the modified Rankin scale: implications for stroke clinical trials: a literature review and synthesis. Stroke. 2007;38(3):1091–6.

Article   PubMed   Google Scholar  

de la Ossa NP, Abilleira S, Jovin TG, García-Tornel Á, Jimenez X, Urra X, et al. Effect of direct transportation to thrombectomy-capable center vs local stroke center on neurological outcomes in patients with suspected large-vessel occlusion stroke in nonurban areas: the RACECAT randomized clinical Trial. JAMA. 2022;327(18):1782–94.

Hubert GJ, Hubert ND, Maegerlein C, Kraus F, Wiestler H, Müller-Barna P, et al. Association between use of a flying intervention team vs patient interhospital transfer and time to endovascular thrombectomy among patients with acute ischemic stroke in nonurban Germany. JAMA. 2022;327(18):1795–805.

Article   PubMed   PubMed Central   Google Scholar  

Bösel J, Niesen WD, Salih F, Morris NA, Ragland JT, Gough B, et al. Effect of early vs standard approach to tracheostomy on functional outcome at 6 months among patients with severe stroke receiving mechanical ventilation: the SETPOINT2 Randomized Clinical Trial. JAMA. 2022;327(19):1899–909.

Wilson L, Boase K, Nelson LD, Temkin NR, Giacino JT, Markowitz AJ, et al. A manual for the glasgow outcome scale-extended interview. J Neurotrauma. 2021;38(17):2435–46.

Marshall JC, Murthy S, Diaz J, Adhikari N, Angus DC, Arabi YM, et al. A minimal common outcome measure set for COVID-19 clinical research. Lancet Infect Dis. 2020;20(8):e192–7.

Article   CAS   Google Scholar  

Lovre D, Bateman K, Sherman M, Fonseca VA, Lefante J, Mauvais-Jarvis F. Acute estradiol and progesterone therapy in hospitalised adults to reduce COVID-19 severity: a randomised control trial. BMJ Open. 2021;11(11):e053684.

Song AT, Rocha V, Mendrone-Júnior A, Calado RT, De Santis GC, Benites BD, et al. Treatment of severe COVID-19 patients with either low-or high-volume of convalescent plasma versus standard of care: a multicenter Bayesian randomized open-label clinical trial (COOP-COVID-19-MCTI). Lancet Reg Health-Am. 2022;10:100216.

PubMed   PubMed Central   Google Scholar  

Mathioudakis AG, Fally M, Hashad R, Kouta A, Hadi AS, Knight SB, et al. Outcomes evaluated in controlled clinical trials on the management of COVID-19: a methodological systematic review. Life. 2020;10(12):350.

Akacha M, Bretz F, Ohlssen D, Rosenkranz G, Schmidli H. Estimands and their role in clinical trials. Stat Biopharm Res. 2017;9(3):268–71.

Mallinckrodt C, Molenberghs G, Lipkovich I, Ratitch B. Estimands, estimators and sensitivity analysis in clinical trials. CRC Press; 2019.

Walker SH, Duncan DB. Estimation of the probability of an event as a function of several independent variables. Biometrika. 1967;54(1–2):167–79.

McCullagh P. Regression models for ordinal data. J R Stat Soc Ser B Methodol. 1980;42(2):109–27.

Google Scholar  

Harrell FE, et al. Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis, vol 3. Springer; 2015.

Ananth CV, Kleinbaum DG. Regression models for ordinal responses: a review of methods and applications. Int J Epidemiol. 1997;26(6):1323–33.

Armstrong BG, Sloan M. Ordinal regression models for epidemiologic data. Am J Epidemiol. 1989;129(1):191–204.

Roozenbeek B, Lingsma HF, Perel P, Edwards P, Roberts I, Murray GD, et al. The added value of ordinal analysis in clinical trials: an example in traumatic brain injury. Crit Care. 2011;15(3):1–7.

Breheny P. Proportional odds models. 2015. MyWeb. https://myweb.uiowa.edu/pbreheny/uk/teaching/760-s13/notes/4-23.pdf .

Abreu MNS, Siqueira AL, Cardoso CS, Caiaffa WT. Ordinal logistic regression models: application in quality of life studies. Cad Saúde Pública. 2008;24:s581–91.

Peterson B, Harrell FE Jr. Partial proportional odds models for ordinal response variables. J R Stat Soc: Ser C: Appl Stat. 1990;39(2):205–17.

Fullerton AS. A conceptual framework for ordered logistic regression models. Sociol Methods Res. 2009;38(2):306–47.

Senn S, Julious S. Measurement in clinical trials: a neglected issue for statisticians? Stat Med. 2009;28(26):3189–209.

Maas AI, Steyerberg EW, Marmarou A, McHugh GS, Lingsma HF, Butcher I, et al. IMPACT recommendations for improving the design and analysis of clinical trials in moderate to severe traumatic brain injury. Neurotherapeutics. 2010;7:127–34.

McFadden D, et al. Conditional logit analysis of qualitative choice behavior.  Front Econ. 1973;105–142.

Wilcoxon F. Individual comparisons by ranking methods. Springer; 1992.

Liu Q, Shepherd BE, Li C, Harrell FE Jr. Modeling continuous response variables using ordinal regression. Stat Med. 2017;36(27):4316–35.

Fay MP, Brittain EH, Shih JH, Follmann DA, Gabriel EE. Causal estimands and confidence intervals associated with Wilcoxon-Mann-Whitney tests in randomized experiments. Stat Med. 2018;37(20):2923–37.

De Neve J, Thas O, Gerds TA. Semiparametric linear transformation models: effect measures, estimators, and applications. Stat Med. 2019;38(8):1484–501.

Ganesh A, Luengo-Fernandez R, Wharton RM, Rothwell PM. Ordinal vs dichotomous analyses of modified Rankin Scale, 5-year outcome, and cost of stroke. Neurology. 2018;91(21):e1951–60.

French B, Shotwell MS. Regression models for ordinal outcomes. JAMA. 2022;328(8):772–3.

Bath PM, Geeganage C, Gray LJ, Collier T, Pocock S. Use of ordinal outcomes in vascular prevention trials: comparison with binary outcomes in published trials. Stroke. 2008;39(10):2817–23.

Scott SC, Goldberg MS, Mayo NE. Statistical assessment of ordinal outcomes in comparative studies. J Clin Epidemiol. 1997;50(1):45–55.

McHugh GS, Butcher I, Steyerberg EW, Marmarou A, Lu J, Lingsma HF, et al. A simulation study evaluating approaches to the analysis of ordinal outcome data in randomized controlled trials in traumatic brain injury: results from the IMPACT Project. Clin Trials. 2010;7(1):44–57.

DeSantis SM, Lazaridis C, Palesch Y, Ramakrishnan V. Regression analysis of ordinal stroke clinical trial outcomes: an application to the NINDS t-PA trial. Int J Stroke. 2014;9(2):226–31.

Selman CJ, Lee KJ, Whitehead CL, Manley BJ, Mahar RK. Statistical analyses of ordinal outcomes in randomised controlled trials: protocol for a scoping review. Trials. 2023;24(1):1–7.

Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73.

Bell ML, Fiero M, Horton NJ, Hsu CH. Handling missing data in RCTs; a review of the top medical journals. BMC Med Res Methodol. 2014;14(1):1–8.

Berwanger O, Ribeiro RA, Finkelsztejn A, Watanabe M, Suzumura EA, Duncan BB, et al. The quality of reporting of trial abstracts is suboptimal: survey of major general medical journals. J Clin Epidemiol. 2009;62(4):387–92.

Higgins JP, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, et al.. Cochrane handbook for systematic reviews of interventions. John Wiley & Sons; 2019.

Veritas Health Innovation. Covidence systematic review software. Melbourne; 2022.

StataCorp L. Stata statistical software: Release 17 (2021). College Station: StataCorp LP; 2021.

Hanley DF, Lane K, McBee N, Ziai W, Tuhrim S, Lees KR, et al. Thrombolytic removal of intraventricular haemorrhage in treatment of severe stroke: results of the randomised, multicentre, multiregion, placebo-controlled CLEAR III trial. Lancet. 2017;389(10069):603–11. https://doi.org/10.1016/S0140-6736(16)32410-2 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Nangia J, Wang T, Osborne C, Niravath P, Otte K, Papish S, et al. Effect of a scalp cooling device on alopecia in women undergoing chemotherapy for breast cancer: the SCALP randomized clinical trial. JAMA. 2017;317(6):596–605. https://doi.org/10.1001/jama.2016.20939 . United States.

Ruzicka T, Hanifin JM, Furue M, Pulka G, Mlynarczyk I, Wollenberg A, et al. Anti-interleukin-31 receptor A antibody for atopic dermatitis. N Engl J Med. 2017;376(9):826–35. https://doi.org/10.1056/NEJMoa1606490 . United States.

Németh G, Laszlovszky I, Czobor P, Szalai E, Szatmári B, Harsányi J, et al. Cariprazine versus risperidone monotherapy for treatment of predominant negative symptoms in patients with schizophrenia: a randomised, double-blind, controlled trial. Lancet. 2017;389(10074):1103–13. https://doi.org/10.1016/S0140-6736(17)30060-0 . England.

Mathieson S, Maher CG, McLachlan AJ, Latimer J, Koes BW, Hancock MJ, et al. Trial of pregabalin for acute and chronic sciatica. N Engl J Med. 2017;376(12):1111–20. https://doi.org/10.1056/NEJMoa1614292 . United States.

Baud O, Trousson C, Biran V, Leroy E, Mohamed D, Alberti C. Association between early low-dose hydrocortisone therapy in extremely preterm neonates and neurodevelopmental outcomes at 2 years of age. JAMA. 2017;317(13):1329–37. https://doi.org/10.1001/jama.2017.2692 . United States.

van den Berg LA, Dijkgraaf MG, Berkhemer OA, Fransen PS, Beumer D, Lingsma HF, et al. Two-year outcome after endovascular treatment for acute ischemic stroke. N Engl J Med. 2017;376(14):1341–9. https://doi.org/10.1056/NEJMoa1612136 . United States.

Kaufman J, Fitzpatrick P, Tosif S, Hopper SM, Donath SM, Bryant PA, et al. Faster clean catch urine collection (Quick-Wee method) from infants: randomised controlled trial. BMJ. 2017;357:j1341. https://doi.org/10.1136/bmj.j1341 .

Costa Leme A, Hajjar LA, Volpe MS, Fukushima JT, De Santis Santiago RR, Osawa EA, et al. Effect of intensive vs moderate alveolar recruitment strategies added to lung-protective ventilation on postoperative pulmonary complications: a randomized clinical trial. JAMA. 2017;317(14):1422–32. https://doi.org/10.1001/jama.2017.2297 . United States.

Breitenstein C, Grewe T, Flöel A, Ziegler W, Springer L, Martus P, et al. Intensive speech and language therapy in patients with chronic aphasia after stroke: a randomised, open-label, blinded-endpoint, controlled trial in a health-care setting. Lancet. 2017;389(10078):1528–38. https://doi.org/10.1016/S0140-6736(17)30067-3 . England.

Wechsler ME, Akuthota P, Jayne D, Khoury P, Klion A, Langford CA, et al. Mepolizumab or placebo for eosinophilic granulomatosis with polyangiitis. N Engl J Med. 2017;376(20):1921–32. https://doi.org/10.1056/NEJMoa1702079 .

Devinsky O, Cross JH, Laux L, Marsh E, Miller I, Nabbout R, et al. Trial of cannabidiol for drug-resistant seizures in the Dravet syndrome. N Engl J Med. 2017;376(21):2011–20. https://doi.org/10.1056/NEJMoa1611618 . United States.

Anderson CS, Arima H, Lavados P, Billot L, Hackett ML, Olavarría VV, et al. Cluster-randomized, crossover trial of head positioning in acute stroke. N Engl J Med. 2017;376(25):2437–47. https://doi.org/10.1056/NEJMoa1615715 . United States.

Juch JNS, Maas ET, Ostelo RWJG, Groeneweg JG, Kallewaard JW, Koes BW, et al. Effect of radiofrequency denervation on pain intensity among patients with chronic low back pain: the Mint randomized clinical trials. JAMA. 2017;318(1):68–81. https://doi.org/10.1001/jama.2017.7918 .

Mohamed S, Johnson GR, Chen P, Hicks PB, Davis LL, Yoon J, et al. Effect of antidepressant switching vs augmentation on remission among patients with major depressive disorder unresponsive to antidepressant treatment: the VAST-D randomized clinical trial. JAMA. 2017;318(2):132–45. https://doi.org/10.1001/jama.2017.8036 .

Kanes S, Colquhoun H, Gunduz-Bruce H, Raines S, Arnold R, Schacterle A, et al. Brexanolone (SAGE-547 injection) in post-partum depression: a randomised controlled trial. Lancet. 2017;390(10093):480–9. https://doi.org/10.1016/S0140-6736(17)31264-3 . England.

Lapergue B, Blanc R, Gory B, Labreuche J, Duhamel A, Marnat G, et al. Effect of endovascular contact aspiration vs stent retriever on revascularization in patients with acute ischemic stroke and large vessel occlusion: the ASTER randomized clinical trial. JAMA. 2017;318(5):443–52. https://doi.org/10.1001/jama.2017.9644 .

Lindley RI, Anderson CS, Billot L, Forster A, Hackett ML, Harvey LA, et al. Family-led rehabilitation after stroke in India (ATTEND): a randomised controlled trial. Lancet. 2017;390(10094):588–99. https://doi.org/10.1016/S0140-6736(17)31447-2 . England.

Berlowitz DR, Foy CG, Kazis LE, Bolin LP, Conroy MB, Fitzpatrick P, et al. Effect of intensive blood-pressure treatment on patient-reported outcomes. N Engl J Med. 2017;377(8):733–44. https://doi.org/10.1056/NEJMoa1611179 .

Hui D, Frisbee-Hume S, Wilson A, Dibaj SS, Nguyen T, De La Cruz M, et al. Effect of lorazepam with haloperidol vs haloperidol alone on agitated delirium in patients with advanced cancer receiving palliative care: a randomized clinical trial. JAMA. 2017;318(11):1047–56. https://doi.org/10.1001/jama.2017.11468 .

Roffe C, Nevatte T, Sim J, Bishop J, Ives N, Ferdinand P, et al. Effect of routine low-dose oxygen supplementation on death and disability in adults with acute stroke: the stroke oxygen study randomized clinical trial. JAMA. 2017;318(12):1125–35. https://doi.org/10.1001/jama.2017.11463 .

Dwivedi R, Ramanujam B, Chandra PS, Sapra S, Gulati S, Kalaivani M, et al. Surgery for drug-resistant epilepsy in children. N Engl J Med. 2017;377(17):1639–47. https://doi.org/10.1056/NEJMoa1615335 . United States.

Nogueira RG, Jadhav AP, Haussen DC, Bonafe A, Budzik RF, Bhuva P, et al. Thrombectomy 6 to 24 hours after stroke with a mismatch between deficit and infarct. N Engl J Med. 2018;378(1):11–21. https://doi.org/10.1056/NEJMoa1706442 . United States.

Zheng MX, Hua XY, Feng JT, Li T, Lu YC, Shen YD, et al. Trial of Contralateral seventh cervical nerve transfer for spastic arm paralysis. N Engl J Med. 2018;378(1):22–34. https://doi.org/10.1056/NEJMoa1615208 . United States.

Atri A, Frölich L, Ballard C, Tariot PN, Molinuevo JL, Boneva N, et al. Effect of idalopirdine as adjunct to cholinesterase inhibitors on change in cognition in patients with Alzheimer disease: three randomized clinical trials. JAMA. 2018;319(2):130–42. https://doi.org/10.1001/jama.2017.20373 .

Bassler D, Shinwell ES, Hallman M, Jarreau PH, Plavka R, Carnielli V, et al. Long-term effects of inhaled budesonide for bronchopulmonary dysplasia. N Engl J Med. 2018;378(2):148–57. https://doi.org/10.1056/NEJMoa1708831 . United States.

Raskind MA, Peskind ER, Chow B, Harris C, Davis-Karim A, Holmes HA, et al. Trial of prazosin for post-traumatic stress disorder in military veterans. N Engl J Med. 2018;378(6):507–17. https://doi.org/10.1056/NEJMoa1507598 . United States.

Albers GW, Marks MP, Kemp S, Christensen S, Tsai JP, Ortega-Gutierrez S, et al. Thrombectomy for stroke at 6 to 16 hours with selection by perfusion imaging. N Engl J Med. 2018;378(8):708–18. https://doi.org/10.1056/NEJMoa1713973 .

Bath PM, Woodhouse LJ, Appleton JP, Beridze M, Christensen H, Dineen RA, et al. Antiplatelet therapy with aspirin, clopidogrel, and dipyridamole versus clopidogrel alone or aspirin and dipyridamole in patients with acute cerebral ischaemia (TARDIS): a randomised, open-label, phase 3 superiority trial. Lancet. 2018;391(10123):850–9. https://doi.org/10.1016/S0140-6736(17)32849-0 .

Krebs EE, Gravely A, Nugent S, Jensen AC, DeRonne B, Goldsmith ES, et al. Effect of opioid vs nonopioid medications on pain-related function in patients with chronic back pain or hip or knee osteoarthritis pain: the SPACE randomized clinical trial. JAMA. 2018;319(9):872–82. https://doi.org/10.1001/jama.2018.0899 .

Campbell BCV, Mitchell PJ, Churilov L, Yassi N, Kleinig TJ, Dowling RJ, et al. Tenecteplase versus alteplase before thrombectomy for ischemic stroke. N Engl J Med. 2018;378(17):1573–82. https://doi.org/10.1056/NEJMoa1716405 . United States.

Mellor R, Bennell K, Grimaldi A, Nicolson P, Kasza J, Hodges P, et al. Education plus exercise versus corticosteroid injection use versus a wait and see approach on global outcome and pain from gluteal tendinopathy: prospective, single blinded, randomised clinical trial. BMJ. 2018;361. https://doi.org/10.1136/bmj.k1662 .

Sprigg N, Flaherty K, Appleton JP, Al-Shahi Salman R, Bereczki D, Beridze M, et al. Tranexamic acid for hyperacute primary IntraCerebral Haemorrhage (TICH-2): an international randomised, placebo-controlled, phase 3 superiority trial. Lancet. 2018;391(10135):2107–15. https://doi.org/10.1016/S0140-6736(18)31033-X .

Jolly K, Sidhu MS, Hewitt CA, Coventry PA, Daley A, Jordan R, et al. Self management of patients with mild COPD in primary care: randomised controlled trial. BMJ. 2018;361. https://doi.org/10.1136/bmj.k2241 .

Brock PR, Maibach R, Childs M, Rajput K, Roebuck D, Sullivan MJ, et al. Sodium thiosulfate for protection from cisplatin-induced hearing loss. N Engl J Med. 2018;378(25):2376–85. https://doi.org/10.1056/NEJMoa1801109 .

Khatri P, Kleindorfer DO, Devlin T, Sawyer RN Jr, Starr M, Mejilla J, et al. Effect of alteplase vs aspirin on functional outcome for patients with acute ischemic stroke and minor nondisabling neurologic deficits: the PRISMS randomized clinical trial. JAMA. 2018;320(2):156–66. https://doi.org/10.1001/jama.2018.8496 .

Wang Y, Li Z, Zhao X, Wang C, Wang X, Wang D, et al. Effect of a multifaceted quality improvement intervention on hospital personnel adherence to performance measures in patients with acute ischemic stroke in china: a randomized clinical trial. JAMA. 2018;320(3):245–54. https://doi.org/10.1001/jama.2018.8802 . United States.

Fossat G, Baudin F, Courtes L, Bobet S, Dupont A, Bretagnol A, et al. Effect of in-bed leg cycling and electrical stimulation of the quadriceps on global muscle strength in critically ill adults: a randomized clinical trial. JAMA. 2018;320(4):368–78. https://doi.org/10.1001/jama.2018.9592 .

Thomalla G, Simonsen CZ, Boutitie F, Andersen G, Berthezene Y, Cheng B, et al. MRI-guided thrombolysis for stroke with unknown time of onset. N Engl J Med. 2018;379(7):611–22. https://doi.org/10.1056/NEJMoa1804355 . United States.

Perkins GD, Ji C, Deakin CD, Quinn T, Nolan JP, Scomparin C, et al. A randomized trial of epinephrine in out-of-hospital cardiac arrest. N Engl J Med. 2018;379(8):711–21. https://doi.org/10.1056/NEJMoa1806842 . United States.

Wang HE, Schmicker RH, Daya MR, Stephens SW, Idris AH, Carlson JN, et al. Effect of a strategy of initial laryngeal tube insertion vs endotracheal intubation on 72-hour survival in adults with out-of-hospital cardiac arrest: a randomized clinical trial. JAMA. 2018;320(8):769–78. https://doi.org/10.1001/jama.2018.7044 .

Benger JR, Kirby K, Black S, Brett SJ, Clout M, Lazaroo MJ, et al. Effect of a strategy of a supraglottic airway device vs tracheal intubation during out-of-hospital cardiac arrest on functional outcome: the AIRWAYS-2 randomized clinical trial. JAMA. 2018;320(8):779–91. https://doi.org/10.1001/jama.2018.11597 .

Meltzer-Brody S, Colquhoun H, Riesenberg R, Epperson CN, Deligiannidis KM, Rubinow DR, et al. Brexanolone injection in post-partum depression: two multicentre, double-blind, randomised, placebo-controlled, phase 3 trials. Lancet. 2018;392(10152):1058–70. https://doi.org/10.1016/S0140-6736(18)31551-4 . England.

Cooper DJ, Nichol AD, Bailey M, Bernard S, Cameron PA, Pili-Floury S, et al. Effect of early sustained prophylactic hypothermia on neurologic outcomes among patients with severe traumatic brain injury: the POLAR randomized clinical trial. JAMA. 2018;320(21):2211–20. https://doi.org/10.1001/jama.2018.17075 .

Bonell C, Allen E, Warren E, McGowan J, Bevilacqua L, Jamal F, et al. Effects of the Learning Together intervention on bullying and aggression in English secondary schools (INCLUSIVE): a cluster randomised controlled trial. Lancet. 2018;392(10163):2452–64. https://doi.org/10.1016/S0140-6736(18)31782-3 .

Stunnenberg BC, Raaphorst J, Groenewoud HM, Statland JM, Griggs RC, Woertman W, et al. Effect of mexiletine on muscle stiffness in patients with nondystrophic myotonia evaluated using aggregated N-of-1 trials. JAMA. 2018;320(22):2344–53. https://doi.org/10.1001/jama.2018.18020 .

Burt RK, Balabanov R, Burman J, Sharrack B, Snowden JA, Oliveira MC, et al. Effect of nonmyeloablative hematopoietic stem cell transplantation vs continued disease-modifying therapy on disease progression in patients with relapsing-remitting multiple sclerosis: a randomized clinical trial. JAMA. 2019;321(2):165–74. https://doi.org/10.1001/jama.2018.18743 .

Dennis M, Mead G, Forbes J, Graham C, Hackett M, Hankey GJ, et al. Effects of fluoxetine on functional outcomes after acute stroke (FOCUS): a pragmatic, double-blind, randomised, controlled trial. Lancet. 2019;393(10168):265–74. https://doi.org/10.1016/S0140-6736(18)32823-X .

Anderson CS, Huang Y, Lindley RI, Chen X, Arima H, Chen G, et al. Intensive blood pressure reduction with intravenous thrombolysis therapy for acute ischaemic stroke (ENCHANTED): an international, randomised, open-label, blinded-endpoint, phase 3 trial. Lancet. 2019;393(10174):877–88. https://doi.org/10.1016/S0140-6736(19)30038-8 . England.

Basner M, Asch DA, Shea JA, Bellini LM, Carlin M, Ecker AJ, et al. Sleep and alertness in a duty-hour flexibility trial in internal medicine. N Engl J Med. 2019;380(10):915–23. https://doi.org/10.1056/NEJMoa1810641 .

Bath PM, Scutt P, Anderson CS, Appleton JP, Berge E, Cala L, et al. Prehospital transdermal glyceryl trinitrate in patients with ultra-acute presumed stroke (RIGHT-2): an ambulance-based, randomised, sham-controlled, blinded, phase 3 trial. Lancet. 2019;393(10175):1009–20. https://doi.org/10.1016/S0140-6736(19)30194-1 .

Hanley DF, Thompson RE, Rosenblum M, Yenokyan G, Lane K, McBee N, et al. Efficacy and safety of minimally invasive surgery with thrombolysis in intracerebral haemorrhage evacuation (MISTIE III): a randomised, controlled, open-label, blinded endpoint phase 3 trial. Lancet. 2019;393(10175):1021–32. https://doi.org/10.1016/S0140-6736(19)30195-3 .

Turk AS 3rd, Siddiqui A, Fifi JT, De Leacy RA, Fiorella DJ, Gu E, et al. Aspiration thrombectomy versus stent retriever thrombectomy as first-line approach for large vessel occlusion (COMPASS): a multicentre, randomised, open label, blinded outcome, non-inferiority trial. Lancet. 2019;393(10175):998–1008.  https://doi.org/10.1016/S0140-6736(19)30297-1 . England.

Ma H, Campbell BCV, Parsons MW, Churilov L, Levi CR, Hsu C, et al. Thrombolysis guided by perfusion imaging up to 9 hours after onset of stroke. N Engl J Med. 2019;380(19):1795–803. https://doi.org/10.1056/NEJMoa1813046 . United States.

Fischer K, Al-Sawaf O, Bahlo J, Fink AM, Tandon M, Dixon M, et al. Venetoclax and obinutuzumab in patients with CLL and coexisting conditions. N Engl J Med. 2019;380(23):2225–36. https://doi.org/10.1056/NEJMoa1815281 . United States.

Shehabi Y, Howe BD, Bellomo R, Arabi YM, Bailey M, Bass FE, et al. Early sedation with dexmedetomidine in critically ill patients. N Engl J Med. 2019;380(26):2506–17. https://doi.org/10.1056/NEJMoa1904710 . United States.

Johnston KC, Bruno A, Pauls Q, Hall CE, Barrett KM, Barsan W, et al. Intensive vs standard treatment of hyperglycemia and functional outcome in patients with acute ischemic stroke: the SHINE randomized clinical trial. JAMA. 2019;322(4):326–35. https://doi.org/10.1001/jama.2019.9346 .

Widmark A, Gunnlaugsson A, Beckman L, Thellenberg-Karlsson C, Hoyer M, Lagerlund M, et al. Ultra-hypofractionated versus conventionally fractionated radiotherapy for prostate cancer: 5-year outcomes of the HYPO-RT-PC randomised, non-inferiority, phase 3 trial. Lancet. 2019;394(10196):385–95. https://doi.org/10.1016/S0140-6736(19)31131-6 . England.

Pittock SJ, Berthele A, Fujihara K, Kim HJ, Levy M, Palace J, et al. Eculizumab in aquaporin-4-positive neuromyelitis optica spectrum disorder. N Engl J Med. 2019;381(7):614–25.  https://doi.org/10.1056/NEJMoa1900866 . United States.

Gunduz-Bruce H, Silber C, Kaul I, Rothschild AJ, Riesenberg R, Sankoh AJ, et al. Trial of SAGE-217 in patients with major depressive disorder. N Engl J Med. 2019;381(10):903–11.  https://doi.org/10.1056/NEJMoa1815981 . United States.

Nave AH, Rackoll T, Grittner U, Bläsing H, Gorsler A, Nabavi DG, et al. Physical Fitness Training in Patients with Subacute Stroke (PHYS-STROKE): multicentre, randomised controlled, endpoint blinded trial. BMJ. 2019;366:l5101. https://doi.org/10.1136/bmj.l5101 .

Sands BE, Peyrin-Biroulet L, Loftus EV Jr, Danese S, Colombel JF, Törüner M, et al. Vedolizumab versus adalimumab for moderate-to-severe ulcerative colitis. N Engl J Med. 2019;381(13):1215–26.  https://doi.org/10.1056/NEJMoa1905725 . United States.

Cree BAC, Bennett JL, Kim HJ, Weinshenker BG, Pittock SJ, Wingerchuk DM, et al. Inebilizumab for the treatment of neuromyelitis optica spectrum disorder (N-MOmentum): a double-blind, randomised placebo-controlled phase 2/3 trial. Lancet. 2019;394(10206):1352–63.  https://doi.org/10.1016/S0140-6736(19)31817-3 . England.

Cooper K, Breeman S, Scott NW, Scotland G, Clark J, Hawe J, et al. Laparoscopic supracervical hysterectomy versus endometrial ablation for women with heavy menstrual bleeding (HEALTH): a parallel-group, open-label, randomised controlled trial. Lancet. 2019;394(10207):1425–36. https://doi.org/10.1016/S0140-6736(19)31790-8 .

Reddihough DS, Marraffa C, Mouti A, O’Sullivan M, Lee KJ, Orsini F, et al. Effect of fluoxetine on obsessive-compulsive behaviors in children and adolescents with autism spectrum disorders: a randomized clinical trial. JAMA. 2019;322(16):1561–9. https://doi.org/10.1001/jama.2019.14685 .

John LK, Loewenstein G, Marder A, Callaham ML. Effect of revealing authors’ conflicts of interests in peer review: randomized controlled trial. BMJ. 2019;367. https://doi.org/10.1136/bmj.l5896 .

Yamamura T, Kleiter I, Fujihara K, Palace J, Greenberg B, Zakrzewska-Pniewska B, et al. Trial of satralizumab in neuromyelitis optica spectrum disorder. N Engl J Med. 2019;381(22):2114–24.  https://doi.org/10.1056/NEJMoa1901747 . United States.

Hoskin PJ, Hopkins K, Misra V, Holt T, McMenemin R, Dubois D, et al. Effect of single-fraction vs multifraction radiotherapy on ambulatory status among patients with spinal canal compression from metastatic cancer: the SCORAD randomized clinical trial. JAMA. 2019;322(21):2084–94. https://doi.org/10.1001/jama.2019.17913 .

Lascarrou JB, Merdji H, Le Gouge A, Colin G, Grillet G, Girardie P, et al. Targeted temperature management for cardiac arrest with nonshockable rhythm. N Engl J Med. 2019;381(24):2327–37.  https://doi.org/10.1056/NEJMoa1906661 . United States.

Ständer S, Yosipovitch G, Legat FJ, Lacour JP, Paul C, Narbutt J, et al. Trial of nemolizumab in moderate-to-severe prurigo nodularis. N Engl J Med. 2020;382(8):706–16.  https://doi.org/10.1056/NEJMoa1908316 . United States.

Hill MD, Goyal M, Menon BK, Nogueira RG, McTaggart RA, Demchuk AM, et al. Efficacy and safety of nerinetide for the treatment of acute ischaemic stroke (ESCAPE-NA1): a multicentre, double-blind, randomised controlled trial. Lancet. 2020;395(10227):878–87.  https://doi.org/10.1016/S0140-6736(20)30258-0 . England.

Olsen HT, Nedergaard HK, Strøm T, Oxlund J, Wian KA, Ytrebø LM, et al. Nonsedation or light sedation in critically ill, mechanically ventilated patients. N Engl J Med. 2020;382(12):1103–11.  https://doi.org/10.1056/NEJMoa1906759 . United States.

Campbell BCV, Mitchell PJ, Churilov L, Yassi N, Kleinig TJ, Dowling RJ, et al. Effect of intravenous tenecteplase dose on cerebral reperfusion before thrombectomy in patients with large vessel occlusion ischemic stroke: the EXTEND-IA TNK Part 2 randomized clinical trial. JAMA. 2020;323(13):1257–65. https://doi.org/10.1001/jama.2020.1511 .

Deyle GD, Allen CS, Allison SC, Gill NW, Hando BR, Petersen EJ, et al. Physical therapy versus glucocorticoid injection for osteoarthritis of the knee. N Engl J Med. 2020;382(15):1420–29.  https://doi.org/10.1056/NEJMoa1905877 . United States.

Koblan KS, Kent J, Hopkins SC, Krystal JH, Cheng H, Goldman R, et al. A non-D2-receptor-binding drug for the treatment of schizophrenia. N Engl J Med. 2020;382(16):1497–506.  https://doi.org/10.1056/NEJMoa1911772 . United States.

Cao B, Wang Y, Wen D, Liu W, Wang J, Fan G, et al. A trial of lopinavir-ritonavir in adults hospitalized with severe COVID-19. N Engl J Med. 2020;382(19):1787–99. https://doi.org/10.1056/NEJMoa2001282 .

Wang Y, Zhang D, Du G, Du R, Zhao J, Jin Y, et al. Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial. Lancet. 2020;395(10236):1569–78. https://doi.org/10.1016/S0140-6736(20)31022-9 .

Yang P, Zhang Y, Zhang L, Treurniet KM, Chen W, Peng Y, et al. Endovascular thrombectomy with or without intravenous alteplase in acute stroke. N Engl J Med. 2020;382(21):1981–93.  https://doi.org/10.1056/NEJMoa2001123 . United States.

Martins SO, Mont’Alverne F, Rebello LC, Abud DG, Silva GS, Lima FO, et al. Thrombectomy for stroke in the public health care system of Brazil. N Engl J Med. 2020;382(24):2316–26.  https://doi.org/10.1056/NEJMoa2000120 . United States.

Kabashima K, Matsumura T, Komazaki H, Kawashima M. Trial of nemolizumab and topical agents for atopic dermatitis with pruritus. N Engl J Med. 2020;383(2):141–50.  https://doi.org/10.1056/NEJMoa1917006 . United States.

Johnston SC, Amarenco P, Denison H, Evans SR, Himmelmann A, James S, et al. Ticagrelor and aspirin or aspirin alone in acute ischemic stroke or TIA. N Engl J Med. 2020;383(3):207–17.  https://doi.org/10.1056/NEJMoa1916870 . United States.

Lebwohl MG, Papp KA, Stein Gold L, Gooderham MJ, Kircik LH, Draelos ZD, et al. Trial of roflumilast cream for chronic plaque psoriasis. N Engl J Med. 2020;383(3):229–39.  https://doi.org/10.1056/NEJMoa2000073 . United States.

Simpson EL, Sinclair R, Forman S, Wollenberg A, Aschoff R, Cork M, et al. Efficacy and safety of abrocitinib in adults and adolescents with moderate-to-severe atopic dermatitis (JADE MONO-1): a multicentre, double-blind, randomised, placebo-controlled, phase 3 trial. Lancet. 2020;396(10246):255–66.  https://doi.org/10.1016/S0140-6736(20)30732-7 . England.

Rowell SE, Meier EN, McKnight B, Kannas D, May S, Sheehan K, et al. Effect of out-of-hospital tranexamic acid vs placebo on 6-month functional neurologic outcomes in patients with moderate or severe traumatic brain injury. JAMA. 2020;324(10):961–74. https://doi.org/10.1001/jama.2020.8958 .

van der Vlist AC, van Oosterom RF, van Veldhoven PLJ, Bierma-Zeinstra SMA, Waarsing JH, Verhaar JAN, et al. Effectiveness of a high volume injection as treatment for chronic Achilles tendinopathy: randomised controlled trial. BMJ. 2020;370. https://doi.org/10.1136/bmj.m3027 .

Spinner CD, Gottlieb RL, Criner GJ, Arribas López JR, Cattelan AM, Soriano Viladomiu A, et al. Effect of remdesivir vs standard care on clinical status at 11 days in patients with moderate COVID-19: a randomized clinical trial. JAMA. 2020;324(11):1048–57. https://doi.org/10.1001/jama.2020.16349 .

Horne AW, Vincent K, Hewitt CA, Middleton LJ, Koscielniak M, Szubert W, et al. Gabapentin for chronic pelvic pain in women (GaPP2): a multicentre, randomised, double-blind, placebo-controlled trial. Lancet. 2020;396(10255):909–17. https://doi.org/10.1016/S0140-6736(20)31693-7 .

Furtado RHM, Berwanger O, Fonseca HA, Corrêa TD, Ferraz LR, Lapa MG, et al. Azithromycin in addition to standard of care versus standard of care alone in the treatment of patients admitted to the hospital with severe COVID-19 in Brazil (COALITION II): a randomised clinical trial. Lancet. 2020;396(10256):959–67. https://doi.org/10.1016/S0140-6736(20)31862-6 .

Tomazini BM, Maia IS, Cavalcanti AB, Berwanger O, Rosa RG, Veiga VC, et al. Effect of dexamethasone on days alive and ventilator-free in patients with moderate or severe acute respiratory distress syndrome and COVID-19: the CoDEX randomized clinical trial. JAMA. 2020;324(13):1307–16. https://doi.org/10.1001/jama.2020.17021 .

Beigel JH, Tomashek KM, Dodd LE, Mehta AK, Zingman BS, Kalil AC, et al. Remdesivir for the treatment of COVID-19 - final report. N Engl J Med. 2020;383(19):1813–26. https://doi.org/10.1056/NEJMoa2007764 .

Goldman JD, Lye DCB, Hui DS, Marks KM, Bruno R, Montejano R, et al. Remdesivir for 5 or 10 days in patients with severe COVID-19. N Engl J Med. 2020;383(19):1827–37. https://doi.org/10.1056/NEJMoa2015301 .

Cavalcanti AB, Zampieri FG, Rosa RG, Azevedo LCP, Veiga VC, Avezum A, et al. Hydroxychloroquine with or without azithromycin in mild-to-moderate COVID-19. N Engl J Med. 2020;383(21):2041–52. https://doi.org/10.1056/NEJMoa2019014 .

Self WH, Semler MW, Leither LM, Casey JD, Angus DC, Brower RG, et al. Effect of hydroxychloroquine on clinical status at 14 days in hospitalized patients with COVID-19: a randomized clinical trial. JAMA. 2020;324(21):2165–76. https://doi.org/10.1001/jama.2020.22240 .

Martínez-Fernández R, Máñez-Miró JU, Rodríguez-Rojas R, Del Álamo M, Shah BB, Hernández-Fernández F, et al. Randomized trial of focused ultrasound subthalamotomy for Parkinson’s disease. N Engl J Med. 2020;383(26):2501–13.  https://doi.org/10.1056/NEJMoa2016311 . United States.

Hutchinson PJ, Edlmann E, Bulters D, Zolnourian A, Holton P, Suttner N, et al. Trial of dexamethasone for chronic subdural hematoma. N Engl J Med. 2020;383(27):2616–27.  https://doi.org/10.1056/NEJMoa2020473 . United States.

Klein AL, Imazio M, Cremer P, Brucato A, Abbate A, Fang F, et al. Phase 3 trial of interleukin-1 trap rilonacept in recurrent pericarditis. N Engl J Med. 2021;384(1):31–41.  https://doi.org/10.1056/NEJMoa2027892 . United States.

Post R, Germans MR, Tjerkstra MA, Vergouwen MDI, Jellema K, Koot RW, et al. Ultra-early tranexamic acid after subarachnoid haemorrhage (ULTRA): a randomised controlled trial. Lancet. 2021;397(10269):112–8.  https://doi.org/10.1016/S0140-6736(20)32518-6 . England.

Suzuki K, Matsumaru Y, Takeuchi M, Morimoto M, Kanazawa R, Takayama Y, et al. Effect of mechanical thrombectomy without vs with intravenous thrombolysis on functional outcome among patients with acute ischemic stroke: the SKIP randomized clinical trial. JAMA. 2021;325(3):244–53. https://doi.org/10.1001/jama.2020.23522 .

Zi W, Qiu Z, Li F, Sang H, Wu D, Luo W, et al. Effect of endovascular treatment alone vs intravenous alteplase plus endovascular treatment on functional independence in patients with acute ischemic stroke: the DEVT randomized clinical trial. JAMA. 2021;325(3):234–43. https://doi.org/10.1001/jama.2020.23523 .

Veiga VC, Prats JAGG, Farias DLC, Rosa RG, Dourado LK, Zampieri FG, et al. Effect of tocilizumab on clinical outcomes at 15 days in patients with severe or critical coronavirus disease 2019: randomised controlled trial. BMJ. 2021;372. https://doi.org/10.1136/bmj.n84 .

Gordon KB, Foley P, Krueger JG, Pinter A, Reich K, Vender R, et al. Bimekizumab efficacy and safety in moderate to severe plaque psoriasis (BE READY): a multicentre, double-blind, placebo-controlled, randomised withdrawal phase 3 trial. Lancet. 2021;397(10273):475–86.  https://doi.org/10.1016/S0140-6736(21)00126-4 . England.

Reich K, Papp KA, Blauvelt A, Langley RG, Armstrong A, Warren RB, et al. Bimekizumab versus ustekinumab for the treatment of moderate to severe plaque psoriasis (BE VIVID): efficacy and safety from a 52-week, multicentre, double-blind, active comparator and placebo controlled phase 3 trial. Lancet. 2021;397(10273):487–98.  https://doi.org/10.1016/S0140-6736(21)00125-2 . England.

Blauvelt A, Kempers S, Lain E, Schlesinger T, Tyring S, Forman S, et al. Phase 3 trials of tirbanibulin ointment for actinic keratosis. N Engl J Med. 2021;384(6):512–20.  https://doi.org/10.1056/NEJMoa2024040 . United States.

Simonovich VA, Burgos Pratx LD, Scibona P, Beruto MV, Vallone MG, Vázquez C, et al. A randomized trial of convalescent plasma in COVID-19 severe pneumonia. N Engl J Med. 2021;384(7):619–29. https://doi.org/10.1056/NEJMoa2031304 .

Brannan SK, Sawchak S, Miller AC, Lieberman JA, Paul SM, Breier A. Muscarinic cholinergic receptor agonist and peripheral antagonist for schizophrenia. N Engl J Med. 2021;384(8):717–26. https://doi.org/10.1056/NEJMoa2017015 .

Lundgren JD, Grund B, Barkauskas CE, Holland TL, Gottlieb RL, Sandkovsky U, et al. A neutralizing monoclonal antibody for hospitalized patients with COVID-19. N Engl J Med. 2021;384(10):905–14. https://doi.org/10.1056/NEJMoa2033130 .

Bieber T, Simpson EL, Silverberg JI, Thaçi D, Paul C, Pink AE, et al. Abrocitinib versus placebo or dupilumab for atopic dermatitis. N Engl J Med. 2021;384(12):1101–12.  https://doi.org/10.1056/NEJMoa2019380 . United States.

Gordon AC, Mouncey PR, Al-Beidh F, Rowan KM, Nichol AD, Arabi YM, et al. Interleukin-6 receptor antagonists in critically ill patients with COVID-19. N Engl J Med. 2021;384(16):1491–502. https://doi.org/10.1056/NEJMoa2100433 .

Rosas IO, Bräu N, Waters M, Go RC, Hunter BD, Bhagani S, et al. Tocilizumab in hospitalized patients with severe COVID-19 pneumonia. N Engl J Med. 2021;384(16):1503–16. https://doi.org/10.1056/NEJMoa2028700 .

Aspvall K, Andersson E, Melin K, Norlin L, Eriksson V, Vigerland S, et al. Effect of an Internet-delivered stepped-care program vs in-person cognitive behavioral therapy on obsessive-compulsive disorder symptoms in children and adolescents: a randomized clinical trial. JAMA. 2021;325(18):1863–73. https://doi.org/10.1001/jama.2021.3839 .

Langezaal LCM, van der Hoeven EJRJ, Mont’Alverne FJA, de Carvalho JJF, Lima FO, Dippel DWJ, et al. Endovascular therapy for stroke due to basilar-artery occlusion. N Engl J Med. 2021;384(20):1910–20.  https://doi.org/10.1056/NEJMoa2030297 . United States.

Roquilly A, Moyer JD, Huet O, Lasocki S, Cohen B, Dahyot-Fizelier C, et al. Effect of continuous infusion of hypertonic saline vs standard care on 6-month neurological outcomes in patients with traumatic brain injury: the COBI randomized clinical trial. JAMA. 2021;325(20):2056–66. https://doi.org/10.1001/jama.2021.5561 .

Guttman-Yassky E, Teixeira HD, Simpson EL, Papp KA, Pangan AL, Blauvelt A, et al. Once-daily upadacitinib versus placebo in adolescents and adults with moderate-to-severe atopic dermatitis (Measure Up 1 and Measure Up 2): results from two replicate double-blind, randomised controlled phase 3 trials. Lancet. 2021;397(10290):2151–68.  https://doi.org/10.1016/S0140-6736(21)00588-2 . England.

Reich K, Teixeira HD, de Bruin-Weller M, Bieber T, Soong W, Kabashima K, et al. Safety and efficacy of upadacitinib in combination with topical corticosteroids in adolescents and adults with moderate-to-severe atopic dermatitis (AD Up): results from a randomised, double-blind, placebo-controlled, phase 3 trial. Lancet. 2021;397(10290):2169–81.  https://doi.org/10.1016/S0140-6736(21)00589-4 . England.

Dankiewicz J, Cronberg T, Lilja G, Jakobsen JC, Levin H, Ullén S, et al. Hypothermia versus normothermia after out-of-hospital cardiac arrest. N Engl J Med. 2021;384(24):2283–94.  https://doi.org/10.1056/NEJMoa2100591 . United States.

Tariot PN, Cummings JL, Soto-Martin ME, Ballard C, Erten-Lyons D, Sultzer DL, et al. Trial of pimavanserin in dementia-related psychosis. N Engl J Med. 2021;385(4):309–19.  https://doi.org/10.1056/NEJMoa2034634 . United States.

Guimarães PO, Quirk D, Furtado RH, Maia LN, Saraiva JF, Antunes MO, et al. Tofacitinib in patients hospitalized with COVID-19 pneumonia. N Engl J Med. 2021;385(5):406–15. https://doi.org/10.1056/NEJMoa2101643 .

Lawler PR, Goligher EC, Berger JS, Neal MD, McVerry BJ, Nicolau JC, et al. Therapeutic anticoagulation with heparin in noncritically ill patients with COVID-19. N Engl J Med. 2021;385(9):790–802. https://doi.org/10.1056/NEJMoa2105911 .

Goligher EC, Bradbury CA, McVerry BJ, Lawler PR, Berger JS, Gong MN, et al. Therapeutic Anticoagulation with heparin in critically ill patients with COVID-19. N Engl J Med. 2021;385(9):777–89. https://doi.org/10.1056/NEJMoa2103417 .

Schwarzschild MA, Ascherio A, Casaceli C, Curhan GC, Fitzgerald R, Kamp C, et al. Effect of urate-elevating inosine on early Parkinson disease progression: the SURE-PD3 randomized clinical trial. JAMA. 2021;326(10):926–39. https://doi.org/10.1001/jama.2021.10207 .

Halliday A, Bulbulia R, Bonati LH, Chester J, Cradduck-Bamford A, Peto R, et al. Second asymptomatic carotid surgery trial (ACST-2): a randomised comparison of carotid artery stenting versus carotid endarterectomy. Lancet. 2021;398(10305):1065–73. https://doi.org/10.1016/S0140-6736(21)01910-3 .

Paget LDA, Reurink G, de Vos RJ, Weir A, Moen MH, Bierma-Zeinstra SMA, et al. Effect of platelet-rich plasma injections vs placebo on ankle symptoms and function in patients with ankle osteoarthritis: a randomized clinical trial. JAMA. 2021;326(16):1595–605. https://doi.org/10.1001/jama.2021.16602 .

Estcourt LJ, Turgeon AF, McQuilten ZK, McVerry BJ, Al-Beidh F, Annane D, et al. Effect of convalescent plasma on organ support-free days in critically ill patients with COVID-19: a randomized clinical trial. JAMA. 2021;326(17):1690–702. https://doi.org/10.1001/jama.2021.18178 .

LeCouffe NE, Kappelhof M, Treurniet KM, Rinkel LA, Bruggeman AE, Berkhemer OA, et al. A randomized trial of intravenous alteplase before endovascular treatment for stroke. N Engl J Med. 2021;385(20):1833–44.  https://doi.org/10.1056/NEJMoa2107727 . United States.

Korley FK, Durkalski-Mauldin V, Yeatts SD, Schulman K, Davenport RD, Dumont LJ, et al. Early convalescent plasma for high-risk outpatients with COVID-19. N Engl J Med. 2021;385(21):1951–60. https://doi.org/10.1056/NEJMoa2103784 .

Bennell KL, Paterson KL, Metcalf BR, Duong V, Eyles J, Kasza J, et al. Effect of intra-articular platelet-rich plasma vs placebo injection on pain and medial tibial cartilage volume in patients with knee osteoarthritis: the RESTORE randomized clinical trial. JAMA. 2021;326(20):2021–30. https://doi.org/10.1001/jama.2021.19415 .

Ospina-Tascón GA, Calderón-Tapia LE, García AF, Zarama V, Gómez-Álvarez F, Álvarez-Saa T, et al. Effect of high-flow oxygen therapy vs conventional oxygen therapy on invasive mechanical ventilation and clinical recovery in patients with severe COVID-19: a randomized clinical trial. JAMA. 2021;326(21):2161–71. https://doi.org/10.1001/jama.2021.20714 .

Lebwohl MG, Stein Gold L, Strober B, Papp KA, Armstrong AW, Bagel J, et al. Phase 3 Trials of tapinarof cream for plaque psoriasis. N Engl J Med. 2021;385(24):2219–29.  https://doi.org/10.1056/NEJMoa2103629 . United States.

Berger JS, Kornblith LZ, Gong MN, Reynolds HR, Cushman M, Cheng Y, et al. Effect of P2Y12 Inhibitors on survival free of organ support among non-critically ill hospitalized patients with COVID-19: a randomized clinical trial. JAMA. 2022;327(3):227–36. https://doi.org/10.1001/jama.2021.23605 .

Polizzotto MN, Nordwall J, Babiker AG, Phillips A, Vock DM, Eriobu N, et al. Hyperimmune immunoglobulin for hospitalised patients with COVID-19 (ITAC): a double-blind, placebo-controlled, phase 3, randomised trial. Lancet. 2022;399(10324):530–40. https://doi.org/10.1016/S0140-6736(22)00101-5 .

Gadjradj PS, Rubinstein SM, Peul WC, Depauw PR, Vleggeert-Lankamp CL, Seiger A, et al. Full endoscopic versus open discectomy for sciatica: randomised controlled non-inferiority trial. BMJ. 2022;376. https://doi.org/10.1136/bmj-2021-065846 .

Preskorn SH, Zeller S, Citrome L, Finman J, Goldberg JF, Fava M, et al. Effect of sublingual dexmedetomidine vs placebo on acute agitation associated with bipolar disorder: a randomized clinical trial. JAMA. 2022;327(8):727–36. https://doi.org/10.1001/jama.2022.0799 .

Ruijter BJ, Keijzer HM, Tjepkema-Cloostermans MC, Blans MJ, Beishuizen A, Tromp SC, et al. Treating rhythmic and periodic EEG patterns in comatose survivors of cardiac arrest. N Engl J Med. 2022;386(8):724–34.  https://doi.org/10.1056/NEJMoa2115998 . United States.

Renú A, Millán M, San Román L, Blasco J, Martí-Fàbregas J, Terceño M, et al. Effect of intra-arterial alteplase vs placebo following successful thrombectomy on functional outcomes in patients with large vessel occlusion acute ischemic stroke: the CHOICE randomized clinical trial. JAMA. 2022;327(9):826–35. https://doi.org/10.1001/jama.2022.1645 .

van der Steen W, van de Graaf RA, Chalos V, Lingsma HF, van Doormaal PJ, Coutinho JM, et al. Safety and efficacy of aspirin, unfractionated heparin, both, or neither during endovascular stroke treatment (MR CLEAN-MED): an open-label, multicentre, randomised controlled trial. Lancet. 2022;399(10329):1059–69.  https://doi.org/10.1016/S0140-6736(22)00014-9 . England.

Paskins Z, Bromley K, Lewis M, Hughes G, Hughes E, Hennings S, et al. Clinical effectiveness of one ultrasound guided intra-articular corticosteroid and local anaesthetic injection in addition to advice and education for hip osteoarthritis (HIT trial): single blind, parallel group, three arm, randomised controlled trial. BMJ. 2022;377:e068446. https://doi.org/10.1136/bmj-2021-068446 .

Yoshimura S, Sakai N, Yamagami H, Uchida K, Beppu M, Toyoda K, et al. Endovascular therapy for acute stroke with a large ischemic region. N Engl J Med. 2022;386(14):1303–13.  https://doi.org/10.1056/NEJMoa2118191 . United States.

Pérez de la Ossa N, Abilleira S, Jovin TG, García-Tornel Á, Jimenez X, Urra X, et al. Effect of direct transportation to thrombectomy-capable center vs local stroke center on neurological outcomes in patients with suspected large-vessel occlusion stroke in nonurban areas: the RACECAT randomized clinical trial. JAMA. 2022;327(18):1782–94. https://doi.org/10.1001/jama.2022.4404 .

Bösel J, Niesen WD, Salih F, Morris NA, Ragland JT, Gough B, et al. Effect of early vs standard approach to tracheostomy on functional outcome at 6 months among patients with severe stroke receiving mechanical ventilation: the SETPOINT2 randomized clinical trial. JAMA. 2022;327(19):1899–909. https://doi.org/10.1001/jama.2022.4798 .

Perry DC, Achten J, Knight R, Appelbe D, Dutton SJ, Dritsaki M, et al. Immobilisation of torus fractures of the wrist in children (FORCE): a randomised controlled equivalence trial in the UK. Lancet. 2022;400(10345):39–47.  https://doi.org/10.1016/S0140-6736(22)01015-7 . England.

Fischer U, Kaesmacher J, Strbian D, Eker O, Cognard C, Plattner PS, et al. Thrombectomy alone versus intravenous alteplase plus thrombectomy in patients with stroke: an open-label, blinded-outcome, randomised non-inferiority trial. Lancet. 2022;400(10346):104–15.  https://doi.org/10.1016/S0140-6736(22)00537-2 . England.

Mitchell PJ, Yan B, Churilov L, Dowling RJ, Bush SJ, Bivard A, et al. Endovascular thrombectomy versus standard bridging thrombolytic with endovascular thrombectomy within 4 \(\cdot\) 5 h of stroke onset: an open-label, blinded-endpoint, randomised non-inferiority trial. Lancet. 2022;400(10346):116–25.  https://doi.org/10.1016/S0140-6736(22)00564-5 . England.

Wu YW, Comstock BA, Gonzalez FF, Mayock DE, Goodman AM, Maitre NL, et al. Trial of erythropoietin for hypoxic-ischemic encephalopathy in newborns. N Engl J Med. 2022;387(2):148–59.  https://doi.org/10.1056/NEJMoa2119660 . United States.

Menon BK, Buck BH, Singh N, Deschaintre Y, Almekhlafi MA, Coutts SB, et al. Intravenous tenecteplase compared with alteplase for acute ischaemic stroke in Canada (AcT): a pragmatic, multicentre, open-label, registry-linked, randomised, controlled, non-inferiority trial. Lancet. 2022;400(10347):161–9.  https://doi.org/10.1016/S0140-6736(22)01054-6 . England.

Valenta Z, Pitha J, Poledne R. Proportional odds logistic regression—effective means of dealing with limited uncertainty in dichotomizing clinical outcomes. Stat Med. 2006;25(24):4227–34.

Saver JL. Novel end point analytic techniques and interpreting shifts across the entire range of outcome scales in acute stroke trials. Stroke. 2007;38(11):3055–62.

Fullerton AS, Xu J. The proportional odds with partial proportionality constraints model for ordinal response variables. Soc Sci Res. 2012;41(1):182–98.

Faraway JJ. On the cost of data analysis. J Comput Graph Stat. 1992;1(3):213–29.

Clogg CC, Shihadeh ES. Statistical models for ordinal variables. AGE Publications; 1994.

McCullagh P. Generalized linear models. Routledge; 2019.

Crans GG, Shuster JJ. How conservative is Fisher’s exact test? A quantitative evaluation of the two-sample comparative binomial trial. Stat Med. 2008;27(18):3598–611.

Selman CJ. Statistic analyses of ordinal outcomes in RCTs: scoping review data. GitHub. 2023. https://github.com/chrisselman/ordscopingreview . Accessed 31 May 2023.

Download references

Acknowledgements

Not applicable.

This work forms part of Chris Selman’s PhD, which is supported by the Research Training Program Scholarship, administered by the Australian Commonwealth Government and The University of Melbourne, Australia. Chris Selman’s PhD was also supported by a Centre of Research Excellence grant from the National Health and Medical Research Council of Australia ID 1171422, to the Australian Trials Methodology (AusTriM) Research Network. Research at the Murdoch Children’s Research Institute is supported by the Victorian Government’s Operational Infrastructure Support Program. This work was supported by the Australian National Health and Medical Research Council (NHMRC) Centre for Research Excellence grants to the Victorian Centre for Biostatistics (ID1035261) and the Australian Trials Methodology Research Network (ID1171422), including through seed funding awarded to Robert Mahar. Katherine Lee is funded by an NHMRC Career Development Fellowship (ID1127984). Brett Manley is funded by the NHMRC Investigator Grant (Leadership 1). The funding bodies played no role in the study conception, design, data collection, data analysis, data interpretation, or writing of the report.

Author information

Authors and affiliations.

Clinical Epidemiology and Biostatistics Unit, Murdoch Children’s Research Institute, Parkville, VIC, 3052, Australia

Chris J. Selman, Katherine J. Lee & Robert K. Mahar

Department of Paediatrics, University of Melbourne, Parkville, VIC, 3052, Australia

Chris J. Selman & Katherine J. Lee

Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Parkville, VIC, 3052, Australia

Robert K. Mahar

Department of Obstetrics and Gynaecology, University of Melbourne, Parkville, VIC, 3052, Australia

Kristin N. Ferguson, Clare L. Whitehead & Brett J. Manley

Department of Maternal Fetal Medicine, The Royal Women’s Hospital, Parkville, VIC, 3052, Australia

Clare L. Whitehead

Newborn Research, The Royal Women’s Hospital, Parkville, VIC, 3052, Australia

Brett J. Manley

Clinical Sciences, Murdoch Children’s Research Institute, Parkville, VIC, 3052, Australia

You can also search for this author in PubMed   Google Scholar

Contributions

CJS, RKM, KJL, CLW, and BJM conceived the study and CJS wrote the first draft of the manuscript. All authors contributed to the design of the study, revision of the manuscript, and take responsibility for its content.

Corresponding author

Correspondence to Chris J. Selman .

Ethics declarations

Ethics approval and consent to participate.

As data and information was only extracted from published studies, ethics approval was not required.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1..

Deviations from the protocol. This presents a summary of the deviations from the protocol, with reasons. We also provide an explanation of any simplifications and assumptions that were made for eligibility criteria and data extraction.

Additional file 2.

Data extraction questionnaire. This is a copy of the data extraction questionnaire that will be used for this review in PDF format.

Additional file 3.

Interpretation of the proportional odds ratio in proportional odds models. This presents a summary of the ways that the proportional odds ratio was interpreted across the studies.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Selman, C.J., Lee, K.J., Ferguson, K.N. et al. Statistical analyses of ordinal outcomes in randomised controlled trials: a scoping review. Trials 25 , 241 (2024). https://doi.org/10.1186/s13063-024-08072-2

Download citation

Received : 02 July 2023

Accepted : 22 March 2024

Published : 06 April 2024

DOI : https://doi.org/10.1186/s13063-024-08072-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Ordinal outcomes
  • Proportional odds model
  • Randomised controlled trials
  • Scoping review

ISSN: 1745-6215

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

statistical analysis research methods

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 04 April 2024

Optimization of cassava peel ash concrete using central composite design method

  • Uzoma Ibe Iro 1 ,
  • George Uwadiegwu Alaneme   ORCID: orcid.org/0000-0003-4863-7628 1 , 2 ,
  • Imoh Christopher Attah 3 ,
  • Nakkeeran Ganasen 4 ,
  • Stellamaris Chinenye Duru 5 &
  • Bamidele Charles Olaiya 2  

Scientific Reports volume  14 , Article number:  7901 ( 2024 ) Cite this article

142 Accesses

Metrics details

  • Engineering
  • Materials science

Cassava peel ash (CPA) is an abundant agricultural byproduct that has shown promise as an additional cementitious material in concrete manufacturing. This research study aims to optimize the incorporation of CPA in concrete blends using the central composite design (CCD) methodology to determine the most effective combination of ingredients for maximizing concrete performance. The investigation involves a physicochemical analysis of CPA to assess its pozzolanic characteristics. Laboratory experiments are then conducted to assess the compressive and flexural strengths of concrete mixtures formulated with varying proportions of CPA, cement, and aggregates. The results show that a mix ratio of 0.2:0.0875:0.3625:0.4625 for cement, CPA, fine, and coarse aggregates, respectively, yields a maximum compressive strength of 28.51 MPa. Additionally, a maximum flexural strength of 10.36 MPa is achieved with a mix ratio of 0.2:0.0875:0.3625:0.525. The experimental data were used to develop quadratic predictive models, followed by statistical analyses. The culmination of the research resulted in the identification of an optimal concrete blend that significantly enhances both compressive and flexural strength. To ensure the reliability of the model, rigorous validation was conducted using student’s t-test, revealing a strong correlation between laboratory findings and simulated values, with computed p-values of 0.9987 and 0.9912 for compressive and flexural strength responses, respectively. This study underscores the potential for enhancing concrete properties and reducing waste through the effective utilization of CPA in the construction sector.

Similar content being viewed by others

statistical analysis research methods

Experimental investigation and modelling of the mechanical properties of palm oil fuel ash concrete using Scheffe’s method

Godwin Adie Akeke, Philip-Edidiong Udo Inem, … Efiok Etim Nyah

statistical analysis research methods

Estimation of compressive strength of waste concrete utilizing fly ash/slag in concrete with interpretable approaches: optimization and graphical user interface (GUI)

Yakubu Dodo, Kiran Arif, … Yaser Gamil

statistical analysis research methods

Effects of aggregate sizes on the performance of laterized concrete

Joseph O. Ukpata, Desmond E. Ewa, … Bamidele Charles Olaiya

Introduction

Concrete stands as one of the extensively utilized construction materials worldwide; however, its manufacturing contributes considerably to environmental consequences because of the substantial energy consumption and carbon emissions linked to cement production 1 , 2 . In response to these environmental considerations and to advocate for sustainable construction methods, researchers are progressively investigating alternative materials and mix formulations. One such material of interest is cassava peel ash (CPA), a waste product generated from cassava processing 3 . Cassava (Manihot esculenta) is a vital crop in many tropical countries, and its processing generates substantial amounts of waste, primarily in the form of cassava peels. Improper disposal of cassava peels can lead to environmental pollution and health hazards 4 . However, recent research has shown that these waste cassava peels can be effectively converted into ash, known as cassava peel ash (CPA), and utilized as a supplementary material in concrete production. CPA contains pozzolanic properties, similar to other supplementary cementitious materials like fly ash or silica fume. Pozzolanic substances have the capability to interact with calcium hydroxide in cement, resulting in the formation of supplementary cementitious compounds, ultimately enhancing the properties of concrete 4 , 5 . However, the optimization of CPA's incorporation into concrete mixes is essential to ensure the desired performance characteristics are achieved 6 .

The incorporation of CPA in concrete has gained attention due to its pozzolanic characteristics, which can contribute to enhanced strength, durability, and reduced environmental impact. The pozzolanic nature of cassava peel ash indicates its capacity to undergo a reaction with calcium hydroxide in the presence of moisture, resulting in the formation of extra cementitious compounds 7 . This chemical process contributes to the reinforcement of strength and improvement of the durability of concrete. Moreover, CPA have found a wide acceptance in their utilization in civil engineering materials like concrete and soil re-engineering. In recent times utilization of this agro waste derivatives as supplementary cementitious material has been practiced to enhance concrete’s mechanical properties when used in cement replacement strategy 8 , 9 . Several research investigations have explored the possibilities of incorporating CPA in concrete applications. Ogunbode et al. 10 explores the mechanical and microstructure properties of concrete composites made using CPA and kenaf bio-fibers. The study likely investigates the potential for incorporating these sustainable materials into concrete mixtures. Also, in research carried out by Olubunmi et al. 11 , they investigate the use of cassava peel ash and wood ash as partial cement replacements in concrete. Various replacement percentages were tested, with 5%, 10%, and 15% replacements meeting plain concrete strength specifications. Higher percentages, such as 20% and 25%, were unsuitable for structural concrete. The study suggests that incorporating these materials into concrete production can help reduce environmental pollution.

Optimization of cassava peel ash (CPA) concrete using the Central Composite Design (CCD) method is an innovative approach that aims to improve the properties and performance of concrete by incorporating cassava peel ash as a supplementary cementitious material 12 , 13 . Statistical analysis of the experimental data enables researchers to model the relationship between the variables and the response using regression techniques and response surface methodology 14 . This allows for the identification of significant factors, evaluation of their individual and interactive effects, and determination of the optimal parameter values that maximize the desired response 15 . The optimization of CPA concrete using the CCD method offers several advantages, including reduced time and cost compared to traditional trial-and-error approaches. It enables researchers to efficiently explore a wide range of parameters and their interactions, leading to improved understanding and control over the properties of CPA concrete 16 , 17 . By identifying the optimal combination of variables, it is possible to enhance the performance, sustainability, and economic viability of concrete structures. It also provides a systematic and data-driven approach to guide the selection and proportioning of materials, ultimately leading to improved concrete performance and sustainability 18 .

Lately, many researchers have employed unconventional techniques to assess concrete performance concerning the interplay of mix ingredients 19 , 20 , 21 . These methods encompass statistical, computational, and analytical approaches. Hassan et al. 22 evaluate the use of micro and nano palm oil fuel ash (POFA) as supplementary cementitious materials in high-strength blended concrete. The research aims to optimize the concrete mix proportions using Central Composite Design and Response Surface Methodology. The experimental results validate mathematical models, indicating a close agreement between predictions and data. The study suggests an optimal mix with 10% micro POFA and 1.50–2.85% nano POFA, meeting optimization criteria for fresh and hardened concrete properties. Moreover, Ali et al. 23 investigate the use of pumice stone (PS) as a replacement for natural coarse aggregates in concrete. Various percentages of PS are used in the mix, and response surface methodology (RSM) is employed for experimentation. The study suggests that up to 30% of PS can be replaced in lightweight aggregate concrete, resulting in compressive strength greater than 15 MPa, split tensile strength at 7–12% of CS, and flexural strength at 9–11% of CS. The proposed quadratic model is highly relevant, with a coefficient of determination (R 2 ) above 99% for all responses. Also, Ali et al . 24 researched on the utilization of waste foundry sand (WFS) as a partial replacement for fine aggregate in concrete mixtures and assess its impact on fresh concrete performance and mechanical properties. WFS ratios were adjusted using Design-Expert software's Central Composite Design (CCD) tool in Response Surface Methodology (RSM). Results showed highest mechanical properties at 20% WFS replacement and 56 days curing, with compressive strength of 29.37 MPa, splitting tensile strength of 3.828 MPa, and flexural strength of 8.0 MPa. However, upto 30% replacement, fresh qualities of substitutes were akin to the control mix.

Furthermore, the optimization of cassava peel ash concrete using the Central Composite Design method is a valuable research approach that allows for the systematic exploration and optimization of various variables to enhance the properties and performance of concrete and provides a systematic and data-driven approach to improve the properties and performance of concrete 13 , 25 . Utilizing this method, researchers can maximize the utilization of cassava peel ash, a waste material, while improving the performance, sustainability of concrete structures and contribute to the development of more resource-efficient construction materials 26 . The study aims to optimize parameters for concrete with CPA utilizing the CCD method. This approach facilitates a systematic exploration of various variables and their interactions to identify the optimal combination that achieves the desired properties in the concrete. By employing statistical analysis and response surface modeling, the study aims to develop a comprehensive understanding of the relationship between the variables and the response, enabling the identification of the optimal parameter values.

The outcomes of this research will provide valuable insights into the optimization of CPA concrete, enabling more efficient utilization of cassava peel ash and enhancing the sustainability and performance of concrete structures. Ultimately, this study is motivated by several factors. Firstly, it aims to promote sustainable and eco-efficient construction practices by utilizing agricultural waste in concrete production. Secondly, there is potential for economic benefits through cost savings by replacing traditional cement with CPA. Additionally, the study seeks to enhance concrete performance by systematically exploring different mixture formulations using advanced design methodologies. The utilization of technology like Design Expert software streamlines the optimization process and contributes to advancements in sustainable construction practices. Overall, the research aims to improve concrete sustainability, cost-effectiveness, and performance through the effective integration of CPA.

Materials and methods

The experimental investigation utilized Grade 53 Dangote cement, obtained from the open market for building materials in Imo State, Nigeria. Furthermore, it adheres to the standards, composition, and compliance requirements outlined in BS 12 (1978).

Water plays a crucial role as a component in the concrete mixture, influencing the mechanical, rheological, and durability properties. For the laboratory tests, we employed potable water that complies with the specifications outlined in ASTM C1602-12 (2012) for concrete applications.

In this experimental study, we employed river sand sourced from Akwa Ibom State, Nigeria, as the fine aggregate. The fine aggregate employed meets the criteria outlined in BS-EN 12,620 and ASTM C125-16 and can pass through a 2.36 mm sieve. As for the coarse aggregate, crushed granite with well-graded properties and devoid of harmful substances was employed, and adherence to BS EN12620. The coarse aggregate has a maximum size of 20 mm.

Cassava peel ash (CPA)

The cassava peel was collected from Abayi-umuokoroato village, situated in the Abayi Ancient Kingdom of Obingwa Local Government Area in Abia State, Nigeria. Subsequently, the cassava peel was subjected to sun drying. It was then incinerated in a controlled kiln at a temperature range of approximately 500 °C to 850 °C for 60 min to ensure environmental protection. The resulting burnt material was carefully gathered and sieved in the laboratory, using a 150 µm sieve size, to obtain finely divided ash material for the experiments. The picture of the cassava peel waste taken in the laboratory during the experiments along with the processed ash samples are shown in Fig.  1

figure 1

Ash samples derived from cassava peel.

Design of experiment using CCD

Response Surface Methodology is a statistical method employed for experiment design to uncover relationships between variables and responses. Its primary goal is optimizing these variables to anticipate the most favorable responses 27 . CCD is a valuable technique for establishing a functional connection between the variables and responses. it incorporates a nested factorial or fractional factorial design with central points, enhanced by a set of 'star points' for curvature estimation. While the center-to-factorial point distance is ± 1 unit for each factor, the center-to-star point distance is |α|> 1. The specific value of α is determined based on design requirements and the number of factors in question, however, Face Centered Central Composite Design (FCCD) which have all the axial points are projected on the surfaces was utilized for the formulation 28 . Design Expert 13.0.5.0 Software was used for designing the experiments, mathematically modeling, statistically analyzing, and optimizing the response parameters. In essence, the Central Composite Design (CCD) includes 2n factorial experiments along with 2n axial experiments, and the experimental error is assessed using center point replicates (n c ). Therefore, a Face Centered Central Composite Design (FCCD) comprises 2n factorial runs, coded as + −1, expanded by 2n axial points like (1, 0, 0…0), (0, + −a, 0…0), …(0, 0, + −… 0), and n c center points (0, 0, 0...0). The total number of required experimental runs (N) for CCD is determined by Eq. ( 1 ) 29 .

In this context, n represents the number of variables, while n c pertains to the number of central points. For our study, which incorporated four input variables, we adopted a CCD design consisting of twelve factorial points, eight axial points, and a single repetition at the center. The arrangement of these points can be visualized in Fig.  2 . Consequently, we conducted a total of twenty-five experimental runs, considering four parameters, each varying at three levels denoted as − 1, 0, and 1 30 .

figure 2

FCCD diagrammatic illustration 29 .

Formulation of mixture components ratio

In CCD, mix design refers to the process of determining the composition of the experimental mixtures that will be used in the study. The methodological approach involves selecting appropriate levels or values for the variables being studied and preparing the experimental mixtures accordingly 31 . The mix design process in central composite design involves carefully selecting variable ranges, determining design points, assigning variable levels, calculating ingredient proportions, and preparing the experimental mixtures. This allows for a systematic exploration of the variable space and helps in understanding the interactions between the factor levels and the target response(s) of interest. The collected data from the experiments can then be used for statistical analysis and optimization to ascertain the optimal mix composition that achieves the desired objectives of the study 32 , 33 . The concrete mix design parameters for this experimental study indicated target strength of 25 N/mm 2 , with a cement content of 290 kg/m 3 , coarse aggregate content of 1198.65 kg/m 3 , and fine aggregate content of 766.35 kg/m 3 which were derived from relevant literature 34 , 35 . Furthermore, taking water-cement-ratio (w/c) of 0.5, the central composite design mixture formulation obtained with the aid of design expert software for the experimental investigations showing four components’ constituents of cement, cassava peel ash (CPA), fine and coarse aggregates is shown in Tables 1 , 2 . Moreover, the experimental factor space for the four components in the mixture design and the cubic plot standard error of design were presented in Figs. 3 , 4 . The plot displayed the factor space on the x-axis, illustrating three sections (center, factorial, and axial) for the central composite design using Design Expert Software. Meanwhile, the mixture components' ratios for the 25 experimental runs were depicted on the y-axis of the plot. Additionally, it was noted that 16 out of the 25 design points are located on the factorial plane within the factor space. Among these, eight data points are positioned at both the lower and upper limits for the four mixture components 36 , 37 .

figure 3

Experimental factor space.

figure 4

Cube standard error of design.

Compressive strength property

The mixture components were accurately weighed and thoroughly mixed based on the specified formula. The resulting uniform concrete mixture was compacted into 150 mm × 150 mm × 150 mm cubic molds. These green concrete specimens, blended with CPA, were submerged in a curing tank filled with clean water for 28 days at normal temperature. After the curing period, they were weighed, and their compressive strength was determined following the BS EN 12,390-4 standard. The cubes underwent crushing tests using the Okhard Machine Tool’s WA-1000B digital display Universal Testing Machine, with a testing range of 0–1000kN. The cubes were positioned between two 25 mm-thick steel plates that covered the top and bottom, and force was incrementally applied until the cubes failed in compression 38 , 39 .

Flexural strength

The procedure for the flexural strength test will adhere to BS EN 12,390-5 (2009) standards, utilizing test specimens with dimensions of 400 × 100 × 100 mm. These specimens will be thoroughly batched and mixed in accordance with the specified component fractions. Subsequently, the concrete beams formed will be demolded and allowed to cure for a 28-day hydration period before undergoing the flexural test. After twenty-eight days of curing, three samples from each experimental run will be subjected to testing, and the average flexural strength will be determined. This process will be repeated for each mix proportion, testing three specimens per proportion and calculating the average flexural strength for each 40 .

Ethics and compliance statement

Authors comply with the International Union for Conservation of Nature (IUCN) Policy Statement on Research Involving Species at Risk of Extinction and the Convention on the Trade in Endangered Species of Wild Fauna and Flora in this research article.

Consent to participate

All authors were highly cooperative and involved in research activities and preparation of this article.

Results discussion and analysis

Test materials characterization.

A sequence of laboratory examinations was carried out on the constituent elements to evaluate their suitability as construction materials in civil engineering. The examinations encompassed sieve analysis and specific gravity tests on aggregates and admixtures to assess particle size distribution and gradation. The results of the sieve analysis test are presented in Fig.  5 , depicting the particle size variation with a cumulative frequency distribution curve. The findings revealed that the coarse aggregate exhibited a passing sieve size of 76.2–11.6% for 10–2 mm, while the fine aggregates demonstrated a passing sieve size of 93.4–0.13 for 2 mm–75 µm 41 . Moreover, the CPA admixtures in the concrete showed a passing sieve size of 99.99–84.63% for 2 mm–75 µm. The results conform to the requirements outlined by BS 882, indicating well-graded sand and gravel particles for enhanced concrete durability performance 38 , 39 .

figure 5

Particles size distribution of test ingredient.

Chemical characterization of the test cement and CPA

The chemical attributes of the examined admixtures were assessed through X-ray fluorescence (XRF). The results revealed that CPA consists of Fe 2 O 3 (6.02%), Al 2 O 3 (19.88%), and SiO 2 (55.93%), and totaling 81.83% composition, indicating a favorable pozzolanic property compliant with ASTM C618, 98 specifications 40 . Furthermore, the cement composition indicated 9.85% for CaO, 51.4% for SiO 2 , and 20.6% for Al 2 O 3 . The plentiful presence of these elemental oxides in the examined materials supports extensive cement hydration, improving the mechanical strength and longevity of the resulting environmentally friendly concrete, as depicted in Table 3 . The reaction mechanism of hydration enables the amalgamation of aluminate and silicate oxides from the admixture with hydrated calcium, leading to the formation of a more robust mass over time 41 .

Effects of CPA admixtures on the mechanical laboratory response

The effective mass values for the ingredients were determined using the ratio conversion method, ensuring precise measurements for each experimental run with a w/c of 0.5. This conversion took into account the standard concrete density of 2400 kg/m 3 and applied the relationship between volume, density, and mass 42 . The mass required to fill the cubic mold was determined by multiplying the calculated mold volume (m 3 ) by the concrete density. For each experimental run, three cube and beam samples were produced, and the average compressive strength response is provided in Tables 4 – 5 . The graphical representation of the influence of cement and CPA interactions on the compressive and flexural strength responses is presented in Fig.  6 . The contour plot illustrates a consistent rise in both compressive and flexural strength attributes of the CPA-blended concrete as the proportion of CPA replacing cement in the mixture gradually increases from 0.025 to 0.875. However, the strength responses begin to decline with further increments in the CPA ratio, particularly at 0.12 and beyond. The maximum compressive strength recorded was 28.51 MPa, achieved with a concrete mixture ratio of 0.2:0.0875:0.3625:0.4625 for cement, CPA, fine, and coarse aggregates. Conversely, the minimum compressive strength of 17.25 MPa corresponded to a mixture ratio of 0.15:0.15:0.425:0.525. Moreover, incorporating 19% cement, 2.4% CPA, 34.6% fine aggregate, and 44% coarse aggregates notably enhanced the compressive strength behavior of the green concrete. Additionally, the highest flexural strength of 10.36 MPa was achieved with a mixture ratio of 0.2:0.0875:0.3625:0.525, and the lowest flexural strength of 4.22 MPa was observed with a mixture ratio of 0.15:0.15:0.425:0.525. Furthermore, obtained results showed that proportions of 17.02% of cement, 7.45% of CPA, 30.85% of fine aggregate and 44.68 of coarse aggregate produced best performance in terms of flexural strength of the CPA-concrete 43 , 44 . Overall, the concrete's mechanical strength behavior complied with NCP-1 and BS-8110 specifications, attributed to the pozzolanic properties derived from the abundance of aluminosilicate oxides in the CPA combined with Portland cement, resulting in the formation of calcium silicate hydrate 45 , 46 .

figure 6

Impact of the interaction between CPA and cement on (a) Compressive Strength and (b) Flexural Strength.

Development and validation of the model

The information obtained from the experimental procedure, which involved the application of the prescribed proportions of mixture ingredients and the corresponding responses to assess the mechanical performance of the created CPA-cement blended concrete, was utilized for constructing the model through response surface methodology. The process involves expertly choosing square root transformation with polynomial analysis type for the purpose of considering non-linearity of the datasets to generate accurate model predictions 47 . Further statistical computations were conducted on the datasets to assess their appropriateness for the intended modeling purposes, including fit statistics and analysis of variance (ANOVA). This crucial preliminary statistical analysis provides a fit summary to identify models using performance indicators such as the coefficient of determination (Rsqd.), PRESS (predicted residual sum of squares), which evaluates how well the sought-after models fit each point in the design, lack of fit tests, and sequential model sum of squares to determine the highest polynomial order with significant additional terms, as detailed in Tables 6 , 7 , 8 , 9 . The presented fit statistical outcomes indicate a preference for quadratic models, with R-sqd. values of 0.8675 and 0.9102 for compressive and flexural strength responses, respectively. From the sequential sum of squares computation results, p value of 0.0237 and 0.0014 for compressive and flexural strength responses respectively 38 , 48 .

Analysis of variance (ANOVA) result

Following the identification of a suitable polynomial model, as suggested during the fit statistical analysis, ANOVA is conducted. In this step, descriptive and statistical tests are carried out to assess the significance levels of the mixture model independent variables concerning the response parameters 49 . The computational outcomes are detailed in Table 10 for the compressive strength response, indicating a Model F-value of 4.68, signifying the significance of the model. There is only a 0.94% (p-value of 0.0094) probability that an F-value of this magnitude could occur due to random variations. Additionally, the statistical results for the flexural strength response show a Model F-value of 7.24, suggesting the significance of the model as shown in Table 11 . There is only a 0.17% (p-value of 0.0017) chance that an F-value of this magnitude could occur due to random variations 50 .

Derived coefficient estimates and model equations

In line with the experimental plan and subsequent statistical fit ANOVA computations, regression analysis enabled the prediction of each response. This analysis was conducted using Design Expert software, exploring the interaction between variables and responses. The CCD experimental design data facilitated the evaluation of mathematical prediction equations, as illustrated in Table 12 . The equations, in terms of coded factors, could be employed to make predictions regarding the response for specified levels of each factor. These predictions were formulated as a function of the factors A, B, C, and D, representing the proportion of cement, CPA, fine aggregates, and coarse aggregates, respectively 51 .

Diagnostics plots

The diagnostic statistical graphs, presented as scattered plots of residuals or model prediction errors against the predicted values, serve to assess whether further refinement of the estimation is possible. These graphs are also utilized to gauge the goodness-of-fit of the developed model using studentized residuals, confirming adherence to regression assumption conditions and identifying potential influential observations that could significantly impact the analysis results. It's noteworthy that the standard errors of the derived residuals differ unless the experimental runs' leverages in the design are identical, signifying that raw residuals belong to varying populations and are insufficient for evaluating regression assumptions 52 , 53 . However, studentized residuals are preferred as they map all normal distributions in different dimensions to a unitary distribution. Regarding the desired response variables, diagnostic statistical tests in this analysis were conducted at upper and lower intervals of ± 4.29681, encompassing predicted vs. residual, normal probability, experimental run vs. residuals, predicted vs. actual, and Box-Cox power transformation. These tests aid in detecting issues with the analysis, including outliers, as depicted in Figs. 7 – 10 . These diagnostic statistical plots provide essential criteria for selecting an appropriate power transformation law to evaluate the effects on the response variables at the current lambda of 0.5. Figures  11 – 13 illustrate the interaction effect of CPA admixture versus the concrete ingredients concerning the mechanical strength response. The patterns of compressive and flexural strength discernible from these plots aid in comprehending the parameters for optimum responses when CPA is incorporated into the concrete mixture. The results indicate that the addition of CPA led to improvements in the mechanical properties of the concrete, with the best results achieved at an 11.21% replacement of cement with CPA in the mixture 54 , 55 .

figure 7

Residuals normal probability plots for the target responses.

figure 8

Residuals vs. Predicted plots.

figure 9

Residuals vs. Experimental Runs plots.

figure 10

Box-cox plots for power transformation.

figure 11

Surface Plot for OPC vs. CPA.

figure 12

Surface Plot for Fine Aggregate. vs. CPA.

figure 13

Surface Plot for Coarse Agg. vs. CPA.

Optimization analysis

After completing the diagnostic statistical analysis and influence graphical calculations, numerical optimization is undertaken using a desirability function. This function assesses the imposed optimization criteria on the model variables to maximize the target response parameters. To achieve this objective, the characteristics of the objective function are analytically adjusted through modifications to weight functions in accordance with the predetermined model variable criteria 56 . These adjustments consider multicollinearity conditions to enable the attainment of favorable conditions and achieve a desirability score of 1.0 within the boundary conditions of 0 ≤ d(yi) ≤ 1. The optimization component of this experimental design seeks the combination of mixture ratios in the feasible factor space, simultaneously satisfying the formulated and imposed criteria on the response parameters and corresponding factor levels 57 . The primary goal of the optimization is set to maximize the target responses, while the combination ratios of the four components are set within the in-range option to determine the optimal proportion of factor levels that yield a maximum response, as detailed in Table 13 . The optimization solution derived from the analytical procedures of the mixture experiment designs is presented in Table 14 and Fig.  14 . The obtained results reveal an optimal desirability score of 1.0 at a combination ratio of 0.222:0.083:0.306:0.406, resulting in maximized compressive and flexural strength of 29.832 MPa and 10.948 MPa, respectively 58 .

figure 14

Optimization ramps.

Optimization contour plot

The contour plot serves as a crucial tool for visualizing the functional points within the feasible experimental region through iterative mixture design optimization solutions. It is a graphical representation tool for presenting 3D surfaces through contour plotting 59 . Three-dimensional surface plots provide a diagrammatic presentation of the relationships and interactions between the proportions of mixture components and the response parameters 60 , 61 . The 3D plots for the optimal solution, considering the desirability function and showing the response surface for the corresponding points in the analysis, are depicted in Fig.  15 . These graphical solutions illustrate the desirability function of all optimal solutions, adjusted according to the multi-response optimization. From the plot, it is evident that the green surface represents the lowest desirability function, occurring in the range of 0.025–0.05 and 0.15–0.125 fractions of CPA. The highest desirability function is indicated by the red-colored surface, covering the range of 0.075–0.12 fraction of CPA 62 , 63 , 64 .

figure 15

3D Surface Plot for the Optimization Solutions.

Model simulation and validation

This marks the final phase of the model validation process, where we replicate a real-life scenario to provide essential guidance to designers, contractors, and operators regarding the performance of the developed quadratic model 65 , 66 . The simulation of the model aims to ensure that the validation achieved during statistical diagnostics and inference computations is applicable in real-life situations. Student’s t-test was further employed to determine the statistically significant difference between the simulated model results and the experimental or actual values 64 . A graphical plot illustrating the experimental-derived responses vs. model-simulated results is presented in Fig.  16 . The computed results, obtained with the assistance of Microsoft Excel statistical software, are detailed in Table 15 . The calculated results reveal p (T ≤ t) two-tail values of 0.9987 and 0.9912 for compressive and flexural strength responses, respectively. The statistical outcomes indicate that there is no significant difference between the actual and model-predicted results, signifying acceptable model performance 67 , 68 .

figure 16

Actual vs. Model Predicted Responses.

The present investigation aimed to optimize the formulation of concrete blended with cassava peel ash (CPA) to achieve superior mechanical properties using a mixture design approach. The study focused on four key parameters: cement content, CPA content, fine aggregate content, and coarse aggregate content, with the primary objectives being to enhance compressive and flexural strength characteristics. Below are the main outcomes derived from the experimental research:

The research study optimizes a mixture consisting of four components, aiming to evaluate the mechanical strength characteristics of the resulting green concrete. The limits for the design mixture components’ ratios were established based on formulations derived from expert knowledge in relevant literature, ensuring an optimal mixture proportion conducive to maximizing strength response.

Chemical property analysis affirmed the beneficial pozzolanic characteristics of cassava peel ash (CPA) when utilized as a supplementary cementitious material (SCM). The CPA composition revealed notable percentages of Fe 2 O 3 (6.02%), Al 2 O 3 (19.88%), and SiO 2 (55.93%), summing up to 81.83%. These findings underscore the potential suitability of CPA as an effective SCM in concrete formulations, owing to its significant content of pozzolanic elements.

The experimental program utilized a face-centered central composite design for laboratory experiments, resulting in a maximum compressive strength of 28.51 MPa and a flexural strength of 10.36 MPa. Subsequently, a quadratic predictive model was developed using the laboratory data, and statistical analyses were conducted to assess the datasets. Through numerical optimization and graphical statistical computations, the optimal levels of mixture ingredients were identified, resulting in a desirability score of 1.0 at a mix ratio of 0.222:0.083:0.306:0.406. This optimal composition led to enhanced compressive and flexural strengths of 29.832 MPa and 10.948 MPa, respectively.

Adequacy tests performed on the generated model demonstrated a robust correlation between laboratory results and model-simulated values, as confirmed by the student's t-test. These findings underscore the effectiveness of the CCD method in optimizing mixture compositions to achieve desired concrete properties, thereby offering valuable insights for enhancing the mechanical performance of green concrete formulations.

Recommendation for future research

Investigation of Additional Parameters: Future studies could explore the impact of varying parameters such as water-cement ratio, curing conditions, and particle size distribution of cassava peel ash (CPA) on the mechanical properties of concrete. This comprehensive approach would provide a more nuanced understanding of the factors influencing concrete performance.

Durability Testing: Given the importance of long-term durability in concrete structures, future research could focus on evaluating the resistance of CPA-blended concrete to environmental factors such as freeze–thaw cycles, sulfate attack, and alkali-silica reaction. Conducting accelerated aging tests and field exposure studies would provide valuable insights into the durability performance of CPA concrete.

Sustainability Assessment: Further studies could assess the environmental impact of utilizing cassava peel ash as a supplementary cementitious material in concrete production. Life cycle assessments and carbon footprint analyses could be conducted to quantify the environmental benefits of incorporating CPA and compare them with traditional concrete formulations.

Optimization of Mixture Design: Continuation of research into optimizing the mixture design of CPA concrete using advanced statistical methods, such as artificial intelligence algorithms, could further enhance the mechanical properties of concrete while minimizing material usage and costs.

Field Applications and Performance Monitoring: Real-world implementation of CPA concrete in construction projects followed by systematic performance monitoring would provide valuable data on its behavior under actual loading and environmental conditions. Long-term monitoring of structures built with CPA concrete would enable the assessment of its structural integrity, durability, and sustainability in practical applications.

Data availability

All data generated or analyzed during this study are included in this published article.

Iro, U. I. et al. Optimization and simulation of saw dust ash concrete using extreme vertex design method. Adv. Mater. Sci. Eng. https://doi.org/10.1155/2022/5082139 (2022).

Article   Google Scholar  

Abdellatief, M. et al. Characterization and optimization of fresh and hardened properties of ultra-high performance geopolymer concrete. Case Stud. Constr. Mater. 19 , e02549. https://doi.org/10.1016/j.cscm.2023.e02549 (2023).

Raheem, S., Arubike, E. & Awogboro, O. Effects of cassava peel ash (CPA) as alternative binder in concrete. Int. J. Constr. Res. Civil Eng. 1 (2), 27–32 (2020).

Google Scholar  

Schmidt, W., Msinjili, N. S., Pirskawetz, S. & Kühne, H. C. Efficiency of high performance concrete types incorporating bio-materials like rice husk ashes, cassava starch, lignosulfonate, and sisal fibres. In First International Conference on Bio-based Building Materials (eds Sonebi, M. et al. ) (RILEM, 2015).

Alaneme George, U. & Elvis, M. Modelling of the mechanical properties of concrete with cement ratio partially replaced by aluminium waste and sawdust ash using artificial neural network’. M. SN Appl. Sci. 1 , 1514. https://doi.org/10.1007/s42452-019-1504-2 (2019).

Article   CAS   Google Scholar  

Uwadiegwu, A. G. & Michael, M. E. Characterization of bambara nut shell Ash (BNSA) in concrete production. J. Kejuruteraan 33 , 621–634. https://doi.org/10.17576/jkukm-2021-33(3)-21 (2021).

Nwachukwu, K. C., Oguaghamba, O., Ozioko, H. O. & Mama, B. O. Optimization of compressive strength of concrete made with partial replacement of cement with cassava peel ash (CPA) and Rice Husk Ash (RHA) using scheffe’s (6,3) Model. Int. J. Trend Sci. Res. Dev. (IJTSRD) 7 (2), 737–754 (2023).

Ogbonna, C., Mbadike, E. & Alaneme, G. Characterisation and use of Cassava peel ash in concrete production. Comput. Eng. Phys. Model. 3 (2), 11–20. https://doi.org/10.22115/cepm.2020.223035.1091 (2020).

Salau, M. A. & Olonade, K. A. Pozzolanic potentials of Cassava Peel Ash. J. Eng. Res. 7 , 9–12 (2011).

Osuide, E. E., Ukeme, U. & Osuide, M. O. An assessment of the compressive strength of concrete made with cement partially replaced with cassava peel ash. SAU Sci.-Tech J. 6 (1), 64–73 (2021).

Ogunbode, E. B. et al. Mechanical and microstructure properties of cassava peel ash–based kenaf bio-fibrous concrete composites. Biomass Conv. Bioref. 13 , 6515–6525. https://doi.org/10.1007/s13399-021-01588-6 (2023).

Olubunmi, A. A., Taiye, J. A. & Tobi, A. compressive strength properties of cassava peel ash and wood ash in concrete production. Int. J. N. Pract. Manag. Eng. 11 (01), 31–40. https://doi.org/10.1776/ijnpme.v11i01.171 (2022).

Erzurumlu, T. & Oktem, H. Comparison of response surface model with neural network in determining the surface quality of molded parts. Mater. Des. 28 (2), 459–465 (2007).

Chimmaobi, O., Mbadike, E. M. & Alaneme, G. U. Experimental investigation of cassava peel ash in the production of concrete and mortar, Umudike. J. Eng. Technol. 6 , 10–21 (2020).

Yi, S., Su, Y., Qi, B., Su, Z. & Wan, Y. Application of response surface methodology and central composite rotatable design in optimization; the preparation condition of vinyltriethoxysilane modified silicate/polydimethylsiloxane hybrid pervaporation membranes. Sep. Purif. Technol. 71 , 252–262 (2020).

Priyan, M. V. et al. Recycling and sustainable applications of waste printed circuit board in concrete application and validation using response surface methodology. Sci. Rep. 13 , 16509. https://doi.org/10.1038/s41598-023-43919-9 (2023).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Nwachukwu, K. C. et al. Optimization of compressive strength of concrete made with partial replacement of cement with cassava peel Ash (CPA) and Rice Husk Ash (RHA) using scheffe’s second degree model. Int. J. Eng. Inv. (IJEI) 11 , 40–50 (2022).

Bekta, S. F. & Bekta, S. B. A. Analyzing mix parameters in ASR concrete using response surface methodology. Constr. Build. Mater. 66 , 299–305 (2014).

Alaneme, G. U., Olonade, K. A. & Esenogho, E. Critical review on the application of artificial intelligence techniques in the production of geopolymer-concrete. SN Appl. Sci. 5 , 217. https://doi.org/10.1007/s42452-023-05447-z (2023).

Alaneme, G. U. & Mbadike, E. M. optimization of strength development of bentonite and palm bunch ash concrete using fuzzy logic. Int. J. Sustain. Eng. 14 (4), 835–851. https://doi.org/10.1080/19397038.2021.1929549 (2021).

Behfarnia, K. & Khademi, F. A comprehensive study on the concrete compressive strength estimation using artifcial neural network and adaptive neuro-fuzzy inference system. Int. J. Optim. Civ. Eng. 7 (1), 71–80 (2017).

Hassan, W. N. F. W. et al. Mixture optimization of high-strength blended concrete using central composite design. Constr. Build. Mater. 243 , 118251. https://doi.org/10.1016/j.conbuildmat.2020.118251 (2020).

Ali, M., Kumar, A., Yuvaz, A. & Bashir, S. Bashir Salah, Central composite design application in the optimization of the effect of pumice stone on lightweight concrete properties using RSM. Case Stud. Constr. Mater. 18 , e01958. https://doi.org/10.1016/j.cscm.2023.e01958 (2023).

Ali, M. et al. Central composite design application in the optimization of the effect of waste foundry sand on concrete properties using RSM. Structures 46 , 1581–1594. https://doi.org/10.1016/j.istruc.2022.11.013 (2022).

Alaneme, G. U., Attah, I. C., Etim, R. K. & Dimonyeka, M. U. Mechanical properties optimization of soil—Cement kiln dust mixture using extreme vertex design. Int. J. Pavement Res. Technol. https://doi.org/10.1007/s42947-021-00048-8 (2021).

Oladipo, I. O., Adams, J. O. & Akinwande, J. T. Using cassava peelings to reduce input cost of concrete: A waste-to-wealth initiative in Southwestern Nigeria. Univ. J. Environ. Res. Technol. 3 (4), 511–516 (2013).

Al Qadi, A., Bin Mustapha, N. K. & AL-Mattarneh, H., and AL-Kadi, Q.,. Statistical models for hardened properties of self-compacting concrete. Am. J. Eng. Appl. Sci. 2 (4), 764–770 (2009).

AI-Qadi, A., Mustapha, N. B. & AL-Mattarneh, H. Central composite design models for workability and strength of self-compacting concrete. Journal of Engineering and Applied Science 4 (3), 177–183 (2009).

Ali, M. et al. Central composite design application in the optimization of the effect of waste foundry sand on concrete properties using RSM, /12/01/ 2022. Structures 46 , 1581–1594. https://doi.org/10.1016/j.istruc.2022.11.013 (2022).

Al Salaheen, M. et al. Modelling and optimization for mortar compressive strength incorporating heat-treated fly oil shale ash as an effective supplementary cementitious material using response surface methodology. Materials https://doi.org/10.3390/ma15196538 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Alqadi, A. N. S. et al. Uses of central composite design and surface response to evaluate the influence of constituent materials on fresh and hardened properties of self-compacting concrete. KSCE J. Civ. Eng. 16 , 407–416. https://doi.org/10.1007/s12205-012-1308-z (2012).

Maia, L. Experimental dataset from a central composite design with two qualitative independent variables to develop high strength mortars with self-compacting properties. Data Brief https://doi.org/10.1016/j.dib.2021.107738 (2021).

Alaneme, G. U. et al. Mechanical properties optimization and simulation of soil-saw dust ash blend using extreme vertex design (EVD) method. Int. J. Pav. Res. Technol. https://doi.org/10.1007/s42947-023-00272-4 (2023).

Maia, L. Experimental dataset from a central composite design to develop mortars with self-compacting properties and high early age strength. Data Brief https://doi.org/10.1016/j.dib.2021.107563 (2021).

Agor, C. D., Mbadike, E. M. & Alaneme, G. U. Evaluation of sisal fiber and aluminum waste concrete blend for sustainable construction using adaptive neuro-fuzzy inference system. Sci. Rep. 13 , 2814. https://doi.org/10.1038/s41598-023-30008-0 (2023).

Alaneme, G. U., Mbadike, E. M., Attah, I. C. & Udousoro, I. M. Mechanical behaviour optimization of saw dust ash and quarry dust concrete using adaptive neuro-fuzzy inference system. Innov. Infrastruct. Solut. 7 , 122. https://doi.org/10.1007/s41062-021-00713-8 (2022).

Zolgharnein, J., Shahmoradi, A. & Ghasemi, J. B. Comparative study of Box-Behnken, centralcomposite, and Doehlert matrix for multivariate optimization of Pb (II) adsorption onto Robinia treeleaves. J. Chemometr. https://doi.org/10.1002/cem.2487 (2013).

Agbenyeku, E. E. & Okonta, F. N. Green economy and innovation: compressive strength potential of blended cement cassava peels ash and laterised concrete. Afr. J. Sci. Technol. Innov. Dev. 6 (2), 105–110. https://doi.org/10.1080/20421338.2014.895482 (2014).

Alaneme, G. U., Olonade, K. A. & Esenogho, E. Eco-friendly agro-waste based geopolymer-concrete: A systematic review. Discov. Mater. 3 , 14. https://doi.org/10.1007/s43939-023-00052-8 (2023).

Article   ADS   Google Scholar  

Ewa, D. E. et al. Scheffe’s simplex optimization of flexural strength of quarry dust and sawdust ash pervious concrete for sustainable pavement construction. Materials 16 (2), 598. https://doi.org/10.3390/ma16020598 (2023).

Alaneme, G. U. & Mbadike, E. M. Experimental investigation of Bambara nut shell ash in the production of concrete and mortar. Innov. Infrastruct. Solut. 6 , 66. https://doi.org/10.1007/s41062-020-00445-1 (2021).

Ikpa, C. C. et al. Evaluation of water quality impact on the compressive strength of concrete. J. Kejuruteraan 33 (3), 527–538. https://doi.org/10.1757/jkukm-2021-33(3)-15 (2021).

Sofi, A., Saxena, A., Agrawal, P., Sharma, A. R. & Sharma, K. Strength predictions of saw dust and steel fibers in concrete. Int. J. Innov. Res. Sci. Eng. Technol. 4 , 12473–12477 (2015).

Ukpata, J. O. et al. Effects of aggregate sizes on the performance of laterized concrete. Sci. Rep. 14 , 448. https://doi.org/10.1038/s41598-023-50998-1 (2024).

Kumar, A. S. et al. Development of eco-friendly geopolymer concrete by utilizing hazardous industrial waste materials. Mater Today Proc. 66 , 2215–2225. https://doi.org/10.1016/j.matpr.2022.06.039 (2022).

De Viguerie, L., Sole, V. A. & Walter, P. Multilayers quantitative X-ray fluorescence analysis applied to easel paintings. Anal. Bioanal. Chem. 395 , 2015–2020 (2009).

Article   PubMed   Google Scholar  

Ewa, D. E. et al. Optimization of saw dust ash and quarry dust pervious concrete’s compressive strength using Scheffe’s simplex lattice method. Innov. Infrastruct. Solut. 8 , 64. https://doi.org/10.1007/s41062-022-01031-3 (2023).

Ezeokpube, G. C., Alaneme, G. U., Attah, I. C., Udousoro, I. M. & Nwogbo, D. Experimental investigation of crude oil contaminated soil for sustainable concrete production, architecture. Struct. Constr. https://doi.org/10.1007/s44150-022-00069-2 (2022).

Alaneme George, U. & Mbadike Elvis, M. optimization of flexural strength of palm nut fibre concrete using Scheffe’s theory. Mater. Sci. Energy Technol. 2 (2019), 272–287. https://doi.org/10.1016/j.mset.2019.01.006 (2019).

Abdellatief, M., Elemam, W. E., Alanazi, H. & Tahwia, A. M. Production and optimization of sustainable cement brick incorporating clay brick wastes using response surface method. Ceramics Int. https://doi.org/10.1016/j.ceramint.2022.11.144 (2023).

Akeke, G. A. et al. Experimental investigation and modelling of the mechanical properties of palm oil fuel ash concrete using Scheffe’s method. Sci. Rep. 13 , 18583. https://doi.org/10.1038/s41598-023-45987-3 (2023).

Ritter, A. & Muñoz-Carpena, R. Performance evaluation of hydrological models: statistical significance for reducing subjectivity in goodness-of-fit assessments. J. Hydrol. 480 (1), 33–45. https://doi.org/10.1016/j.jhydrol.2012.12.004 (2013).

Ganasen, N. et al. Soft computing techniques for predicting the properties of raw rice husk concrete bricks using regression-based machine learning approaches. Sci. Rep. 13 , 14503. https://doi.org/10.1038/s41598-023-41848-1 (2023).

Rencher, A. C. & Christensen, W. F. Chapter 10, multivariate regression—Section 10.1 introduction. In Methods of Multivariate Analysis Wiley Series in Probability and Statistics (eds Rencher, A. C. et al. ) (Wiley, 2012).

Chapter   Google Scholar  

Onyelowe, K. C., Jalal, F. E., Onyia, M. E., Onuoha, I. C. & Alaneme, G. U. Application of gene expression programming to evaluate strength characteristics of hydrated-lime-activated rice husk ash-treated expansive soil. Appl. Comput. Intell. Sof Comput. https://doi.org/10.1155/2021/6686347 (2021).

Attah, I. C. et al. Role of extreme vertex design approach on the mechanical and morphological behaviour of residual soil composite. Sci. Rep. https://doi.org/10.1038/s41598-023-35204-6 (2023).

Alaneme, G. U. et al. Mechanical strength optimization and simulation of cement kiln dust concrete using extreme vertex design method. Nanotechnol. Environ. Eng. https://doi.org/10.1007/s41204-021-00175-4 (2022).

Attah, I. C., Okafor, F. O. & Ugwu, O. O. Durability performance of expansive soil ameliorated with binary blend of additives for infrastructure delivery. Innov. Infrastruct. Solut. 7 , 234. https://doi.org/10.1007/s41062-022-00834-8 (2022).

Olatokunbo, O. et al. Assessment of strength properties of cassava peel ash-concrete. Int. J. Civil Eng. Technol. 9 (1), 965–974 (2018).

Onyelowe, K. et al. Generalized review on evd and constraints simplex method of materials properties optimization for civil engineering. Civil Eng. J. 5 , 729–749. https://doi.org/10.28991/cej-2019-03091283 (2019).

Aju, D. E., Onyelowe, K. C. & Alaneme, G. U. Constrained vertex optimization and simulation of the unconfined compressive strength of geotextile reinforced soil for flexible pavement foundation construction. Clean Eng. Technol. https://doi.org/10.1016/j.clet.2021.100287 (2021).

Ikponmwosa, E. E. & Olonade, K. A. Shrinkage characteristics of cassava peel ash concrete. Pac. J. Sci. Technol. 18 , 23–32 (2017).

Abdellatief, M. et al. A state-of-the-art review on geopolymer foam concrete with solid waste materials: components, characteristics, and microstructure. Innov. Infrastruct. Solut. 8 , 230. https://doi.org/10.1007/s41062-023-01202-w (2023).

Alaneme, G. U., Mbadike, E. M., Iro, U. I., Udousoro, I. M. & Ifejimalu, W. C. Adaptive neuro-fuzzy inference system prediction model for the mechanical behaviour of rice husk ash and periwinkle shell concrete blend for sustainable construction. Asian J. Civil Eng. https://doi.org/10.1007/s42107-021-00357-0 (2021).

Ujong, J. A., Mbadike, E. M. & Alaneme, G. U. Prediction of cost and duration of building construction using artificial neural network. Asian J. Civ. Eng. https://doi.org/10.1007/s42107-022-00474-4 (2022).

Olonade, K. A., Olajumoke, A. M., Omotosho, A. O. & Oyekunle, F. A. Effects of sulphuric acid on the compressive strength of blended cement-cassava peel ash concrete. In Construction Materials and Structures (eds Olonade, K. A. et al. ) (IOS Press, 2014).

Attah, I. C., Etim, R. K., Alaneme, G. U. & Bassey, O. B. Optimization of mechanical properties of rice husk ash concrete using Scheffe’s theory. SN Appl. Sci. https://doi.org/10.1007/s42452-020-2727-y (2020).

Download references

Author information

Authors and affiliations.

Department of Civil Engineering, Michael Okpara University of Agriculture, Umudike, Nigeria

Uzoma Ibe Iro & George Uwadiegwu Alaneme

Department of Civil, School of Engineering and Applied Sciences, Kampala International University, Kampala, Uganda

George Uwadiegwu Alaneme & Bamidele Charles Olaiya

Department of Civil Engineering, Akwa Ibom State University, Ikot Akpaden, Nigeria

Imoh Christopher Attah

Department of Civil Engineering, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu, 603203, Tamil Nadu, India

Nakkeeran Ganasen

Agricultural and Bioresources Engineering Department, University of Nigeria, Nsukka, Nigeria

Stellamaris Chinenye Duru

You can also search for this author in PubMed   Google Scholar

Contributions

UII, GUA: conceptualization UII, GUA, SCD, ICA: methodology GUA, SCD, NG, BCO: project administration ICA, NG, BCO: supervision UII, GUA, SCD, ICA: formal analysis UII, SCD, NG, BCO: software GUA, ICA, NG: data curation UII, GUA, NG, BCO: investigation UII, GUA, NG: write original draft SCD, ICA, BCO: edit original draft. All authors have declared and agreed to publish this research article.

Corresponding author

Correspondence to George Uwadiegwu Alaneme .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Iro, U.I., Alaneme, G.U., Attah, I.C. et al. Optimization of cassava peel ash concrete using central composite design method. Sci Rep 14 , 7901 (2024). https://doi.org/10.1038/s41598-024-58555-0

Download citation

Received : 13 January 2024

Accepted : 01 April 2024

Published : 04 April 2024

DOI : https://doi.org/10.1038/s41598-024-58555-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Response surface methodology
  • Analysis of variance
  • Design expert

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

statistical analysis research methods

Search Cornell

Cornell University

Class Roster

Section menu.

  • Toggle Navigation
  • Summer 2024
  • Spring 2024
  • Winter 2024
  • Archived Rosters

Last Updated

  • Schedule of Classes - April 7, 2024 7:31PM EDT
  • Course Catalog - April 7, 2024 7:06PM EDT

PADM 5313 Managerial Statistics for Public Affairs

Course description.

Course information provided by the Courses of Study 2023-2024 . Courses of Study 2024-2025 is scheduled to publish mid-June.

An introduction to statistical methods commonly used in managerial decision making.  Topics to be covered include the descriptive analysis of data, inferential methods (estimation and hypothesis testing), regression and correlation analysis, as well as quality control methods.  The course will involve a research project designed to give experience in collecting and interpreting data.

When Offered Fall.

Permission Note Enrollment limited to: EMPA students.

Distribution Category (KCM-HE, SBA-HE)

  • Demonstrate analytical and functional competency in basic statistical skills.
  • Demonstrate a working knowledge of ethics as it relates to statistical analysis and communication.
  • Demonstrate the ability to solve practical problems.
  • Develop skills to be critical consumers of business and policy research.

View Enrollment Information

  Seven Week - First.  

Credits and Grading Basis

1 Credit Graded (Letter grades only)

Class Number & Section Details

12834 PADM 5313   LEC 001

Meeting Pattern

  • TBA Online Meeting
  • Aug 26 - Oct 11, 2024

Instructors

To be determined. There are currently no textbooks/materials listed, or no textbooks/materials required, for this section. Additional information may be found on the syllabus provided by your professor.

For the most current information about textbooks, including the timing and options for purchase, see the Cornell Store .

Additional Information

Instruction Mode: Distance Learning-Asynchronous Enrollment limited to EMPA students.

Department Consent Required (Add)

Or send this URL:

Available Syllabi

About the class roster.

The schedule of classes is maintained by the Office of the University Registrar . Current and future academic terms are updated daily . Additional detail on Cornell University's diverse academic programs and resources can be found in the Courses of Study . Visit The Cornell Store for textbook information .

Please contact [email protected] with questions or feedback.

If you have a disability and are having trouble accessing information on this website or need materials in an alternate format, contact [email protected] for assistance.

Cornell University ©2024

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Crit Care Med
  • v.25(Suppl 2); 2021 May

An Introduction to Statistics: Choosing the Correct Statistical Test

Priya ranganathan.

1 Department of Anaesthesiology, Critical Care and Pain, Tata Memorial Centre, Homi Bhabha National Institute, Mumbai, Maharashtra, India

The choice of statistical test used for analysis of data from a research study is crucial in interpreting the results of the study. This article gives an overview of the various factors that determine the selection of a statistical test and lists some statistical testsused in common practice.

How to cite this article: Ranganathan P. An Introduction to Statistics: Choosing the Correct Statistical Test. Indian J Crit Care Med 2021;25(Suppl 2):S184–S186.

In a previous article in this series, we looked at different types of data and ways to summarise them. 1 At the end of the research study, statistical analyses are performed to test the hypothesis and either prove or disprove it. The choice of statistical test needs to be carefully performed since the use of incorrect tests could lead to misleading conclusions. Some key questions help us to decide the type of statistical test to be used for analysis of study data. 2

W hat is the R esearch H ypothesis ?

Sometimes, a study may just describe the characteristics of the sample, e.g., a prevalence study. Here, the statistical analysis involves only descriptive statistics . For example, Sridharan et al. aimed to analyze the clinical profile, species distribution, and susceptibility pattern of patients with invasive candidiasis. 3 They used descriptive statistics to express the characteristics of their study sample, including mean (and standard deviation) for normally distributed data, median (with interquartile range) for skewed data, and percentages for categorical data.

Studies may be conducted to test a hypothesis and derive inferences from the sample results to the population. This is known as inferential statistics . The goal of inferential statistics may be to assess differences between groups (comparison), establish an association between two variables (correlation), predict one variable from another (regression), or look for agreement between measurements (agreement). Studies may also look at time to a particular event, analyzed using survival analysis.

A re the C omparisons M atched (P aired ) or U nmatched (U npaired )?

Observations made on the same individual (before–after or comparing two sides of the body) are usually matched or paired . Comparisons made between individuals are usually unpaired or unmatched . Data are considered paired if the values in one set of data are likely to be influenced by the other set (as can happen in before and after readings from the same individual). Examples of paired data include serial measurements of procalcitonin in critically ill patients or comparison of pain relief during sequential administration of different analgesics in a patient with osteoarthritis.

W hat are the T ype of D ata B eing M easured ?

The test chosen to analyze data will depend on whether the data are categorical (and whether nominal or ordinal) or numerical (and whether skewed or normally distributed). Tests used to analyze normally distributed data are known as parametric tests and have a nonparametric counterpart that is used for data, which is distribution-free. 4 Parametric tests assume that the sample data are normally distributed and have the same characteristics as the population; nonparametric tests make no such assumptions. Parametric tests are more powerful and have a greater ability to pick up differences between groups (where they exist); in contrast, nonparametric tests are less efficient at identifying significant differences. Time-to-event data requires a special type of analysis, known as survival analysis.

H ow M any M easurements are B eing C ompared ?

The choice of the test differs depending on whether two or more than two measurements are being compared. This includes more than two groups (unmatched data) or more than two measurements in a group (matched data).

T ests for C omparison

( Table 1 lists the tests commonly used for comparing unpaired data, depending on the number of groups and type of data. As an example, Megahed and colleagues evaluated the role of early bronchoscopy in mechanically ventilated patients with aspiration pneumonitis. 5 Patients were randomized to receive either early bronchoscopy or conventional treatment. Between groups, comparisons were made using the unpaired t test for normally distributed continuous variables, the Mann–Whitney U -test for non-normal continuous variables, and the chi-square test for categorical variables. Chowhan et al. compared the efficacy of left ventricular outflow tract velocity time integral (LVOTVTI) and carotid artery velocity time integral (CAVTI) as predictors of fluid responsiveness in patients with sepsis and septic shock. 6 Patients were divided into three groups— sepsis, septic shock, and controls. Since there were three groups, comparisons of numerical variables were done using analysis of variance (for normally distributed data) or Kruskal–Wallis test (for skewed data).

Tests for comparison of unpaired data

A common error is to use multiple unpaired t -tests for comparing more than two groups; i.e., for a study with three treatment groups A, B, and C, it would be incorrect to run unpaired t -tests for group A vs B, B vs C, and C vs A. The correct technique of analysis is to run ANOVA and use post hoc tests (if ANOVA yields a significant result) to determine which group is different from the others.

( Table 2 lists the tests commonly used for comparing paired data, depending on the number of groups and type of data. As discussed above, it would be incorrect to use multiple paired t -tests to compare more than two measurements within a group. In the study by Chowhan, each parameter (LVOTVTI and CAVTI) was measured in the supine position and following passive leg raise. These represented paired readings from the same individual and comparison of prereading and postreading was performed using the paired t -test. 6 Verma et al. evaluated the role of physiotherapy on oxygen requirements and physiological parameters in patients with COVID-19. 7 Each patient had pretreatment and post-treatment data for heart rate and oxygen supplementation recorded on day 1 and day 14. Since data did not follow a normal distribution, they used Wilcoxon's matched pair test to compare the prevalues and postvalues of heart rate (numerical variable). McNemar's test was used to compare the presupplemental and postsupplemental oxygen status expressed as dichotomous data in terms of yes/no. In the study by Megahed, patients had various parameters such as sepsis-related organ failure assessment score, lung injury score, and clinical pulmonary infection score (CPIS) measured at baseline, on day 3 and day 7. 5 Within groups, comparisons were made using repeated measures ANOVA for normally distributed data and Friedman's test for skewed data.

Tests for comparison of paired data

T ests for A ssociation between V ariables

( Table 3 lists the tests used to determine the association between variables. Correlation determines the strength of the relationship between two variables; regression allows the prediction of one variable from another. Tyagi examined the correlation between ETCO 2 and PaCO 2 in patients with chronic obstructive pulmonary disease with acute exacerbation, who were mechanically ventilated. 8 Since these were normally distributed variables, the linear correlation between ETCO 2 and PaCO 2 was determined by Pearson's correlation coefficient. Parajuli et al. compared the acute physiology and chronic health evaluation II (APACHE II) and acute physiology and chronic health evaluation IV (APACHE IV) scores to predict intensive care unit mortality, both of which were ordinal data. Correlation between APACHE II and APACHE IV score was tested using Spearman's coefficient. 9 A study by Roshan et al. identified risk factors for the development of aspiration pneumonia following rapid sequence intubation. 10 Since the outcome was categorical binary data (aspiration pneumonia— yes/no), they performed a bivariate analysis to derive unadjusted odds ratios, followed by a multivariable logistic regression analysis to calculate adjusted odds ratios for risk factors associated with aspiration pneumonia.

Tests for assessing the association between variables

T ests for A greement between M easurements

( Table 4 outlines the tests used for assessing agreement between measurements. Gunalan evaluated concordance between the National Healthcare Safety Network surveillance criteria and CPIS for the diagnosis of ventilator-associated pneumonia. 11 Since both the scores are examples of ordinal data, Kappa statistics were calculated to assess the concordance between the two methods. In the previously quoted study by Tyagi, the agreement between ETCO 2 and PaCO 2 (both numerical variables) was represented using the Bland–Altman method. 8

Tests for assessing agreement between measurements

T ests for T ime-to -E vent D ata (S urvival A nalysis )

Time-to-event data represent a unique type of data where some participants have not experienced the outcome of interest at the time of analysis. Such participants are considered to be “censored” but are allowed to contribute to the analysis for the period of their follow-up. A detailed discussion on the analysis of time-to-event data is beyond the scope of this article. For analyzing time-to-event data, we use survival analysis (with the Kaplan–Meier method) and compare groups using the log-rank test. The risk of experiencing the event is expressed as a hazard ratio. Cox proportional hazards regression model is used to identify risk factors that are significantly associated with the event.

Hasanzadeh evaluated the impact of zinc supplementation on the development of ventilator-associated pneumonia (VAP) in adult mechanically ventilated trauma patients. 12 Survival analysis (Kaplan–Meier technique) was used to calculate the median time to development of VAP after ICU admission. The Cox proportional hazards regression model was used to calculate hazard ratios to identify factors significantly associated with the development of VAP.

The choice of statistical test used to analyze research data depends on the study hypothesis, the type of data, the number of measurements, and whether the data are paired or unpaired. Reviews of articles published in medical specialties such as family medicine, cytopathology, and pain have found several errors related to the use of descriptive and inferential statistics. 12 – 15 The statistical technique needs to be carefully chosen and specified in the protocol prior to commencement of the study, to ensure that the conclusions of the study are valid. This article has outlined the principles for selecting a statistical test, along with a list of tests used commonly. Researchers should seek help from statisticians while writing the research study protocol, to formulate the plan for statistical analysis.

Priya Ranganathan https://orcid.org/0000-0003-1004-5264

Source of support: Nil

Conflict of interest: None

R eferences

IMAGES

  1. 7 Types of Statistical Analysis: Definition and Explanation

    statistical analysis research methods

  2. Standard statistical tools in research and data analysis

    statistical analysis research methods

  3. Statistical Analysis Methods: 6 Statistical Methods for Analysis Must

    statistical analysis research methods

  4. 7 Types of Statistical Analysis with Best Examples

    statistical analysis research methods

  5. 15 Types of Research Methods (2024)

    statistical analysis research methods

  6. Statistical Analysis Types

    statistical analysis research methods

VIDEO

  1. Tools for statistical analysis /Research methodology /malayalam

  2. Introduction to Data

  3. Statistical Analysis Software in Clinical Research

  4. Chapter 7: statistical analysis data treatment and evaluation

  5. SPSS Tutorial: Mastering Simple Linear Regression for Data Analysis

  6. Understanding Chi Square & Fisher’s Exact Tests

COMMENTS

  1. Choosing the Right Statistical Test

    ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g., the average heights of children, teenagers, and adults). Predictor variable. Outcome variable. Research question example. Paired t-test. Categorical. 1 predictor. Quantitative. groups come from the same population.

  2. The Beginner's Guide to Statistical Analysis

    Statistical analysis means investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, businesses, and other organizations. ... Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics ...

  3. Selection of Appropriate Statistical Methods for Data Analysis

    Type and distribution of the data used. For the same objective, selection of the statistical test is varying as per data types. For the nominal, ordinal, discrete data, we use nonparametric methods while for continuous data, parametric methods as well as nonparametric methods are used.[] For example, in the regression analysis, when our outcome variable is categorical, logistic regression ...

  4. What is Statistical Analysis? Types, Methods and Examples

    Statistical analysis is the process of collecting and analyzing data in order to discern patterns and trends. It is a method for removing bias from evaluating data by employing numerical analysis. This technique is useful for collecting the interpretations of research, developing statistical models, and planning surveys and studies.

  5. Statistical Methods for Data Analysis: a Comprehensive Guide

    Introduction to Statistical Methods. At its core, statistical methods are the backbone of data analysis, helping us make sense of numbers and patterns in the world around us. Whether you're looking at sales figures, medical research, or even your fitness tracker's data, statistical methods are what turn raw data into useful insights.

  6. What Is Statistical Analysis? (Definition, Methods)

    Statistical analysis is useful for research and decision making because it allows us to understand the world around us and draw conclusions by testing our assumptions. Statistical analysis is important for various applications, including: Statistical quality control and analysis in product development. Clinical trials.

  7. Basic statistical tools in research and data analysis

    Abstract. Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise ...

  8. Introduction to Statistical Analysis: A Beginner's Guide.

    Statistical analysis plays a pivotal role in achieving this by providing tools and methods to analyze and interpret data accurately. It helps researchers identify patterns, test hypotheses, draw inferences, and quantify the strength of relationships between variables. Understanding the significance of statistical analysis empowers researchers ...

  9. Statistical Analysis in Research: Meaning, Methods and Types

    The scientific method is an empirical approach to acquiring new knowledge by making skeptical observations and analyses to develop a meaningful interpretation. It is the basis of research and the primary pillar of modern science. Researchers seek to understand the relationships between factors associated with the phenomena of interest. In some cases, research works with … Statistical ...

  10. Role of Statistics in Research

    Types of Statistical Research Methods. Statistical analysis is the process of analyzing samples of data into patterns or trends that help researchers anticipate situations and make appropriate research conclusions. Based on the type of data, statistical analyses are of the following type: 1. Descriptive Analysis. The descriptive statistical ...

  11. Research Methods

    Qualitative analysis tends to be quite flexible and relies on the researcher's judgement, so you have to reflect carefully on your choices and assumptions and be careful to avoid research bias. Quantitative analysis methods. Quantitative analysis uses numbers and statistics to understand frequencies, averages and correlations (in descriptive ...

  12. Introduction to Research Statistical Analysis: An Overview of the

    Introduction. Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology.

  13. 5 Statistical Analysis Methods for Research and Analysis

    The practice of gathering and analyzing data to identify patterns and trends is known as statistical analysis. It is a method for eliminating bias from data evaluation by using numerical analysis. Data analytics and data analysis are closely related processes that involve extracting insights from data to make informed decisions. And these ...

  14. What Is Statistical Analysis? Definition, Types, and Jobs

    Statistical analysis is the process of collecting and analyzing large volumes of data in order to identify trends and develop valuable insights. In the professional world, statistical analysts take raw data and find correlations between variables to reveal patterns and trends to relevant stakeholders. Working in a wide range of different fields ...

  15. 7 Types of Statistical Analysis Techniques (And Process Steps)

    3. Data presentation. Data presentation is an extension of data cleaning, as it involves arranging the data for easy analysis. Here, you can use descriptive statistics tools to summarize the data. Data presentation can also help you determine the best way to present the data based on its arrangement. 4.

  16. Data Analysis in Research: Types & Methods

    LEARN ABOUT: Steps in Qualitative Research. Methods used for data analysis in quantitative research. After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical analysis plans are the most favored to analyze numerical data.

  17. Sage Research Methods

    Methods Map. This visualization demonstrates how methods are related and connects users to relevant content. Project Planner. Find step-by-step guidance to complete your research project. Which Stats Test. Answer a handful of multiple-choice questions to see which statistical method is best for your data. Reading Lists

  18. JAMA Guide to Statistics and Methods

    March 22, 2022. This JAMA Guide to Statistics and Methods discusses instrumental variable analysis, a method designed to reduce or eliminate unobserved confounding in observational studies, with the goal of achieving unbiased estimation of treatment effects. Research, Methods, Statistics Guide to Statistics and Methods.

  19. The Beginner's Guide to Statistical Analysis

    Statistical analysis means investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, businesses, and other organisations. ... Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics ...

  20. Quantitative Research

    Statistical analysis is the most common quantitative research analysis method. It involves using statistical tools and techniques to analyze the numerical data collected during the research process. Statistical analysis can be used to identify patterns, trends, and relationships between variables, and to test hypotheses and theories.

  21. Inferential Statistics

    Hypothesis testing is a formal process of statistical analysis using inferential statistics. The goal of hypothesis testing is to compare populations or assess relationships between variables using samples. Hypotheses, or predictions, are tested using statistical tests. Statistical tests also estimate sampling errors so that valid inferences ...

  22. What is Statistical Analysis? Types, Methods, Software, Examples

    Statistical analysis encompasses a diverse range of methods and approaches, each suited to different types of data and research questions. Understanding the various types of statistical analysis is essential for selecting the most appropriate technique for your analysis. Let's explore some common distinctions in statistical analysis methods.

  23. Evidence‐based statistical analysis and methods in biomedical research

    The SAMBR (a) suggests linking study objectives, design, and methods for proper selection and application of statistical methods; (b) suggests preferred reporting and summarizing of research question, sample size, and statistical analysis plan; (c) facilitates the choice of statistical approaches with proper references for their execution ...

  24. Statistical analyses of ordinal outcomes in randomised controlled

    Principled analysis of ordinal data often relies on less familiar statistical methods and underlying assumptions. Many statistical methods have been proposed to analyse ordinal outcomes. One approach to estimate the effect of treatment on the distribution of ordinal endpoints is to use a cumulative logistic model [ 14 , 15 ].

  25. Foundations of Statistical Methods for Psychological Science

    Instructors. This graduate-level course covers foundations of statistics in psychological research. It is required of all first-year students in the psychology Ph.D. program. The purpose of this course is to introduce students to statistics with an emphasis on modeling. We cover many of the most widely applied data analysis models in psychology.

  26. Global cancer statistics 2022: GLOBOCAN estimates of incidence and

    This article presents global cancer statistics by world region for the year 2022 based on updated estimates from the International Agency for Research on Cancer (IARC). There were close to 20 million new cases of cancer in the year 2022 (including nonmelanoma skin cancers [NMSCs]) alongside 9.7 million deaths from cancer (including NMSC).

  27. Optimization of cassava peel ash concrete using central ...

    Regarding the desired response variables, diagnostic statistical tests in this analysis were conducted at upper and lower intervals of ± 4.29681, encompassing predicted vs. residual, normal ...

  28. Class Roster

    An introduction to statistical methods commonly used in managerial decision making. Topics to be covered include the descriptive analysis of data, inferential methods (estimation and hypothesis testing), regression and correlation analysis, as well as quality control methods. The course will involve a research project designed to give experience in collecting and interpreting data.

  29. An Introduction to Statistics: Choosing the Correct Statistical Test

    The choice of statistical test used for analysis of data from a research study is crucial in interpreting the results of the study. ... Han KA, Park SY. Analysis of statistical methods and errors in the articles published in the Korean journal of pain. Korean J Pain. 2010; 23 (1):35-41. doi: 10.3344/kjp.2010.23.1.35. DOI: [PMC free article] ...