MBA Knowledge Base

Business • Management • Technology

Home » Management Case Studies » Case Study: Quality Management System at Coca Cola Company

Case Study: Quality Management System at Coca Cola Company

Coca Cola’s history can be traced back to a man called Asa Candler, who bought a specific formula from a pharmacist named Smith Pemberton. Two years later, Asa founded his business and started production of soft drinks based on the formula he had bought. From then, the company grew to become the biggest producers of soft drinks with more than five hundred brands sold and consumed in more than two hundred nations worldwide.

Although the company is said to be the biggest bottler of soft drinks, they do not bottle much. Instead, Coca Cola Company manufactures a syrup concentrate, which is bought by bottlers all over the world. This distribution system ensures the soft drink is bottled by these smaller firms according to the company’s standards and guidelines. Although this franchised method of distribution is the primary method of distribution, the mother company has a key bottler in America, Coca Cola Refreshments.

In addition to soft drinks, which are Coca Cola’s main products, the company also produces diet soft drinks. These are variations of the original soft drinks with improvements in nutritional value, and reductions in sugar content. Saccharin replaced industrial sugar in 1963 so that the drinks could appeal to health-conscious consumers. A major cause for concern was the inter product competition which saw some sales dwindle in some products in favor of others.

Coca Cola started diversifying its products during the First World War when ‘Fanta’ was introduced. During World War 1, the heads of Coca Cola in Nazi Germany decided to establish a new soft drink into the market. During the ongoing war, America’s promotion in Germany was not acceptable. Therefore, he decided to use a new name and ‘Fanta’ was born. The creation was successful and production continued even after the war. ‘Sprite’ followed soon after.

In the 1990’s, health concerns among consumers of soft drinks forced their manufactures to consider altering the energy content of these products. ‘Minute Maid’ Juices, ‘PowerAde’ sports drinks, and a few flavored teas variants were Coca Cola’s initial reactions to this new interest. Although most of these new products were well received, some did not perform as well. An example of such was Coca Cola classic, dubbed C2.

Coca Cola Company has been a successful company for more than a century. This can be attributed partly to the nature of its products since soft drinks will always appeal to people. In addition to this, Coca Cola has one of the best commercial and public relations programs in the world. The company’s products can be found on adverts in virtually every corner of the globe. This success has led to its support for a wide range of sporting activities. Soccer, baseball, ice hockey, athletics and basketball are some of these sports, where Coca Cola is involved

Quality Management System at Coca Cola Company

The Quality Management System at Coca Cola

It is very important that each product that Coca Cola produces is of a high quality standard to ensure that each product is exactly the same. This is important as the company wants to meet with customer requirements and expectations. With the brand having such a global presence, it is vital that these checks are continually consistent. The standardized bottle of Coca Cola has elements that need to be checked whilst on the production line to make sure that a high quality is being met. The most common checks include ingredients, packaging and distribution. Much of the testing being taken place is during the production process, as machines and a small team of employees monitor progress. It is the responsibility of all of Coca Colas staff to check quality from hygiene operators to product and packaging quality. This shows that these constant checks require staff to be on the lookout for problems and take responsibility for this, to ensure maintained quality.

Coca-cola uses inspection throughout its production process, especially in the testing of the Coca-Cola formula to ensure that each product meets specific requirements. Inspection is normally referred to as the sampling of a product after production in order to take corrective action to maintain the quality of products. Coca-Cola has incorporated this method into their organisational structure as it has the ability of eliminating mistakes and maintaining high quality standards, thus reducing the chance of product recall. It is also easy to implement and is cost effective.

Coca-cola uses both Quality Control (QC) and Quality Assurance (QA) throughout its production process. QC mainly focuses on the production line itself, whereas QA focuses on its entire operations process and related functions, addressing potential problems very quickly. In QC and QA, state of the art computers check all aspects of the production process, maintaining consistency and quality by checking the consistency of the formula, the creation of the bottle (blowing), fill levels of each bottle, labeling of each bottle, overall increasing the speed of production and quality checks, which ensures that product demands are met. QC and QA helps reduce the risk of defective products reaching a customer; problems are found and resolved in the production process, for example, bottles that are considered to be defective are placed in a waiting area for inspection. QA also focuses on the quality of supplied goods to Coca-cola, for example sugar, which is supplied by Tate and Lyle. Coca-cola informs that they have never had a problem with their suppliers. QA can also involve the training of staff ensuring that employees understand how to operate machinery. Coca-Cola ensures that all members of staff receive training prior to their employment, so that employees can operate machinery efficiently. Machinery is also under constant maintenance, which requires highly skilled engineers to fix problems, and help Coca-cola maintain high outputs.

Every bottle is also checked that it is at the correct fill level and has the correct label. This is done by a computer which every bottle passes through during the production process. Any faulty products are taken off the main production line. Should the quality control measures find any errors, the production line is frozen up to the last good check that was made. The Coca Cola bottling plant also checks the utilization level of each production line using a scorecard system. This shows the percentage of the line that is being utilized and allows managers to increase the production levels of a line if necessary.

Coca-Cola also uses Total Quality Management (TQM) , which involves the management of quality at every level of the organisation , including; suppliers, production, customers etc. This allows Coca-cola to retain/regain competitiveness to achieve increased customer satisfaction . Coca-cola uses this method to continuously improve the quality of their products. Teamwork is very important and Coca-cola ensures that every member of staff is involved in the production process, meaning that each employee understands their job/roles, thus improving morale and motivation , overall increasing productivity. TQM practices can also increase customer involvement as many organisations, including Coca-Cola relish the opportunity to receive feedback and information from their consumers. Overall, reducing waste and costs, provides Coca-cola with a competitive advantage .

The Production Process

Before production starts on the line cleaning quality tasks are performed to rinse internal pipelines, machines and equipment. This is often performed during a switch over of lines for example, changing Coke to Diet Coke to ensure that the taste is the same. This quality check is performed for both hygiene purposes and product quality. When these checks are performed the production process can begin.

Coca Cola uses a database system called Questar which enables them to perform checks on the line. For example, all materials are coded and each line is issued with a bill of materials before the process starts. This ensures that the correct materials are put on the line. This is a check that is designed to eliminate problems on the production line and is audited regularly. Without this system, product quality wouldn’t be assessed at this high level. Other quality checks on the line include packaging and carbonation which is monitored by an operator who notes down the values to ensure they are meeting standards.

To test product quality further lab technicians carry out over 2000 spot checks a day to ensure quality and consistency. This process can be prior to production or during production which can involve taking a sample of bottles off the production line. Quality tests include, the CO2 and sugar values, micro testing, packaging quality and cap tightness. These tests are designed so that total quality management ideas can be put forward. For example, one way in which Coca Cola has improved their production process is during the wrapping stage at the end of the line. The machine performed revolutions around the products wrapping it in plastic until the contents were secure. One initiative they adopted meant that one less revolution was needed. This idea however, did not impact on the quality of the packaging or the actual product therefore saving large amounts of money on packaging costs. This change has been beneficial to the organisation. Continuous improvement can also be used to adhere to environmental and social principles which the company has the responsibility to abide by. Continuous Improvement methods are sometimes easy to identify but could lead to a big changes within the organisation. The idea of continuous improvement is to reveal opportunities which could change the way something is performed. Any sources of waste, scrap or rework are potential projects which can be improved.

The successfulness of this system can be measured by assessing the consistency of the product quality. Coca Cola say that ‘Our Company’s Global Product Quality Index rating has consistently reached averages near 94 since 2007, with a 94.3 in 2010, while our Company Global Package Quality Index has steadily increased since 2007 to a 92.6 rating in 2010, our highest value to date’. This is an obvious indication this quality system is working well throughout the organisation. This increase of the index shows that the consistency of the products is being recognized by consumers.

Related Posts:

  • Case Study of KFC: Establishment of a Successful Global Business Model
  • Case Study: Doritos Chips "The Loudest Taste on Earth" Ad Campaign
  • Case Study: Kraft's Takeover of Cadbury
  • Case Study: The Collaboration Between Sony and Ericsson
  • Case Study on Information Technology Management: Frito-Lay's Long-Term IT Plan
  • Case Study: Analysis of the Ethical Behavior of Coca Cola
  • Case Study: L'Oreal's Customer- Based Brand Equity (CBBE) Model
  • Case Study on Entrepreneurship: Mary Kay Ash
  • Case Study: An Assessment of Wal-Mart's Global Expansion Strategy
  • Case Study: PepsiCo's International Marketing Strategy

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

Quality management

  • Business management
  • Process management
  • Project management

Reign of Zero Tolerance (HBR Case Study and Commentary)

  • Janet Parker
  • Eugene Volokh
  • Jean Halloran
  • Michael G. Cherkasky
  • From the November 2006 Issue

Creating a Culture of Quality

  • Ashwin Srinivasan
  • Bryan Kurey
  • From the April 2014 Issue

case study of quality control

Organizational Grit

  • Thomas H. Lee
  • Angela L. Duckworth
  • From the September–October 2018 Issue

case study of quality control

The CEO of Canada Goose on Creating a Homegrown Luxury Brand

  • From the September–October 2019 Issue

case study of quality control

Will Disruptive Innovations Cure Health Care? (HBR OnPoint Enhanced Edition)

  • Clayton M. Christensen
  • Richard Bohmer
  • John Kenagy
  • June 01, 2004

Growth as a Process: The HBR Interview

  • Jeffrey R. Immelt
  • Thomas A. Stewart
  • From the June 2006 Issue

Design Your Product to Sell Itself

  • Jessica Herrin
  • May 31, 2012

case study of quality control

Learning to Lead at Toyota

  • Steven J. Spear
  • From the May 2004 Issue

case study of quality control

Lean Knowledge Work

  • Bradley Staats
  • David M. Upton
  • From the October 2011 Issue

case study of quality control

Crafting the Luxury Experience

  • Robert Chavez
  • December 04, 2011

Fixing Health Care from the Inside, Today

  • From the September 2005 Issue

Develop Your Strengths, Not Your Weaknesses

  • Scott K. Edinger
  • February 28, 2012

Will Disruptive Innovations Cure Health Care?

  • From the September–October 2000 Issue

Framing the Big Picture

  • Scott D. Anthony
  • March 31, 2011

case study of quality control

Teaching Smart People How to Learn

  • Chris Argyris
  • From the May–June 1991 Issue

Listening at Scale

  • Charlene Li
  • April 13, 2020

Measuring the Strategic Readiness of Intangible Assets

  • Robert S. Kaplan
  • David P. Norton
  • February 01, 2004

Coming Commoditization of Processes

  • Thomas H. Davenport
  • From the June 2005 Issue

case study of quality control

The Case for Capitation

  • Brent C. James
  • Gregory P. Poulsen
  • From the July–August 2016 Issue

Benchmarking Your Staff

  • Michael Goold
  • David J. Collis

case study of quality control

Danone: Changing the Food System

  • David E. Bell
  • Federica Gabrieli
  • Daniela Beyersdorfer
  • November 15, 2019

Strategy Execution Module 12: Aligning Performance Goals and Incentives

  • Robert Simons
  • November 21, 2016

Pumping Iron at Cliffs & Associates: The Circored Iron Ore Reduction Plant in Trinidad

  • Christoph H. Loch
  • Christian Terwiesch
  • December 06, 2002

Texas Eastman Co.

  • October 06, 1989

Toyota Motor Manufacturing, U.S.A., Inc.

  • Kazuhiro Mishina
  • September 08, 1992

Continuous Software Development: Agile's Successor

  • Jeffrey J. Bussgang
  • Samuel Clemens
  • Olivia Hull
  • January 07, 2018

SomPack: If You Can't Beat Them, Join Them?

  • September 16, 2010

Jailing Kids for Cash in Pennsylvania (Supplement)

  • John D. Donahue
  • Esther Scott
  • November 03, 2009

Sandra Brown Goes Digital (B): The Commitment Decision

  • Rosabeth Moss Kanter
  • Jonathan Cohen
  • March 27, 2018

Romeo Engine Plant

  • Amy P. Hutton
  • November 19, 1993

Analyzing Low Patient Satisfaction at Herzog Memorial Hospital

  • Jack Boepple
  • February 13, 2013

Turkasset: Bringing Customer-Centricity to Debt Collection

  • Dennis Campbell
  • Gamze Yucaoglu
  • October 11, 2016

AT&T's Transmission Systems Business Unit (C)

  • Rogelio Oliva
  • James Quinn
  • Elizabeth Keating
  • June 21, 2004

Koo Foundation Sun Yat-Sen Cancer Center: Breast Cancer Care in Taiwan

  • Michael E. Porter
  • Jennifer F. Baron
  • C. Jason Wang
  • December 08, 2009

Maestro Pizza (D): This Means War

  • Ramon Casadesus-Masanell
  • Fares Khrais
  • May 26, 2022

Year Up: A Social Entrepreneur Builds High Performance

  • Allen S. Grossman
  • Naomi Greckol-Herlich
  • June 11, 2008

Kaizen in Translation: Suggestion Systems Across Cultures

  • Elliott N. Weiss
  • Donald Stevenson
  • February 04, 2016

Account Opening in Blue Bank - Part A: Process Visualization

  • Vijaya Sunder M
  • March 31, 2023

Maestro Pizza (B): The Competition Awakens

Strategy execution module 3: using information for performance measurement and control.

  • September 01, 2016

case study of quality control

The Value SPC Can Add to Quality, Operations, Supply Chain Management, and Continuous Improvement Programs

  • Victor Sower
  • January 02, 2014

Advanced Control Charts for Variables

Strategic implications for health care providers: moving to value-based competition in the u.s. health care system.

  • Elizabeth Olmsted Teisberg
  • May 25, 2006

How Do Customers Judge Quality in an E-Tailer?

  • Joel E. Collier
  • Carol C. Bienstock
  • October 01, 2006

Managing Responsibility: What Can Be Learned from the Quality Movement?

  • Sandra Waddock
  • Charles Bodwell
  • November 01, 2004

Popular Topics

Partner center.

Making quality assurance smart

For decades, outside forces have dictated how pharmaceutical and medtech companies approach quality assurance. The most influential force remains regulatory requirements. Both individual interpretations of regulations and feedback received during regulatory inspections have shaped quality assurance systems and processes. At the same time, mergers and acquisitions, along with the proliferation of different IT solutions and quality software, have resulted in a diverse and complicated quality management system (QMS) landscape. Historically, the cost of consolidating and upgrading legacy IT systems has been prohibitively expensive. Further challenged by a scarcity of IT support, many quality teams have learned to rely on the processes and workflows provided by off-the-shelf software without questioning whether they actually fit their company’s needs and evolving regulatory requirements.

In recent years, however, several developments have enabled a better way. New digital and analytics technologies make it easier for quality teams to access data from different sources and in various formats, without replacing existing systems. Companies can now build dynamic user experiences in web applications at a fraction of the cost of traditional, enterprise desktop software; this development raises the prospect of more customized, user-friendly solutions. Moreover, regulators, such as the FDA, are increasingly focused on quality systems and process maturity. 1 MDIC Case for Quality program. The FDA also identified the enablement of innovative technologies as a strategic priority, thereby opening the door for constructive dialogue about potential changes. 2 Technology Modernization Action Plan, FDA.

Smart quality at a glance

“Smart quality” is a framework that pharma and medtech companies can apply to redesign key quality assurance processes and create value for the organization.

Smart quality has explicit objectives:

  • to perceive and deliver on multifaceted and ever-changing customer needs
  • to deploy user-friendly processes built organically into business workflows, reimagined with leading-edge technologies
  • to leapfrog existing quality management systems with breakthrough innovation, naturally fulfilling the spirit—not just the letter—of the regulations

The new ways in which smart quality achieves its objectives can be categorized in five building blocks (exhibit).

To learn more about smart quality and how leading companies are reimagining the quality function, please see “ Smart quality: Reimagining the way quality works .”

The time has arrived for pharmaceutical and medtech companies to act boldly and reimagine the quality function. Through our work on large-scale quality transformation projects and our conversations with executives, we have developed a new approach we call “smart quality” (see sidebar, “Smart quality at a glance”). With this approach, companies can redesign key quality processes and enable design-thinking methodology (to make processes more efficient and user-friendly), automation and digitization (to deliver speed and transparency), and advanced analytics (to provide deep insights into process capability and product performance).

The quality assurance function thereby becomes a driver of value in the organization and a source of competitive advantage—improving patient safety and health outcomes while operating efficiently, effectively, and fully aligned with regulatory expectations. In our experience, companies applying smart quality principles to quality assurance can quickly generate returns that outweigh investments in new systems, including line-of-sight impact on profit; a 30 percent improvement in time to market; and a significant increase in manufacturing and supply chain reliability. Equally significant are improvements in customer satisfaction and employee engagement, along with reductions in compliance risk.

Revolutionizing quality assurance processes

The following four use cases illustrate how pharmaceutical and medtech companies can apply smart quality to transform core quality assurance processes—including complaints management, quality management review, deviations investigations, and supplier risk management, among others.

1. Complaint management

Responding swiftly and effectively to complaints is not only a compliance requirement but also a business necessity. Assessing and reacting to feedback from the market can have an immediate impact on patient safety and product performance. Today, a pharmaceutical or medtech company may believe it is handling complaints well if it has a single software deployed around the globe for complaint management, with some elements of automation (for example, flagging reportable malfunctions in medical devices) and several processing steps happening offshore (such as intake, triage, and regulatory reporting).

Yet, for most quality teams, the average investigation and closure cycle time hovers around 60 days—a few adverse events are reported late every month, and negative trends are addressed two or more months after the signals come in. It can take quality assurance teams even longer to identify complaints that collectively point to negative trends for a particular product or device. At the same time, less than 5 percent of incoming complaints are truly new events that have never been seen before. The remainder of complaints can usually be categorized into well-known issues, within expected limits; or previously investigated issues, in which root causes have been identified and are already being addressed.

The smart quality approach improves customer engagement and speed

By applying smart quality principles and the latest technologies, companies can reduce turnaround times and improve the customer experience. They can create an automated complaint management process that reduces costs yet applies the highest standards:

  • For every complaint, the information required for a precise assessment is captured at intake, and the event is automatically categorized.
  • High-risk issues are immediately escalated by the system, with autogenerated reports ready for submission.
  • New types of complaints and out-of-trend problems are escalated and investigated quickly.
  • Low-risk, known issues are automatically trended and closed if they are within expected limits or already being addressed.
  • Customer responses and updates are automatically available.
  • Trending reports are available in real time for any insights or analyses.

To transform the complaint management process, companies should start by defining a new process and ensuring it meets regulatory requirements. The foundation for the new process can lie in a structured event assessment that allows automated issue categorization based on the risk level defined in the company’s risk management documentation. A critical technological component is the automation of customer complaint intake; a dynamic front-end application can guide a customer through a series of questions (Exhibit 1). The application captures only information relevant to a specific complaint evaluation, investigation, and—if necessary—regulatory report. Real-time trending can quickly identify signals that indicate issues exceeding expected limits. In addition, companies can use machine learning to scan text and identify potential high-risk complaints. Finally, risk-tailored investigation pathways, automated reporting, and customer response solutions complete the smart quality process. Successful companies maintain robust procedures and documentation that clearly explain how the new process reliably meets specific regulatory requirements. Usually, a minimal viable product (MVP) for the new process can be built within two to four months for the first high-volume product family.

In our experience, companies that redesign the complaint management process can respond more swiftly—often within a few hours—to reduce patient risk and minimize the scale and impact of potential issues in the field. For example, one medtech company that adopted the new complaint management approach can now automatically assess all complaints and close more than 55 percent of them in 24 hours without human intervention. And few, if any, reportable events missed deadlines for submission. Now, subject matter experts are free to focus on investigating new or high-risk issues, understanding root causes, and developing the most effective corrective and preventive actions. The company also reports that its customers prefer digital interfaces to paper forms and are pleased to be updated promptly on their status and resolution of their complaints.

2. Quality management review

Real-time performance monitoring is crucial to executive decision making at pharmaceutical and medtech companies. During a 2019 McKinsey roundtable discussion, 62 percent of quality assurance executives rated it as a high priority for the company, exceeding all other options.

For many companies today, the quality review process involves significant manual data collection and chart creation. Often, performance metrics focus on quality compliance outcomes and quality systems—such as deviation cycle times—at the expense of leading indicators and connection to culture and cost. Managers and executives frequently find themselves engaged in lengthy discussions, trying to interpret individual metrics and often missing the big picture.

Although many existing QMS solutions offer automated data-pull and visualization features, the interpretation of complex metric systems and trends remains largely a manual process. A team may quickly address one performance metric or trend, only to learn several months later that the change negatively affected another metric.

The smart quality approach speeds up decision making and action

By applying smart quality principles and the latest digital technologies, companies can get a comprehensive view of quality management in real time. This approach to performance monitoring allows companies to do the following:

  • automatically collect, analyze, and visualize relevant leading indicators and outcomes on a simple and intuitive dashboard
  • quickly identify areas of potential risk and emerging trends, as well as review their underlying metrics and connections to different areas
  • rapidly make decisions to address existing or emerging issues and monitor the results
  • adjust metrics and targets to further improve performance as goals are achieved
  • view the entire value chain and create transparency for all functions, not just quality

To transform the process, companies should start by reimagining the design of the process and settling on a set of metrics that balances leading and lagging indicators. A key technical enabler of the system is establishing an interconnected metrics structure that automates data pull and visualization and digitizes analysis and interpretation (Exhibit 2). Key business processes, such as regular quality management reviews, may require changes to include a wider range of functional stakeholders and to streamline the review cascade.

Healthcare companies can use smart quality to redesign the quality management review process and see results quickly. At one pharmaceutical and medtech company, smart visualization of connected, cross-functional metrics significantly improved the effectiveness and efficiency of quality management review at all levels. Functions throughout the organization reported feeling better positioned to ascertain the quality situation quickly, support decision making, and take necessary actions. Because of connected metrics, management can not only see alarming trends but also link them to other metrics and quickly align on targeted improvement actions. For example, during a quarterly quality management review, the executive team linked late regulatory reporting challenges to an increase in delayed complaint submissions in some geographic regions. Following the review, commercial leaders raised attention to this issue in their respective regions, and in less than three months, late regulatory reporting was reduced to zero. Although the company is still in the process of fully automating data collection, it has already noticed a significant shift in its work. The quality team no longer spends the majority of its time on data processing but has pivoted to understanding, interpreting, and addressing complex and interrelated trends to reduce risks associated with quality and compliance.

Healthcare companies can use smart quality to redesign the quality management review process and see results quickly.

3. Deviation or nonconformance investigations

Deviation or nonconformance management is a critical topic for companies today because unaddressed issues can lead to product recalls and reputational damage. More often, deviations or nonconformances can affect a company’s product-release process, capacity, and lead times. As many quality teams can attest, the most challenging and time-consuming part of a deviation or nonconformance investigation is often the root cause analysis. In the best of circumstances, investigators use a tracking and trending system to identify similar occurrences. However, more often than not, these systems lack good classification of root causes and similarities. The systems search can become another hurdle for quality teams, resulting in longer lead times and ineffective root cause assessment. Not meeting the standards defined by regulators regarding deviation or nonconformance categorization and root cause analysis is one of the main causes of warning letters or consent decrees.

The smart quality approach improves effectiveness and reduces lead times

Our research shows companies that use smart quality principles to revamp the investigation process may reap these benefits:

  • all pertinent information related to processes and equipment is easily accessible in a continuously updated data lake
  • self-learning algorithms predict the most likely root cause of new deviations, thereby automating the review of process data and statements

In our experience, advanced analytics is the linchpin of transforming the investigation process. The most successful companies start by building a real-time data model from local and global systems that continuously refreshes and improves the model over time. Natural language processing can generate additional classifications of deviations or nonconformances to improve the quality and accuracy of insights. Digitization ensures investigators can easily access graphical interfaces that are linked to all data sources. With these tools in place, companies can readily identify the most probable root cause for deviation or nonconformance and provide a fact base for the decision. Automation also frees quality assurance professionals to focus on corrective and preventive action (Exhibit 3).

Pharmaceutical and medtech companies that apply these innovative technologies and smart quality principles can see significant results. Our work with several companies shows that identifying, explaining, and eliminating the root causes of recurring deviations and nonconformances can reduce the overall volume of issues by 65 percent. Companies that use the data and models to determine which unexpected factors in processes and products influence the end quality are able to control for them, thereby achieving product and process mastery. What’s more, by predicting the most likely root causes and their underlying drivers, these companies can reduce the investigation cycle time for deviations and nonconformances by 90 percent.

4. Supplier quality risk management

Drug and medical device supply chains have become increasingly global, complex, and opaque as more pharmaceutical and medtech companies outsource major parts of production to suppliers and contract manufacturing organizations (CMOs). More recently, the introduction of new, complex modalities, such as cell therapy and gene editing, has further increased pressure to ensure the quality of supplier products. Against this backdrop, it is critical to have a robust supplier quality program that can proactively identify and mitigate supplier risks or vulnerabilities before they become material issues.

Today, many companies conduct supplier risk management manually and at one specific point in time, such as at the beginning of a contract or annually. Typically, risk assessments are done in silos across the organization; every function completes individual reports and rarely looks at supplier risk as a whole. Because the results are often rolled up and individual risk signals can become diluted, companies focus more on increasing controls than addressing underlying challenges.

The smart quality approach reduces quality issues and optimizes resources

Companies that break down silos and apply a more holistic risk lens across the organization have a better chance of proactively identifying supplier quality risks. With smart quality assurance, companies can do the following:

  • identify vulnerabilities by utilizing advanced analytics on a holistic set of internal and external supplier and product data
  • ensure real-time updates and reviews to signal improvements in supplier quality and any changes that may pose an additional risk
  • optimize resource allocation and urgency of action, based on the importance and risk level of the supplier or CMO

Current technologies make it simpler than ever to automatically collect meaningful data. They also make it possible to analyze the data, identify risk signals, and present information in an actionable format. Internal and supplier data can include financials, productivity, and compliance metrics. Such information can be further enhanced by publicly available external sources—such as regulatory reporting, financial statements, and press releases—that provide additional insights into supplier quality risks. For example, using natural language processing to search the web for negative press releases is a simple yet powerful method to identify risks.

Would you like to learn more about our Life Sciences Practice ?

Once a company has identified quality risks, it must establish a robust process for managing these risks. Mitigation actions can include additional monitoring with digital tools, supporting the supplier to address the sources of issues, or deciding to switch to a different supplier. In our experience, companies that have a deep understanding of the level of quality risk, as well as the financial exposure, have an easier time identifying the appropriate mitigation action. Companies that identify risks and proactively mitigate them are less likely to experience potentially large supply disruptions or compliance findings.

Many pharmaceutical and medtech companies have taken steps to improve visibility into supplier quality risks by using smart quality principles. For example, a large pharmaceutical company that implemented this data-driven approach eliminated in less than two years major CMO and supplier findings that were identified during audits. In addition, during the COVID-19 pandemic, a global medtech company was able to proactively prevent supply chain disruptions by drawing on insights derived from smart quality supplier risk management.

Getting started

Pharmaceutical and medtech companies can approach quality assurance redesign in multiple ways. In our experience, starting with two or three processes, codifying the approach, and then rolling it out to more quality systems accelerates the overall transformation and time to value.

Smart quality assurance starts with clean-sheet design. By deploying modern design techniques, organizations can better understand user needs and overcome constraints. To define the solution space, we encourage companies to draw upon a range of potential process, IT, and analytics solutions from numerous industries. In cases where the new process is substantially different from the legacy process, we find it beneficial to engage regulators in an open dialogue and solicit their early feedback to support the future-state design.

Once we arrive at an MVP that includes digital and automation elements, companies can test and refine new solutions in targeted pilots. Throughout the process, we encourage companies to remain mindful of training and transition planning. Plans should include details on ensuring uninterrupted operations and maintaining compliance during the transition period.

The examples in this article are not exceptions. We believe that any quality assurance process can be significantly improved by applying a smart quality approach and the latest technologies. Pharmaceutical and medtech companies that are willing to make the organizational commitment to rethink quality assurance can significantly reduce quality risks, improve their speed and effectiveness in handling issues, and see long-term financial benefits.

Note: The insights and concepts presented here have not been validated or independently verified, and future results may differ materially from any statements of expectation, forecasts, or projections. Recipients are solely responsible for all of their decisions, use of these materials, and compliance with applicable laws, rules, and regulations. Consider seeking advice of legal and other relevant certified/licensed experts prior to taking any specific steps.

Explore a career with us

Related articles.

Smart quality: Reimagining the way quality works

Smart quality: Reimagining the way quality works

Ready for launch: Reshaping pharma’s strategy in the next normal

Ready for launch: Reshaping pharma’s strategy in the next normal

Healthcare innovation: Building on gains made through the crisis

Healthcare innovation: Building on gains made through the crisis

  • Browse All Articles
  • Newsletter Sign-Up

case study of quality control

  • 11 Apr 2023
  • Cold Call Podcast

A Rose by Any Other Name: Supply Chains and Carbon Emissions in the Flower Industry

Headquartered in Kitengela, Kenya, Sian Flowers exports roses to Europe. Because cut flowers have a limited shelf life and consumers want them to retain their appearance for as long as possible, Sian and its distributors used international air cargo to transport them to Amsterdam, where they were sold at auction and trucked to markets across Europe. But when the Covid-19 pandemic caused huge increases in shipping costs, Sian launched experiments to ship roses by ocean using refrigerated containers. The company reduced its costs and cut its carbon emissions, but is a flower that travels halfway around the world truly a “low-carbon rose”? Harvard Business School professors Willy Shih and Mike Toffel debate these questions and more in their case, “Sian Flowers: Fresher by Sea?”

case study of quality control

  • 17 Sep 2019

How a New Leader Broke Through a Culture of Accuse, Blame, and Criticize

Children’s Hospital & Clinics COO Julie Morath sets out to change the culture by instituting a policy of blameless reporting, which encourages employees to report anything that goes wrong or seems substandard, without fear of reprisal. Professor Amy Edmondson discusses getting an organization into the “High Performance Zone.” Open for comment; 0 Comments.

case study of quality control

  • 27 Feb 2019
  • Research & Ideas

The Hidden Cost of a Product Recall

Product failures create managerial challenges for companies but market opportunities for competitors, says Ariel Dora Stern. The stakes have only grown higher. Open for comment; 0 Comments.

case study of quality control

  • 31 Mar 2018
  • Working Paper Summaries

Expected Stock Returns Worldwide: A Log-Linear Present-Value Approach

Over the last 20 years, shortcomings of classical asset-pricing models have motivated research in developing alternative methods for measuring ex ante expected stock returns. This study evaluates the main paradigms for deriving firm-level expected return proxies (ERPs) and proposes a new framework for estimating them.

  • 26 Apr 2017

Assessing the Quality of Quality Assessment: The Role of Scheduling

Accurate inspections enable companies to assess the quality, safety, and environmental practices of their business partners, and enable regulators to protect consumers, workers, and the environment. This study finds that inspectors are less stringent later in their workday and after visiting workplaces with fewer problems. Managers and regulators can improve inspection accuracy by mitigating these biases and their consequences.

  • 23 Sep 2013

Status: When and Why It Matters

Status plays a key role in everything from the things we buy to the partnerships we make. Professor Daniel Malter explores when status matters most. Closed for comment; 0 Comments.

  • 16 May 2011

What Loyalty? High-End Customers are First to Flee

Companies offering top-drawer customer service might have a nasty surprise awaiting them when a new competitor comes to town. Their best customers might be the first to defect. Research by Harvard Business School's Ryan W. Buell, Dennis Campbell, and Frances X. Frei. Key concepts include: Companies that offer high levels of customer service can't expect too much loyalty if a new competitor offers even better service. High-end businesses must avoid complacency and continue to proactively increase relative service levels when they're faced with even the potential threat of increased service competition. Even though high-end customers can be fickle, a company that sustains a superior service position in its local market can attract and retain customers who are more valuable over time. Firms rated lower in service quality are more or less immune from the high-end challenger. Closed for comment; 0 Comments.

  • 08 Dec 2008

Thinking Twice About Supply-Chain Layoffs

Cutting the wrong employees can be counterproductive for retailers, according to research from Zeynep Ton. One suggestion: Pay special attention to staff who handle mundane tasks such as stocking and labeling. Your customers do. Closed for comment; 0 Comments.

  • 01 Dec 2006
  • What Do You Think?

How Important Is Quality of Labor? And How Is It Achieved?

A new book by Gregory Clark identifies "labor quality" as the major enticement for capital flows that lead to economic prosperity. By defining labor quality in terms of discipline and attitudes toward work, this argument minimizes the long-term threat of outsourcing to developed economies. By understanding labor quality, can we better confront anxieties about outsourcing and immigration? Closed for comment; 0 Comments.

  • 20 Sep 2004

How Consumers Value Global Brands

What do consumers expect of global brands? Does it hurt to be an American brand? This Harvard Business Review excerpt co-written by HBS professor John A. Quelch identifies the three characteristics consumers look for to make purchase decisions. Closed for comment; 0 Comments.

YouTube

Sign up for the newsletter

Digital editions.

Digital Editions

Total quality management: three case studies from around the world

With organisations to run and big orders to fill, it’s easy to see how some ceos inadvertently sacrifice quality for quantity. by integrating a system of total quality management it’s possible to have both.

Feature image

Top 5 ways to manage the board during turbulent times Top 5 ways to create a family-friendly work culture Top 5 tips for a successful joint venture Top 5 ways managers can support ethnic minority workers Top 5 ways to encourage gender diversity in the workplace  Top 5 ways CEOs can create an ethical company culture Top 5 tips for going into business with your spouse Top 5 ways to promote a healthy workforce Top 5 ways to survive a recession Top 5 tips for avoiding the ‘conference vortex’ Top 5 ways to maximise new parents’ work-life balance with technology Top 5 ways to build psychological safety in the workplace Top 5 ways to prepare your workforce for the AI revolution Top 5 ways to tackle innovation stress in the workplace Top 5 tips for recruiting Millennials

There are few boardrooms in the world whose inhabitants don’t salivate at the thought of engaging in a little aggressive expansion. After all, there’s little room in a contemporary, fast-paced business environment for any firm whose leaders don’t subscribe to ambitions of bigger factories, healthier accounts and stronger turnarounds. Yet too often such tales of excess go hand-in-hand with complaints of a severe drop in quality.

Food and entertainment markets are riddled with cautionary tales, but service sectors such as health and education aren’t immune to the disappointing by-products of unsustainable growth either. As always, the first steps in avoiding a catastrophic forsaking of quality begins with good management.

There are plenty of methods and models geared at managing the quality of a particular company’s goods or services. Yet very few of those models take into consideration the widely held belief that any company is only as strong as its weakest link. With that in mind, management consultant William Deming developed an entirely new set of methods with which to address quality.

Deming, whose managerial work revolutionised the titanic Japanese manufacturing industry, perceived quality management to be more of a philosophy than anything else. Top-to-bottom improvement, he reckoned, required uninterrupted participation of all key employees and stakeholders. Thus, the total quality management (TQM) approach was born.

All in Similar to the Six Sigma improvement process, TQM ensures long-term success by enforcing all-encompassing internal guidelines and process standards to reduce errors. By way of serious, in-depth auditing – as well as some well-orchestrated soul-searching – TQM ensures firms meet stakeholder needs and expectations efficiently and effectively, without forsaking ethical values.

By opting to reframe the way employees think about the company’s goals and processes, TQM allows CEOs to make sure certain things are done right from day one. According to Teresa Whitacre, of international consulting firm ASQ , proper quality management also boosts a company’s profitability.

“Total quality management allows the company to look at their management system as a whole entity — not just an output of the quality department,” she says. “Total quality means the organisation looks at all inputs, human resources, engineering, production, service, distribution, sales, finance, all functions, and their impact on the quality of all products or services of the organisation. TQM can improve a company’s processes and bottom line.”

Embracing the entire process sees companies strive to improve in several core areas, including: customer focus, total employee involvement, process-centred thinking, systematic approaches, good communication and leadership and integrated systems. Yet Whitacre is quick to point out that companies stand to gain very little from TQM unless they’re willing to go all-in.

“Companies need to consider the inputs of each department and determine which inputs relate to its governance system. Then, the company needs to look at the same inputs and determine if those inputs are yielding the desired results,” she says. “For example, ISO 9001 requires management reviews occur at least annually. Aside from minimum standard requirements, the company is free to review what they feel is best for them. While implementing TQM, they can add to their management review the most critical metrics for their business, such as customer complaints, returns, cost of products, and more.”

The customer knows best: AtlantiCare TQM isn’t an easy management strategy to introduce into a business; in fact, many attempts tend to fall flat. More often than not, it’s because firms maintain natural barriers to full involvement. Middle managers, for example, tend to complain their authority is being challenged when boots on the ground are encouraged to speak up in the early stages of TQM. Yet in a culture of constant quality enhancement, the views of any given workforce are invaluable.

AtlantiCare in numbers

5,000 Employees

$280m Profits before quality improvement strategy was implemented

$650m Profits after quality improvement strategy

One firm that’s proven the merit of TQM is New Jersey-based healthcare provider AtlantiCare . Managing 5,000 employees at 25 locations, AtlantiCare is a serious business that’s boasted a respectable turnaround for nearly two decades. Yet in order to increase that margin further still, managers wanted to implement improvements across the board. Because patient satisfaction is the single-most important aspect of the healthcare industry, engaging in a renewed campaign of TQM proved a natural fit. The firm chose to adopt a ‘plan-do-check-act’ cycle, revealing gaps in staff communication – which subsequently meant longer patient waiting times and more complaints. To tackle this, managers explored a sideways method of internal communications. Instead of information trickling down from top-to-bottom, all of the company’s employees were given freedom to provide vital feedback at each and every level.

AtlantiCare decided to ensure all new employees understood this quality culture from the onset. At orientation, staff now receive a crash course in the company’s performance excellence framework – a management system that organises the firm’s processes into five key areas: quality, customer service, people and workplace, growth and financial performance. As employees rise through the ranks, this emphasis on improvement follows, so managers can operate within the company’s tight-loose-tight process management style.

After creating benchmark goals for employees to achieve at all levels – including better engagement at the point of delivery, increasing clinical communication and identifying and prioritising service opportunities – AtlantiCare was able to thrive. The number of repeat customers at the firm tripled, and its market share hit a six-year high. Profits unsurprisingly followed. The firm’s revenues shot up from $280m to $650m after implementing the quality improvement strategies, and the number of patients being serviced dwarfed state numbers.

Hitting the right notes: Santa Cruz Guitar Co For companies further removed from the long-term satisfaction of customers, it’s easier to let quality control slide. Yet there are plenty of ways in which growing manufacturers can pursue both quality and sales volumes simultaneously. Artisan instrument makers the Santa Cruz Guitar Co (SCGC) prove a salient example. Although the California-based company is still a small-scale manufacturing operation, SCGC has grown in recent years from a basement operation to a serious business.

SCGC in numbers

14 Craftsmen employed by SCGC

800 Custom guitars produced each year

Owner Dan Roberts now employs 14 expert craftsmen, who create over 800 custom guitars each year. In order to ensure the continued quality of his instruments, Roberts has created an environment that improves with each sale. To keep things efficient (as TQM must), the shop floor is divided into six workstations in which guitars are partially assembled and then moved to the next station. Each bench is manned by a senior craftsman, and no guitar leaves that builder’s station until he is 100 percent happy with its quality. This product quality is akin to a traditional assembly line; however, unlike a traditional, top-to-bottom factory, Roberts is intimately involved in all phases of instrument construction.

Utilising this doting method of quality management, it’s difficult to see how customers wouldn’t be satisfied with the artists’ work. Yet even if there were issues, Roberts and other senior management also spend much of their days personally answering web queries about the instruments. According to the managers, customers tend to be pleasantly surprised to find the company’s senior leaders are the ones answering their technical questions and concerns. While Roberts has no intentions of taking his manufacturing company to industrial heights, the quality of his instruments and high levels of customer satisfaction speak for themselves; the company currently boasts one lengthy backlog of orders.

A quality education: Ramaiah Institute of Management Studies Although it may appear easier to find success with TQM at a boutique-sized endeavour, the philosophy’s principles hold true in virtually every sector. Educational institutions, for example, have utilised quality management in much the same way – albeit to tackle decidedly different problems.

The global financial crisis hit higher education harder than many might have expected, and nowhere have the odds stacked higher than in India. The nation plays home to one of the world’s fastest-growing markets for business education. Yet over recent years, the relevance of business education in India has come into question. A report by one recruiter recently asserted just one in four Indian MBAs were adequately prepared for the business world.

RIMS in numbers

9% Increase in test scores post total quality management strategy

22% Increase in number of recruiters hiring from the school

20,000 Increase in the salary offered to graduates

50,000 Rise in placement revenue

At the Ramaiah Institute of Management Studies (RIMS) in Bangalore, recruiters and accreditation bodies specifically called into question the quality of students’ educations. Although the relatively small school has always struggled to compete with India’s renowned Xavier Labour Research Institute, the faculty finally began to notice clear hindrances in the success of graduates. The RIMS board decided it was time for a serious reassessment of quality management.

The school nominated Chief Academic Advisor Dr Krishnamurthy to head a volunteer team that would audit, analyse and implement process changes that would improve quality throughout (all in a particularly academic fashion). The team was tasked with looking at three key dimensions: assurance of learning, research and productivity, and quality of placements. Each member underwent extensive training to learn about action plans, quality auditing skills and continuous improvement tools – such as the ‘plan-do-study-act’ cycle.

Once faculty members were trained, the team’s first task was to identify the school’s key stakeholders, processes and their importance at the institute. Unsurprisingly, the most vital processes were identified as student intake, research, knowledge dissemination, outcomes evaluation and recruiter acceptance. From there, Krishnamurthy’s team used a fishbone diagram to help identify potential root causes of the issues plaguing these vital processes. To illustrate just how bad things were at the school, the team selected control groups and administered domain-based knowledge tests.

The deficits were disappointing. A RIMS students’ knowledge base was rated at just 36 percent, while students at Harvard rated 95 percent. Likewise, students’ critical thinking abilities rated nine percent, versus 93 percent at MIT. Worse yet, the mean salaries of graduating students averaged $36,000, versus $150,000 for students from Kellogg. Krishnamurthy’s team had their work cut out.

To tackle these issues, Krishnamurthy created an employability team, developed strategic architecture and designed pilot studies to improve the school’s curriculum and make it more competitive. In order to do so, he needed absolutely every employee and student on board – and there was some resistance at the onset. Yet the educator asserted it didn’t actually take long to convince the school’s stakeholders the changes were extremely beneficial.

“Once students started seeing the results, buy-in became complete and unconditional,” he says. Acceptance was also achieved by maintaining clearer levels of communication with stakeholders. The school actually started to provide shareholders with detailed plans and projections. Then, it proceeded with a variety of new methods, such as incorporating case studies into the curriculum, which increased general test scores by almost 10 percent. Administrators also introduced a mandate saying students must be certified in English by the British Council – increasing scores from 42 percent to 51 percent.

By improving those test scores, the perceived quality of RIMS skyrocketed. The number of top 100 businesses recruiting from the school shot up by 22 percent, while the average salary offers graduates were receiving increased by $20,000. Placement revenue rose by an impressive $50,000, and RIMS has since skyrocketed up domestic and international education tables.

No matter the business, total quality management can and will work. Yet this philosophical take on quality control will only impact firms that are in it for the long haul. Every employee must be in tune with the company’s ideologies and desires to improve, and customer satisfaction must reign supreme.

Contributors

  • Industry Outlook

CEO

  • Journalists

What are you looking for?

Suggestions, pharmaceutical quality control case studies.

case study of quality control

BIOCAD’s Quest for a Reliable Microbiological Quantitative Reference Material

case study of quality control

®</sup> 3D", "link" : "/content/biomerieux/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-a-top-5-pharma-company-protects-production-an-increases-productivity-using-bact-alert-3d-media-statement.html", "type" : "Link"}}' href="/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-a-top-5-pharma-company-protects-production-an-increases-productivity-using-bact-alert-3d-media-statement.html"> How a Top 5 Pharma Company Protects Production and Increases Productivity Using BACT/ALERT ® 3D

case study of quality control

®</sup>", "link" : "/content/biomerieux/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-thalgo-increased-productivity-with-chemunex-media-statement.html", "type" : "Link"}}' href="/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-thalgo-increased-productivity-with-chemunex-media-statement.html"> How Thalgo Increased Productivity With CHEMUNEX ®

case study of quality control

®</sup>", "link" : "/content/biomerieux/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-shiseido-increased-the-efficiency-of-microbiological-controls-with-chemunex-case-study.html", "type" : "Link"}}' href="/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-shiseido-increased-the-efficiency-of-microbiological-controls-with-chemunex-case-study.html"> How Shiseido Increased the Efficiency of Microbiological Controls With CHEMUNEX ®

case study of quality control

®</sup> System?", "link" : "/content/biomerieux/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-did-loreal-optimize-microbial-testing-with-the-chemunex-system-case-study.html", "type" : "Link"}}' href="/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-did-loreal-optimize-microbial-testing-with-the-chemunex-system-case-study.html"> How did L’Oréal Optimize Microbial Testing With the CHEMUNEX ® System?

case study of quality control

®</sup>?", "link" : "/content/biomerieux/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-did-cosmebac-improve-microbiological-testing-process-with-chemunex-case-study.html", "type" : "Link"}}' href="/us/en/resource-hub/knowledge/case-studies/pharmaceutical-qc-case-studies/how-did-cosmebac-improve-microbiological-testing-process-with-chemunex-case-study.html"> How did Cosmebac Improve Microbiological Testing Process With CHEMUNEX ® ?

Analytical Instruments & Supplies

  • Chromatography
  • Mass Spectrometry
  • Certified Pre-Owned Instruments
  • Spectroscopy
  • Capillary Electrophoresis
  • Chromatography & Spectroscopy Lab Supplies
  • Instrument Repair
  • Sample Preparation
  • Chemical Standards

Life Science

  • Cell Analysis
  • Automated Electrophoresis
  • Microarray Solutions
  • Mutagenesis & Cloning
  • Next Generation Sequencing
  • Research Flow Cytometry
  • PCR/Real-Time PCR (qPCR)
  • CRISPR/Cas9
  • Microscopes and Microplate Instrumentation
  • Oligo Pools & Oligo GMP Manufacturing

Clinical & Diagnostic Testing

  • Immunohistochemistry
  • Companion Diagnostics
  • Hematoxylin & Eosin
  • Special Stains
  • In Situ Hybridization
  • Clinical Flow Cytometry
  • Specific Proteins
  • Clinical Microplate Instrumentation

Lab Management & Consulting

  • Lab Management
  • Lab Consulting

Lab Software

  • Software & Informatics
  • Genomics Informatics

Lab Supplies

  • Microplates

Dissolution Testing

  • Dissolution

Lab Automation

  • Automated Liquid Handling

Vacuum & Leak Detection

  • Vacuum Technologies
  • Leak Detection
  • Applications & Industries
  • Biopharma/Pharma
  • Cancer Research
  • Cannabis & Hemp Testing
  • Clinical Diagnostics
  • Clinical Research
  • Infectious Disease
  • Energy & Chemicals
  • Environmental
  • Food & Beverage Testing
  • Materials Testing & Research
  • Security, Defense & First Response
  • Vacuum Solutions
  • Training & Events

Mass spectrometry, chromatography, spectroscopy, software, dissolution, sample handling and vacuum technologies courses

On-demand continuing education

Instrument training and workshops

Live or on-demand webinars on product introductions, applications and software enhancements

Worldwide trade shows, conferences, local seminars and user group meetings

Lab Management Services

Service Plans, On Demand Repair, Preventive Maintenance, and Service Center Repair

Software to manage instrument access, sample processing, inventories, and more

Instrument/software qualifications, consulting, and data integrity validations

Learn essential lab skills and enhance your workflows

Instrument & equipment deinstallation, transportation, and reinstallation

Other Services Header1

CrossLab Connect services use laboratory data to improve control and decision-making

Advance lab operations with lab-wide services, asset management, relocation

Shorten the time it takes to start seeing the full value of your instrument investment

Other Services Header2

  • Agilent Community
  • Financial Solutions
  • Agilent University
  • Instrument Trade-In & BuyBack

Pathology Services

  • Lab Solution Deployment Services
  • Instrument & Solution Services
  • Training & Application Services
  • Workflow & Connectivity Services

Nucleic Acid Therapeutics

  • Oligonucleotide GMP Manufacturing

Vacuum Product & Leak Detector Services

  • Advance Exchange Service
  • Repair Support Services & Spare Parts
  • Support Services, Agreements & Training
  • Technology Refresh & Upgrade
  • Leak Detector Services
  • Support & Resources

Technical Support

  • Instrument Support Resources
  • Columns, Supplies, & Standards
  • Contact Support
  • See All Technical Support

Purchase & Order Support

  • Instrument Subscriptions
  • Flexible Spend Plan
  • eProcurement
  • eCommerce Guides

Literature & Videos

  • Application Notes
  • Technical Overviews
  • User Manuals
  • Life Sciences Publication Database
  • Electronic Instructions for Use (eIFU)
  • Safety Data Sheets
  • Technical Data Sheets
  • Site Prep Checklist

E-Newsletters

  • Solution Insights
  • ICP-MS Journal

Certificates

  • Certificate of Analysis
  • Certificate of Conformance
  • Certificate of Performance
  • ISO Certificates

More Resources

  • iOS & Android Apps
  • QuikChange Primer Design Tools
  • GC Calculators & Method Translation Software
  • BioCalculators / Nucleic Acid Calculators
  • Order Center
  • Quick Order
  • Request a Quote
  • My Favorites
  • Where to Buy
  • Flex Spend Portal
  • Genomics Applications and Solutions
  • Sample Quality Control Solutions
  • Sample Quality Control Case Studies

U.S. flag

An official website of the United States government

Here’s how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Heart-Healthy Living
  • High Blood Pressure
  • Sickle Cell Disease
  • Sleep Apnea
  • Information & Resources on COVID-19
  • The Heart Truth®
  • Learn More Breathe Better®
  • Blood Diseases and Disorders Education Program
  • Publications and Resources
  • Blood Disorders and Blood Safety
  • Sleep Science and Sleep Disorders
  • Lung Diseases
  • Health Disparities and Inequities
  • Heart and Vascular Diseases
  • Precision Medicine Activities
  • Obesity, Nutrition, and Physical Activity
  • Population and Epidemiology Studies
  • Women’s Health
  • Research Topics
  • Clinical Trials
  • All Science A-Z
  • Grants and Training Home
  • Policies and Guidelines
  • Funding Opportunities and Contacts
  • Training and Career Development
  • Email Alerts
  • NHLBI in the Press
  • Research Features
  • Past Events
  • Upcoming Events
  • Mission and Strategic Vision
  • Divisions, Offices and Centers
  • Advisory Committees
  • Budget and Legislative Information
  • Jobs and Working at the NHLBI
  • Contact and FAQs
  • NIH Sleep Research Plan
  • < Back To Health Topics

Study Quality Assessment Tools

In 2013, NHLBI developed a set of tailored quality assessment tools to assist reviewers in focusing on concepts that are key to a study’s internal validity. The tools were specific to certain study designs and tested for potential flaws in study methods or implementation. Experts used the tools during the systematic evidence review process to update existing clinical guidelines, such as those on cholesterol, blood pressure, and obesity. Their findings are outlined in the following reports:

  • Assessing Cardiovascular Risk: Systematic Evidence Review from the Risk Assessment Work Group
  • Management of Blood Cholesterol in Adults: Systematic Evidence Review from the Cholesterol Expert Panel
  • Management of Blood Pressure in Adults: Systematic Evidence Review from the Blood Pressure Expert Panel
  • Managing Overweight and Obesity in Adults: Systematic Evidence Review from the Obesity Expert Panel

While these tools have not been independently published and would not be considered standardized, they may be useful to the research community. These reports describe how experts used the tools for the project. Researchers may want to use the tools for their own projects; however, they would need to determine their own parameters for making judgements. Details about the design and application of the tools are included in Appendix A of the reports.

Quality Assessment of Controlled Intervention Studies - Study Quality Assessment Tools

*CD, cannot determine; NA, not applicable; NR, not reported

Guidance for Assessing the Quality of Controlled Intervention Studies

The guidance document below is organized by question number from the tool for quality assessment of controlled intervention studies.

Question 1. Described as randomized

Was the study described as randomized? A study does not satisfy quality criteria as randomized simply because the authors call it randomized; however, it is a first step in determining if a study is randomized

Questions 2 and 3. Treatment allocation–two interrelated pieces

Adequate randomization: Randomization is adequate if it occurred according to the play of chance (e.g., computer generated sequence in more recent studies, or random number table in older studies). Inadequate randomization: Randomization is inadequate if there is a preset plan (e.g., alternation where every other subject is assigned to treatment arm or another method of allocation is used, such as time or day of hospital admission or clinic visit, ZIP Code, phone number, etc.). In fact, this is not randomization at all–it is another method of assignment to groups. If assignment is not by the play of chance, then the answer to this question is no. There may be some tricky scenarios that will need to be read carefully and considered for the role of chance in assignment. For example, randomization may occur at the site level, where all individuals at a particular site are assigned to receive treatment or no treatment. This scenario is used for group-randomized trials, which can be truly randomized, but often are "quasi-experimental" studies with comparison groups rather than true control groups. (Few, if any, group-randomized trials are anticipated for this evidence review.)

Allocation concealment: This means that one does not know in advance, or cannot guess accurately, to what group the next person eligible for randomization will be assigned. Methods include sequentially numbered opaque sealed envelopes, numbered or coded containers, central randomization by a coordinating center, computer-generated randomization that is not revealed ahead of time, etc. Questions 4 and 5. Blinding

Blinding means that one does not know to which group–intervention or control–the participant is assigned. It is also sometimes called "masking." The reviewer assessed whether each of the following was blinded to knowledge of treatment assignment: (1) the person assessing the primary outcome(s) for the study (e.g., taking the measurements such as blood pressure, examining health records for events such as myocardial infarction, reviewing and interpreting test results such as x ray or cardiac catheterization findings); (2) the person receiving the intervention (e.g., the patient or other study participant); and (3) the person providing the intervention (e.g., the physician, nurse, pharmacist, dietitian, or behavioral interventionist).

Generally placebo-controlled medication studies are blinded to patient, provider, and outcome assessors; behavioral, lifestyle, and surgical studies are examples of studies that are frequently blinded only to the outcome assessors because blinding of the persons providing and receiving the interventions is difficult in these situations. Sometimes the individual providing the intervention is the same person performing the outcome assessment. This was noted when it occurred.

Question 6. Similarity of groups at baseline

This question relates to whether the intervention and control groups have similar baseline characteristics on average especially those characteristics that may affect the intervention or outcomes. The point of randomized trials is to create groups that are as similar as possible except for the intervention(s) being studied in order to compare the effects of the interventions between groups. When reviewers abstracted baseline characteristics, they noted when there was a significant difference between groups. Baseline characteristics for intervention groups are usually presented in a table in the article (often Table 1).

Groups can differ at baseline without raising red flags if: (1) the differences would not be expected to have any bearing on the interventions and outcomes; or (2) the differences are not statistically significant. When concerned about baseline difference in groups, reviewers recorded them in the comments section and considered them in their overall determination of the study quality.

Questions 7 and 8. Dropout

"Dropouts" in a clinical trial are individuals for whom there are no end point measurements, often because they dropped out of the study and were lost to followup.

Generally, an acceptable overall dropout rate is considered 20 percent or less of participants who were randomized or allocated into each group. An acceptable differential dropout rate is an absolute difference between groups of 15 percentage points at most (calculated by subtracting the dropout rate of one group minus the dropout rate of the other group). However, these are general rates. Lower overall dropout rates are expected in shorter studies, whereas higher overall dropout rates may be acceptable for studies of longer duration. For example, a 6-month study of weight loss interventions should be expected to have nearly 100 percent followup (almost no dropouts–nearly everybody gets their weight measured regardless of whether or not they actually received the intervention), whereas a 10-year study testing the effects of intensive blood pressure lowering on heart attacks may be acceptable if there is a 20-25 percent dropout rate, especially if the dropout rate between groups was similar. The panels for the NHLBI systematic reviews may set different levels of dropout caps.

Conversely, differential dropout rates are not flexible; there should be a 15 percent cap. If there is a differential dropout rate of 15 percent or higher between arms, then there is a serious potential for bias. This constitutes a fatal flaw, resulting in a poor quality rating for the study.

Question 9. Adherence

Did participants in each treatment group adhere to the protocols for assigned interventions? For example, if Group 1 was assigned to 10 mg/day of Drug A, did most of them take 10 mg/day of Drug A? Another example is a study evaluating the difference between a 30-pound weight loss and a 10-pound weight loss on specific clinical outcomes (e.g., heart attacks), but the 30-pound weight loss group did not achieve its intended weight loss target (e.g., the group only lost 14 pounds on average). A third example is whether a large percentage of participants assigned to one group "crossed over" and got the intervention provided to the other group. A final example is when one group that was assigned to receive a particular drug at a particular dose had a large percentage of participants who did not end up taking the drug or the dose as designed in the protocol.

Question 10. Avoid other interventions

Changes that occur in the study outcomes being assessed should be attributable to the interventions being compared in the study. If study participants receive interventions that are not part of the study protocol and could affect the outcomes being assessed, and they receive these interventions differentially, then there is cause for concern because these interventions could bias results. The following scenario is another example of how bias can occur. In a study comparing two different dietary interventions on serum cholesterol, one group had a significantly higher percentage of participants taking statin drugs than the other group. In this situation, it would be impossible to know if a difference in outcome was due to the dietary intervention or the drugs.

Question 11. Outcome measures assessment

What tools or methods were used to measure the outcomes in the study? Were the tools and methods accurate and reliable–for example, have they been validated, or are they objective? This is important as it indicates the confidence you can have in the reported outcomes. Perhaps even more important is ascertaining that outcomes were assessed in the same manner within and between groups. One example of differing methods is self-report of dietary salt intake versus urine testing for sodium content (a more reliable and valid assessment method). Another example is using BP measurements taken by practitioners who use their usual methods versus using BP measurements done by individuals trained in a standard approach. Such an approach may include using the same instrument each time and taking an individual's BP multiple times. In each of these cases, the answer to this assessment question would be "no" for the former scenario and "yes" for the latter. In addition, a study in which an intervention group was seen more frequently than the control group, enabling more opportunities to report clinical events, would not be considered reliable and valid.

Question 12. Power calculation

Generally, a study's methods section will address the sample size needed to detect differences in primary outcomes. The current standard is at least 80 percent power to detect a clinically relevant difference in an outcome using a two-sided alpha of 0.05. Often, however, older studies will not report on power.

Question 13. Prespecified outcomes

Investigators should prespecify outcomes reported in a study for hypothesis testing–which is the reason for conducting an RCT. Without prespecified outcomes, the study may be reporting ad hoc analyses, simply looking for differences supporting desired findings. Investigators also should prespecify subgroups being examined. Most RCTs conduct numerous post hoc analyses as a way of exploring findings and generating additional hypotheses. The intent of this question is to give more weight to reports that are not simply exploratory in nature.

Question 14. Intention-to-treat analysis

Intention-to-treat (ITT) means everybody who was randomized is analyzed according to the original group to which they are assigned. This is an extremely important concept because conducting an ITT analysis preserves the whole reason for doing a randomized trial; that is, to compare groups that differ only in the intervention being tested. When the ITT philosophy is not followed, groups being compared may no longer be the same. In this situation, the study would likely be rated poor. However, if an investigator used another type of analysis that could be viewed as valid, this would be explained in the "other" box on the quality assessment form. Some researchers use a completers analysis (an analysis of only the participants who completed the intervention and the study), which introduces significant potential for bias. Characteristics of participants who do not complete the study are unlikely to be the same as those who do. The likely impact of participants withdrawing from a study treatment must be considered carefully. ITT analysis provides a more conservative (potentially less biased) estimate of effectiveness.

General Guidance for Determining the Overall Quality Rating of Controlled Intervention Studies

The questions on the assessment tool were designed to help reviewers focus on the key concepts for evaluating a study's internal validity. They are not intended to create a list that is simply tallied up to arrive at a summary judgment of quality.

Internal validity is the extent to which the results (effects) reported in a study can truly be attributed to the intervention being evaluated and not to flaws in the design or conduct of the study–in other words, the ability for the study to make causal conclusions about the effects of the intervention being tested. Such flaws can increase the risk of bias. Critical appraisal involves considering the risk of potential for allocation bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a rating of poor quality. Low risk of bias translates to a rating of good quality.

Fatal flaws: If a study has a "fatal flaw," then risk of bias is significant, and the study is of poor quality. Examples of fatal flaws in RCTs include high dropout rates, high differential dropout rates, no ITT analysis or other unsuitable statistical analysis (e.g., completers-only analysis).

Generally, when evaluating a study, one will not see a "fatal flaw;" however, one will find some risk of bias. During training, reviewers were instructed to look for the potential for bias in studies by focusing on the concepts underlying the questions in the tool. For any box checked "no," reviewers were told to ask: "What is the potential risk of bias that may be introduced by this flaw?" That is, does this factor cause one to doubt the results that were reported in the study?

NHLBI staff provided reviewers with background reading on critical appraisal, while emphasizing that the best approach to use is to think about the questions in the tool in determining the potential for bias in a study. The staff also emphasized that each study has specific nuances; therefore, reviewers should familiarize themselves with the key concepts.

Quality Assessment of Systematic Reviews and Meta-Analyses - Study Quality Assessment Tools

Guidance for Quality Assessment Tool for Systematic Reviews and Meta-Analyses

A systematic review is a study that attempts to answer a question by synthesizing the results of primary studies while using strategies to limit bias and random error.424 These strategies include a comprehensive search of all potentially relevant articles and the use of explicit, reproducible criteria in the selection of articles included in the review. Research designs and study characteristics are appraised, data are synthesized, and results are interpreted using a predefined systematic approach that adheres to evidence-based methodological principles.

Systematic reviews can be qualitative or quantitative. A qualitative systematic review summarizes the results of the primary studies but does not combine the results statistically. A quantitative systematic review, or meta-analysis, is a type of systematic review that employs statistical techniques to combine the results of the different studies into a single pooled estimate of effect, often given as an odds ratio. The guidance document below is organized by question number from the tool for quality assessment of systematic reviews and meta-analyses.

Question 1. Focused question

The review should be based on a question that is clearly stated and well-formulated. An example would be a question that uses the PICO (population, intervention, comparator, outcome) format, with all components clearly described.

Question 2. Eligibility criteria

The eligibility criteria used to determine whether studies were included or excluded should be clearly specified and predefined. It should be clear to the reader why studies were included or excluded.

Question 3. Literature search

The search strategy should employ a comprehensive, systematic approach in order to capture all of the evidence possible that pertains to the question of interest. At a minimum, a comprehensive review has the following attributes:

  • Electronic searches were conducted using multiple scientific literature databases, such as MEDLINE, EMBASE, Cochrane Central Register of Controlled Trials, PsychLit, and others as appropriate for the subject matter.
  • Manual searches of references found in articles and textbooks should supplement the electronic searches.

Additional search strategies that may be used to improve the yield include the following:

  • Studies published in other countries
  • Studies published in languages other than English
  • Identification by experts in the field of studies and articles that may have been missed
  • Search of grey literature, including technical reports and other papers from government agencies or scientific groups or committees; presentations and posters from scientific meetings, conference proceedings, unpublished manuscripts; and others. Searching the grey literature is important (whenever feasible) because sometimes only positive studies with significant findings are published in the peer-reviewed literature, which can bias the results of a review.

In their reviews, researchers described the literature search strategy clearly, and ascertained it could be reproducible by others with similar results.

Question 4. Dual review for determining which studies to include and exclude

Titles, abstracts, and full-text articles (when indicated) should be reviewed by two independent reviewers to determine which studies to include and exclude in the review. Reviewers resolved disagreements through discussion and consensus or with third parties. They clearly stated the review process, including methods for settling disagreements.

Question 5. Quality appraisal for internal validity

Each included study should be appraised for internal validity (study quality assessment) using a standardized approach for rating the quality of the individual studies. Ideally, this should be done by at least two independent reviewers appraised each study for internal validity. However, there is not one commonly accepted, standardized tool for rating the quality of studies. So, in the research papers, reviewers looked for an assessment of the quality of each study and a clear description of the process used.

Question 6. List and describe included studies

All included studies were listed in the review, along with descriptions of their key characteristics. This was presented either in narrative or table format.

Question 7. Publication bias

Publication bias is a term used when studies with positive results have a higher likelihood of being published, being published rapidly, being published in higher impact journals, being published in English, being published more than once, or being cited by others.425,426 Publication bias can be linked to favorable or unfavorable treatment of research findings due to investigators, editors, industry, commercial interests, or peer reviewers. To minimize the potential for publication bias, researchers can conduct a comprehensive literature search that includes the strategies discussed in Question 3.

A funnel plot–a scatter plot of component studies in a meta-analysis–is a commonly used graphical method for detecting publication bias. If there is no significant publication bias, the graph looks like a symmetrical inverted funnel.

Reviewers assessed and clearly described the likelihood of publication bias.

Question 8. Heterogeneity

Heterogeneity is used to describe important differences in studies included in a meta-analysis that may make it inappropriate to combine the studies.427 Heterogeneity can be clinical (e.g., important differences between study participants, baseline disease severity, and interventions); methodological (e.g., important differences in the design and conduct of the study); or statistical (e.g., important differences in the quantitative results or reported effects).

Researchers usually assess clinical or methodological heterogeneity qualitatively by determining whether it makes sense to combine studies. For example:

  • Should a study evaluating the effects of an intervention on CVD risk that involves elderly male smokers with hypertension be combined with a study that involves healthy adults ages 18 to 40? (Clinical Heterogeneity)
  • Should a study that uses a randomized controlled trial (RCT) design be combined with a study that uses a case-control study design? (Methodological Heterogeneity)

Statistical heterogeneity describes the degree of variation in the effect estimates from a set of studies; it is assessed quantitatively. The two most common methods used to assess statistical heterogeneity are the Q test (also known as the X2 or chi-square test) or I2 test.

Reviewers examined studies to determine if an assessment for heterogeneity was conducted and clearly described. If the studies are found to be heterogeneous, the investigators should explore and explain the causes of the heterogeneity, and determine what influence, if any, the study differences had on overall study results.

Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies - Study Quality Assessment Tools

Guidance for Assessing the Quality of Observational Cohort and Cross-Sectional Studies

The guidance document below is organized by question number from the tool for quality assessment of observational cohort and cross-sectional studies.

Question 1. Research question

Did the authors describe their goal in conducting this research? Is it easy to understand what they were looking to find? This issue is important for any scientific paper of any type. Higher quality scientific research explicitly defines a research question.

Questions 2 and 3. Study population

Did the authors describe the group of people from which the study participants were selected or recruited, using demographics, location, and time period? If you were to conduct this study again, would you know who to recruit, from where, and from what time period? Is the cohort population free of the outcomes of interest at the time they were recruited?

An example would be men over 40 years old with type 2 diabetes who began seeking medical care at Phoenix Good Samaritan Hospital between January 1, 1990 and December 31, 1994. In this example, the population is clearly described as: (1) who (men over 40 years old with type 2 diabetes); (2) where (Phoenix Good Samaritan Hospital); and (3) when (between January 1, 1990 and December 31, 1994). Another example is women ages 34 to 59 years of age in 1980 who were in the nursing profession and had no known coronary disease, stroke, cancer, hypercholesterolemia, or diabetes, and were recruited from the 11 most populous States, with contact information obtained from State nursing boards.

In cohort studies, it is crucial that the population at baseline is free of the outcome of interest. For example, the nurses' population above would be an appropriate group in which to study incident coronary disease. This information is usually found either in descriptions of population recruitment, definitions of variables, or inclusion/exclusion criteria.

You may need to look at prior papers on methods in order to make the assessment for this question. Those papers are usually in the reference list.

If fewer than 50% of eligible persons participated in the study, then there is concern that the study population does not adequately represent the target population. This increases the risk of bias.

Question 4. Groups recruited from the same population and uniform eligibility criteria

Were the inclusion and exclusion criteria developed prior to recruitment or selection of the study population? Were the same underlying criteria used for all of the subjects involved? This issue is related to the description of the study population, above, and you may find the information for both of these questions in the same section of the paper.

Most cohort studies begin with the selection of the cohort; participants in this cohort are then measured or evaluated to determine their exposure status. However, some cohort studies may recruit or select exposed participants in a different time or place than unexposed participants, especially retrospective cohort studies–which is when data are obtained from the past (retrospectively), but the analysis examines exposures prior to outcomes. For example, one research question could be whether diabetic men with clinical depression are at higher risk for cardiovascular disease than those without clinical depression. So, diabetic men with depression might be selected from a mental health clinic, while diabetic men without depression might be selected from an internal medicine or endocrinology clinic. This study recruits groups from different clinic populations, so this example would get a "no."

However, the women nurses described in the question above were selected based on the same inclusion/exclusion criteria, so that example would get a "yes."

Question 5. Sample size justification

Did the authors present their reasons for selecting or recruiting the number of people included or analyzed? Do they note or discuss the statistical power of the study? This question is about whether or not the study had enough participants to detect an association if one truly existed.

A paragraph in the methods section of the article may explain the sample size needed to detect a hypothesized difference in outcomes. You may also find a discussion of power in the discussion section (such as the study had 85 percent power to detect a 20 percent increase in the rate of an outcome of interest, with a 2-sided alpha of 0.05). Sometimes estimates of variance and/or estimates of effect size are given, instead of sample size calculations. In any of these cases, the answer would be "yes."

However, observational cohort studies often do not report anything about power or sample sizes because the analyses are exploratory in nature. In this case, the answer would be "no." This is not a "fatal flaw." It just may indicate that attention was not paid to whether the study was sufficiently sized to answer a prespecified question–i.e., it may have been an exploratory, hypothesis-generating study.

Question 6. Exposure assessed prior to outcome measurement

This question is important because, in order to determine whether an exposure causes an outcome, the exposure must come before the outcome.

For some prospective cohort studies, the investigator enrolls the cohort and then determines the exposure status of various members of the cohort (large epidemiological studies like Framingham used this approach). However, for other cohort studies, the cohort is selected based on its exposure status, as in the example above of depressed diabetic men (the exposure being depression). Other examples include a cohort identified by its exposure to fluoridated drinking water and then compared to a cohort living in an area without fluoridated water, or a cohort of military personnel exposed to combat in the Gulf War compared to a cohort of military personnel not deployed in a combat zone.

With either of these types of cohort studies, the cohort is followed forward in time (i.e., prospectively) to assess the outcomes that occurred in the exposed members compared to nonexposed members of the cohort. Therefore, you begin the study in the present by looking at groups that were exposed (or not) to some biological or behavioral factor, intervention, etc., and then you follow them forward in time to examine outcomes. If a cohort study is conducted properly, the answer to this question should be "yes," since the exposure status of members of the cohort was determined at the beginning of the study before the outcomes occurred.

For retrospective cohort studies, the same principal applies. The difference is that, rather than identifying a cohort in the present and following them forward in time, the investigators go back in time (i.e., retrospectively) and select a cohort based on their exposure status in the past and then follow them forward to assess the outcomes that occurred in the exposed and nonexposed cohort members. Because in retrospective cohort studies the exposure and outcomes may have already occurred (it depends on how long they follow the cohort), it is important to make sure that the exposure preceded the outcome.

Sometimes cross-sectional studies are conducted (or cross-sectional analyses of cohort-study data), where the exposures and outcomes are measured during the same timeframe. As a result, cross-sectional analyses provide weaker evidence than regular cohort studies regarding a potential causal relationship between exposures and outcomes. For cross-sectional analyses, the answer to Question 6 should be "no."

Question 7. Sufficient timeframe to see an effect

Did the study allow enough time for a sufficient number of outcomes to occur or be observed, or enough time for an exposure to have a biological effect on an outcome? In the examples given above, if clinical depression has a biological effect on increasing risk for CVD, such an effect may take years. In the other example, if higher dietary sodium increases BP, a short timeframe may be sufficient to assess its association with BP, but a longer timeframe would be needed to examine its association with heart attacks.

The issue of timeframe is important to enable meaningful analysis of the relationships between exposures and outcomes to be conducted. This often requires at least several years, especially when looking at health outcomes, but it depends on the research question and outcomes being examined.

Cross-sectional analyses allow no time to see an effect, since the exposures and outcomes are assessed at the same time, so those would get a "no" response.

Question 8. Different levels of the exposure of interest

If the exposure can be defined as a range (examples: drug dosage, amount of physical activity, amount of sodium consumed), were multiple categories of that exposure assessed? (for example, for drugs: not on the medication, on a low dose, medium dose, high dose; for dietary sodium, higher than average U.S. consumption, lower than recommended consumption, between the two). Sometimes discrete categories of exposure are not used, but instead exposures are measured as continuous variables (for example, mg/day of dietary sodium or BP values).

In any case, studying different levels of exposure (where possible) enables investigators to assess trends or dose-response relationships between exposures and outcomes–e.g., the higher the exposure, the greater the rate of the health outcome. The presence of trends or dose-response relationships lends credibility to the hypothesis of causality between exposure and outcome.

For some exposures, however, this question may not be applicable (e.g., the exposure may be a dichotomous variable like living in a rural setting versus an urban setting, or vaccinated/not vaccinated with a one-time vaccine). If there are only two possible exposures (yes/no), then this question should be given an "NA," and it should not count negatively towards the quality rating.

Question 9. Exposure measures and assessment

Were the exposure measures defined in detail? Were the tools or methods used to measure exposure accurate and reliable–for example, have they been validated or are they objective? This issue is important as it influences confidence in the reported exposures. When exposures are measured with less accuracy or validity, it is harder to see an association between exposure and outcome even if one exists. Also as important is whether the exposures were assessed in the same manner within groups and between groups; if not, bias may result.

For example, retrospective self-report of dietary salt intake is not as valid and reliable as prospectively using a standardized dietary log plus testing participants' urine for sodium content. Another example is measurement of BP, where there may be quite a difference between usual care, where clinicians measure BP however it is done in their practice setting (which can vary considerably), and use of trained BP assessors using standardized equipment (e.g., the same BP device which has been tested and calibrated) and a standardized protocol (e.g., patient is seated for 5 minutes with feet flat on the floor, BP is taken twice in each arm, and all four measurements are averaged). In each of these cases, the former would get a "no" and the latter a "yes."

Here is a final example that illustrates the point about why it is important to assess exposures consistently across all groups: If people with higher BP (exposed cohort) are seen by their providers more frequently than those without elevated BP (nonexposed group), it also increases the chances of detecting and documenting changes in health outcomes, including CVD-related events. Therefore, it may lead to the conclusion that higher BP leads to more CVD events. This may be true, but it could also be due to the fact that the subjects with higher BP were seen more often; thus, more CVD-related events were detected and documented simply because they had more encounters with the health care system. Thus, it could bias the results and lead to an erroneous conclusion.

Question 10. Repeated exposure assessment

Was the exposure for each person measured more than once during the course of the study period? Multiple measurements with the same result increase our confidence that the exposure status was correctly classified. Also, multiple measurements enable investigators to look at changes in exposure over time, for example, people who ate high dietary sodium throughout the followup period, compared to those who started out high then reduced their intake, compared to those who ate low sodium throughout. Once again, this may not be applicable in all cases. In many older studies, exposure was measured only at baseline. However, multiple exposure measurements do result in a stronger study design.

Question 11. Outcome measures

Were the outcomes defined in detail? Were the tools or methods for measuring outcomes accurate and reliable–for example, have they been validated or are they objective? This issue is important because it influences confidence in the validity of study results. Also important is whether the outcomes were assessed in the same manner within groups and between groups.

An example of an outcome measure that is objective, accurate, and reliable is death–the outcome measured with more accuracy than any other. But even with a measure as objective as death, there can be differences in the accuracy and reliability of how death was assessed by the investigators. Did they base it on an autopsy report, death certificate, death registry, or report from a family member? Another example is a study of whether dietary fat intake is related to blood cholesterol level (cholesterol level being the outcome), and the cholesterol level is measured from fasting blood samples that are all sent to the same laboratory. These examples would get a "yes." An example of a "no" would be self-report by subjects that they had a heart attack, or self-report of how much they weigh (if body weight is the outcome of interest).

Similar to the example in Question 9, results may be biased if one group (e.g., people with high BP) is seen more frequently than another group (people with normal BP) because more frequent encounters with the health care system increases the chances of outcomes being detected and documented.

Question 12. Blinding of outcome assessors

Blinding means that outcome assessors did not know whether the participant was exposed or unexposed. It is also sometimes called "masking." The objective is to look for evidence in the article that the person(s) assessing the outcome(s) for the study (for example, examining medical records to determine the outcomes that occurred in the exposed and comparison groups) is masked to the exposure status of the participant. Sometimes the person measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would most likely not be blinded to exposure status because they also took measurements of exposures. If so, make a note of that in the comments section.

As you assess this criterion, think about whether it is likely that the person(s) doing the outcome assessment would know (or be able to figure out) the exposure status of the study participants. If the answer is no, then blinding is adequate. An example of adequate blinding of the outcome assessors is to create a separate committee, whose members were not involved in the care of the patient and had no information about the study participants' exposure status. The committee would then be provided with copies of participants' medical records, which had been stripped of any potential exposure information or personally identifiable information. The committee would then review the records for prespecified outcomes according to the study protocol. If blinding was not possible, which is sometimes the case, mark "NA" and explain the potential for bias.

Question 13. Followup rate

Higher overall followup rates are always better than lower followup rates, even though higher rates are expected in shorter studies, whereas lower overall followup rates are often seen in studies of longer duration. Usually, an acceptable overall followup rate is considered 80 percent or more of participants whose exposures were measured at baseline. However, this is just a general guideline. For example, a 6-month cohort study examining the relationship between dietary sodium intake and BP level may have over 90 percent followup, but a 20-year cohort study examining effects of sodium intake on stroke may have only a 65 percent followup rate.

Question 14. Statistical analyses

Were key potential confounding variables measured and adjusted for, such as by statistical adjustment for baseline differences? Logistic regression or other regression methods are often used to account for the influence of variables not of interest.

This is a key issue in cohort studies, because statistical analyses need to control for potential confounders, in contrast to an RCT, where the randomization process controls for potential confounders. All key factors that may be associated both with the exposure of interest and the outcome–that are not of interest to the research question–should be controlled for in the analyses.

For example, in a study of the relationship between cardiorespiratory fitness and CVD events (heart attacks and strokes), the study should control for age, BP, blood cholesterol, and body weight, because all of these factors are associated both with low fitness and with CVD events. Well-done cohort studies control for multiple potential confounders.

Some general guidance for determining the overall quality rating of observational cohort and cross-sectional studies

The questions on the form are designed to help you focus on the key concepts for evaluating the internal validity of a study. They are not intended to create a list that you simply tally up to arrive at a summary judgment of quality.

Internal validity for cohort studies is the extent to which the results reported in the study can truly be attributed to the exposure being evaluated and not to flaws in the design or conduct of the study–in other words, the ability of the study to draw associative conclusions about the effects of the exposures being studied on outcomes. Any such flaws can increase the risk of bias.

Critical appraisal involves considering the risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues throughout the questions above. High risk of bias translates to a rating of poor quality. Low risk of bias translates to a rating of good quality. (Thus, the greater the risk of bias, the lower the quality rating of the study.)

In addition, the more attention in the study design to issues that can help determine whether there is a causal relationship between the exposure and outcome, the higher quality the study. These include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, sufficient timeframe to see an effect, and appropriate control for confounding–all concepts reflected in the tool.

Generally, when you evaluate a study, you will not see a "fatal flaw," but you will find some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, you should ask yourself about the potential for bias in the study you are critically appraising. For any box where you check "no" you should ask, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, does this factor cause you to doubt the results that are reported in the study or doubt the ability of the study to accurately assess an association between exposure and outcome?

The best approach is to think about the questions in the tool and how each one tells you something about the potential for bias in a study. The more you familiarize yourself with the key concepts, the more comfortable you will be with critical appraisal. Examples of studies rated good, fair, and poor are useful, but each study must be assessed on its own based on the details that are reported and consideration of the concepts for minimizing bias.

Quality Assessment of Case-Control Studies - Study Quality Assessment Tools

Guidance for Assessing the Quality of Case-Control Studies

The guidance document below is organized by question number from the tool for quality assessment of case-control studies.

Did the authors describe their goal in conducting this research? Is it easy to understand what they were looking to find? This issue is important for any scientific paper of any type. High quality scientific research explicitly defines a research question.

Question 2. Study population

Did the authors describe the group of individuals from which the cases and controls were selected or recruited, while using demographics, location, and time period? If the investigators conducted this study again, would they know exactly who to recruit, from where, and from what time period?

Investigators identify case-control study populations by location, time period, and inclusion criteria for cases (individuals with the disease, condition, or problem) and controls (individuals without the disease, condition, or problem). For example, the population for a study of lung cancer and chemical exposure would be all incident cases of lung cancer diagnosed in patients ages 35 to 79, from January 1, 2003 to December 31, 2008, living in Texas during that entire time period, as well as controls without lung cancer recruited from the same population during the same time period. The population is clearly described as: (1) who (men and women ages 35 to 79 with (cases) and without (controls) incident lung cancer); (2) where (living in Texas); and (3) when (between January 1, 2003 and December 31, 2008).

Other studies may use disease registries or data from cohort studies to identify cases. In these cases, the populations are individuals who live in the area covered by the disease registry or included in a cohort study (i.e., nested case-control or case-cohort). For example, a study of the relationship between vitamin D intake and myocardial infarction might use patients identified via the GRACE registry, a database of heart attack patients.

NHLBI staff encouraged reviewers to examine prior papers on methods (listed in the reference list) to make this assessment, if necessary.

Question 3. Target population and case representation

In order for a study to truly address the research question, the target population–the population from which the study population is drawn and to which study results are believed to apply–should be carefully defined. Some authors may compare characteristics of the study cases to characteristics of cases in the target population, either in text or in a table. When study cases are shown to be representative of cases in the appropriate target population, it increases the likelihood that the study was well-designed per the research question.

However, because these statistics are frequently difficult or impossible to measure, publications should not be penalized if case representation is not shown. For most papers, the response to question 3 will be "NR." Those subquestions are combined because the answer to the second subquestion–case representation–determines the response to this item. However, it cannot be determined without considering the response to the first subquestion. For example, if the answer to the first subquestion is "yes," and the second, "CD," then the response for item 3 is "CD."

Question 4. Sample size justification

Did the authors discuss their reasons for selecting or recruiting the number of individuals included? Did they discuss the statistical power of the study and provide a sample size calculation to ensure that the study is adequately powered to detect an association (if one exists)? This question does not refer to a description of the manner in which different groups were included or excluded using the inclusion/exclusion criteria (e.g., "Final study size was 1,378 participants after exclusion of 461 patients with missing data" is not considered a sample size justification for the purposes of this question).

An article's methods section usually contains information on sample size and the size needed to detect differences in exposures and on statistical power.

Question 5. Groups recruited from the same population

To determine whether cases and controls were recruited from the same population, one can ask hypothetically, "If a control was to develop the outcome of interest (the condition that was used to select cases), would that person have been eligible to become a case?" Case-control studies begin with the selection of the cases (those with the outcome of interest, e.g., lung cancer) and controls (those in whom the outcome is absent). Cases and controls are then evaluated and categorized by their exposure status. For the lung cancer example, cases and controls were recruited from hospitals in a given region. One may reasonably assume that controls in the catchment area for the hospitals, or those already in the hospitals for a different reason, would attend those hospitals if they became a case; therefore, the controls are drawn from the same population as the cases. If the controls were recruited or selected from a different region (e.g., a State other than Texas) or time period (e.g., 1991-2000), then the cases and controls were recruited from different populations, and the answer to this question would be "no."

The following example further explores selection of controls. In a study, eligible cases were men and women, ages 18 to 39, who were diagnosed with atherosclerosis at hospitals in Perth, Australia, between July 1, 2000 and December 31, 2007. Appropriate controls for these cases might be sampled using voter registration information for men and women ages 18 to 39, living in Perth (population-based controls); they also could be sampled from patients without atherosclerosis at the same hospitals (hospital-based controls). As long as the controls are individuals who would have been eligible to be included in the study as cases (if they had been diagnosed with atherosclerosis), then the controls were selected appropriately from the same source population as cases.

In a prospective case-control study, investigators may enroll individuals as cases at the time they are found to have the outcome of interest; the number of cases usually increases as time progresses. At this same time, they may recruit or select controls from the population without the outcome of interest. One way to identify or recruit cases is through a surveillance system. In turn, investigators can select controls from the population covered by that system. This is an example of population-based controls. Investigators also may identify and select cases from a cohort study population and identify controls from outcome-free individuals in the same cohort study. This is known as a nested case-control study.

Question 6. Inclusion and exclusion criteria prespecified and applied uniformly

Were the inclusion and exclusion criteria developed prior to recruitment or selection of the study population? Were the same underlying criteria used for all of the groups involved? To answer this question, reviewers determined if the investigators developed I/E criteria prior to recruitment or selection of the study population and if they used the same underlying criteria for all groups. The investigators should have used the same selection criteria, except for study participants who had the disease or condition, which would be different for cases and controls by definition. Therefore, the investigators use the same age (or age range), gender, race, and other characteristics to select cases and controls. Information on this topic is usually found in a paper's section on the description of the study population.

Question 7. Case and control definitions

For this question, reviewers looked for descriptions of the validity of case and control definitions and processes or tools used to identify study participants as such. Was a specific description of "case" and "control" provided? Is there a discussion of the validity of the case and control definitions and the processes or tools used to identify study participants as such? They determined if the tools or methods were accurate, reliable, and objective. For example, cases might be identified as "adult patients admitted to a VA hospital from January 1, 2000 to December 31, 2009, with an ICD-9 discharge diagnosis code of acute myocardial infarction and at least one of the two confirmatory findings in their medical records: at least 2mm of ST elevation changes in two or more ECG leads and an elevated troponin level. Investigators might also use ICD-9 or CPT codes to identify patients. All cases should be identified using the same methods. Unless the distinction between cases and controls is accurate and reliable, investigators cannot use study results to draw valid conclusions.

Question 8. Random selection of study participants

If a case-control study did not use 100 percent of eligible cases and/or controls (e.g., not all disease-free participants were included as controls), did the authors indicate that random sampling was used to select controls? When it is possible to identify the source population fairly explicitly (e.g., in a nested case-control study, or in a registry-based study), then random sampling of controls is preferred. When investigators used consecutive sampling, which is frequently done for cases in prospective studies, then study participants are not considered randomly selected. In this case, the reviewers would answer "no" to Question 8. However, this would not be considered a fatal flaw.

If investigators included all eligible cases and controls as study participants, then reviewers marked "NA" in the tool. If 100 percent of cases were included (e.g., NA for cases) but only 50 percent of eligible controls, then the response would be "yes" if the controls were randomly selected, and "no" if they were not. If this cannot be determined, the appropriate response is "CD."

Question 9. Concurrent controls

A concurrent control is a control selected at the time another person became a case, usually on the same day. This means that one or more controls are recruited or selected from the population without the outcome of interest at the time a case is diagnosed. Investigators can use this method in both prospective case-control studies and retrospective case-control studies. For example, in a retrospective study of adenocarcinoma of the colon using data from hospital records, if hospital records indicate that Person A was diagnosed with adenocarcinoma of the colon on June 22, 2002, then investigators would select one or more controls from the population of patients without adenocarcinoma of the colon on that same day. This assumes they conducted the study retrospectively, using data from hospital records. The investigators could have also conducted this study using patient records from a cohort study, in which case it would be a nested case-control study.

Investigators can use concurrent controls in the presence or absence of matching and vice versa. A study that uses matching does not necessarily mean that concurrent controls were used.

Question 10. Exposure assessed prior to outcome measurement

Investigators first determine case or control status (based on presence or absence of outcome of interest), and then assess exposure history of the case or control; therefore, reviewers ascertained that the exposure preceded the outcome. For example, if the investigators used tissue samples to determine exposure, did they collect them from patients prior to their diagnosis? If hospital records were used, did investigators verify that the date a patient was exposed (e.g., received medication for atherosclerosis) occurred prior to the date they became a case (e.g., was diagnosed with type 2 diabetes)? For an association between an exposure and an outcome to be considered causal, the exposure must have occurred prior to the outcome.

Question 11. Exposure measures and assessment

Were the exposure measures defined in detail? Were the tools or methods used to measure exposure accurate and reliable–for example, have they been validated or are they objective? This is important, as it influences confidence in the reported exposures. Equally important is whether the exposures were assessed in the same manner within groups and between groups. This question pertains to bias resulting from exposure misclassification (i.e., exposure ascertainment).

For example, a retrospective self-report of dietary salt intake is not as valid and reliable as prospectively using a standardized dietary log plus testing participants' urine for sodium content because participants' retrospective recall of dietary salt intake may be inaccurate and result in misclassification of exposure status. Similarly, BP results from practices that use an established protocol for measuring BP would be considered more valid and reliable than results from practices that did not use standard protocols. A protocol may include using trained BP assessors, standardized equipment (e.g., the same BP device which has been tested and calibrated), and a standardized procedure (e.g., patient is seated for 5 minutes with feet flat on the floor, BP is taken twice in each arm, and all four measurements are averaged).

Question 12. Blinding of exposure assessors

Blinding or masking means that outcome assessors did not know whether participants were exposed or unexposed. To answer this question, reviewers examined articles for evidence that the outcome assessor(s) was masked to the exposure status of the research participants. An outcome assessor, for example, may examine medical records to determine the outcomes that occurred in the exposed and comparison groups. Sometimes the person measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would most likely not be blinded to exposure status. A reviewer would note such a finding in the comments section of the assessment tool.

One way to ensure good blinding of exposure assessment is to have a separate committee, whose members have no information about the study participants' status as cases or controls, review research participants' records. To help answer the question above, reviewers determined if it was likely that the outcome assessor knew whether the study participant was a case or control. If it was unlikely, then the reviewers marked "no" to Question 12. Outcome assessors who used medical records to assess exposure should not have been directly involved in the study participants' care, since they probably would have known about their patients' conditions. If the medical records contained information on the patient's condition that identified him/her as a case (which is likely), that information would have had to be removed before the exposure assessors reviewed the records.

If blinding was not possible, which sometimes happens, the reviewers marked "NA" in the assessment tool and explained the potential for bias.

Question 13. Statistical analysis

Were key potential confounding variables measured and adjusted for, such as by statistical adjustment for baseline differences? Investigators often use logistic regression or other regression methods to account for the influence of variables not of interest.

This is a key issue in case-controlled studies; statistical analyses need to control for potential confounders, in contrast to RCTs in which the randomization process controls for potential confounders. In the analysis, investigators need to control for all key factors that may be associated with both the exposure of interest and the outcome and are not of interest to the research question.

A study of the relationship between smoking and CVD events illustrates this point. Such a study needs to control for age, gender, and body weight; all are associated with smoking and CVD events. Well-done case-control studies control for multiple potential confounders.

Matching is a technique used to improve study efficiency and control for known confounders. For example, in the study of smoking and CVD events, an investigator might identify cases that have had a heart attack or stroke and then select controls of similar age, gender, and body weight to the cases. For case-control studies, it is important that if matching was performed during the selection or recruitment process, the variables used as matching criteria (e.g., age, gender, race) should be controlled for in the analysis.

General Guidance for Determining the Overall Quality Rating of Case-Controlled Studies

NHLBI designed the questions in the assessment tool to help reviewers focus on the key concepts for evaluating a study's internal validity, not to use as a list from which to add up items to judge a study's quality.

Internal validity for case-control studies is the extent to which the associations between disease and exposure reported in the study can truly be attributed to the exposure being evaluated rather than to flaws in the design or conduct of the study. In other words, what is ability of the study to draw associative conclusions about the effects of the exposures on outcomes? Any such flaws can increase the risk of bias.

In critical appraising a study, the following factors need to be considered: risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a poor quality rating; low risk of bias translates to a good quality rating. Again, the greater the risk of bias, the lower the quality rating of the study.

In addition, the more attention in the study design to issues that can help determine whether there is a causal relationship between the outcome and the exposure, the higher the quality of the study. These include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, sufficient timeframe to see an effect, and appropriate control for confounding–all concepts reflected in the tool.

If a study has a "fatal flaw," then risk of bias is significant; therefore, the study is deemed to be of poor quality. An example of a fatal flaw in case-control studies is a lack of a consistent standard process used to identify cases and controls.

Generally, when reviewers evaluated a study, they did not see a "fatal flaw," but instead found some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, reviewers examined the potential for bias in the study. For any box checked "no," reviewers asked, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, did this factor lead to doubt about the results reported in the study or the ability of the study to accurately assess an association between exposure and outcome?

By examining questions in the assessment tool, reviewers were best able to assess the potential for bias in a study. Specific rules were not useful, as each study had specific nuances. In addition, being familiar with the key concepts helped reviewers assess the studies. Examples of studies rated good, fair, and poor were useful, yet each study had to be assessed on its own.

Quality Assessment Tool for Before-After (Pre-Post) Studies With No Control Group - Study Quality Assessment Tools

Guidance for Assessing the Quality of Before-After (Pre-Post) Studies With No Control Group

Question 1. Study question

Question 2. Eligibility criteria and study population

Did the authors describe the eligibility criteria applied to the individuals from whom the study participants were selected or recruited? In other words, if the investigators were to conduct this study again, would they know whom to recruit, from where, and from what time period?

Here is a sample description of a study population: men over age 40 with type 2 diabetes, who began seeking medical care at Phoenix Good Samaritan Hospital, between January 1, 2005 and December 31, 2007. The population is clearly described as: (1) who (men over age 40 with type 2 diabetes); (2) where (Phoenix Good Samaritan Hospital); and (3) when (between January 1, 2005 and December 31, 2007). Another sample description is women who were in the nursing profession, who were ages 34 to 59 in 1995, had no known CHD, stroke, cancer, hypercholesterolemia, or diabetes, and were recruited from the 11 most populous States, with contact information obtained from State nursing boards.

To assess this question, reviewers examined prior papers on study methods (listed in reference list) when necessary.

Question 3. Study participants representative of clinical populations of interest

The participants in the study should be generally representative of the population in which the intervention will be broadly applied. Studies on small demographic subgroups may raise concerns about how the intervention will affect broader populations of interest. For example, interventions that focus on very young or very old individuals may affect middle-aged adults differently. Similarly, researchers may not be able to extrapolate study results from patients with severe chronic diseases to healthy populations.

Question 4. All eligible participants enrolled

To further explore this question, reviewers may need to ask: Did the investigators develop the I/E criteria prior to recruiting or selecting study participants? Were the same underlying I/E criteria used for all research participants? Were all subjects who met the I/E criteria enrolled in the study?

Question 5. Sample size

Did the authors present their reasons for selecting or recruiting the number of individuals included or analyzed? Did they note or discuss the statistical power of the study? This question addresses whether there was a sufficient sample size to detect an association, if one did exist.

An article's methods section may provide information on the sample size needed to detect a hypothesized difference in outcomes and a discussion on statistical power (such as, the study had 85 percent power to detect a 20 percent increase in the rate of an outcome of interest, with a 2-sided alpha of 0.05). Sometimes estimates of variance and/or estimates of effect size are given, instead of sample size calculations. In any case, if the reviewers determined that the power was sufficient to detect the effects of interest, then they would answer "yes" to Question 5.

Question 6. Intervention clearly described

Another pertinent question regarding interventions is: Was the intervention clearly defined in detail in the study? Did the authors indicate that the intervention was consistently applied to the subjects? Did the research participants have a high level of adherence to the requirements of the intervention? For example, if the investigators assigned a group to 10 mg/day of Drug A, did most participants in this group take the specific dosage of Drug A? Or did a large percentage of participants end up not taking the specific dose of Drug A indicated in the study protocol?

Reviewers ascertained that changes in study outcomes could be attributed to study interventions. If participants received interventions that were not part of the study protocol and could affect the outcomes being assessed, the results could be biased.

Question 7. Outcome measures clearly described, valid, and reliable

Were the outcomes defined in detail? Were the tools or methods for measuring outcomes accurate and reliable–for example, have they been validated or are they objective? This question is important because the answer influences confidence in the validity of study results.

An example of an outcome measure that is objective, accurate, and reliable is death–the outcome measured with more accuracy than any other. But even with a measure as objective as death, differences can exist in the accuracy and reliability of how investigators assessed death. For example, did they base it on an autopsy report, death certificate, death registry, or report from a family member? Another example of a valid study is one whose objective is to determine if dietary fat intake affects blood cholesterol level (cholesterol level being the outcome) and in which the cholesterol level is measured from fasting blood samples that are all sent to the same laboratory. These examples would get a "yes."

An example of a "no" would be self-report by subjects that they had a heart attack, or self-report of how much they weight (if body weight is the outcome of interest).

Question 8. Blinding of outcome assessors

Blinding or masking means that the outcome assessors did not know whether the participants received the intervention or were exposed to the factor under study. To answer the question above, the reviewers examined articles for evidence that the person(s) assessing the outcome(s) was masked to the participants' intervention or exposure status. An outcome assessor, for example, may examine medical records to determine the outcomes that occurred in the exposed and comparison groups. Sometimes the person applying the intervention or measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would not likely be blinded to the intervention or exposure status. A reviewer would note such a finding in the comments section of the assessment tool.

In assessing this criterion, the reviewers determined whether it was likely that the person(s) conducting the outcome assessment knew the exposure status of the study participants. If not, then blinding was adequate. An example of adequate blinding of the outcome assessors is to create a separate committee whose members were not involved in the care of the patient and had no information about the study participants' exposure status. Using a study protocol, committee members would review copies of participants' medical records, which would be stripped of any potential exposure information or personally identifiable information, for prespecified outcomes.

Question 9. Followup rate

Higher overall followup rates are always desirable to lower followup rates, although higher rates are expected in shorter studies, and lower overall followup rates are often seen in longer studies. Usually an acceptable overall followup rate is considered 80 percent or more of participants whose interventions or exposures were measured at baseline. However, this is a general guideline.

In accounting for those lost to followup, in the analysis, investigators may have imputed values of the outcome for those lost to followup or used other methods. For example, they may carry forward the baseline value or the last observed value of the outcome measure and use these as imputed values for the final outcome measure for research participants lost to followup.

Question 10. Statistical analysis

Were formal statistical tests used to assess the significance of the changes in the outcome measures between the before and after time periods? The reported study results should present values for statistical tests, such as p values, to document the statistical significance (or lack thereof) for the changes in the outcome measures found in the study.

Question 11. Multiple outcome measures

Were the outcome measures for each person measured more than once during the course of the before and after study periods? Multiple measurements with the same result increase confidence that the outcomes were accurately measured.

Question 12. Group-level interventions and individual-level outcome efforts

Group-level interventions are usually not relevant for clinical interventions such as bariatric surgery, in which the interventions are applied at the individual patient level. In those cases, the questions were coded as "NA" in the assessment tool.

General Guidance for Determining the Overall Quality Rating of Before-After Studies

The questions in the quality assessment tool were designed to help reviewers focus on the key concepts for evaluating the internal validity of a study. They are not intended to create a list from which to add up items to judge a study's quality.

Internal validity is the extent to which the outcome results reported in the study can truly be attributed to the intervention or exposure being evaluated, and not to biases, measurement errors, or other confounding factors that may result from flaws in the design or conduct of the study. In other words, what is the ability of the study to draw associative conclusions about the effects of the interventions or exposures on outcomes?

Critical appraisal of a study involves considering the risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues throughout the questions above. High risk of bias translates to a rating of poor quality; low risk of bias translates to a rating of good quality. Again, the greater the risk of bias, the lower the quality rating of the study.

In addition, the more attention in the study design to issues that can help determine if there is a causal relationship between the exposure and outcome, the higher quality the study. These issues include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, and sufficient timeframe to see an effect.

Generally, when reviewers evaluate a study, they will not see a "fatal flaw," but instead will find some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, reviewers should ask themselves about the potential for bias in the study they are critically appraising. For any box checked "no" reviewers should ask, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, does this factor lead to doubt about the results reported in the study or doubt about the ability of the study to accurately assess an association between the intervention or exposure and the outcome?

The best approach is to think about the questions in the assessment tool and how each one reveals something about the potential for bias in a study. Specific rules are not useful, as each study has specific nuances. In addition, being familiar with the key concepts will help reviewers be more comfortable with critical appraisal. Examples of studies rated good, fair, and poor are useful, but each study must be assessed on its own.

Quality Assessment Tool for Case Series Studies - Study Quality Assessment Tools

Background: development and use - study quality assessment tools.

Learn more about the development and use of Study Quality Assessment Tools.

Last updated: July, 2021

A case study evolving quality management in Indian civil engineering projects using AI techniques: a framework for automation and enhancement

  • Published: 02 April 2024

Cite this article

  • Kaushal Kumar 1 ,
  • Saurav Dixit 2 ,
  • Umank Mishra 3 &
  • Nikolai Ivanovich Vatin 4 , 5  

Explore all metrics

The present research examines a wide range of civil engineering projects across India, each providing a distinct platform for investigating quality management, automation techniques, and improvement activities using artificial intelligence (AI) techniques. The study covers projects demonstrating the variety of India’s civil engineering undertakings, from the Smart City Mission to the Mumbai Metro Line 3 and the Chennai-Madurai Expressway. The adoption of quality management techniques, including ISO 9001 Certification, Lean Construction, Six Sigma, Building Information Modeling (BIM), and Total Quality Management (TQM), is evaluated in the projects. In this case study, experimental datasets and employed AI techniques such as Artificial Neural Networks (ANN) are used to predict accurate outcomes. It was also observed that more variation in the regression coefficient (R 2 ) and errors (MSE) from 1 to 5 hidden layer nodes. While hidden layer nodes 6 to 10 performed stable outcomes. Out of them, hidden layer node 9 performed best of the best regression coefficient (R 2  = 99.4%) with minimum error (MSE = 0.04). The comple investigation of the outcomes indicating towards the suitability of the existing model as an important one for accurately predicting the UCS. A thorough framework for improving quality management in Indian civil engineering projects is the research’s final product, and it offers insightful information to industry stakeholders.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

case study of quality control

Data availability

No datasets were generated or analysed during the current study.

Arora, R., Kumar, K., & Dixit, S. (2023). Comparative analysis of the infuence of partial replacement of cement with supplementing cementitious materials in sustainable concrete using machine learning approach. Asian Journal of Civil Engineering . https://doi.org/10.1007/s42107-023-00858-0 .

Article   Google Scholar  

Dong, W., Huang, Y., Lehane, B., & Ma, G. (2020). XGBoost algorithmbased prediction of concrete electrical resistivity for structural health monitoring. Automation in Construction , 114 , 103155. https://doi.org/10.1016/j.autcon.2020.103155 .

Feng, D. C., Liu, Z. T., Wang, X. D., Chen, Y., Chang, J. Q., Wei, D. F., & Jiang, Z. M. (2020). Machine learning-based compressive strength prediction for concrete: An adaptive boosting approach. Construction and Building Materials , 230 , 117000. https://doi.org/10.1016/j.conbuildmat.2019.117000 .

Iranmanesh, A., & Kaveh, A. (1999). Structural optimization by gradient based neural networks. International Journal for Numerical Methods in Engineering , 46 (2), 297–311.

Kaveh, A., Elmieh, R., & Servati, H. (2001). Prediction of moment rotation characteristic for semi-rigid connections using BP neural networks. Asian Journal of Civil Engineering , 2 (2), 131–142.

Google Scholar  

Kaveh, A., Gholipour, Y., & Rahami, H. (2008). Optimal design of transmission towers using genetic algorithm and neural networks. International Journal of Space Structures , 23 (1), 1–19. https://doi.org/10.1260/026635108785342073 .

Kaveh, A. (2016). Advances in metaheuristic algorithms for optimal design of structures (2nd ed., pp. 1–631). Springer. https://doi.org/10.1007/978-3-319-46173-1/COVER .

Kaveh, A., Seddighian, M. R., & Ghanadpour, E. (2020). Black hole mechanics optimization: A novel meta-heuristic algorithm. Asian Journal of Civil Engineering , 21 , 1129–1149. https://doi.org/10.1007/s42107-020-00282-8 .

Kaveh, A., Eskandari, A., & Movasat, M. (2023). Buckling resistance prediction of high-strength steel columns using metaheuristic-trained artifcial neural networks. Structures , 56 (C), 104853.

Kaveh, A., & Khavaninzadeh, N. (2023). Efficient training of two ANNs using four meta-heuristic algorithms for predicting the FRP strength. Structures , 52 , 256–272. https://doi.org/10.1016/J.ISTRUC.2023.03.178 .

Kim, J. H. (2022). Smart city trends: A focus on 5 countries and 15 companies. Cities , 123 , 103551. https://doi.org/10.1016/j.cities.2021.103551 .

Kumar, A., Yadav, U. S., Yadav, G. P., & Tripathi, R. (2023). New sustainable ideas for materialistic solutions of smart city in India: A review from allahabad city. Mater Today Proc . https://doi.org/10.1016/j.matpr.2023.08.057 .

Kumar, K., Arora, R., Tipu, R. K., Dixit, S., Vatin, N., & Arya, S. (2024). Infuence of machine learning approaches for partial replacement of cement content through waste in construction sector. Asian Journal of Civil Engineering . https://doi.org/10.1007/s42107-023-00972-z .

Mortaheb, R., & Jankowski, P. (2023). Smart city re-imagined: City planning and GeoAI in the age of big data. Journal of Urban Management , 12 , 4–15. https://doi.org/10.1016/j.jum.2022.08.001 .

Peraka, N. S. P., & Biligiri, K. P. (2020). Pavement asset management systems and technologies: A review, Autom Constr. 119 (2020). https://doi.org/10.1016/j.autcon.2020.103336 .

Shashi, P., Centobelli, R., Cerchione, M., Ertz, & Oropallo, E. (2023). What we learn is what we earn from sustainable and circular construction, Journal of Cleaner Production. 382 (2023). https://doi.org/10.1016/j.jclepro.2022.135183 .

Singh, V., & Mirzaeifar, S. (2020). Assessing transactions of distributed knowledge resources in modern construction projects – A transactive memory approach, Autom Constr. 120 (2020). https://doi.org/10.1016/j.autcon.2020.103386 .

Singh, A. K., Kumar, V. R. P., Dehdasht, G., Mohandes, S. R., Manu, P., & Rahimian, F. P. (2023). Investigating the barriers to the adoption of blockchain technology in sustainable construction projects. Journal of Cleaner Production , 403 (2023). https://doi.org/10.1016/j.jclepro.2023.136840 .

Tipu, R. K., Panchal, V. R., & Pandya, K. S. (2022). An ensemble approach to improve BPNN model precision for predicting compressive strength of high-performance concrete. Structures , 45 , 500–508. https://doi.org/10.1016/j.istruc.2022.09.046 .

Tipu, R. K., Arora, R., & Kumar, K. (2023). Machine learning-based prediction of concrete strength properties with coconut shell as partial aggregate replacement: A sustainable approach in construction engineering. Asian Journal of Civil Engineering . https://doi.org/10.1007/s42107-023-00957-y .

Tipu, R. K., & Batra, V. (2023). Enhancing prediction accuracy of workability and compressive strength of high-performance concrete through extended dataset and improved machine learning models. Asian Journal of Civil Engineering . https://doi.org/10.1007/s42107-023-00768-1 .

Verma, A., Gupta, V., Nihar, K., Jana, A., Jain, R. K., & Deb, C. (2023). Tropical climates and the interplay between IEQ and energy consumption in buildings: A review. Building and Environment , 242. https://doi.org/10.1016/j.buildenv.2023.110551 .

Download references

Acknowledgements

The authors are thankful to Lovely profession University, Jalandhar, an autonomous organization Punjab, India, for providing basic data set for analysis to carrying out this study.

This research was also funded by the Ministry of Science and Higher Education of the Russian Federation within the framework of the state assignment No. 075-03-2022-010 dated 14 January 2022 and No. 075– 01568-23-04 dated 28 March 2023(Additional agreement 075-03-2022- 010/10 dated 09 November 2022, Additional agreement 075-03-2023- 004/4 dated 22 May 2023), FSEG-2022-0010.

Author information

Authors and affiliations.

Department of Mechanical Engineering, K. R. Mangalam University, Gurugram, Haryana, 122103, India

Kaushal Kumar

Division of Research and Development, Lovely Professional University, Phagwara, Punjab, 144401, India

Saurav Dixit

Department of Civil Engineering, Shri Shankaracharya Technical Campus, Bhilai, Chhattisgarh, 490020, India

Umank Mishra

Peter The Great St. Petersburg Polytechnic University, Saint Petersburg, 195251, Russia

Nikolai Ivanovich Vatin

Division of Research and Innovation, Uttaranchal University, Dehradun, India

You can also search for this author in PubMed   Google Scholar

Contributions

Author contributions: K.K. wrote main manuscript text while K.K. and S.D. provide the methodology and U.M., N.V. reviewed the manuscript.

Corresponding author

Correspondence to Kaushal Kumar .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Kumar, K., Dixit, S., Mishra, U. et al. A case study evolving quality management in Indian civil engineering projects using AI techniques: a framework for automation and enhancement. Asian J Civ Eng (2024). https://doi.org/10.1007/s42107-024-01029-5

Download citation

Received : 21 February 2024

Accepted : 06 March 2024

Published : 02 April 2024

DOI : https://doi.org/10.1007/s42107-024-01029-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence (AI)
  • Automation tools
  • Building information modeling (BIM)
  • Enhancement initiatives
  • Indian projects
  • Quality management
  • Find a journal
  • Publish with us
  • Track your research

case study of quality control

2-way Texting

A better way to communicate with vehicle owners by text

case study of quality control

Digital Vehicle Inspection

A better way to DVI with a best in class digital vehicle inspection

Streamline and standardize your process to automate your profits

Track technician efficiency in your bays with our digital work order.

case study of quality control

Quality Control

Avoid costly mistakes with a digital quality control process

The easiest to boost your authentic online review score

CRM Marketing Module

Automate your customer follow-up and vehicle service reminders

Rewards & Referrals

Easy-to-use rewards program with a built-in innovative referral tool

Remote Payments

Text invoices to customers allowing them to pay from anywhere

case study of quality control

Online scheduling for your website with calendar integration

Quality Control Case Study

Golden Rule Auto Care

golden rule auto care

  • Is there grease on the hood, door, steering wheel, console, etc.?
  • Are check engine or warning lights on?
  • If an oil change was performed, are lights reset and sticker placed in windshield?
  • Are tools left in the vehicle?
  • Is the work completed and the vehicle fully reassembled?
  • Are all caps put back on?
  • Are all belts and hoses tightened?
  • Are all fluids filled up?
  • Are the tires properly inflated?

Chris found that 80% of the vehicles he inspected had one or more of these issues .. He presented the data to his team, and it was decided something had to change; a quality control process was to be implemented.

Quality Control Chart

Quality control is a process followed by nearly all Fortune 500 companies to ensure that top quality work and performance are met on every single product and service. With his software engineering background, Chris applied the practice of quality control from the software industry to the auto repair industry. Software is not released until a separate team or person reviews and tests it for bugs.

Click here for a free quality control checklist 1

Click here for a free quality control checklist 2

Following is the plan that was implemented at Golden Rule Auto Care:

  • Every vehicle was to be inspected and wiped down by someone at the front counter.
  • If a vehicle had driveability work performed, a counter person would perform another test drive and complete the quality control inspection.
  • The results were to be captured in a monthly report for employees to see the effectiveness of their new quality process.
  • Golden Rule Auto Care wanted to celebrate the fact that they were finding the issues vs. the customer but also wanted to make sure the technicians were not using the quality control process as a crutch to not properly complete their jobs.
  • The customer would be informed about the process as an added advantage to the shop and invited to observe or even participate in the quality control inspection.

Since initiating the quality control process, Chris has created a quality control checklist in Autoflow .

Note: There was push back from the counter people and technicians.

  • Counter people said they didn’t have time but agreed change was needed with an 80% quality issue. It turned out, one average “QC” or quality control inspection took 5 to 10 minutes.
  • Technicians felt they were being disrespected by someone reviewing their work.  They were assured that mistakes are common and needed to be caught by the shop and not by customers.
  • Customer Retention – Customers will keep coming back, knowing that they received top quality service the first time and every time they visit your shop. Loyal customers are worth 10 times their initial visit . It is 6-7 times more expensive to acquire a new customer than it is to keep an existing one.
  • Positive Reputation – It takes 12 positive consumer experiences to make up for one bad consumer experience . Consumers are likely to tell 10-15 people about their bad experience.
  • Add-on Sales – A service writer/counter person may notice something that was originally not written up, such as a past due oil change sticker that a technician missed, which can lead to add-on sales.
  • Quality control time without drive time: 5 to 10 minutes
  • Quality control time with drive time: 15 to 20 minutes
  • Golden Rule Auto Care’s average: 12 minutes
  • KPI or goal for Golden Rule Auto Care’s QC issues per month: 10% or less
  • This year, they have run as high as 25% and as low as 9% per month.
  • Most common issue found at Golden Rule Auto Care: grease on handle, door panel, seat, console, floor, kick plate, etc.
  • Second most common issue found: fluids not full
  • Intake boot under upper radiator hose was disconnected and pressed against exhaust manifold…melting & smoking.
  • Vehicle started leaking a large amount of coolant due to incorrect installation.
  • Positive battery terminal not fully tightened.
  • Scanner still plugged in (these can get expensive to give away!)

Following is a snapshot of Golden Rule Auto Care’s monthly report:

Quality Control Report

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection
  • PMC10019438

Logo of phenaturepg

Using machine learning prediction models for quality control: a case study from the automotive industry

Mohamed kais msakni.

Department of Industrial Economics and Technology Management, Norwegian University of Science and Technology, Torgarden, 7491 Trondheim, Norway

Anders Risan

Peter schütz.

This paper studies a prediction problem using time series data and machine learning algorithms. The case study is related to the quality control of bumper beams in the automotive industry. These parts are milled during the production process, and the locations of the milled holes are subject to strict tolerance limits. Machine learning models are used to predict the location of milled holes in the next beam. By doing so, tolerance violations are detected at an early stage, and the production flow can be improved. A standard neural network, a long short term memory network (LSTM), and random forest algorithms are implemented and trained with historical data, including a time series of previous product measurements. Experiments indicate that all models have similar predictive capabilities with a slight dominance for the LSTM and random forest. The results show that some holes can be predicted with good quality, and the predictions can be used to improve the quality control process. However, other holes show poor results and support the claim that real data problems are challenged by inappropriate information or a lack of relevant information.

Introduction

The emergence of the fourth industrial revolution, Industry 4.0, is primarily driven by advancements in information, communication, and intelligence technologies that can improve production flexibility, efficiency, and productivity in industry (Ibarra et al. 2018 ). While the definition of Industry 4.0 is broad, there are several key concepts associated with it, such as smart factories, the Internet of Things (IoT), cloud computing, cyber-physical systems, and Big Data Manufacturing (Santos et al. 2017 ). IoT technology enables to connect manufacturing resources, like sensors, machines, and other equipment, enabling interconnection between components and reducing human intervention. This also allows real-time, high-accuracy monitoring of product quality, equipment, and production processes. Real-time data flow can help identify problems early on and provide better visibility into the flow of materials and products. In addition, cloud computing makes data available to other systems with powerful resources, such as servers, storage, and software (Lee and Lee 2015 ). As many manufacturers have large amounts of data that go unused, cloud computing is seen as a way to transform the traditional manufacturing business model into an effective collaboration, helping manufacturers align business strategies and product innovation and create smart networks (Xu 2012 ). The amount of data collected from various systems and objects is growing at an exponential rate and is commonly referred to as Big Data. This concept is characterized by high dimensionality and high complexity due to the variety of formats, semantics, and quality of sensors and processes generating the data (Wuest et al. 2016 ). As a key concept in smart factories, Big Data can impact Industry 4.0 in three ways: enabling self-diagnosis, forecasting, and control (Tao et al. 2017 ). Conventional data processing software and technologies cannot fully leverage the potential of these large and complex datasets, and advanced methods such as machine learning algorithms are needed to organize and derive value from the data.

In the context of Industry 4.0, machine learning has been applied to different levels of the industrial process, such as anomaly detection, process optimization, predictive maintenance, quality control, diagnosis, and resource management (Roblek et al. 2016 ). Machine learning is seen as a promising improvement in manufacturing as it allows for decentralized, autonomous, and real-time decision-making without human interaction. It has the advantages of addressing large and complex processes and enabling continuous quality improvement (Dogan and Birant 2021 ). Unlike conventional algorithms, machine learning algorithms can dynamically learn from the system and automatically adapt to changes in the environment. It can also detect patterns and implicit knowledge from the data, improving existing processes and methods in manufacturing (Wuest et al. 2016 ). However, the application of machine learning is not straightforward. The performance of these algorithms can be hindered by the acquisition of relevant data in terms of volume and quality. On the one hand, the training data must be sufficiently numerous to reach the level of generalization, for which the learning model also performs well on new (unseen) data. On the other hand, the data may either contain inappropriate and redundant information or lack relevant information, as not all data is captured during the manufacturing process, and some attributes may not be available. Data preprocessing, which includes selecting relevant inputs and normalizing the data (Wuest et al. 2016 ), is also an important step before learning. The challenges of machine learning are not only limited to data but also include the algorithm itself. Some machine learning algorithms are more appropriate for specific applications, and the performance of some of them depends on selecting suitable hyperparameter settings. Despite these challenges, machine learning algorithms have the capacity to extract new information and provide better results than conventional algorithms.

One of the advances offered by Industry 4.0 is the opportunity to improve quality control in manufacturing. Traditionally, manufacturers have used Statistical Process Control (SPC) to ensure that product features are defect-free and meet specifications. SPC is based on the statistical assumption that random factors, such as humidity, temperature changes, and variations in raw material, tend to form a normal distribution centered on the quality characteristics of the product (e.g., length, weight, and hardness). Thus, the process is under statistical control, which allows for analyzing the outputs and the capability of the process. SPC provides tools and techniques for monitoring and exploring the process behavior and identifying anomalies (Tao et al. 2017 ; Oakland and Oakland 2018 ). With the technological capabilities of Industry 4.0, SPC can be supplemented to improve quality control further. Big data and cloud computing can use real-time data to detect quality defects and process instability at an early stage. For example, Gokalp et al. ( 2017 ) describe real-time data analysis to self-calibrate a process when a deviation in the trajectory of an ongoing machining process. In addition, machine learning can use time-series data of process and product variables to identify patterns and detect early process deviations so that preventive measures can be taken and the production process is stabilized.

This paper investigates the use of machine learning algorithms to predict product quality in manufacturing in order to support quality control. The focus is on bumper beams, which are an essential component of automotive crash management systems and are subject to strict quality control. The goal is to improve the quality control process in production by predicting the quality of future products, allowing for early adjustments, and reducing scrap production and downtime in the production system. The machine learning algorithms used in this study are based on neural networks and random forests. They are trained on historical data consisting of previously produced and measured parts provided by the manufacturer. The effectiveness of the neural network and random forest models is compared and evaluated for their ability to predict key product characteristics important for quality control. This work differs from previous research in that it develops machine learning models that use previously measured products to predict the quality of the next product rather than using the real-time state of the system to predict the quality of the current part.

The outline of the remainder of this paper is as follows. Section  2 discusses machine learning for quality control in manufacturing systems and presents related works in the literature. Section  3 introduces the concept of time series and relates it to process control and machine learning prediction models. The case study of this paper is discussed in Sect.  4 . Section  5 shows the implementation and the obtained performance of the learning models. Finally, Sect.  6 gives a conclusion to this paper.

Related works

Maintaining high-quality products and processes is essential for success in a competitive environment. In manufacturing, product quality is related to the functional aspects of the product that must be out of defects and out-of-tolerance conditions. The process of ensuring that any manufactured product meets the requirements is called quality control. If a product does not satisfy the requirements, it is considered a poor-quality product and will be removed from the production line. Many factors can cause quality to vary in the production process, such as humidity, temperature, and variations in raw materials and tools.

As technology advances and data becomes more available, new ways to perform more accurate, real-time quality control are emerging. Machine learning has already been successfully applied to tasks involving quality control and quality assessments and is expected to further improve the field of quality control in the future (Wuest et al. 2016 ). Collected data can be analyzed by learning algorithms in two ways (Tao et al. 2018 ). The first is to monitor the process in real-time to ensure product quality; for example, a deviation in tool trajectory can be detected using real-time analysis, and the process can be adjusted according to the requirements. The second is to identify emerging problems. Using historical data, the learning algorithms can identify patterns or predict the output characteristics of a process, enabling early detection of faulty products.

The neural network is one of the most widely used machine learning algorithms for process and quality control in manufacturing environments. Most of the applications are related to real-time analysis for process control or detection of defective products by image recognition. Karayel ( 2009 ) uses a feedforward neural network as an observer for a control system by predicting surface roughness in a computer numerical control (CNC) lathe. The prediction model uses process parameters as input data, i.e., cutting depth, cutting speed, and feed rate, to predict surface roughness. This prediction is later sent to the controller to determine the best cutting parameters for the CNC turning system. A similar network structure is used by Tsai et al. ( 1999 ) to predict real-time surface roughness in milling cutting operations. The model uses process parameters that consist of vibration measures, depth of cut, speed of the axis, and rotation and feed speed. Martin et al. ( 2007 ) propose a supervised feedforward neural network to replace a human expert in the quality control process of resistance spot welding. The machine learning model uses ultrasonic oscillograms to classify the quality spot welds into one of the six predefined levels. Zhao et al. ( 2020 ) use power signals to predict the nugget diameter from spot-welded joints in a real-time prediction system. The paper compares the performance of a regression model and feedforward network in monitoring weld quality and shows that the latter model provides better performance. To detect defects in parts, Wang et al. ( 2018 ) propose a deep convolutional neural network that uses raw images of flat surfaces to automatically extract product features for defect detection and improve production efficiency. For geometrically complex products (such as turbo blades), Wang et al. ( 2020 ) develop a similar model architecture in a cloud-based platform to meet the high-speed performance required by complex product images. Risan et al. ( 2021 ) develop a feedforward neural network to predict the location of a milled hole in a bumper beam.

Other works in the literature focus on prediction models to support quality control using random forest algorithms. With respect to the machining process, most works use process parameters as input variables for prediction models such as feed rates, tool wear, and drive power. Bustillo et al. ( 2021 ) investigate the prediction of surface flatness deviations in a face milling process using different machine learning algorithms. The input data consists of tool life and wear and drive power. The problem is first designed and evaluated as a regression problem and then as a classification problem using discretized flatness levels. For the regression problem, the random forest is outperformed by the two artificial neural networks, a multilayer perceptrons network and a radial basis functions network. However, when the classification problem is considered, the random forest gives the most accurate predictions. Wu et al. ( 2018 ) develop a random forest model to predict the surface roughness in fused deposition modeling. The input data is mainly based on the temperature and vibration of the table and extruder. The model can predict the surface roughness of a printed part with very high accuracy. Bustillo et al. ( 2018 ) propose different machine learning algorithms for surface roughness and loss-of-mass predictions in machining processes. The models studied include regression trees, multilayer perceptrons, radio basis networks, and random forest. The experiments show that multilayer perceptrons achieve the best surface roughness prediction. However, the random forest has the advantage of being more suitable for industrial use in the absence of experts in machine learning as this model has a non-parametric property. Agrawal et al. ( 2015 ) develop a multiple regression model and a random forest model for the prediction of surface roughness during hard turning of a hardened steel piece. Both models use the same cutting parameters as input variables. The results show better surface predictions for the random forest model. Li et al. ( 2019 ) use an ensemble of machine learning models to predict the surface roughness for an extrusion-based additive manufacturing process. A random forest is used to reduce the input variable size to improve the computational efficiency and avoid overfitting. Then, an ensemble of different machine learning models is used for surface roughness prediction.

Quality control can also be enhanced using time series data and prediction models. Ma et al. ( 2022 ) proposed a soft sensor model for quality prediction of industrial products using time series data and process features. The model framework is based on a neighborhood dimension reduction and a bidirectional gated recurrent unit. Shohan et al. ( 2022 ) use time series modeling to help improve the prediction quality of biofabrication process. Standard autoregressive time series models and machine learning models were tested. Experiments showed that the Long-Short Term Memory model provides the best performance in terms of mean square errors. In another application, Freeman et al. ( 2018 ) implement deep learning techniques in the prediction of air quality. The data consists of time series events of hourly air quality and meteorological events. Kim et al. ( 2022 ) use a descriptive time series analysis to predict downtime of a linear medical accelerator by using long-term maintenance data. Meng et al. ( 2022 ) deploy a deep learning model using time series data of historical images and image recognition of plants. The objective is to improve the quality by helping have healthy plants with high yields. Prediction using time series data is not limited to quality improvement but covers a wide range of applications. Many works have recently emerged to predict COVID-19 transmission using time series and deep learning models (Long Short Term Memory networks and Gated Recurrent Units), e.g., Rauf et al. ( 2021 ); Ayoobi et al. ( 2021 ).

Methodology

Time series data, which consists of observations recorded at specific times, is often available in manufacturing processes and equipment. It is then important to exploit these data to extract valuable information for the manufacturers. This task corresponds to finding a model that describes a time series. This model estimates the relationship between the variable of interest Y and the input variables X using a function f . While various approaches can be applied, i.e., physical, statistical, and machine learning models, the nonlinear and high-dimensional aspects of manufacturing systems make it very difficult to develop a satisfactory model for estimating f . Despite this challenge, developing a time series model has several advantages, such as a compact description of the time series, hypothesis testing, separation and filtering of noise from data, and time series prediction (Brockwell and Davis 2016 ).

Times series and statistical process control

In the context of quality control, SPC is a widely used method that involves process capability and statistical analysis of process results. These methods rely on monitoring and analyzing the product features relevant to product quality. By using samples of a specific size from the process, causes for variation can be identified, and adjustments can be made (Groover 2019 ).

One of the primary techniques is the control chart, which offers a visualization way to study the evolution of a process over time. A time-series data is represented in a chart with a central line for the average, an upper line for the upper control limit, and a lower line for the lower control limit. These control limits are then compared to the actual data to see if the process variation is under control. When the process is under statistical control, the control limits are defined based on the process capability ( PC ), which provides information about the accuracy of a process’s performance over time and measures the ability of a process to meet its specifications (Oakland and Oakland 2018 ). It can be defined as:

where μ is the mean of the process, and σ is the standard deviation. Thus, 99.73% of outputs of a controlled process are within 3 σ limits.

Machine learning prediction models

The main challenge of machine learning models is to establish a valid representation of the input data (the input variables X of the time series) by performing some transformation (the model or function f ) that can approximate the expected outcomes (the variable of interest Y ). The provided data set is commonly referred to as the “training set" and is used by machine learning algorithms to apply a predefined set of operations to build a model. Thus, machine learning models are not explicitly programmed to make decisions or predictions but are created during the learning stage. Machine learning has been successfully applied to a variety of problems across different domains, such as image recognition, anomaly detection, and quality control. In this work, two classes of machine learning algorithms are developed, namely neural networks and random forests, to predict the location of holes in future products. Since the locations are continuous values and the training set is composed of input and output variables, the problem is referred to as a regression problem with supervised learning. A general description of neural networks and random forests is given in the following subsections.

Neural networks

Neural networks are one of the most commonly known machine learning algorithms, and they have been successfully applied to a wide range of fields. This learning algorithm is inspired by a biological network of neurons, in which neurons are chemically connected to form an extensive network. In artificial neural networks, neurons are modeled as nodes and connections as weights. The role of weights is to computationally activate or deactivate a connection between two nodes: A positive weight indicates an active connection, whereas a negative weight prohibits the link between the nodes. A node receives many connections (weights) that are transformed into a single output. Typically, the neurons in a neural network are organized in layers. The first (input) layer passes the input data to the network without any transformation, and the last (output) layer consists of output variables. The hidden layers connect the input layer to the output layer and perform the data transformation using activation functions. The role of an activation function in the hidden layers is to transform the weighted sum of the input into an output that will be used in the following layers. A general structure of the feedforward network is illustrated in Fig.  1 .

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig1_HTML.jpg

An example of a feedforward neural network with an input layer, two hidden layers, and one output layer with one target variable. Adapted from (Ketkar and Moolayil 2021 )

The layered representation of neurons can represent complex relationships between input and output data and extract complex patterns. Indeed, neural networks can model non-linear-statistical data and handle high-dimensional and multivariate data. However, they require more data than other machine learning models, and the best performance requires extensive customization as neural networks depend on several hyper-parameters. Also, neural networks do not provide any information about how the outputs are computed, a problem commonly referred to as the black box in machine learning (Ian et al. 2016 ).

Although neural networks are well suited for a large variety of problems, such as image recognition and text recognition, they suffer from a major issue known as the vanishing gradient problem, which prevents learning long-term dependencies (Rehmer and Kroll 2020 ). This makes it difficult to train standard neural networks on long data series (Kinyua and Jouandeau 2021 ). The vanishing gradient problem can be addressed by including gated units, such as the Long Short-Term Memory and the Gated Recurrent Unit (Rehmer and Kroll 2020 ).

Random forests

The random forest algorithm has become a widely used machine learning algorithm because of its simplicity and accuracy (Biau and Scornet 2016 ) and its ability to perform both supervised and unsupervised learning, as well as classification and regression problems (Genuer and Poggi 2020 ). This algorithm is a statistical learning method proposed by (Breiman 2001 ), based on the principles of ensemble learning. In machine learning, ensemble learning refers to the techniques of combining the predictions of a group of trained models (an ensemble). The idea is that by aggregating the outcomes of several models, the prediction of the ensemble is more likely to perform better than any individual model in the ensemble. For the random forest, the algorithm is trained on different and independent training subsets (bootstraps) to obtain several models, referred to as trees. Figure  2 illustrates a general structure of the random forest.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig2_HTML.jpg

Flowchart of training a random forest tree and aggregation of results. Adapted from (Genuer and Poggi 2020 )

A decision tree is a predictive model with a tree-like structure where the decision progresses from the root node through internal nodes until it reaches a leaf. A node corresponds to a binary split of the predictor space to continue the decision flow in one of the two sub-trees of the node. A leaf in the decision tree represents a predicted value or class label, and the path to the leaf represents the classification rules. Such a decision representation makes decision trees readable and simple to interpret. Although there are different algorithms for building a decision tree, the classification and Regression Tree (CART) algorithm is widely used for random forests (James et al. 2013 ).

For a classification problem, the decision tree uses the input values to reach one of its leaves and, here, to find the predicted class. Similarly, a regression problem uses the same decision tree structure, with the difference that the leaves correspond to continuous target values. To make the decision trees independent of each other, they are trained on B randomly drawn and independent subsets of equal size. Each subset is used to train B decision trees that will finally be aggregated into a forest.

One advantage of the random forest algorithm is that it does not require heavy computation for training. It is easy to tune as it depends only on a few hyper-parameters. Another advantage is that it is suitable for high-dimensional problems with multivariate data where the number of variables far exceeds the number of observations, and vice versa (Géron 2019 ). However, the prediction quality of the random forest is highly dependent on the quality of the training set, e.g., it cannot predict values outside the minimum and maximum of the values in the training set.

A case study

The product studied in this paper is the bumper beam, a component of a crash management system in cars. The beam is formed from an extruded aluminum profile and is machined and cut before being assembled to the bumper with screws.

The beam is placed using clamps at predefined locations during the machining step. Then, the CNC machining starts with the milling of reference holes, which are of particular interest because they are used to locate and mill other holes by CNC machining. In total, there are 20 milled holes, each with narrow tolerance ranges regarding their location in the beam. Any displacement of the reference holes results in deviation of the connected holes. Quality control of the milled holes is performed after the machining process, when a new product is released, or at predefined intervals. The interval length between two quality controls is typically two hours and is considered by the manufacturer as satisfactory to guarantee high-quality standards while ensuring smooth production. During quality control, the geometric characteristics of all milled holes and the beam curvature are automatically measured in an XYZ grid system, resulting in a total of 144 different features. When the measurement report shows any deviation, the entire batch of products is rejected, and the production line is disrupted. The production batch since the last control is scrapped. An experienced operator makes the necessary changes to the machine settings. Then a new beam is machined, and another quality control is performed. The goal of the manufacturer is to reduce the downtime of the production as much as possible to minimize direct economic loss.

Many factors can cause variations in the CNC machining process, including both random variations such as clamping force, temperature, and variations in upstream activity, as well as assignable variations such as replacement of CNC parts and change in the beam type being processed. Unfortunately, not all of these variations are available to be considered as part of the input to the learning models.

Figure  3 illustrates the shape of the bumper beam and the locations of the reference holes. Two reference holes (H1 and H4) are located on the left side of the beam, and three other holes (H2, H3, and H5) are located on the right side of the beam. H1, H2, and H3 are located using the YZ coordinate system, whereas H4 and H5 are located using the XZ coordinate system. This work aims to improve the quality control of the milled holes by predicting the reference hole locations of the next product to be manufactured. Machine learning models are implemented to predict future hole positions, which can be used as a preventive measure to avoid out-of-tolerance products. Since this information is available, early adjustments can be made, and the production flow is smooth. The proposed learning models do not depend on real-time data, as is the case in many literature works, but consider a time series analysis that uses previous measures to predict the hole locations in the upcoming product. Historical data from all available measurements is used as input to train the models. The target variables are the coordinates of all reference holes.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig3_HTML.jpg

Illustration of the bumper beam shape and the locations of the five reference holes

Figure  4 shows an example of the measurements and data collected for the reference hole H1. This hole is located using a measured value (MS) and a nominal value (NM). The deviation (DV) of H1 is the difference between MS and NM. Based on the deviations of Y and Z , denoted here by dy and dz , the true position (TP) of the measured hole can be computed. The TP defines a circular tolerance area for the hole position and is defined by Eq.  2 .

For the example of Fig.  4 , the TP measure is within the predefined tolerances limits ( - T and + T ). The angular deviation ( DA ) complements the TP measure and provides information about what direction the hole has moved. The actual location of H1 related to this example is represented by a dotted circle in Fig.  5 .

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig4_HTML.jpg

A table from a control report of the studied beam showing the measurements of the reference hole H1

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig5_HTML.jpg

An illustration of the displacement of H1 of Fig.  4 . The related milled hole is represented by a dotted circle. The solid circle corresponds to the nominal position, and the dotted square represents the area where the hole meets the specifications. DA shows the angular deviation of the milled hole

The TP values of the other holes are calculated similarly using the deviations of the two coordinates locating a hole. The decision as to whether a hole location is within specification or not depends on the TP values, which must be within the lower and upper limits defined by the manufacturer.

Experiments and results

In this section, the data, implementation, and performance of each machine learning model are discussed.

Training, validation and test set

The dataset used for the quality control prediction consists of 1255 measurement reports, covering three years. Each report includes a timestamp of the measurement operation, the locations of the 20 milled holes, and the curvature of the beam, resulting in 144 different point measurements. It should be mentioned that the interval between two quality control measurements is not always two hours and can vary greatly depending on the production schedule, holidays, priorities, etc. As shown in Fig.  6 , which depicts the measurements of two hole-coordinate pairs using a time-stamped axis, the production for the bumper beam under study was partially interrupted during November and December 2019. This kind of interruption can be found several times (about 15 times) throughout the dataset, with most of them lasting for one or two weeks. Despite these interruptions, we assume that the dataset is continuous as the number of interruptions is small, and the machine learning models used in this study depend only on lag features for prediction.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig6_HTML.jpg

Scatter plot of the measurement values of H1-Y and H2-Z over the data collection period

For a given learning algorithm, the variable to be predicted (the output) is a single hole-coordinate pair, e.g., H1-Y, that is trained and tested separately. The prediction of measure t uses all points of the three lagged measurements, i.e., t - 3 , t - 2 , and t - 1 , as input, resulting in 3 × 144 independent variables for every variable to predict. Indeed, preliminary testing showed that the prediction mainly depends on the last observation t - 1 , and that it can be slightly improved by integrating three-lagged input. Furthermore, it should be stated that all measures are related to relative deviation. These values are available with the raw data used in this study.

The dataset is divided into two subsets. The first 70% of the dataset is used to train and validate the machine learning algorithms, and the remaining 30% is used to test the prediction performance of the models. Since the default hyperparameters of all models cannot guarantee optimal results for the prediction problem, we performed a hyperparameter tuning step that was done on the first subset of data. This subset was, in turn, divided into two other subsets, where the 50% of the dataset is used for training, and the next 20% is used for validation. The hyperparameters test was done on a randomly chosen hole, namely H5-X (the results of subsection 5.4 show that this hole has an average performance). The best parameters found were then used for the other holes. It should be noted that other holes were selected for the hyperparameter tests, and similar results were obtained.

Implementation

In addition to the random forest, two neural network models are considered for prediction purposes. The first model is based on a standard neural network, hereafter referred to simply as a ‘neural network’, and the second is a Long-Short Term Memory (LSTM). All machine learning models were implemented in Python 3.8.6 using the Scikit-Learn library (for the neural network and the random forest) and the Keras library (for the LSTM). The input data is processed using Pandas 1.1.3 and Numpy 1.19.2, and the visualization tools are based on Matplotlib 3.3.2. The working environment is Jupyter Notebook on a Windows machine with a Core i7 CPU and 32 GB of RAM.

The parameters of the neural network are one hidden layer of 100 neurons, and the LBFGS solver is used as the identity activation function. The hyperparameter function of the random forest is used to find the best parameters, with the best results obtained with the default settings. A greedy hyperparameter tuning was performed for LSTM, where the number of epochs ranged between 1 and 1000, the batch size was set to 1, 2, and 4, and the number of neurons was set to 1, 2, 4, and 10. The best parameters were found for 100 as the number of epochs, two as the number of batches, and one as the number of neurons.

Finally, it should be mentioned that the neural network and LSTM models are more sensitive to data scaling than the random forest. The input data is pre-processed by standardizing, and the same scaling is then applied to the input of the test set.

Data analysis

The first step in analyzing the collected data is to understand the problem and verify its quality. This step involves visualizing and evaluating the relevance of the data, identifying outliers, and removing bad entries. Figure  6 presents the different measured values for the hole-coordinate pairs H1-Y and H2-Z over the data collection period. It can be observed that the H2-Z measurements are more spread out than H1-Y measurements. In contrast, H1-Y measurements fall within a narrow range that varies over time without a distinctive trend (e.g., degradation over time). These variations could potentially be explained by changes in the production process, but without additional data, it cannot be confirmed. Furthermore, the dispersion of the values in Fig.  6 (especially for H1-Y) supports the idea of using lagged measurements for prediction purposes.

Figure  7 a illustrates the distribution of the different measurements of hole-coordinate pairs. Since showing all pairs in the same figure makes it difficult to read, we restrict the representation to only five variables that are selected randomly and have data properties compared to other variables. Figure  7 a indicates that all variables share similar data distribution properties. The median is almost equal to zero, and the interquartile ranges (IQRs) presented by boxes are very narrow, revealing that the measurements are concentrated within this area, especially for H1-Y, H1-Z, and H4-Y. In addition, Fig.  7 a shows for each pair the outliers, defined as the points outside of the whiskers, at 1.5 of IQR from the first and third quartiles. We can see that the majority of outliers are condensed at the head and tail of whiskers, and very few measurements are far from the rest of the data. Thus, the data available for this study is considered to be of good quality and does not require preprocessing.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig7_HTML.jpg

Analysis of measurement data for a subset of reference holes and coordinates

Furthermore, Fig.  7 b illustrates the distribution plots of pairs of hole-coordinate. For the sake of readability, we select only three representative pairs as they have similar distributions to the other holes. The H2-Z plot shows a normal distribution that is symmetric and bell-shaped but slightly deviated from the zero mean. That is, this particular pair of hole-coordinate can be represented by a standard normal distribution. As for the H1-Y measurements, the distribution has a slight multimodal shape that can be smoothed to a normal distribution with high kurtosis, meaning that most observations have zero deviation. However, H5-Y has different peaks with different densities, which means that most measurements are slightly deviated from zero.

Performance of the models

In this subsection, the performance of the neural network, LSTM and random forest are assessed from both a quantitative and qualitative perspective. In addition, the models are compared to a standard autoregressive time series model.

A quantitative comparison

Three common performance metrics for regression problems are used to evaluate the predictive quality of models. The first is the Mean Absolute Error (MAE) which shows the magnitude of the overall error between the observed and predicted values. It neither eliminates the effect of positive and negative errors nor penalizes extreme forecast errors. The second is the Mean Squared Error (MSE), which penalizes extreme values. A high value of MSE shows a significant deviation between observed and predicted values, whereas a low value indicates that the predicted values are very close to the observations. Finally, the root mean square error (RMSE) is commonly used for regression problems and measures the square root of the second sample moment of residuals. RMSE is used to compare prediction errors of different models for the same data set and a particular variable, as it is scale-dependent. The definition of these three metrics are given in Eqs. ( 3 ), ( 4 ), and ( 5 ).

  • y i is the observed target value,
  • y ^ i represents the predicted target value, and
  • n is the number of observations.

Figure  8 shows the MAE, MSE, and RMSE metrics for the Random Forest and the two versions of Neural Network. Except for the reference hole H2, all models can provide reasonable predictions relative to the actual observations, i.e., the MAE metric ranges from 0.11 to 0.28 mm for the other holes. In particular, the predictions for hole H1 provide the best metrics, and H2 has the worst prediction metrics for the Y and Z coordinates. When learning models are considered, LSTM provides the best performance for H2-Z, where the metric MAE is improved by 31% over the neural network. For the remaining hole-coordinate pairs, the average MAE is the same for all models, i.e., 0.16. However, Random Forest and LSTM perform slightly better than Neural Network for MSE, i.e., 0.045 against 0.047.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig8_HTML.jpg

Comparison between the Random Forest, Neural Network and LSTM models using the MAE, MSE and RMSE metrics for the locations of all reference holes

Figure  9 groups the hole-coordinate pairs that are located in the same direction and provides the MAE and MSE metrics of each direction by learning model. Together with the results shown in Fig.  8 , we observe that for holes in the YZ coordinates (H1, H2, and H3), all prediction errors in the Z coordinate are higher than the corresponding Y coordinate. The same pattern appears for holes in the XY coordinates (H4 and H5) where the prediction errors in the X direction are slightly higher than in the Y direction. Since the comparison by coordinate is consistent for all learning models, it can be concluded that the models are better suited to one direction than another. Furthermore, when considering the locations of the holes in the beam, it can be observed that the holes on the left side of the beam, i.e., H1 and H4, are better predicted than the holes on the right side of the beam, i.e., H2, H3, and H5.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig9_HTML.jpg

Comparison between the prediction models by grouping the holes located in the same coordinate (X, Y or Z). The metrics are MAE and MSE

The previous analysis is further extended to include the feature importance of the predicted hole-coordinate pairs in the Y direction. The metrics of H2-Z and H3-Z show poor predictions compared to H1-Z, despite all of them being located in the same direction. Therefore, it is worth exploring which variables are the most significant and which factors have the greatest impact on the prediction. This can be done by analyzing the average feature importance of the decision trees in the random forest model. Figure  10 shows the top 20 most important features for H1-Z, H2-Z, and H3-Z. It can be observed that there is a variety of variables involved in the importance feature, including the previous measurements of the hole to be predicted, with the lagged time in parenthesis, as well as bend and other (non-reference) hole measurements. In the label in front of a variable name, the letter T refers to the twist tolerance for a bend measurement. The same letter is also used for small holes and indicates the distance to a reference hole (a specific metric set by the manufacturer is used). Lastly, the DF denotes the diameter of a milled hole.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig10_HTML.jpg

Top 20 most important features of the random forest model for the hole-coordinate pairs in the Y direction (H1-Z, H2-Z and H3-Z)

Figure  10 shows that three factors affect the prediction quality. First, all predicted pairs depend strongly on their immediate previous measurement t - 1 and, to a lesser extent, on t - 2 and t - 3 . However, basing the predictions solely on previous measurements leads to poor prediction metrics, as can be seen in H2-Z. Second, a diverse range of information leads to better predictions. The feature importance of H1-Z shows that different sources of information are used, i.e., Bend 66 and Bend 67 are bend measurements near the location of H1, Bend 61 is in the middle of the beam, and Bend 20 and Bend 21 are on the other side of the beam. Third, having only low-importance values does not help to have good predictions (i.e., H3-Z). This indicates that the learning model cannot identify the relevant features for a good prediction of the target variable. Overall, this analysis confirms the importance of considering all available information and three-lagged measurements for prediction purposes.

A qualitative comparison

Figure  11 gives a qualitative comparison of the prediction models for the best-performing hole-coordinate pair, namely H1-Y. The bottom part of Fig.  11 plots the actual values and those predicted by the random forest, while the top part draws the predictions of Neural Network and LSTM (since they belong to the same family of learning models) together with the actual values. It can be observed that, in general, the random forest provides restricted and smooth predicted values. This observation is notable for the first and last segments of observations, where the values predicted by the random forest are always within the fluctuations of the actual measurements. However, LSTM and neural network models can track the spikes better to generate predictions as high as the actual values. Except for the measurements around Observation 100, LSTM performs marginally better than the neural network. Overall, all H1-Y predictions can be considered of high quality with good performance for both the random forest and LSTM.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig11_HTML.jpg

Prediction performance of the neural network (top) and random forest (bottom) models for the best-predicted coordinate – H1-Y

In the second qualitative comparison, Fig.  12 illustrates the performance of the prediction models for the worst-performing hole-coordinate pair, namely H2-Z. It can be seen that there is a significant gap between the actual observations and the predicted values for all models. Except for the first 40 observations, the random forest and neural network models generate poor predictions compared to the actual values. For the interval between observations 40 and 270, the neural network attempts to follow the trend of the actual values without providing good predictions, while the random forest predicts deviations close to zero. However, the LSTM shows much better performance as the actual values are better tracked. Indeed, when this segment of observations is considered, the MAE of LSTM is 0.24 against 0.36 and 0.30 for the neural network and random forest, respectively. This explains the better metrics obtained by LSTM for H2-Z. From about observation 270, the deviation of the predicted values from the actual observations becomes increasingly significant for all models. In particular, the random forest generates a prediction close to zero. We can conclude that this last segment is very peculiar; unknown changes have been made to the production process preventing the learning models from making good predictions. The limited performance is not due to limited learning capacity but rather to missing information not provided to the models. Indeed, as previously discussed and shown in Fig.  8 and Fig.  9 , the prediction is better in one direction than in another. Similarly, the prediction of the holes located on the left side of the beam is better than the holes located on the other side of the beam. This may be due to the clamping forces applied to the beam during the machining process, which is not available for this study. Another reason may be a variation in the upstream activity, for example, when the aluminum profiles are bent.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig12_HTML.jpg

Prediction performance of the neural network (top) and random forest (bottom) models for the worst predicted coordinate – H2-Z

Comparison with an autoregressive time series model

The performance of the machine learning models is compared to an autoregressive integrated moving average (ARIMA) model. The model was fitted to the time series data to predict future points in the series. We recall that ARIMA is a univariate model, which means that previous data of a specific hole-coordinate pair is used to predict the next observations. The ‘statsmodels’ library was used to implement the ARIMA model, and the pre-built ‘auto_arima’ function was called to identify the most optimal values for trend elements ( p ,  d ,  q ), where p , d , and q are the trends in autoregression order, difference order, and moving average order, respectively. The best parameters differ from one hole to another, for example, the ARIMA (3,1,3) model is used for H1-Y, and the ARIMA (2,0,4) is used for H2-Z.

The metrics of the ARIMA models are shown in Fig.  13 and compared to the random forest and LSTM models. To compute the MAE and MSE, a static prediction for ARIMA for all pairs is used, meaning that the fitted models employ the actual value of the lagged dependent variable for prediction. It is clear that the learning models perform much better than the ARIMA models for all holes. This indicates that integrating other information such as bend and other holes’ measures help to achieve a better prediction. The exception is for H2-Z where ARIMA performs better than the random forest and is comparable to LSTM. This can be explained by the fact that ARIMA provides a very restrictive prediction that is always within the bounds of the actual values, which helps to reduce the error, especially for the last observation segment for H2-Z. However, as shown in Fig.  12 , the random forest and LSTM perform poorly for this last segment, starting at observation 270. Despite this poor performance, the learning models provide important information for this particular segment. They indicate that unknown changes have been made to the production process. This information cannot be derived from the ARIMA results.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig13_HTML.jpg

A comparison between ARIMA, random forest, and LSTM using MAE and MSE metrics

Residual evaluation

The prediction quality of the proposed models can be further assessed by analyzing the residuals, which are the difference between the actual and predicted values. These residuals should be uncorrelated and normally distributed with zero mean (Kuhn and Johnson 2013 ). Figure  14 shows the histograms of residuals for the hole-coordinate pairs studied above, namely H2-Z and H1-Y, for all models. In addition to the worst- and best-predicted coordinates, the analysis includes H5-X and H4-X, which are ranked around 7th and 3rd positions in terms of MAE for all models considered.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig14_HTML.jpg

Residual normality test for the predictions of LSTM, Neural Network, and Random Forest for the worst and best MAE metric, H2-Z and H1-Y, respectively. The comparison also includes H5-X and H4-X ranked 7th and 3rd on the same MAE metric

The histograms in Fig.  14 provide insight into the distribution of residual errors and the prediction quality. In particular, the distribution of residuals for H2-Z (the worst-predicted coordinate) is non-Gaussian and positively skewed with high kurtosis (large tails) for the neural network and random forest. This confirms the poor prediction quality for these two prediction models. However, LSTM shows a better distribution of residuals for H2-Z, which confirms the better performance obtained for this coordinate compared to the other two prediction models. For the H1-Y coordinate, the distribution of residuals shows a mean close to zero and a low kurtosis. The histograms confirm the strong performance obtained with the H1-Y coordinate. As for H5-X and H4-X, the visualization of the residuals is close to the Gaussian distribution, especially for H5-X with the neural network. With the mean of the distribution for H5-X and H4-X being almost equal to zero for all models, it can be concluded that the predicted and actual values are not correlated. The prediction performance of the learning models is generally good for some hole-coordinate pairs.

Confidence bounds and TP-outliers detection

Figure  15 shows the predicted TP values for the hole H1 (discussed in Sect. 4 ), along with the corresponding 3 σ level (shown with orange dashed line) and the TP limits set by the manufacturer (between 0.0 and 1.0). The predicted TP value is calculated according to Eq.  2 and is based on the coordinates H1-Y and H1-Z predicted by the random forest model. This figure aims to assess whether the predicted TP values are within statistical control and whether they can be used for quality control of the bumper beam. The validation test is used for the 3 σ control. As can be seen, the TP limits are stricter than the 3 σ level. Only two outliers out of four are above the confidence bound. When the remaining holes are considered, a similar observation is obtained, i.e., the TP limits and 3 σ are almost the same. The only exception is for the hole H4, for which the 3 σ level is stricter than the TP limit; however, this can be considered acceptable as only one observation of H4 deviates from both TP and 3 σ limits. In conclusion, the process variation is under control and subject to random factors.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig15_HTML.jpg

Predicted TP values for the hole H1 using the random forest model with a 3 σ level. The TP limits for H1 are between 0.0 and 1.0 as set by the manufacturer

In the last experiment, Figs.  16 , ​ ,17, 17 , ​ ,18, 18 , ​ ,19 19 and   20 compare the real and predicted TP values for holes H1–H5. The TP limits of each hole, as set by the manufacturer, are highlighted with a dashed green line. These figures show a difference between the actual and predicted TP values. The results of the prediction model cannot follow the fluctuations of the actual TP values. However, when it comes to outliers, the model can give some insights into when the deviations might occur. For the hole with the best-predicted coordinates (H1, shown in Fig.  16 ), the predicted TP values are located in the same observation area where actual TP deviations are observed. The first area, located around observations 210-213, is detected in advance by the prediction model before observation 210. The second area records consecutive actual TP measures that exceed the upper limit. Although the prediction model cannot detect all of these outliers, it has a prediction located in this area. The last area includes only observation 344, which is perfectly detected by the prediction model with a similar actual TP value.

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig16_HTML.jpg

Comparison between the predicted and true TP values for the hole H1. The random forest model is used to predict the coordinates of H1

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig17_HTML.jpg

Comparison between the predicted and true TP values for the hole H2. The random forest model is used to predict the coordinates of H2

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig18_HTML.jpg

Comparison between the predicted and true TP values for the hole H3. The random forest model is used to predict the coordinates of H3

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig19_HTML.jpg

Comparison between the predicted and true TP values for the hole H4. The random forest model is used to predict the coordinates of H4

An external file that holds a picture, illustration, etc.
Object name is 10287_2023_448_Fig20_HTML.jpg

Comparison between the predicted and true TP values for the hole H5. The random forest model is used to predict the coordinates of H5

The same statement is also valid for H3, with the deviations of observations 226 and 272 being correctly reported by the model. Although the deviation of observation 45 is not detected, the predicted value of TP is very close to the upper limit, which may indicate that an early adjustment of the CNC machining settings should be made. As for H4, only one deviation is reported that is perfectly predicted by the model. While the results are satisfactory for the holes discussed above, the prediction of TP outliers for H5 is poor. The predictions of the X and Y coordinates of H5 are among the worst, which may explain the poor quality of the prediction of TP values for this hole. Finally, no conclusions can be drawn from the analysis of the H2 results because there are no outliers for this set of observations. However, it is unlikely that the prediction model would be able to identify outliers for H2, as the H2-Y and H2-Z coordinates are poorly predicted.

Furthermore, the accuracy of the learning model is evaluated using the True Positive Rate (TPR or Sensitivity) and the True Negative Rate (TNR or Specificity) measures. The accuracy is defined as the ability of the learning model to predict actual TP-outliers. In this case, measuring the specificity is not relevant as the number of actual TP values within limits is significantly higher than actual outliers. Therefore, the focus is on the ability of the model to correctly predict actual TP outliers. As the model can detect outliers in nearby areas, the definition of a True Positive is expanded to include predicted outliers that will actually occur within the next 24 h. A False Positive is defined as when no actual TP outlier occurs during the next 24 h, and a False Negative is defined when there is an actual TP-outlier that is not predicted within the previous 24 h. Equations ( 6 ), ( 7 ), and ( 8 ) define TPR, False Negative Rate (FNR), and Threat Score (TS), respectively.

Table ​ Table1 1 reports the number of outliers, TPR , FNR , and TS rates for each hole. The performance of holes H4 and H5, while noteworthy, is not significant since they are only related to a single outlier that the model either predicts well or badly. For hole H1, Table ​ Table1 1 indicates low TPR and TS rates, which is a result of some predictions and actual TP outliers being separated by more than 24 h. For instance, the predicted TP outlier at observation 200 is separated by more than 24 h from the actual TP outliers at observations 213-215, and similarly with the predicted TP outlier at observation 272 and the actual outliers at subsequent observations. In the case of hole H3, the predicted TP outlier at observation 226 is correctly reported by the learning model; however, the next actual outlier at observation 232 occurs after 24 h, explaining the relatively low TS score obtained for H3. Overall, Table ​ Table1 1 confirms that the learning model cannot accurately predict TP outliers. Nonetheless, the predicted information can still be used by the manufacturer to make early adjustments.

Accuracy of the learning model for detecting TP outliers within a 24-hour interval

This paper deals with a prediction problem for quality control. The underlying problem is related to the automotive industry, and the product under study is the bumper beam, subject to stringent quality criteria. To support the quality control process of this product, we proposed machine learning models to predict the location of the reference holes of the next produced beam. The models are based on a time series that consisting of the historical data set of previous measurements that includes the beam characteristics. The learning models developed are a neural network, a long short-term memory network, and a random forest, and all are trained under similar conditions. The experimental study showed that the performance of all models is generally quite similar, with a slight dominance of the long short-term memory network and the random forest models. The results also indicate that the prediction can be good for some hole-coordinate pairs. However, there are considerable discrepancies for some other coordinates, and the predictions deviate significantly from the actual values. Since both models showed similar behavior, it can be concluded that the available information is not sufficient for prediction and that other resources should be included, such as process parameters or data from an upstream activity.

This work shows that applying machine learning models to real-life problems is not as easy as it sounds and is hampered by several factors. Not all data is captured or made available to be used for other purposes. For example, in the context of this work, information about changes in CNC settings is volatile and cannot be retrieved later, limiting its use for learning purposes. This example also shows that the transition to Industry 4.0 is not a straightforward process and could be challenging in several areas.

Open access funding provided by NTNU Norwegian University of Science and Technology (incl St. Olavs Hospital - Trondheim University Hospital). This work was supported by the Research Council of Norway as part of the LeanDigital research project, number 295145.

Declarations

One of the co-authors of this manuscript is a member of the editorial board of Computational Management Science. The authors declare no-conflict of interest regarding the publication of this paper.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Anders Risan and Peter Schütz contributed equally to this work.

Contributor Information

Mohamed Kais Msakni, Email: [email protected] .

Anders Risan, Email: on.untn.duts@sirdna .

Peter Schütz, Email: [email protected] .

  • Agrawal A, Goel S, Rashid WB, et al. Prediction of surface roughness during hard turning of AISI 4340 steel (69 HRC) Appl Soft Comput. 2015; 30 :279–286. doi: 10.1016/j.asoc.2015.01.059. [ CrossRef ] [ Google Scholar ]
  • Ayoobi N, Sharifrazi D, Alizadehsani R, et al. Time series forecasting of new cases and new deaths rate for covid-19 using deep learning methods. Results Phys. 2021; 27 (104):495. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Biau G, Scornet E. A random forest guided tour. Test. 2016; 25 (2):197–227. doi: 10.1007/s11749-016-0481-7. [ CrossRef ] [ Google Scholar ]
  • Breiman L. Random forests. Mach Learn. 2001; 45 (1):5–32. doi: 10.1023/A:1010933404324. [ CrossRef ] [ Google Scholar ]
  • Brockwell PJ, Davis RA. Introduction to time series and forecasting. Springer texts in statistics. Cham: Springer; 2016. pp. 73–96. [ Google Scholar ]
  • Bustillo A, Pimenov DY, Matuszewski M, et al. Using artificial intelligence models for the prediction of surface wear based on surface isotropy levels. Robot Comput Integr Manufact. 2018; 53 :215–227. doi: 10.1016/j.rcim.2018.03.011. [ CrossRef ] [ Google Scholar ]
  • Bustillo A, Pimenov DY, Mia M, et al. Machine-learning for automatic prediction of flatness deviation considering the wear of the face mill teeth. J Intell Manufact. 2021; 32 (3):895–912. doi: 10.1007/s10845-020-01645-3. [ CrossRef ] [ Google Scholar ]
  • Dogan A, Birant D. Machine learning and data mining in manufacturing. Expert Syst Appl. 2021; 166 (114):060. [ Google Scholar ]
  • Freeman BS, Taylor G, Gharabaghi B, et al. Forecasting air quality time series using deep learning. J Air & Waste Manag Assoc. 2018; 68 (8):866–886. doi: 10.1080/10962247.2018.1459956. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Genuer R, Poggi JM. Random forests. London: Springer; 2020. [ Google Scholar ]
  • Géron A (2019) Hands-on machine learning with scikit-learn, keras and tensorflow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media
  • Gokalp MO, Kayabay K, Akyol MA, et al (2017) Big data for industry 4.0: a conceptual framework. In: proceedings - 2016 international conference on computational science and computational intelligence, CSCI 2016 pp 431–434
  • Groover MP. Fundamentals of modern manufacturing: materials, processes, and systems, Wiley; 2019. [ Google Scholar ]
  • Ian G, Yoshua B, Aaron C. Deep learning. Adaptive Computation and Machine Learning: The MIT Press; 2016. [ Google Scholar ]
  • Ibarra D, Ganzarain J, Igartua JI. Business model innovation through Industry 4.0: a review. Procedia Manufact. 2018; 22 :4–10. doi: 10.1016/j.promfg.2018.03.002. [ CrossRef ] [ Google Scholar ]
  • James G, Witten D, Hastie T, et al. An introduction to statistical learning, Springer; 2013. [ Google Scholar ]
  • Karayel D. Prediction and control of surface roughness in CNC lathe using artificial neural network. J Mater Process Technol. 2009; 209 (7):3125–3137. doi: 10.1016/j.jmatprotec.2008.07.023. [ CrossRef ] [ Google Scholar ]
  • Ketkar N, Moolayil J (2021) Feed-forward neural networks. Deep learning with python pp 93–131
  • Kim KH, Sohn MJ, Lee S, et al. Descriptive time series analysis for downtime prediction using the maintenance data of a medical linear accelerator. Appl Sci. 2022; 12 (11):5431. doi: 10.3390/app12115431. [ CrossRef ] [ Google Scholar ]
  • Kinyua P, Jouandeau N (2021) Sample-label view transfer active learning for time series classification. In: international conference on artificial neural networks, Springer, London pp 600–611
  • Kuhn M, Johnson K. Applied predictive modeling. New York: Springer; 2013. [ Google Scholar ]
  • Lee I, Lee K. The internet of things (IoT): applications, investments, and challenges for enterprises. Bus Horiz. 2015; 58 (4):431–440. doi: 10.1016/j.bushor.2015.03.008. [ CrossRef ] [ Google Scholar ]
  • Li Z, Zhang Z, Shi J, et al. Prediction of surface roughness in extrusion-based additive manufacturing with machine learning. Robot Comput Integr Manufact. 2019; 57 :488–495. doi: 10.1016/j.rcim.2019.01.004. [ CrossRef ] [ Google Scholar ]
  • Ma L, Wang M, Peng K. A novel bidirectional gated recurrent unit-based soft sensor modeling framework for quality prediction in manufacturing processes. IEEE Sens J. 2022; 22 (19):18,610–18,619. doi: 10.1109/JSEN.2022.3199474. [ CrossRef ] [ Google Scholar ]
  • Martin O, Lopez M, Martin F. Artificial neural networks for quality control by ultrasonic testing in resistance spot welding. J Mater Process Technol. 2007; 183 (2–3):226–233. doi: 10.1016/j.jmatprotec.2006.10.011. [ CrossRef ] [ Google Scholar ]
  • Meng Y, Xu M, Yoon S, et al. Flexible and high quality plant growth prediction with limited data. Front Plant Sci. 2022; 13 :304–989. doi: 10.3389/fpls.2022.989304. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Oakland RJ, Oakland JS. Statistical process control. Routledge; 2018. [ Google Scholar ]
  • Rauf HT, Lali M, Khan MA, et al. Time series forecasting of covid-19 transmission in asia pacific countries using deep neural networks. Pers Ubiquitous Comput. 2021; 20 :1–18. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rehmer A, Kroll A. On the vanishing and exploding gradient problem in gated recurrent units. IFAC-PapersOnLine. 2020; 53 (2):1243–1248. doi: 10.1016/j.ifacol.2020.12.1342. [ CrossRef ] [ Google Scholar ]
  • Risan A, Msakni MK, Schütz P (2021) A Neural Network Model for Quality Prediction in the Automotive Industry. In: IFIP advances in information and communication technology. Springer: Berlin pp 567–575
  • Roblek V, Meško M, Krapež A. A complex view of industry 4.0. SAGE Open. 2016; 6 (2):1–11. doi: 10.1177/2158244016653987. [ CrossRef ] [ Google Scholar ]
  • Santos C, Mehrsai A, Barros AC, et al. Towards industry 4.0: an overview of European strategic roadmaps. Procedia Manufact. 2017; 13 :972–979. doi: 10.1016/j.promfg.2017.09.093. [ CrossRef ] [ Google Scholar ]
  • Shohan S, Hasan M, Starly B, et al. Investigating autoregressive and machine learning-based time series modeling with dielectric spectroscopy for predicting quality of biofabricated constructs. Manufact Lett. 2022; 33 :902–908. doi: 10.1016/j.mfglet.2022.07.110. [ CrossRef ] [ Google Scholar ]
  • Tao F, Cheng J, Qi Q, et al. (2017) Digital twin-driven product design, manufacturing and service with big data. Int J Adv Manufact Technol. 2017; 94 (9):3563–3576. [ Google Scholar ]
  • Tao F, Qi Q, Liu A, et al. Data-driven smart manufacturing. J Manufact Syst. 2018; 48 (C):157–169. doi: 10.1016/j.jmsy.2018.01.006. [ CrossRef ] [ Google Scholar ]
  • Tsai YH, Chen JC, Lou SJ. An in-process surface recognition system based on neural networks in end milling cutting operations. Int J Mach Tools Manufact. 1999; 39 (4):583–605. doi: 10.1016/S0890-6955(98)00053-4. [ CrossRef ] [ Google Scholar ]
  • Wang T, Chen Y, Qiao M, et al. A fast and robust convolutional neural network-based defect detection model in product quality control. Int J Adv Manufact Technol. 2018; 94 (9–12):3465–3471. doi: 10.1007/s00170-017-0882-0. [ CrossRef ] [ Google Scholar ]
  • Wang Y, Liu M, Zheng P, et al. A smart surface inspection system using faster R-CNN in cloud-edge computing environment. Adv Eng Informat. 2020; 43 (101):037. [ Google Scholar ]
  • Wu D, Wei Y, Terpenny J, (2018) Surface roughness prediction in additive manufacturing using machine learning. ASME, (2018) 13th international manufacturing science and engineering conference. MSEC 2018:3
  • Wuest T, Weimer D, Irgens C, et al. Machine learning in manufacturing: advantages, challenges, and applications. Prod Manufact Res. 2016; 4 (1):23–45. [ Google Scholar ]
  • Xu X. From cloud computing to cloud manufacturing. Robot Comput Integr Manufact. 2012; 28 (1):75–86. doi: 10.1016/j.rcim.2011.07.002. [ CrossRef ] [ Google Scholar ]
  • Zhao D, Wang Y, Liang D, et al. Performances of regression model and artificial neural network in monitoring welding quality based on power signal. J Mater Res Technol. 2020; 9 (2):1231–1240. doi: 10.1016/j.jmrt.2019.11.050. [ CrossRef ] [ Google Scholar ]
  • Open access
  • Published: 03 April 2024

Application of flipped classroom teaching method based on ADDIE concept in clinical teaching for neurology residents

  • Juan Zhang 1 ,
  • Hong Chen 2 ,
  • Xie Wang 2 ,
  • Xiaofeng Huang 1 &
  • Daojun Xie 1  

BMC Medical Education volume  24 , Article number:  366 ( 2024 ) Cite this article

124 Accesses

Metrics details

As an important medical personnel training system in China, standardized residency training plays an important role in enriching residents’ clinical experience, improving their ability to communicate with patients and their clinical expertise. The difficulty of teaching neurology lies in the fact that there are many types of diseases, complicated conditions, and strong specialisation, which puts higher requirements on residents’ independent learning ability, the cultivation of critical thinking, and the learning effect. Based on the concept of ADDIE (Analysis-Design-Development-Implementation-Evaluation), this study combines the theory and clinical practice of flipped classroom teaching method to evaluate the teaching effect, so as to provide a basis and reference for the implementation of flipped classroom in the future of neurology residency training teaching.

The participants of the study were 90 neurology residents in standardised training in our hospital in the classes of 2019 and 2020. A total of 90 residents were divided into a control group and an observation group of 45 cases each using the random number table method. The control group used traditional teaching methods, including problem based learning (PBL), case-based learning (CBL), and lecture-based learning (LBL). The observation group adopted the flipped classroom teaching method based on the ADDIE teaching concept. A unified assessment of the learning outcomes of the residents was conducted before they left the department in the fourth week, including the assessment of theoretical and skill knowledge, the assessment of independent learning ability, the assessment of critical thinking ability, and the assessment of clinical practice ability. Finally, the overall quality of teaching was assessed.

The theoretical and clinical skills assessment scores achieved by the observation group were significantly higher than those of the control group, and the results were statistically significant ( P  < 0.001). The scores of independent learning ability and critical thinking ability of the observation group were better than those of the control group, showing statistically significant differences ( P  < 0.001). The observation group was better than the control group in all indicators in terms of Mini-Cex score ( P  < 0.05). In addition, the observation group had better teaching quality compared to the control group ( P  < 0.001).

Based on the concept of ADDIE combined with flipped classroom teaching method can effectively improve the teaching effect of standardized training of neurology residents, and had a positive effect on the improvement of residents’ autonomous learning ability, critical thinking ability, theoretical knowledge and clinical comprehensive ability.

Peer Review reports

Introduction

As an important medical education system, the standardized residency training system is of great significance in China’s clinical medical training system [ 1 – 2 ]. In order to continuously improve the clinical medical talent training system and build a talent training system with clinical medical characteristics, China began to implement the resident standardized training system in 2014. Under the standardized clinical teaching plan, residents can achieve the requirements and objectives of multidisciplinary training required by the primary professional title through rotational learning and clinical teaching evaluation among various departments [ 3 ]. The implementation of the system not only greatly improves the professional ability of clinical medical staff, but also effectively saves medical resources and costs. However, neurology diseases are relatively abstruse and complex, with many critical diseases and strong professionalism, which requires physicians to have better autonomous learning ability, richer knowledge reserve and clinical emergency problem-solving ability.

The ADDIE model consists of five components: analysis, design, development, implementation, and evaluation [ 4 ]. The ADDIE teaching theory, as a new type of teaching theory, focuses on the needs and goals of the students. It allows the teacher to be the decision maker for learning [ 5 ], to set and develop the necessary learning steps and to implement them effectively by analysing the main learning objectives of the students and taking into account the students’ own realities. Learning effectiveness is checked through appropriate clinical teaching practice sessions to assess whether the learning requirements have been met, and it helps students to enhance their understanding of the learning content. It not only improves the educator’s ability to teach, but most importantly, the effectiveness of the students’ learning is also improved. Gagne instructional design method is mainly composed of nine learning events, such as training attention, informing learner of objectives, stimulating recall of prior learning, presenting stimulus, and providing learning guidance [ 6 ]. Compared with Gagne teaching design method, ADDIE model teaching method has the advantages of simple steps and easy implementation, and is often used in medical education design. Lucia et al. [ 7 ] used ADDIE model to develop the basic life support course in the process of adult cardiac arrest related surgery. Under the guidance of this theory, it not only realized the technical innovation in cardiopulmonary resuscitation education and systematization, but also had important positive significance for medical education. Maya et al. [ 8 ] developed and implemented the covid-19 elective course for pediatric residents by using the idea of ADDIE teaching. As an effective teaching method, this course provides necessary disaster response and flexible education for pediatric residents. Therefore, the teaching concept plays an important role in medical education.

Flipped classroom [ 9 ] was first popularised in the United States, where people advocated homework to replace the classroom learning format, and has gradually been applied to the medical education business in recent years [ 10 ]. It is different from traditional teaching. As an emerging mode of teaching, it advocates a student-centred approach, whereby the teacher prepares teaching videos or materials through an online platform and sends the materials to the students in a uniform manner before the students arrange their own study plan and time [ 11 – 12 ]. Therefore, this model is not limited by time and place, and students can learn according to their own situation and their own speed. When encountering difficult points, students can also watch the video repeatedly, interact and discuss with other students, or organise the questions and feedback them to the teacher for one-by-one answers.

Therefore, the flipped classroom teaching method based on AddIE teaching concept can formulate and implement the corresponding learning and training plan in combination with the clinical teaching needs of standardized training of neurology residents and the actual situation at this stage, encourage students to independently arrange learning time, and give the initiative of learning to students, so as to overcome the disadvantages of tight classroom time, heavy tasks, and students’ inability to study and think deeply in traditional medical teaching, which has a positive effect on the cultivation of students’ autonomous learning ability, the formation of critical thinking ability, and the improvement of professional knowledge and clinical comprehensive ability. Mini-CEX (Mini clinical exercise assessment) is considered to be an effective method for evaluating the clinical ability and teaching function of residents [ 13 ]. In this study, the theoretical and technical knowledge, autonomous learning ability and critical thinking ability were evaluated and scored, and the clinical comprehensive ability of residents was evaluated by mini CEX method, so as to provide a comprehensive and objective evaluation for clinical teaching results. This study is an exploration of medical clinical education mode, in order to provide reference for clinical teaching mode of standardized training of residents.

Materials and methods

Study design.

A prospective controlled experimental design of research was used in this study.

Participants

The participants of the study were 90 residents of the classes of 2019 and 2020 participating in the standardized residency training in the Department of Neurology of our hospital. Random number table method was used to divide 90 residents into control group and observation group with 45 residents in each group. There were 21 males and 24 females in the control group, aged 23–28 (25.40 ± 2.78) years. The observation group consisted of 23 males and 22 females, aged 22–27 (24.37 ± 2.59) years. All subjects signed an informed consent form. By comparing the general data of the residents in both groups, the results suggested no statistical significance ( p  > 0.05).

Training methods

Both groups of residents underwent a one-month standardized residency training in the Department of Neurology. During the training period, the instructors trained the residents according to the standardized residency training syllabus, which mainly included theoretical learning and skills operation. The two groups of teachers were.

randomly assigned and the quality of teaching was monitored by the department head.

Control group

The group adopted traditional teaching methods, including problem-based learning (PBL), case-based learning (CBL) and lecture based learning (LBL). PBL refers to a problem-oriented teaching method in which students seek solutions around problems [ 14 ]. CBL refers to the case-based teaching method, that is, to design cases according to teaching objectives, take teachers as the leading role, and let students think, analyze and discuss [ 15 ]. LBL refers to the traditional teaching method [ 16 ]. In the first week of enrollment, teachers will conduct unified enrollment assessment, enrollment education and popularization of basic knowledge of Neurology. The second week is mainly based on the traditional LBL teaching method, mainly for common diseases in the Department of Neurology, including ward round, bedside physical examination, auxiliary examination analysis, and putting forward the diagnosis basis and treatment plan. In the third week, CBL teaching method is mainly used to consolidate the knowledge learned through case study. In the fourth week, PBL teaching method is mainly used to promote problem learning and knowledge understanding by asking and answering questions. The learning outcomes were evaluated before leaving the department four weeks later. The detailed process was shown in Fig.  1 .

figure 1

Flow chart of resident training process for two groups

Observation group

This group adopted the flipped classroom teaching method based on the ADDIE teaching concept. The training content of the first week was the same as that of the control group. From the second to the fourth week, the flipped classroom teaching method based on the ADDIE teaching concept was adopted, with a total of 38 class hours. By analysing the content of the syllabus and the actual situation of the subjects, we designed and developed a characteristic and targeted teaching programme and implemented it, and conducted a unified assessment of the learning outcomes before the residents left the department in the fourth week. The concrete programme is shown in Table  1 .

Step 1: composition of the teaching team

The members of the teaching team included a department head, 10 neurology lead teachers, and two non-neurology ADDIE specialists. The department chair is responsible for overseeing the overall quality of teaching, and the instructors are responsible for the teaching and learning of all students and the assessment of their outcomes. The ADDIE experts integrate the ADDIE concepts into the clinical learning curriculum plan of the standardised residency training according to the specific arrangement and actual situation of the curriculum.

Step 2: setting of teaching objectives

The teaching objectives of standardised training for neurology residents mainly include the following aspects: (1) To understand and master common neurological diseases and their diagnosis and treatment processes, such as migraine, tension headache, benign paroxysmal positional vertigo, peripheral facial palsy, Parkinson’s disease, posterior circulation ischemia, cerebral infarction, cerebral hemorrhage, subarachnoid hemorrhage, epilepsy, etc.; (2) To understand and master systematic physical examination of the neurological system methods; (3) Proficiency in performing skillful operations related to neurological diseases, including lumbar puncture, etc.; (4) Familiarity with the management process of common neurological emergencies, including acute-phase cerebral infarction, acute-phase cerebral haemorrhage, and epileptic status persistent, etc.; and (5) Improvement of the resident’s ability of communicating with the team, collaborating with the team, communicating with the patients and the ability of dealing with the emergency problems on a temporary basis.

Step 3: concrete teaching plan

With the unanimous agreement and unremitting efforts of the teaching team, the curriculum and methodology for the standardised training of residents in the flipped classroom based on the ADDIE teaching concept was finalised. The teaching plan will be carried out in 5 steps, as shown in Table  1 .

Step 4: implementation of flipped classroom teaching method based on ADDIE teaching philosophy

Project analysis.

The final teaching task of this training mainly includes two aspects: (1) To complete all the teaching objectives set above; (2) To improve the residents’ comprehensive clinical ability in the process. Before the start of the training through the questionnaire form of the resident’s knowledge base of neurological specialities for the initial assessment, which helps to understand the current learning situation of the students, in order to facilitate the tailored teaching. At the same time, the main teaching tasks and teaching objectives were combined to analyse the specific form and content of the project, so as to develop a more practical and targeted programme.

Project design

The specific content of the project mainly includes: (1) Admission assessment: after admission to the department, all residents will conduct a unified admission mission and popularise the basic knowledge of neurology; (2) Flipped classroom teaching method: before the class, the leading teacher will analyse and sort out the common neurology diseases and their diagnosis and treatment processes according to the disease types based on the requirements of the syllabus, make a good teaching plan, and study a disease type at a time. Teachers will send teaching resources including PPT, video, cases, literature, etc. to the social platform. At the same time, they put forward the content and requirements to be mastered, and put forward 3–5 questions for students to think about in accordance with the focus of the teaching. Students can arrange their own study time, group themselves and have group discussions to try to solve the problems, and they can also ask questions to the teaching staff through the social platform at any time. Students can choose to go to the library or check the relevant literature on the Internet to expand their knowledge. In this session, knowledge transfer is completed; (3) Bedside practice teaching: the teacher communicates with the patient in advance, so that the students can conduct bedside questioning of medical history, physical examination, auxiliary examination and analysis. The diagnosis and diagnostic basis are proposed, and the teacher observes and assists the whole process.

Project development

After the teacher has finished the theoretical learning and practical teaching, he/she will ask targeted questions, pointing out what the students have done well and what needs to be improved in the process of questioning and treating the patients. At the same time, specific learning tasks are assigned for different students. Students are encouraged to report to the teacher about the patient’s condition and treatment plan, and propose their own treatment ideas. They are also allowed to ask the teacher any questions or problems that they cannot solve during the consultation. This teaching method is of great significance for students to master the theoretical knowledge of diseases and cultivate their clinical thinking.

Project implementation

Through the teaching team’s development of a specific and detailed teaching programme, methods such as entrance examination, flipped classroom teaching method, bedside practical teaching, and special case discussion were adopted. When encountering problems, students take the initiative to consult the literature and information or solve the problems independently through group discussion. If the problem cannot be solved, the students will seek help from the teachers, in order to practice students’ independent learning, teamwork and clinical diagnosis and treatment thinking ability.

Programme assessment

Students are assessed on their theoretical and professional skills knowledge at the end of the programme training. Students’ independent learning ability, critical thinking ability, clinical practice ability are assessed using relevant assessment methods, and finally the overall teaching quality is assessed, after which the teacher comments and summarises the results of the assessment.

Observation indicators

Theory and skill knowledge assessment.

This assessment includes two parts: theory and skill operation. The theoretical assessment mainly consists of the basic knowledge of neurology and the diagnosis and treatment process and medication of common neurology diseases. Skill operation involves lumbar puncture, thoracentesis, abdominal puncture, cardiopulmonary resuscitation, and other necessary items. The theory and skill operation parts were each worth 50 points, totalling 100 points. Unified assessment and grading will be conducted by the teachers.

Self-directed learning ability assessment scale

After the fourth week of training, the self-learning ability assessment form [ 17 ] was used to assess residents’ self-learning ability. The main contents include self motivation belief and objective behavior. Self motivation belief also includes self motivation (5 items) and learning belief (3 items). Objective behavior mainly includes four aspects: making learning goals and plans (4 items), self-monitoring and adjustment (7 items), obtaining and processing information (4 items) and communication and cooperation ability (7 items). The Likert scale [ 18 ] is used for a 5-level response system, which includes 5 levels of “completely non compliant”, “basically non compliant”, “average”, “basically compliant”, and “completely compliant”. The corresponding scores are 1 point, 2 point, 3 point, 4 point, and 5 point, with a total score of 150 points. The level of the score is positively correlated with the strength of autonomous learning ability. The Cronbach’s alpha coefficient was 0.929, the split half reliability was 0.892, and the content validity index was 0.970, indicating that the scale has good internal consistency, reliability and validity.

Critical thinking skills assessment scale

The Critical Thinking Skills Assessment Scale [ 19 ], which consists of seven dimensions, namely, truth-seeking, open-mindedness, analytical ability, and systematisation, with 10 items for each dimension, was used for the assessment at the end of the fourth week of training. A 6-point scale was used, ranging from “Strongly Disagree” to “Strongly Agree”, with scores ranging from 1 to 6, and the opposite for negative responses. The total score of the scale is 70–420, where ≤ 210 indicates negative performance, 211–279 indicates neutral performance, 280–349 indicates positive performance, and ≥ 350 indicates strong critical thinking skills. The Cronbach’s alpha coefficient was 0.90, the content validity index was 0.89, and the reliability was 0.90, indicating that the internal consistency, reliability and validity were good.

Clinical practice competence assessment

Clinical practice competence was assessed at the end of the fourth week of training using the mini-CEX scale [ 20 ], which included the following seven aspects: medical interview, physical examination, humanistic care, clinical diagnosis, communication skills, organisational effectiveness, and overall performance. Each aspect is rated from 1 to 9: 1 to 3 as “unqualified”; 4 to 6 as “qualified”; and 7 to 9 as “excellent”. The Cronbach’s alpha coefficient of the scale was 0.780, and the split-half reliability coefficient was 0.842, indicating that the internal consistency and reliability of the scale were relatively high.

Teaching quality assessment

Teaching quality assessment was conducted at the end of the fourth week of assessment, using the teaching quality assessment scale [ 21 ]. The specific content includes five aspects: teaching attitude, teaching method, teaching content, teaching characteristics, and teaching effect. The Likert 5-point scale was used, and the rating was positively correlated with the quality of teaching. The Cronbach’s alpha coefficient was 0.85 and the reliability was 0.83, which showed good reliability and validity.

Data analysis

SPSS 23.0 statistical software was used to analyse the data. Measurement information was expressed as mean ± standard deviation ( \( \bar x \pm \,S \) ), and t-test was used for comparison between groups. Comparison of the unordered data between the two groups was performed using the χ2 test, or Fisher’s exact method. p -value < 0.05 was considered a statistically significant difference.

The scores and statistical analysis results of theory, skill assessment, self-learning ability assessment, critical thinking ability assessment of the two groups of students were shown in Table  2 . The results of mini CEX assessment and statistical analysis were shown in Table  3 . The results of teaching quality assessment and statistical analysis were shown in Table  4 .

The standardised training of residents is an important medical personnel training system in China. It is a key link in the training of high-quality residents, which requires clinicians to have not only solid clinical expertise, but also noble medical character to better serve patients in outpatient and inpatient medical work. In recent years, due to the continuous development of China’s economic level, people’s demand for health is also increasing. Neurological system diseases are diverse, and certain diseases such as acute cerebrovascular disease, epilepsy, central nervous system infections, acute disseminated encephalomyelitis, Guillain-Barré, etc., have an acute onset and a rapid change in condition, which requires neurology residents to accurately identify and manage certain neurological emergencies and serious illnesses at an early stage. It puts forward higher requirements on the basic quality of neurology residents and brings more challenges to the clinical teaching of standardised neurology residency training. Therefore, the traditional teaching methods can no longer meet the current teaching requirements put forward under the new situation and new policies. Only by continuously improving and innovating the clinical teaching methods and improving the quality of teaching can the professional quality construction and training quality of residents be improved [ 22 ].

This study found that through four weeks’ teaching assessment, the theoretical and clinical skills assessment scores of the observation group were significantly higher than those of the control group, and the results were statistically significant ( P  < 0.001). Meanwhile, the scores of autonomous learning ability and critical thinking ability of the observation group were also better than those of the control group, with statistically significant differences ( P  < 0.001). In terms of Mini-Cex assessment, the observation group had better scores than the control group both in medical interview and physical examination ( P  < 0.01) and in humanistic care, clinical diagnosis, communication skills, organisational effectiveness, and overall performance ( P  < 0.05). In addition, the observation group also had higher scores compared to the control group regarding the quality of teaching in this study ( P  < 0.001). Previous studies have shown that the ADDIE concept can be applied to the design of clinical ethics education programmes and can be an effective tool for healthcare education, providing an established structure for the development of educational programmes [ 23 ]. Saeidnia [ 24 ] et al. used the ADDIE model to develop and design an educational application for COVID-19 self-prevention, self-care educational application to help people learn self-care skills at home during isolation, which can be used as an effective tool against COVID-19 to some extent. For the sake of reducing postoperative complications of breast cancer, Aydin [ 25 ] and others designed and developed a mobile application to support self-care of patients after breast cancer surgery with the support of the ADDIE model concept, which can provide professional medical guidance and advice for postoperative patients and is widely used in both education and clinical settings. Therefore, the ADDIE model concept has not only achieved better outcomes in the design of medical education, but also played a positive role in all aspects of disease prevention guidance and postoperative care.

As a flexible, targeted and effective new teaching method, flipped classroom method has been studied by many scholars in the field of basic medicine and clinical education. Pual [ 26 ] et al. found that the flipped classroom method was more effective for teaching clinical skills by comparing the two methods of course implementation, flipped teaching and online teaching. Du [ 27 ] and others found that a fully online flipped classroom approach increased classroom participation and adequate student-faculty interaction in distance education, and improved overall medical student exam pass rates during the COVID-19 pandemic, with better teaching and learning outcomes. Sierra [ 28 ] and others found that the flipped classroom method achieved better teaching and learning outcomes in a cardiology residency training programme, with higher acceptance among participants and teachers, and improved physicians’ assessment scores compared to traditional and virtual model teaching methods. Meanwhile, the Mini-CEX method was used in this study to assess the overall clinical competence of residents. This method, as a formative assessment, can not only provide a more accurate and comprehensive assessment of physicians’ comprehensive clinical competence, but also effectively promote physicians’ learning and growth [ 29 – 30 ]. Objective structured clinical examination(OSCE), as a method of evaluating students’ clinical comprehensive ability, understanding and application by simulating clinical scenarios, is widely used in the pre internship training of Undergraduates’ professional clinical practice skills [ 31 ]. Compared with OSCE, Mini-CEX is not limited by site and time, and it is time-consuming, simple and comprehensive. It can more systematically and comprehensively evaluate students’ clinical comprehensive ability [ 32 – 33 ]. Therefore, Mini-CEX is selected as the main clinical evaluation method in this study. Khalafi [ 34 ] et al. found that the use of Mini-CEX as a formative assessment method had a significant impact on the improvement of clinical skills of nursing anaesthesia students. Shafqat [ 35 ] et al. assessed the validity and feasibility of Mini-CEX by adopting it as a direct observation to assess its effectiveness and feasibility in an undergraduate medical curriculum. The study found that the altered method was effective in measuring student competence, improving clinical and diagnostic skills of medical students, and enhancing teacher-student interaction.

This study found that using ADDIE concept combined with flipped classroom teaching method, residents’ autonomous learning ability, critical thinking ability, theoretical knowledge and clinical comprehensive ability were improved. Analyze the potential causes: ADDIE, as a comprehensive medical teaching design concept, mainly includes five dimensions: analysis, design, development, implementation and evaluation. First, it systematically analyzes the specific clinical teaching needs and combines them with the current actual situation of students. On this basis, it flexibly sets the teaching plan, especially with the flipped classroom method, and pays attention to student-centered, This is quite different from the teacher centered concept in traditional teaching methods. This method encourages students to use their spare time to study independently through the text and video materials distributed by the teacher platform to meet the personalized needs of each student. At the same time, students actively explore the problems raised and encountered by teachers, which not only stimulate students’ interest in learning, but also greatly improve students’ autonomous learning and independent thinking ability. Furthermore, students’ collaborative discussion of problems and teachers’ in-depth explanation promoted the formation of students’ critical thinking, improved students’ learning effect and classroom efficiency, and improved students’ clinical comprehensive ability.

Limitations and recommendations

Although this study achieved some clinical teaching value, we still have many shortcomings. First, the limited number of residency trainers resulted in an insufficient sample size for this study, which may have an impact on the results. Second, due to the limitations of the residency training syllabus and policy, the training in this study was conducted for only one month, in fact, the training of speciality knowledge and talent development often need more sufficient time. Third, the study only used the Mini-CEX to assess the residents’ comprehensive clinical competence, and the scale selection in this area is relatively homogeneous, which may have an impact on the real assessment results. Therefore, in the future, we will expand the sample size, giving more reasonable and sufficient time for teaching training and knowledge digestion and assimilation, by using multiple scales to conduct in-depth assessment in various aspects, with a view to obtaining more reliable and persuasive results, which will provide reference for the teaching of specialised clinical medicine.

Based on the ADDIE concept combined with flipped classroom teaching method, this study conducted research in the residency training and found that compared with the traditional teaching method, the new teaching concept combined with flipped classroom teaching method can effectively improve the autonomous learning ability, critical thinking ability, theoretical knowledge and clinical comprehensive ability of neurology residents, and had better teaching quality. In clinical medical education, we should actively conform to modern teaching ideas. On the basis of traditional teaching, we should actively integrate new ideas and methods, give full play to the advantages of different teaching methods, so as to continuously improve the teaching efficiency and quality.

Data availability

The datasets used and/or analysed in this study are available from the corresponding author upon reasonable request.

Hongxing L, Yan S, Lianshuang Z, et al. The practice of professional degree postgraduate education of clinical medicine under the background of the reform of the medical education Cooperatio. Contin Med Educ. 2018;32(12):16–8.

Google Scholar  

Shilin F, Chang G, Guanlin L, et al. The investigation of the training model in four-in-one professional Master’s degree of medicine. China Contin Med Educ. 2018;10(35):34–7.

Man Z, Dayu S, Lulu Z, et al. Study on the evaluation indeses system of clinical instructors by clinical professional postgraduates under the dualtrack system mode. Med Educ Res Prac. 2018;26(6):957–61.

Boling E, Easterling WV, Hardré PL, Howard CD, Roman TA. ADDIE: perspectives in transition. Educational Technology. 2011;51(5):34–8.

Hsu T, Lee-Hsieh J, Turton MA, Cheng S. Using the ADDIE model to develop online continuing education courses on caring for nurses in Taiwan. J Contin Educ Nurs. 2014;45(3):124–131. https://doi.org/10.3928/00220124-20140219-04

Woo WH. Using Gagne’s instructional model in phlebotomy education. Adv Med Educ Pract. 2016;7:511–6. https://doi.org/10.2147/AMEP.S1103 . Published 2016 Aug 31.

Article   Google Scholar  

Tobase L, Peres HHC, Almeida DM, Tomazini EAS, Ramos MB, Polastri TF. Instructional design in the development of an online course on basic life support. Rev Esc Enferm USP. 2018;51:e03288. Published 2018 Mar 26. https://doi.org/10.1590/S1980-220X2016043303288

MS, Lo CB, Scherzer DJ, et al. The COVID-19 elective for pediatric residents: learning about systems-based practice during a pandemic. Cureus. 2021;13(2):e13085. https://doi.org/10.7759/cureus.13085 . Published 2021 Feb 2.

Pierce R, Fox J. Vodcasts and active-learning exercises in a flipped classroom model of a renal pharmacotherapy module. Am J Pharm Educ. 2012;76(10):196.

Bergmann J, Sams A. Remixing chemistry class. Learn Lead Technol. 2008;36(4):24–7.

Mehta NB, Hull AL, Young JB, Stoller JK. Just imagine: new paradigms for medical education. Acad Med. 2013;88(10):1418–23.

Ramnanan CJ, Pound LD. Advances in medical education and practice: student perceptions of the flipped classroom. Adv Med Educ Pract. 2017;8:63–73.

Norcini JJ, Blank LL, Duffy FD, Fortna GS. The mini-CEX: a method for assessing clinical skills. Ann Intern Med. 2003;138:476–81. https://doi.org/10.7326/0003-4819-138-6-200303180-00012

Zhang J, Xie DJ, Bao YC et al. The application of goal setting theory combined with PBL teaching mode in clinical teaching of neurology [J]. J Clin Chin Med. 2017;29(06):946–8. https://doi.org/10.16448/j.cjtcm.2017.0316 (Chinese).

Zhang J, Xie DJ, Huang XF, et al. The application of PAD combined with CBL teaching method in neurology teaching [J]. Chin Med Records. 2023;24(06):98–101..(Chinese).

Liu CX, Ouyang WW, Wang XW, Chen D, Jiang ZL. Comparing hybrid problem-based and lecture learning (PBL + LBL) with LBL pedagogy on clinical curriculum learning for medical students in China: a meta-analysis of randomized controlled trials. Med (Baltim). 2020;99(16):e19687. https://doi.org/10.1097/MD.0000000000019687

Wang Xiaodan T, Gangqin W, Suzhen, et al. Construction of the self-study ability assessment scale for medical students [J]. Chin J Health Psychol. 2014;22(7):1034–7. (Chinese).

Fang Bao. Analysis of the influencing factors on the effectiveness of the likert rating scale survey results [J]. J Shiyan Vocat Tech Coll. 2009;22(2):25–8..(Chinese).

Meici P, Guocheng W, Jile C, et al. Research on the reliability and validity of the critical thinking ability measurement scale [J]. Chin J Nurs. 2004;39(9):7–10.

YUSUF L, AHMED A, YASMIN R. Educational impact of mini-clinical evaluation exercise: a game changer[J]. Pak J Med Sci. 2018;34(2):405–11.

Zhou Tong X, Ling W, Dongmei et al. The application of the teaching model based on CDIO concept in the practice teaching of cardiovascular nursing. Chin Gen Med. 2022;(09):1569–72.

Li Q, Shuguang L. Analysis and reflection on the standardized training of 24 hour responsible physicians [J]. China Health Manage. 2016;33(5):374–6. (Chinese).

Kim S, Choi S, Seo M, Kim DR, Lee K. Designing a clinical ethics education program for nurses based on the ADDIE model. Res Theory Nurs Pract. 2020;34(3):205–222. https://doi.org/10.1891/RTNP-D-19-00135

Saeidnia HR, Kozak M, Ausloos M, et al. Development of a mobile app for self-care against COVID-19 using the analysis, design, development, implementation, and evaluation (ADDIE) model: methodological study. JMIR Form Res. 2022;6(9):e39718. Published 2022 Sep 13. https://doi.org/10.1891/RTNP-D-19-0013510.2196/39718

Aydin A, Gürsoy A, Karal H. Mobile care app development process: using the ADDIE model to manage symptoms after breast cancer surgery (step 1). Discov Oncol. 2023;14(1):63. https://doi.org/10.1007/s12672-023-00676-5 . Published 2023 May 9.

Paul A, Leung D, Salas RME, et al. Comparative effectiveness study of flipped classroom versus online-only instruction of clinical reasoning for medical students. Med Educ Online. 2023;28(1):2142358. https://doi.org/10.1080/10872981.2022.2142358

Du J, Chen X, Wang T, Zhao J, Li K. The effectiveness of the fully online flipped classroom for nursing undergraduates during the COVID-19: historical control study. Nurs Open. 2023;10(8):5766–5776. https://doi.org/10.1002/nop2.1757

Sierra-Fernández CR, Alejandra HD, Trevethan-Cravioto SA, Azar-Manzur FJ, Mauricio LM, Garnica-Geronimo LR. Flipped learning as an educational model in a cardiology residency program. BMC Med Educ. 2023;23(1):510. Published 2023 Jul 17. https://doi.org/10.1186/s12909-023-04439-2

Jamenis SC, Pharande S, Potnis S, Kapoor P. Use of mini clinical evaluation exercise as a tool to assess the orthodontic postgraduate students. J Indian Orthod Soc. 2020;54(1):39–43.

Devaprasad PS. Introduction of mini clinical evaluation exercise as an assessment tool for M.B.B.S. Interns in the Department of Orthopaedics. Indian J Orthop. 2023;57(5):714–717. Published 2023 Apr 9. https://doi.org/10.1007/s43465-023-00866-x

Hatala R, Marr S, Cuncic C, et al. Modifcation of an OSCE format to enhance patient continuity in a high-stakes assessment of clinical performance. BMC Med Educ. 2011;11:23.

Niu L, Mei Y, Xu X et al. A novel strategy combining Mini-CEX and OSCE to assess standardized training of professional postgraduates in department of prosthodontics. BMC Med Educ. 2022;22(1):888. Published 2022 Dec 22. https://doi.org/10.1186/s12909-022-03956-w

örwald AC, Lahner FM, Mooser B, et al. Influences on the implementation of Mini-CEX and DOPS for postgraduate medical trainees’ learning: a grounded theory study. Med Teach. 2019;41(4):448–56. https://doi.org/10.1080/0142159X.2018.1497784

Khalafi A, Sharbatdar Y, Khajeali N, Haghighizadeh MH, Vaziri M. Improvement of the clinical skills of nurse anesthesia students using mini-clinical evaluation exercises in Iran: a randomized controlled study. J Educ Eval Health Prof. 2023;20(12). https://doi.org/10.3352/jeehp.2023.20.12

Shafqat S, Tejani I, Ali M, Tariq H, Sabzwari S. Feasibility and effectiveness of mini-clinical evaluation exercise (Mini-CEX) in an undergraduate medical program: a study from Pakistan. Cureus. 2022;14(9):e29563. Published 2022 Sep 25. https://doi.org/10.7759/cureus.29563

Download references

Acknowledgements

The authors would like to thank all the faculty members of the Department of Neurology of the First Affiliated Hospital of Anhui University of Traditional Chinese Medicine for their support of the clinical teaching programme for standardized residency training.

This study was funded by the National Natural Foundation of China under the National Science Foundation of China (Grant No. 82274493) and Scientific Research Project of Higher Education Institutions in Anhui Province (Grant No. 2023AH050791).

Author information

Authors and affiliations.

Department of Neurology, The First Affiliated Hospital of Anhui University of Traditional Chinese Medicine, 117 Meishan Road, Hefei, Anhui, China

Juan Zhang, Xiaofeng Huang & Daojun Xie

The First Clinical Medical College of Anhui University of Chinese Medicine, Hefei, China

Hong Chen & Xie Wang

You can also search for this author in PubMed   Google Scholar

Contributions

JZ wrote the manuscript. JZ and HC collected the data. HC, XW, XH obtained and analysed the data. DX revised the manuscript for intellectual content. JZ confirmed the authenticity of all original data. All authors had read and approved the final manuscript.

Corresponding author

Correspondence to Juan Zhang .

Ethics declarations

Ethical approval and consent to participate.

All procedures performed in the study involving human participants were in accordance with institutional and/or national research council ethical standards and in accordance with the 1964 Declaration of Helsinki and its subsequent amendments or similar ethical standards. All participants signed an informed consent form. All experimental protocols were approved by the Ethics Committee of the First Affiliated Hospital of Anhui University of Traditional Chinese Medicine.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Zhang, J., Chen, H., Wang, X. et al. Application of flipped classroom teaching method based on ADDIE concept in clinical teaching for neurology residents. BMC Med Educ 24 , 366 (2024). https://doi.org/10.1186/s12909-024-05343-z

Download citation

Received : 26 September 2023

Accepted : 23 March 2024

Published : 03 April 2024

DOI : https://doi.org/10.1186/s12909-024-05343-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • ADDIE teaching model
  • Flipped classroom
  • Standardized training for residents

BMC Medical Education

ISSN: 1472-6920

case study of quality control

  • Case report
  • Open access
  • Published: 06 April 2024

Patients' perspectives on buprenorphine subcutaneous implant: a case series

  • Claudio Pierlorenzi 1 ,
  • Marco Nunzi 1 ,
  • Sabino Cirulli 1 ,
  • Giovanni Francesco Maria Direnzo 2 ,
  • Lucia Curatella 1 ,
  • Sandra Liberatori 2 ,
  • Annalisa Pascucci 2 ,
  • Edoardo Petrone 3 ,
  • Generoso Ventre 2 ,
  • Concettina Varango 4 ,
  • Maria Luisa Pulito 4 ,
  • Antonella Varango 4 ,
  • Cosimo Dandolo 4 ,
  • Brunella Occupati 5 ,
  • Roberta Marenzi 6 &
  • Claudio Leonardi 7  

Journal of Medical Case Reports volume  18 , Article number:  202 ( 2024 ) Cite this article

1 Altmetric

Metrics details

Considering the enormous burden represented by the opioid use disorder (OUD), it is important to always consider, when implementing opioid agonist therapy (OAT), the potential impact on patient’s adherence, quality of life, and detoxification. Thus, the purpose of the study is to evaluate how the introduction of a novel OAT approach influences these key factors in the management of OUD.

Case presentation

This article marks the pioneering use of OAT through buprenorphine implant in Europe and delves into the experience of six patients diagnosed with OUD at a relatively young age. The patients, comprising both males and a female, are of Caucasian Italian and African Italian ancestry (case 4) and exhibit an age range from 23 to 63, with an average drug abuse history of 19 ± 12 years. All patients were on stable traditional OAT before transitioning to buprenorphine implants. Despite the heterogeneity in social and educational backgrounds, health status, and drug abuse initiation histories, the case series reveals consistent positive treatment outcomes such as detoxification, absence of withdrawal symptoms and of side effects. Notably, all patients reported experiencing a newfound sense of freedom and improved quality of life.

Conclusions

These results emphasise the promising impact of OAT via buprenorphine implants in enhancing the well-being and quality of life in the context of OUD.

Peer Review reports

Introduction

Opioid use disorder (OUD) is a chronic, relapsing condition accounting for over 16 million people worldwide [ 1 , 2 ]. International guidelines recommend opioid agonist therapy (OAT) with sublingual buprenorphine or methadone as first-line treatments of opioid dependence [ 3 ]. However, the rates of oral OAT misuse, abuse, and diversion are of public concern due to their social, sanitary, and economic repercussions [ 4 , 5 ]. Additionally, patient adherence to oral OAT remains a challenge nowadays.

Little research has been carried out about strategies to support long-term remission from opioid dependence [ 6 , 7 ]. An implantable formulation of buprenorphine has been developed to address problems with adherence, diversion, and non-medical use [ 6 ]. The rod-shaped implant consists of a mixture of a polymeric ethylene vinyl acetate matrix and buprenorphine that, following an initial pulse release, delivers a constant and stable medication level over 6 months after a single procedure [ 8 ].

Buprenorphine implant has shown its effectiveness in placebo-controlled studies [ 6 , 8 , 9 ] displaying a significant reduction of the opioid abuse (percentage of opioid-negative urine samples: 36% in implant group vs. 14.4% in placebo) and percentage of participants who completed the study [ 8 ]. As compared with standard sublingual buprenorphine or buprenorphine/naloxone tablets, the implant showed comparable efficacy and adverse event rate [ 6 , 8 ]. A systematic benefit-risk assessment, based on a semiquantitative analysis of the available data, found a favourable profile for buprenorphine implant in comparison to sublingual buprenorphine [ 10 ]. The main benefits identified for buprenorphine implant included: improved compliance and convenience, reduced risk of illicit opioid abuse, quality of life, and risk of misuse/diversion. On the other hand, risks were mostly associated with the insertion and removal procedure. The benefits mentioned so far outweighed the risks [ 10 ]and long-acting buprenorphine implants appears to sustain the long-term remission of patients suffering from OUD [ 10 ].

This article describes a series of patients with OUD who received OAT through buprenorphine implant, marking the pioneering cases at the European level. Each case report provides a comprehensive narrative, encompassing the patients' history and clinical progression, starting from the initiation of drug abuse to the subsequent outcomes (in terms of detoxification, absence of withdrawal symptoms, side effects and improved quality of life) achieved with buprenorphine implant.

Case report 1

Clinical case description.

The patient is a 54-year-old male of Caucasian Italian ancestry with lower secondary education. The patient, the youngest child of 5, experienced the tragic loss of a brother at the age of 17 due to an accident, and the father passed away 38 years ago from gastric haemorrhage. The mother, who is still alive and in fair health, works from home as a seamstress. The patient lives with his mother, but often sleeps away from home because of work. He engaged in a romantic relationship, including cohabitation, which lasted for a few years. Ultimately, at the conclusion of this period, he returned to live with his mother and stated: “ I was not in the right state of mind… Who wants to be with me? I'm never at home… and then I'm fine like this ”. He worked as a welder for a brief period. At the age of 18, he started working as a courier and then as a truck driver for third parties, constantly moving around Italy. Currently, he continues to work as a truck driver, but on his own account.

Medical history

The patient reports having contracted common childhood exanthems and undergoing a splenectomy due to a car accident in his 20s, followed by hemotransfusions. In 1986, he was diagnosed with chronic HCV hepatitis (it is unclear whether it was related to drug addiction), classified as G4, F2-related, and was treated with glecaprevir/pibrentasvir.

Toxicological history

At the age of 23, the patient began his journey with drugs by abusing intravenous heroin, cocaine, and alcohol (in the latter case, moderately). He was referred in April 1991, based on Article 75, to the Addiction Service of Lodi by the Prefecture of Piacenza. Two months later, the patient started OAT with a daily dose of 50 mg of methadone. From the age of 23 to 27, the patient exhibited very oppositional behaviour: he was lying, provocative, sometimes aggressive and threatening. During that period, the patient began and interrupted several therapeutic programmes.

Traditional opioid agonist therapy

In July 2004, the first contact with our Addiction Service occurred. The patient began therapy with sublingual buprenorphine at 8 mg/day in increments, but he never completed the scaling. In this regard, in 2013, we read in the clinical diary: “He is not able to disengage from buprenorphine despite remaining abstinent from drugs for some months” . The patient continued with sublingual buprenorphine 2 mg/day until May 2018, at which point he transitioned to a dosage of 2 mg every other day. The patient maintained this regimen until June 2022.

  • Buprenorphine implant

In May 2022 we proposed the subcutaneous buprenorphine implant treatment to the patient, as he appeared to align with the characteristics of the ideal patient. He showed immediate interest and accepted. The selection was based on his consistent use of 2 mg sublingual buprenorphine every other day over the years, prolonged negative drug tests, frequent business-related travel as a lorry driver, and the logistical challenge of attending the Addiction Service every weekend (which also involved transfers to various Services). Furthermore, the patient expressed a desire to avoid encounters with other users at the Addiction Service with whom he no longer wished to share experiences.

In August 2022, the implant surgery was conducted for the patient.

Follow up visits

Throughout the six months of treatment, the patient underwent several visits, including monthly and sometimes fortnightly follow-ups. A urine toxicology check was performed every two weeks, consistently yielding negative results. The patient did not encounter any issues with the implanted arm site, finding it easy to use. He reported a notable absence of the fluctuations ("spikes") experienced with tablet intake, a diminished taste for cigarettes, and a complete lack of cravings for drugs. He expressed satisfaction with his choice but recommended the buprenorphine implant primarily for individuals aiming to cease the use of drugs of abuse. In his perspective, the implant may seem "a bit light" and more suitable for those seeking complete abstinence rather than those intending to remain on agonist therapy. The patient did not have interviews with the psychologist due to work-related commitments.

The organisation and management of the patient’s surgery proceeded smoothly. The patient was consistently monitored through visits, urine tests, and phone calls, especially during his business travels. The psychophysical condition of the patient has always been good, and the patient also observed a stabilisation in his nightly rest. In February 2023, the patient removed the device after the 6-month period, expressing great satisfaction with the experience. Subsequently, the patient did not encounter any issues and did not require buprenorphine/naloxone. In fact, the patient conveyed the intention to abstain from a second implant and forgo further OAT because he felt well. During the months with the implant, he successfully distanced himself from addiction after many years.

Case report 2

The patient is a 63-year-old man of Caucasian Italian ancestry who underwent treatment at the Medical Toxicology Department in Florence. He is a former addict, having maintained abstinence for over 30 years from heroin and methadone. After an extended period of traditional OAT with sublingual buprenorphine, he consistently expressed his desire to discontinue this treatment. Subsequently, the patient was presented with the option of a buprenorphine implant, which he accepted with the goal of achieving detoxification as the dosage in the subcutaneous implant is depleted by the end of the 6th month.

The patient's family history includes a hypertensive mother who died in 2010at the age of 86, a father who died at the age of 89, and an older sister in apparent good health. Throughout his life, the patient has experienced chronic hypoxia, maintained a low body mass index (BMI), and displayed regular diuresis and bowel function. Employed as an office worker, he grapples with insomnia and smokes approximately 15 cigarettes daily. Since the 1990s, the patient has tested positive for Hepatitis C (HCV). In 2008, he was diagnosed with renal heteroplasia on the right side, necessitating surgical exeresis. In 2010, a fracture of the right distal condyle of the femur occurred, prompting surgical intervention. From 2017 onward, the patient has been under the surveillance of the Systemic Manifestations of Hepatitis Virus Centre (MASVE), where he was diagnosed with cryoglobulinemia. Successful HCV eradication measures were undertaken.

The patient began illicit drug abuse in 1978 at the age of 19, with heroin being the primary substance of abuse. Of note, around the age of 30, the patient underwent a period of community day care. Concomitantly, he has consistently used and continues to use cannabinoids. Currently, the patient has been abstinent from heroin use for about 30 years.

From 1982 to 2007, the patient received treatment at the Medical Toxicology Department of the regional reference centre with methadone for heroin use disorder. Subsequently, he underwent OAT with sublingual buprenorphine until October 2022 (Table  1 ), at which point he transitioned to buprenorphine implant therapy.

Psychological aspects prior to buprenorphine implant

The patient exhibits compensated histrionic traits without psychosocial relapse. He is also characterised by an anxious temperament but maintains an on-axis mood [ 8 ]. The acceptance of this treatment stems from the desire for increased freedom, as it eliminates the need for frequent visits to the facility for sublingual buprenorphine, with the ultimate goal of achieving definitive detoxification.

At the time of implantation, the patient was on 8 mg sublingual buprenorphine agonist therapy. The patient underwent subcutaneous implant surgery in October 2022. The implantation was performed at the Vascular Access Centre Unit, Department of Anaesthesia and Resuscitation AOUC (for a comprehensive outline of the procedure, please refer to Additional file 1 : Appendix SI). Except for the initial days when the patient experienced mild withdrawal symptoms and a minor infection at the implant site, promptly addressed with antibiotics, the patient expressed overall satisfaction and happiness with the decision made.

Follow-up visits

The patient engaged in numerous follow-up visits, during which evaluations were performed to assess both physical and psychological outcomes. The Clinical Opiate Withdrawal Scale (COWS) score was employed throughout these visits to monitor the patient's withdrawal symptoms and general well-being (Table  2 ). The COWS categorical score ranges are defined as follows: no withdrawal (0–4), mild (5–12), moderate (13–24), moderately severe (25–36), and severe withdrawal (> 36) [ 11 , 12 ].

The removal of the implant, initially planned at the latest after 7 months from insertion, was delayed by a few months at the patient's request. The patient underwent monitoring of buprenorphine blood levels, which showed a slow decline in values, maintaining excellent toxicological compensation. The removal procedure was scheduled for the July 17, 2023, at the Vascular Access Centre Unit of the AOUC, but it was unsuccessful. After 2 h, the removal intervention was interrupted, and the patient was directed to ultrasound and MRI examination, which allowed visualization of the implants in the subfascial space in the brachial biceps muscle of the left arm instead of subcutaneous space. Following a thorough orthopaedic consultation, it was decided to forgo surgical intervention due to the patient's asymptomatic clinical presentation. Instead, the plan is to monitor the progress through semi-annual follow-ups. As of now, no complications have been identified.

The patient consistently reported minimal withdrawal symptoms and no significant cravings throughout the follow-up period with an excellent toxicological compensation. Furthermore, the patient expressed overall satisfaction with the subcutaneous implant, emphasizing its positive impact on mood, anxiety levels, and sleep patterns. Despite the initial challenges in the removal procedure, the patient's clinical presentation remains asymptomatic, contributing to the overall success of the buprenorphine implant treatment.

Case report 3

The patient, a 55-year-old woman of Caucasian Italian ancestry, was admitted to a psychiatric clinic in 2012 with a diagnosis of “depressive syndrome in a patient suffering from bipolar disorder, diffuse polyarthralgias and resumption of alcoholism”. She has been consistently under the care of a trusted psychiatrist since then.

Her primary substance of abuse was heroin until the late 1990s, followed by the development of alcohol use disorder. Alcohol consumption persisted over the years with long periods of remission and brief relapses mainly in a binge-like manner. Due to her history, the patient had been actively engaging with the Alcohol Centre and participating in self-help groups. She has been abstinent from alcohol consumption since 2021 and from heroin for over 20 years.

Since 2004, the patient has been undergoing OAT with buprenorphine (Table  3 ), and during this period, she has also been consistently receiving stable and concurrent psychopharmacological therapy. The patient had repeatedly expressed interest in discontinuing OAT, thus at the end of 2022 she was offered the option of using a buprenorphine implant. The proposed plan involved utilizing the implant for a duration of either 6 or 12 months, contingent on the patient's decision to pursue or decline a second implant at the conclusion of the initial period. This approach aimed to facilitate the detoxification process. At the time of the decision, the patient was in good compensation from a psychiatric and toxicological point of view.

The patient underwent subcutaneous implant surgery in February 2023. For the detailed procedure, please refer to Additional file 1 : Appendix SII. The patient did not show any signs of withdrawal or overdose in the days following implantation.

Psychological aspects following the buprenorphine implant

Generally, the patient considers herself satisfied and happy with the choice made. Moreover, the World Health Organization Quality of Life – BREF (WHOQOL-BREF), a self-report questionnaire assessing quality of life [ 13 ], was administered to the patient. Her assessment yielded the following scores: physical health = 21 (scale range: 7–35), psychological health = 23 (scale range: 6–30), social relationships = 10 (scale range: 3–15), and environment = 27 (scale range: 8–40).

During the follow-up visits, the patient’s physical and psychological state were assessed, and the COWS score was employed to evaluate withdrawal symptoms and general well-being (Table  4 ).

Throughout the observation period, the patient displayed overall well-being. However, as the removal procedure approached, she experienced mild anxiety, which was successfully managed with low doses of sublingual buprenorphine. The clinician notes that the patient's overall progress indicates a positive response to the buprenorphine implant treatment, showcasing effective control over withdrawal symptoms and cravings. The patient herself expresses satisfaction with her experience.

Case report 4

The patient, a 53-year-old male of African Italian ancestry, reported that his initial exposure to drugs, particularly THC, occurred around the age of 12. Subsequently, following the dissolution of his marriage, he had encounters with cocaine and later opioids, leading to the development of addiction.

After a period spent abroad, the patient returned to Italy in 2002 and sought treatment from various Addiction Services, where he began treatment with methadone. Approximately four years ago he transitioned to OAT with sublingual buprenorphine. Upon admission to our Service in May 2022, his therapy consisted of sublingual buprenorphine 6 mg + sublingual naloxone 1.5 mg per day.

During the meetings, the patient consistently demonstrated willingness and motivation. While his language was partially fluent, there were occasional interruptions attributed to difficulties in recalling certain phases of his life history. He showed spontaneity and did not need to be triggered to express himself, showing reflexivity and ability to contextualise. Adequate introspection and the absence of emotional blocks related to traumatic experiences were evident. The patient exhibited an internal locus of control and a sense of self-efficacy overall. From the behavioural point of view, within the service and with the clinical staff, we can highlight a good adherence to the indications given and to the scheduled appointments, and a good general compliance.

Due to pharmacological stability for over 5 years and restricted drug use limited to cannabinoids, the patient was deemed eligible for the buprenorphine implant, meeting the psychosocial inclusion criteria. Following the proposal, he exhibited heightened curiosity about the implant, experiencing a sense of "euphoria" in anticipation of this novel experience. His interest increased during the presentation of the implant procedure, which he quickly accepted. The impetus to accept the proposal stemmed from some of the patient's reflections, especially regarding the potential for a lifestyle change and the reclamation of "his time," envisioning more opportunities for hobbies, family, and travel. Moreover, he imagined the recovery and achievement of life goals linked both to everyday life and to the possibility of planning without “personal” constraints of time and organisation. Eventually, some reflections "almost of tiredness" emerged, referencing both to the regular visits to the Addiction Service and to the interactions with other service users. This weariness stemmed from the perceived hindrance of traditional OAT, seen as a substantial impediment to daily freedom due to the commitment required for therapy. Additionally, it extended to the challenges in achieving personal and life goals.

Since this was the first buprenorphine implant carried out at our facility, it was necessary to draw up a procedure, and have it approved by the Health Management. This protocol encompassed the establishment of a dedicated outpatient file and the provision of a specialized room, serving both for the surgical procedure and for consultations with prospective candidates, some of whom were referred from other Addiction Services.

The patient exhibited a comprehensive shift in mood, a heightened inclination toward openness with others, and a rejuvenated approach to life planning. Following the implant procedure, the patient demonstrated improved speech fluency attributed to heightened introspective abilities. He identified the socio-affective dimension as the most significant element in the initiated change, leading to increased stability on the affective level. This translated into a newfound capacity to navigate relationships with more meaningful and secure emotional grounding. Moreover, the initial days following the implant marked a shift in self-perception and how the patient was perceived by others. The awareness of the significant impact of the intervention on his life became apparent, bringing about a rediscovery of energy, an enhanced "esprit de vivre", and a transformation in interpersonal relations with the Addiction Service staff. Overall, a newfound optimism and fortitude was evident.

During the post-implant interviews, the patient was subjected to a patient-reported outcome (PRO) measure using a visual analog scale (VAS) to capture the severity or other aspects of craving. A VAS measure usually requires participants to indicate their response by marking a point on a 100-mm line, with the extremities represented by 0 as "no craving for heroin" and 100 as "absolute craving for heroin" [ 12 , 14 ]. At the follow-up the patient reported a “lack of craving” in terms of intensity and frequency, and he also denied the possibility of starting drug use in the event of experiencing craving. Throughout the course of treatment, the patient underwent weekly visits during the first month, followed by fortnightly visits in the second month, and eventually transitioning to monthly visits. Toxicological tests were conducted during these visits to monitor the patient's progress. No additional sublingual buprenorphine tablets or other drugs were necessary. Out of 11 toxicological tests carried out on urine samples, 2 were negative for all the substances sought. All other tests showed positivity for cannabinoids; this was consistent with the patient’s reported reduced daily use of THC before going to sleep.

From the outset, the patient expressed a reluctance to pursue a second implant, although he did not entirely rule out the possibility. As a result, the decision was made to defer the removal of the implant, allowing for close follow-up to monitor any changes. If needed, oral therapy could be resumed while awaiting a potential second implant to be grafted. In line with the patient's preferences and the agreement with healthcare providers, the implant remained in place beyond the initially planned sixth month. This extension allowed the patient additional time to contemplate the option of a second implant while ongoing urine buprenorphine screening, toxicological monitoring, and regular interviews were conducted. The removal was originally scheduled for the end of the seventh month. However, due to the patient's unavailability, primarily driven by severe personal reasons, the removal was subsequently postponed by two weeks. As of today, the removal procedure has been successfully performed and the patient exhibits a complete absence of craving and no desire to use substances. During the last interview the patient reported: “Every day I feel better!” .

Overall, the patient has experienced significant improvements in mood, interpersonal openness, and life planning. Additionally, there appears to be a reduction in THC use. Remarkably, even after the removal of the implant, the patient has not reported any cravings related to substance use.

Case report 5

The patient, a 40-year-old male of Caucasian Italian ancestry university graduate currently in permanent employment, initiated drug experimentation around the age of 20. In this period, he became fascinated with and started attending rave parties, leading to gradual experimentation (reported as "controlled") with illicit drugs including Afghan opium, eventually resulting in the development of an addiction disorder.

The patient’s toxicological history indicated a pattern of polyaddiction, involving the use of cannabinoids, particularly hashish, since the age of 18. At the age of 21, he began attending rave parties, engaging in simultaneous and occasional consumption of various drugs such as cocaine, MDMA, amphetamines, LSD, and Ketamine. Subsequently, the patient transitioned from regular opium use to heroin after approximately two years.

He continued his substance use until the age of 25, at which point he initiated treatment with sublingual buprenorphine at an Addiction Service. Upon admission, he had a diagnosis of OUD in protracted remission under treatment with partial OAT (buprenorphine in combination with naloxone), and concomitant depression. Throughout the course of treatment, the patient maintained a steady intake of buprenorphine/naloxone sublingual tablets at a fixed dosage of 2 mg/0.5 mg per day. Since his initial admission, he consistently reported challenges in discontinuing OAT. Specifically, he mentioned being able to refrain from the medication for a few days (up to a maximum of 4 days). However, with the onset of anxiety and intensified cravings for buprenorphine, the patient resumed his daily intake of 2 mg. Since initiating OAT, the patient reported abstinence from opium or heroin use. Despite maintaining a stable clinical picture, the presence of recurrent unsuccessful attempts to discontinue OAT prompted consideration for transitioning from sublingual to subcutaneous therapy. In August 2022, during a toxicology interview, the possibility of buprenorphine implant therapy was proposed to the patient.

In conjunction with the pharmacological aspect of the new therapy, the patient concurrently received treatment with specific antidepressants. Additionally, he has actively participated in individual psychotherapy for a duration of two years and is presently engaged in group psychotherapy. The patient promptly made himself available and demonstrated willing adherence to the instructions provided by the medical staff, consistently attending his scheduled appointments. Notably, he exhibited overall good mentalisation and fair self-esteem.

The patient initially exhibited moderate curiosity during the first interview introducing the buprenorphine implant. However, his interest in the proposed treatment escalated swiftly. This interest and curiosity stimulated thoughts about the prospect of embarking on a new lifestyle. Throughout the interviews, he conveyed that he embraced the proposal due to tiredness from the constant mood swings induced by traditional OAT, which required daily visits to the facility. These factors, coupled with other personal considerations, amplified his discomfort with commitment, hindering the overall pursuit of life goals.

After establishing a dedicated room at our facility, the patient was directed to the Addiction Service, where an external doctor from the hospital conducted the implant surgery. Following the surgery, we maintained continuous monitoring through both group psychotherapy and individual therapy sessions.

Throughout the course of treatment, the patient initially underwent weekly visits during the first month, followed by fortnightly visits in the second and third month, and eventually transitioning to monthly visits. During these regular check-ups, the patient underwent toxicological controls, and notably, no additional sublingual buprenorphine tablets were required.

After the implant procedure, the patient experienced mood stabilization, which he described as surprisingly positive. This positive change was openly shared by the patient within the therapy group. He demonstrated introspective ability, albeit stereotyped, aligning with the ideological and social models of his peer group. Following the implant there appeared to be a recognition of subjective aspects that he had not previously explored, potentially serving as a foundation for renewed self-awareness. Moreover, the patient exhibited rich and articulate language, along with good introspective and self-reflective ability, fair insight, and a proficient recall of his life history. Shortly after the implant, he conveyed his sense of liberation in an email, stating: "…I am a free man…" . In a group session, he elaborated on this feeling, expressing that he now perceives himself as "like everyone else," no longer dependent on the daily tablet, and experiencing mood fluctuations akin to any other individual.

The patient has been undergoing treatment for several months and reported only experiencing a headache in the initial days following the implant. Toxicological controls indicate positivity only for THC, as the patient has consistently used cannabinoids by smoking a "joint" in the evening to relax before going to sleep, with no intention of discontinuing this habit.

In post-implant interviews, the patient underwent the VAS test and reported a "lack of craving" both in terms of intensity and frequency. Furthermore, he expressed no inclination to initiate drug use in the event of experiencing cravings.

From the outset, the patient has made it clear that he had no intention of pursuing a second implant. Although he does not rule out the possibility entirely, his hope is to attain complete liberation from OAT and, more broadly, from drugs. This suggests a reasonably sound capacity for judgment on his part. Hence, the decision was made to defer the removal of the initial implant, utilizing the gradual reduction of the drug, and assessing how best to support the patient on his journey towards detoxification.

The patient appears to be progressing well on the detoxification path, as evidenced by his expressed intention to refrain from further OAT after the removal of the implant. The patient's determination is a crucial factor in the success of the detoxification process. The absence of craving after the removal of the implant, along with the noted mood stabilization and positive treatment perception reported by the patient, are significant indicators contributing to the success of the patient's detoxification journey.

Case report 6

Filippo (fictitious name), a 23-year-old male of Caucasian Italian ancestry, reflects on his childhood, describing it as “normal”. His father is portrayed as a diligent worker, while his mother is characterized as a pragmatic and less sentimental woman. As an intelligent child, Filippo sensed the weight of the expectations his mother had placed on him. During the transition to middle school, he experienced a loss in friendships, became apathetic, distracted, and spent most of his time playing video games, rarely venturing outside. However, there was an improvement in his social life and academic performance during high school, which led Filippo to enrol in university, where he also initiated a romantic relationship with a girl.

In the summer of 2018, following his first year of university, Filippo started experiencing anxiety disorders, making it challenging for him to cope with his exams. Simultaneously, he found out that his girlfriend was using heroin and cocaine. In response, he decided to experiment with these substances. Initially, his usage was occasional and seemed "manageable", but Filippo rapidly developed both physical and mental addiction. Furthermore, he began using cocaine to counteract the effects of heroin. His drug abuse progressively escalated from occasional to daily, extending beyond social contexts to solitary moments. Filippo found himself trapped in a vicious cycle, marked by a constant need to soothe himself and promptly reactivate. His academic performance suffered, and financial resources were increasingly diverted towards substance abuse. Recognizing the severity of the situation, Filippo sought help from a psychiatrist-psychotherapist, who advised him to approach an Addiction Service. Although Filippo was not fully convinced, he perceived that seeking help was his only viable option. When he shared his predicament with his family, their initial response was a mix of anger and concern. However, that single conversation remained an isolated instance, and subsequently, they seemed to adopt an approach of denial, choosing not to acknowledge the reality of Filippo's struggles.

Filippo initiated his treatment at the Addiction Service in January 2020 with a dosage of 2 mg of sublingual buprenorphine. This regimen was subsequently increased to 4 mg after a few weeks. Notably, Filippo demonstrated commendable adherence to the treatment regimen, attending interviews regularly and concurrently engaging in private psychotherapy. He ceased his heroin use immediately after commencing OAT, and he also managed to discontinue cocaine, with only a few relapses in October 2020. Subsequently, Filippo experienced improvements in mood, school performance, and social interactions. However, his main concern revolved around the prospect of discontinuing the daily tablet intake.

In January 2022, after Filippo's previous doctor departed from the service, I had a clinical interview with Filippo. During this meeting, I suggested a questionnaire to assess the current state of his therapy and his interest in transitioning to newly available drug formulations. Filippo embraced the idea of transitioning to a subcutaneous buprenorphine implant with enthusiasm. Despite considering the possibility of balancing his personal life with regular visits to the Addiction Service, he expressed a keen interest in the new treatment. At that point, he had been on a 4 mg sublingual buprenorphine tablet regimen for approximately two years, and his toxicological tests consistently showed negative results for illicit drugs.

Filippo's excitement stemmed from several profound considerations: the weariness of identifying himself as an addict, a label that he felt no longer accurately portrayed his current state; the conscious desire to disengage from the daily ritual of medication, which he defined as a "substitute" for his previous heroin use, and thus corresponded to him as if still "getting high" every day; and the wish to regain control over his daily routine without being tethered to the demands of therapy, envisioning a future where he could plan vacations and travel abroad without the constraints of regular visits. Lastly, Filippo held a hopeful anticipation of achieving a definitive conclusion to his therapy, marking a significant milestone in his journey towards recovery. Despite receiving comprehensive information from the data sheet, Filippo's determination to pursue the subcutaneous buprenorphine implant treatment remained unwavering. He maintained a steadfast commitment to this choice, eagerly anticipating further details about the practicality and feasibility of undergoing the implant procedure. The Addiction Service practitioners collaborated closely with the hospital pharmacy and the Palliative Care operating unit to efficiently organize the day of the surgery. On the morning of the surgery, Filippo exhibited no signs of agitation. He adhered to the given directions and refrained from taking the morning sublingual buprenorphine tablet. Without experiencing any withdrawal symptoms, he maintained focus on the day's objective. The surgical procedure proceeded smoothly, lasting approximately an hour, after which Filippo proceeded to attend his university activities.

In the days following the surgery, Filippo reported a sustained, almost heightened sense of well-being, exceptional concentration (especially in his studies), and an energy level he had not experienced before. While there may have been a brief, two-day period resembling a hypomanic phase, Filippo soon returned to a stable and regular state of well-being, seamlessly resuming his daily activities. Despite being aware of the option to supplement the implant with buprenorphine tablets, Filippo never felt the necessity to do so. In agreement with the department director, we limited Filippo's visits to the Ser.D to the bare minimum needed to perform the monitoring required by the implant protocol. These included urine tests at various intervals post-intervention: 1 week, 2 weeks, 1.5 months, 3 months, 4.5 months, 6 months, and 7 months. During these visits, we assessed his overall health, general well-being, reactions at the implantation site, degree of patient satisfaction, and any withdrawal or craving symptoms, along with potential drug abuse.

In our regular phone interviews with Filippo, he would describe positive events that were taking place in his life. Approximately four months post-intervention, during an in-person interview, we delved into the impact of the implant on Filippo's lifestyle. A significant transformation was evident: his self-perception had undergone a complete shift. During the six-month period of the implant, Filippo encountered an emotional reconnection with his mother when he shared his experience with the subcutaneous treatment. Until then, his addiction had only been briefly mentioned within the family context, resulting in a negative outcome. This revelation left his mother surprised, astonished, and moved, but also visibly proud.

On a separate occasion, Filippo attended a party and unexpectedly spent the night away from home. He emphasized that he only realized the next day that such spontaneity would not have been possible without the implant. Without the need for daily tablets, he could participate freely without the fear of experiencing withdrawal symptoms the following morning. He no longer needed the “daily heroin substitute” and he no longer needed heroin. These and other episodes strengthened his conviction to “get rid” of therapy and of the fear of not being able to “walk without that crutch”. His determination grew, accompanied by the belief that the removal of the implant would mark the conclusion of his therapy. Filippo explicitly requested the removal of the implant not at the initially specified deadline but at a later time, and he duly signed a written request expressing this desire.

The implant removal occurred in mid-November 2022, precisely 7 months and 9 days after its initial placement. Despite Filippo was at the time a little tense, the removal proceeded smoothly. The urine test conducted at this time still showed a positive result for buprenorphine. In the subsequent days, Filippo experienced symptoms including chills, tearing, arthralgia, and asthenia. Initially attributing these symptoms to a form of flu without strong conviction, he persevered. After 20 days, despite lingering discomfort, his determination to discontinue oral OAT prevailed. The subsequent urine test confirmed the absence of buprenorphine, marking the achievement of Filippo's goal.

General discussion

The buprenorphine implant represents an innovative formulation for OAT, specifically designed for individuals with OUD who have achieved stabilization through prior oral therapy. Notably, the implant demonstrates equivalent therapeutic effectiveness and similar rates of adverse effects when compared to standard sublingual buprenorphine or buprenorphine/naloxone tablets [ 6 , 8 ]. Nonetheless, a comprehensive risk–benefit evaluation has revealed several advantages associated with the subcutaneous buprenorphine implant in comparison to conventional OAT [ 10 ]. These benefits include enhanced treatment adherence, improved quality of life for patients, decreased likelihood of engaging in illicit opioid abuse, and a reduced risk of misuse or diversion [ 10 ]. These findings have been validated through the experiences of the first six patients in Europe who underwent the buprenorphine implant, as outlined in this case series. The report provides insights into the tangible effects of the buprenorphine implant on patients' quality of life and the achievement of therapeutic objectives, specifically focusing on abstinence from illicit drug abuse and the detoxification process.

Eligible patients were carefully assessed by the medical equipe in terms of clinical, psychological, and pharmacological status. All patients had refrained from using illicit drugs, were receiving low-dose sublingual buprenorphine (≤ 8 mg), demonstrated adherence to OAT and regular visits to the Addiction Service, and exhibited psychological stability. The heterogeneity observed in this group of patients stemmed from variations in sociocultural background, gender, age, duration of substance abuse history, length of the period of drug abstinence, and the specifics of their medical and pharmacological history, including the duration, dosage, and any prior OAT before transitioning to buprenorphine (Table  5 ).

Buprenorphine implant emerges as a viable treatment alternative for diverse patient profiles, contingent upon achieving a certain level of pharmacological stability (≤ 8 mg), psychological well-being, and a documented recent history of drug withdrawal.

The reaction of the patients to the implant proposal ranged from moderate interest in some cases to genuine enthusiasm in others as delineated in Table  6 (Buprenorphine implant proposal). All patients embraced the buprenorphine implant to enhance their quality of life, eliminating the need for regular visits to the Addiction Service for the administration of tablets and moving closer to complete detoxification.

To assess the impact of both traditional OAT and buprenorphine implant, a semi-quantitative narrative analysis was conducted [ 15 , 16 , 17 , 18 ]. Every quote pertaining to patients' experiences with either treatment was considered in the analysis and subsequently categorised into one or more of the following topics: commitment to achieving complete detoxification, disengagement from therapy, smoothness of therapy, emotional impact, and improved quality of life in terms of free-time, finances, work, and interpersonal relationships (Fig.  1 ). Subsequently, the positive or negative valence associated with each statement was recorded. The implant was viewed as a valuable means to achieve abstinence from both drugs and medications, as evidenced by a total of 22 positive statements (Fig.  1 A), compared to 6 for traditional OAT (Fig.  1 B). The regular attendance at the Addiction Service was seen as a “constraint that disrupted daily routines” and “contributed to social stigma”, undermining patients' commitment to therapy and overall quality of life (7 out of 9 negative statements). The desire to break free from the daily tablet intake, perceived as a “substitute for heroin” and a source of mood swings, was a common sentiment. In contrast to the peaks associated with oral intake, the subcutaneous implant offered a stable release of buprenorphine, as evidenced by the 22 positive statements (Fig.  1 A) compared to 7 (Fig.  1 B) associated with traditional oral intake. This consistency helped in mitigating both physical and emotional fluctuations experienced by the patients.

figure 1

A narrative analysis of patients' reported experiences was conducted for both traditional OAT ( A ) and buprenorphine implant ( B ). The analysis was conducted by categorizing the statements related to each treatment into the five identified topics positioned at the vertices of the pentagon. The number of positive (blue line) and negative (red line) statements per topic were plotted along the direction of the corresponding vertex and connected by a 5-pointed closed line. The distance from the centre indicates the frequency of iterations. Notably, the scale of the pentagon differs between the two graphs

In terms of surgical procedure, the buprenorphine implant insertion was carried out in a specialised facility, by a professional surgeon, and no significant issues were encountered for any of the patients (Table  6 , Surgery outcome). Solely one patient developed a minor infection at the implant site, which was promptly addressed with antibiotics. He also reported a subjective feeling of overdose in the initial days post-insertion, followed by mild withdrawal symptoms, that were stabilised by a 3-day course of 1 mg sublingual buprenorphine. Consistent with findings from a previous study [ 19 ], the buprenorphine implant insertion procedure and the subsequent adaptation to treatment appear to be overall safe and well-tolerated.

During 6 months of follow-up, as outlined in Table  6 (Follow-up visits), the potential onset of withdrawal symptoms was closely monitored through regular assessments for most patients. Psychometric tests were also conducted to evaluate various aspects. Importantly, no patient reported experiencing cravings throughout the course of treatment, and all toxicological tests yielded negative results for the detection of illegal opioid abuse. All the patients expressed satisfaction with the buprenorphine implant treatment, and most of them reported being content with their decision, as indicated in Fig.  1 and Table  6 (End of therapy). On an emotional level, all patients reported a sense of well-being, with 18 positive statements compared to 3 positive statements for traditional OAT (Fig.  1 A, B), and no instances of relapse were noted. Half of the patients experienced increased lucidity, improved introspective ability, and greater stability on the affective level (Table  6 , End of therapy). The majority showed an on-axis mood (4 out of 6), absence of anxiety, hypnic pattern within limits, and restful sleep. Two out of six patients explicitly described a marked improvement in self-perception during the 6-month buprenorphine implant treatment. Overall, buprenorphine implant was perceived as a step closer to complete detoxification with 13 positive statements (Fig.  1 A) vs. 5 positive statements for traditional OAT (Fig.  1 B).

The removal procedure was successful for most of the patients, and none of them opted for a second implant. Solely one patient reintroduced sublingual buprenorphine at low dosages, although this decision was not prompted by any withdrawal symptom. Most importantly, none of the patients experienced craving episodes, indicating the potential for them to continue living without any OAT and ultimately achieve complete detoxification.

In summary, this case series explores the pioneering use of buprenorphine implant as a treatment option for OUD in a small European cohort of eligible patients. The findings suggest positive outcomes, including improved patient satisfaction and quality of life, reduced stigma associated with regular clinic attendance, and perceived advantages in achieving opioid abstinence. However, certain limitations must be acknowledged, including the small sample size, the relatively short follow-up period, and the reliance on self-reported questionnaires to evaluate patients’ perspective and experiences. The relatively small yet heterogenous sample size, while providing valuable insights into how various patient profiles might respond to this treatment approach, could affect the generalizability of the findings to a broader population. Moreover, the variability in the frequency and duration of follow-up visits, while enabling to capture the moderate-to-long term effects of the treatment, limits the ability to assess longer-term outcomes. Furthermore, the study's reliance on self-reported questionnaires while focusing on patients’ perspective, might introduce the possibility of response bias. This could include an inclination to offer responses that align with social expectations or recall biases. Therefore, in future studies the adoption of standardized assessment tools will ensure consistency and facilitate more robust cross-study comparisons. Future research should prioritise larger cohorts, encompassing comparative analyses with traditional OAT, and long-term investigations to assess sustained efficacy and diverse dynamics of patient profiles. Collaborative efforts to standardize assessment protocols across facilities would further strengthen the reproducibility of research findings in this evolving field.

Final conclusions

This case series outlines the therapeutic journey of the first six European patients who underwent buprenorphine implant therapy. The results demonstrate favourable outcomes, including successful opioid abstention, alleviation of withdrawal symptoms, and enhanced quality of life and psychological well-being. Importantly, the treatment exhibited a high level of safety and tolerability, with no significant adverse events reported during the peri-operative period. The smooth insertion procedure and subsequent adaptation highlight the consistent benefits of the implant, with most patients achieving complete abstention, a milestone that might have been challenging with traditional approaches. Overall, the patients' satisfaction with the buprenorphine implant underscores its potential as a viable treatment option for pharmacologically stable individuals seeking to transition from traditional OAT. Nevertheless, further research into patient profiles, craving dynamics, and patient-centred outcomes is essential for optimizing personalized interventions in the field of addiction medicine.

Availability of data and materials

The datasets used during the current study are available from the corresponding author on reasonable request.

Abbreviations

  • Opioid use disorder
  • Opioid agonist therapy

Hepatitis C Virus

Centro Manifestazioni Sistemiche Virus Epatitici, Systemic Manifestations of Hepatitis Virus Centre

World Health Organization Quality of Life-BREF

Tetrahydrocannabinol

Visual analog scale

Methylenedioxymethamphetamine

Lysergic acid diethylamide

Dydyk AM, Jain NK, Gupta M. Opioid Use Disorder. [Updated 2023 Jul 21]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024. https://www.ncbi.nlm.nih.gov/books/NBK553166/ .

Strang J, Volkow ND, Degenhardt L, Hickman M, Johnson K, Koob GF, et al . Opioid use disorder. Nat Rev Dis Primers. 2020;6(1):3.

Article   PubMed   Google Scholar  

Overview | Drug misuse in over 16s: opioid detoxification | Guidance | NICE [Internet]. NICE; https://www.nice.org.uk/guidance/cg52 . Accessed 1 Dec 2022.

Bell J, Strang J. Medication treatment of opioid use disorder. Biol Psychiatry. 2020;87(1):82–8.

Article   CAS   PubMed   Google Scholar  

Mannaioni G, Lugoboni F. Precautions in the management of opioid agonist therapy: from target population characteristics to new formulations and post-marketing monitoring—a focus on the Italian system. Drugs Context. 2023;12.

Rosenthal RN, Lofwall MR, Kim S, Chen M, Beebe KL, Vocci FJ, et al . Effect of buprenorphine implants on illicit opioid use among abstinent adults with opioid dependence treated with sublingual buprenorphine: a randomized clinical trial. JAMA. 2016;316(3):282–90.

Lagios K. Buprenorphine: extended‐release formulations “a game changer”! Med J Aust. 2021;214(11):534.

Rosenthal RN, Ling W, Casadonte P, Vocci F, Bailey GL, Kampman K, et al . Buprenorphine implants for treatment of opioid dependence: randomized comparison to placebo and sublingual buprenorphine/naloxone. Addiction (Abingdon, England). 2013;108(12):2141–9.

Ling W, Casadonte P, Bigelow G, Kampman KM, Patkar A, Bailey GL, et al . Buprenorphine implants for treatment of opioid dependence: a randomized controlled trial. JAMA. 2010;304(14):1576–83.

Osborne V, Davies M, Roy D, Tescione F, Shakir SAW. Systematic benefit-risk assessment for buprenorphine implant: a semiquantitative method to support risk management. BMJ Evid Based Med. 2020;25(6):199–205.

Article   PubMed   PubMed Central   Google Scholar  

Wesson DR, Ling W. The clinical opiate withdrawal scale (COWS). J Psychoactive Drugs. 2003;35(2):253–9.

Boyett B, Wiest K, McLeod LD, Nelson LM, Bickel WK, Learned SM, et al . Assessment of craving in opioid use disorder: psychometric evaluation and predictive validity of the opioid craving VAS. Drug Alcohol Depend. 2021;229: 109057.

World Health Organization. WHOQOL: Measuring Quality of Life [Internet]. 2012. https://www.who.int/tools/whoqol . Accessed 26 Jan 2024.

Goodyear K, Haass-Koffler CL. Opioid craving in human laboratory settings: a review of the challenges and limitations. Neurotherapeutics. 2020;17(1):100–4.

Nolte K, Drew AL, Friedmann PD, Romo E, Kinney LM, Stopka TJ. Opioid initiation and injection transition in rural northern New England: a mixed-methods approach. Drug Alcohol Depend. 2020;217: 108256.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lai J, Goldfine C, Chapman B, Taylor M, Rosen R, Carreiro S, et al. Nobody wants to be narcan’d: a pilot qualitative analysis of drug users’ perspectives on naloxone. Western J Emerg Med. 2021;22(2).

Meyer M, Rist B, Strasser J, Lang UE, Vogel M, Dürsteler KM, et al . Exploring why patients in heroin-assisted treatment are getting incarcerated—a qualitative study. BMC Psychiatry. 2022;22(1):169.

Scurti P, Nunzi M, Leonardi C, Pierlorenzi C, Marenzi R, Lamartora V. The experience of buprenorphine implant in patients with opioid use disorder: a series of narrative interviews. Front Psychiatry. 2023;31:14.

Google Scholar  

Frost M, Bailey GL, Lintzeris N, Strang J, Dunlop A, Nunes EV, et al . Long-term safety of a weekly and monthly subcutaneous buprenorphine depot (CAM2038) in the treatment of adult out-patients with opioid use disorder. Addiction. 2019;114(8):1416–26.

Download references

Acknowledgements

Realized with the unconditional support of L. Molteni & C. dei F.lli Alitti Società di Esercizio S.p.A.

Not applicable.

Author information

Authors and affiliations.

UOS Patologie da Dipendenza D9 ASL Roma 2, Rome, Italy

Claudio Pierlorenzi, Marco Nunzi, Sabino Cirulli & Lucia Curatella

UOC Patologie da Dipendenza D8 ASL Roma 2, Rome, Italy

Giovanni Francesco Maria Direnzo, Sandra Liberatori, Annalisa Pascucci & Generoso Ventre

UOS Terapia del Dolore ASL Roma 2 Ospedale S. Eugenio, Rome, Italy

Edoardo Petrone

Servizio Dipendenze Casalpusterlengo, ASST Lodi, Lodi, Italy

Concettina Varango, Maria Luisa Pulito, Antonella Varango & Cosimo Dandolo

Tossicologia Medica, Azienda Ospedaliero Universitaria Careggi, Florence, Italy

Brunella Occupati

ASST Papa Giovanni XXII, Ospedale Di Bergamo, Bergamo, Italy

Roberta Marenzi

Dipartimento Tutela Delle Fragilità ASL Roma 2, Rome, Italy

Claudio Leonardi

You can also search for this author in PubMed   Google Scholar

Contributions

CV, MLP, AV, and CD collected, analysed, and interpreted the patient data regarding the Case n.1. BO collected, analysed, and interpreted the patient data regarding Case reports 2 and 3. CL, SB, GFMD, LC, SL, MN, AP, EP, CP, and GV collected, analysed, and interpreted the patient data regarding Case reports 4 and 5. RM collected, analysed, and interpreted the patient data regarding the Case report 6. All authors participated in the writing process, reviewed, and approved the final version of the manuscript.

Corresponding author

Correspondence to Claudio Leonardi .

Ethics declarations

Ethics approval and consent to participate.

The study was performed according to the Declaration of Helsinki. Informed consent was obtained from patients prior their participation to the study.

Consent for publication

Written informed consent was obtained from the patients for publication of these case reports and any accompanying images. A copy of the written consent is available for review by the Editor-in-Chief of this journal.

Competing interests

The authors declared that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: case report 1..

Buprenorphine implant procedure.

Additional file 2: Case report 2.

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Pierlorenzi, C., Nunzi, M., Cirulli, S. et al. Patients' perspectives on buprenorphine subcutaneous implant: a case series. J Med Case Reports 18 , 202 (2024). https://doi.org/10.1186/s13256-024-04483-6

Download citation

Received : 24 May 2023

Accepted : 01 March 2024

Published : 06 April 2024

DOI : https://doi.org/10.1186/s13256-024-04483-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Sublingual buprenorphine

Journal of Medical Case Reports

ISSN: 1752-1947

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

case study of quality control

IMAGES

  1. A Case Study in Quality Control

    case study of quality control

  2. 😀 Quality case study. The Business Case For Quality: Case Studies And

    case study of quality control

  3. (PDF) Total Quality Management and Research Productivity-A Case Report

    case study of quality control

  4. case control study how to select controls

    case study of quality control

  5. (PDF) A CASE STUDY OF QUALITY CONTROL CHARTS IN A MANUFACTURING

    case study of quality control

  6. Quality control

    case study of quality control

VIDEO

  1. Cp and cpk Process Capability Study Quality

  2. Week 8 : CASE CONTROL STUDY

  3. Programación de las reglas de control para optimizar la detección del error y minimizar el falso rec

  4. Uygulamalar#1-Type1 Gage Study-Minitab

  5. SAP S4Hana Tutorial/Follow-Along -- QM Case Study Step 16 updated

  6. SAP S4Hana Tutorial/Follow-Along -- QM Case Study Step 8 updated

COMMENTS

  1. Case Studies

    Search more than 1,000 examples of case studies sharing quality solutions to real-world problems. Find more case studies Featured Case Studies Classic Case Studies ... The laboratory helps the U.S. Department of Defense establish quality control requirements when testing for chemical warfare agents (CWA). In preparation for accreditation, EML ...

  2. Case Study: Quality Management System at Coca Cola Company

    The Quality Management System at Coca Cola. It is very important that each product that Coca Cola produces is of a high quality standard to ensure that each product is exactly the same. This is important as the company wants to meet with customer requirements and expectations. With the brand having such a global presence, it is vital that these ...

  3. Quality management

    Organizational Development Magazine Article. Ben Gerson. Janet Parker. Eugene Volokh. Jean Halloran. Michael G. Cherkasky. Simon Pemberton, a materials chemist at Applied Devices, is escorted by ...

  4. A Case Study on Improvement of Outgoing Quality Control Works for

    This paper discuss the improvement of outgoing quality control works for manufacturing product. There are two types of part was selected for this case study which are huge and symmetrical parts ...

  5. (PDF) Toyota Quality System case study

    T oyota Quality System case study. Introduction. T oyota from the early 1960s alongside their supplier network consolidated the way in which. they were able to refine their production system ...

  6. Quality control review: implementing a scientifically based quality

    Since publication in 2003 of a review 'Internal quality control: planning and implementation strategies,' 1 quality control (QC) has evolved as part of a comprehensive quality management system (QMS). The language of quality today is defined by International Standard Organization (ISO) in an effort to standardize terminology and quality management practices for world-wide applications.

  7. Smart quality control in pharmaceuticals

    The smart quality approach allows pharma companies to deploy these technologies and to integrate their quality controls in development and manufacturing. 1 (see sidebar, "Smart quality at a glance"). Well-performing manufacturing facilities have started to create paperless labs, optimize testing, automate processes, and shift testing to the ...

  8. Smart quality assurance approach

    Case study. Healthcare companies can use smart quality to redesign the quality management review process and see results quickly. At one pharmaceutical and medtech company, smart visualization of connected, cross-functional metrics significantly improved the effectiveness and efficiency of quality management review at all levels.

  9. Quality: Articles, Research, & Case Studies on Quality

    by by Jim Heskett. A new book by Gregory Clark identifies "labor quality" as the major enticement for capital flows that lead to economic prosperity. By defining labor quality in terms of discipline and attitudes toward work, this argument minimizes the long-term threat of outsourcing to developed economies.

  10. Total quality management: three case studies from around the world

    According to Teresa Whitacre, of international consulting firm ASQ, proper quality management also boosts a company's profitability. "Total quality management allows the company to look at their management system as a whole entity — not just an output of the quality department," she says. "Total quality means the organisation looks at ...

  11. The Power of 7 QC Tools and PDCA: A Case Study on Quality ...

    Conclusion: The 7 QC tools, when combined with the systematic approach of the PDCA cycle, can yield impressive results, as demonstrated in our case study. Addressing quality issues requires a ...

  12. Case Study: Amazon's Approach to Tackling Quality Control ...

    Quality control is a crucial aspect of any e-commerce platform, and Amazon, being one of the largest online marketplaces, has faced significant challenges regarding counterfeit and low-quality ...

  13. Pharmaceutical Quality Control Case Studies

    Shiseido is a globally-renowned manufacturer of cosmetics and fragrances. The company carries out intensive research and development activity on a continual basis, bringing multiple new product references to market every year. In this case study we discuss why Shiseido began using CHEMUNEX® for quality control of bulk raw materials and in ...

  14. The World Series of Quality Control: A Case Study in the Package

    Extant work groups were paired in daily games in which points were scored based on company quality control measures. Fourteen teams were divided into leagues and divisions leading to playofts and a world series. The program demonstrated quality increases, prompting management to replace the local program with a regional competition between ...

  15. A Case Study of Quality Control Charts in A Manufacturing Industry

    International Journal of Science, Engineering and Technology Research (IJSETR), Volume 3, Issue 3, March 2014 A CASE STUDY OF QUALITY CONTROL CHARTS IN A MANUFACTURING INDUSTRY Fahim Ahmed Touqir1, Md. Maksudul Islam1, Lipon Kumar Sarkar2 1,2 Department of Industrial Engineering and Management 1,2 Khulna University of Engineering & Technology II.

  16. Case Studies of Quality Improvement Initiatives

    Case Studies of Quality Improvement Initiatives. Over a five-year period the CAHPS RAND Team prepared a series of case studies of quality improvement initiatives undertaken by health plans and health care organizations. The studies provide practical examples of efforts to improve performance on various aspects of patients' experience of health ...

  17. Validation of risk-based quality control techniques: a case study from

    Quality control is an outstanding area of production management. The effectiveness of applied quality control methods strongly depends on the performance of the measurement system. ... This paper proposes a case study from the automotive industry regarding the application of risk-based conformity control and risk-based control charts ...

  18. Using machine learning prediction models for quality control: a case

    This paper studies a prediction problem using time series data and machine learning algorithms. The case study is related to the quality control of bumper beams in the automotive industry. These parts are milled during the production process, and the locations of the milled holes are subject to strict tolerance limits. Machine learning models are used to predict the location of milled holes in ...

  19. Sample Quality Control Case Studies

    Case Studies Highlight Why Researchers Use Automated Electrophoresis Solutions. Many scientists utilize Agilent automated electrophoresis systems for sample quality control (QC) in various application workflows, including next-generation sequencing (NGS), biobanking, fragment analysis, and nucleic acid vaccine development.

  20. Study Quality Assessment Tools

    For case-control studies, it is important that if matching was performed during the selection or recruitment process, the variables used as matching criteria (e.g., age, gender, race) should be controlled for in the analysis. General Guidance for Determining the Overall Quality Rating of Case-Controlled Studies

  21. A case study evolving quality management in Indian civil ...

    The present research examines a wide range of civil engineering projects across India, each providing a distinct platform for investigating quality management, automation techniques, and improvement activities using artificial intelligence (AI) techniques. The study covers projects demonstrating the variety of India's civil engineering undertakings, from the Smart City Mission to the Mumbai ...

  22. Quality Control (QC) Case Study

    Quality control time without drive time: 5 to 10 minutes; Quality control time with drive time: 15 to 20 minutes; Golden Rule Auto Care's average: 12 minutes; KPI or goal for Golden Rule Auto Care's QC issues per month: 10% or less; This year, they have run as high as 25% and as low as 9% per month. Most common issue found at Golden Rule Auto Care: grease on handle, door panel, seat ...

  23. Water quality enhancement through vertical and horizontal roughing

    Gravel was selected as the control medium due to its widespread use in roughing filters. The filter also effectively removes iron, manganese, and color. The study results suggest that horizontal-flow roughing filtration could be a cost-effective and efficient pre-treatment process, particularly when surface water is utilized as the water source.

  24. Effect of housing construction material on childhood acute ...

    A hospital-based case-control study was conducted, involving 221 cases and 221 controls from January to April 2023. ... The recommended policy of this study is to replace the poor-quality ...

  25. Using machine learning prediction models for quality control: a case

    The case study is related to the quality control of bumper beams in the automotive industry. These parts are milled during the production process, and the locations of the milled holes are subject to strict tolerance limits. Machine learning models are used to predict the location of milled holes in the next beam.

  26. Impact of the No-Driving Day Program on Air Quality in a High-Altitude

    This study addresses the pressing issue of urban air pollution impact, emphasizing the need for emissions control to ensure environmental equity. Focused on the Toluca Valley Metropolitan Area (TVMA), this research employs air quality modeling to examine ozone, sulfur dioxide, nitrogen dioxide, and carbon monoxide concentrations during three different periods in 2019.

  27. Application of flipped classroom teaching method based on ADDIE concept

    In addition, the observation group also had higher scores compared to the control group regarding the quality of teaching in this study (P < 0.001). Previous studies have shown that the ADDIE concept can be applied to the design of clinical ethics education programmes and can be an effective tool for healthcare education, providing an ...

  28. Patients' perspectives on buprenorphine subcutaneous implant: a case

    Background Considering the enormous burden represented by the opioid use disorder (OUD), it is important to always consider, when implementing opioid agonist therapy (OAT), the potential impact on patient's adherence, quality of life, and detoxification. Thus, the purpose of the study is to evaluate how the introduction of a novel OAT approach influences these key factors in the management ...