Laboratories have a major impact on patient safety, as 80 – 90% of all diagnoses are made on the basis of laboratory tests (1). Laboratory errors have a frequency of 0.012 – 0.6% for all test results (2). A series of regulatory requirements and practice guidelines have been introduced to guide the establishment and continuous improvement of the quality management system to reduce the risk of the total testing process (3-5).
Failure mode and effects analysis (FMEA), one of the most proactive methods of risk management, has been accepted as the method of choice in the identification of potential points of failure within a process, their effects being determined and action identified for mitigating failures (6). The first step in FMEA is to identify all potential possible failure modes of the product or system. After that, critical analysis is performed on these failure modes and the risk priority number (RPN) is calculated by the multiplication of the occurrence (O), severity (S) and detection (D). Finally, the failure modes can be ranked and then proper actions will be preferentially taken on the high-risk failure modes (7). The application of the FMEA tool is consistent with the risk-based thinking required by ISO 9001 in the critical decisions, and plays an important role in ensuring the reliability of the product system. In contrast, Shebl et al. conducted numerous interviews with hospital staff in the United Kingdom and concluded: “FMEA in health care is associated with a lack of standardization in how the scoring scales are used and how failures are prioritized.” (8). Different technicians and different scoring methods yielded dissimilar results; it is a tool for which there is a lack of evidence (9). The Clinical and Laboratory Standards Institute (CLSI) EP23A guideline: Laboratory Quality Control Based on Risk Management provides an introduction to risk management techniques and guidance on developing a risk-based quality control plan (QCP) (3). The 2-factor model that includes only the probability of occurrence of harm and the severity of harm, does not consider the detection capability, is not conducive to the development of a robust laboratory QCP (10). Six Sigma is a technique that allows objective assessment of process performance. The resulting RPN on a sigma-scale is more objective, because it is less reliant on subjective rankings and more reliant on observed performance (11). Six sigma quality control (QC) design tools can enhance FMEA, the risk assessment process and design of QC plans (11).
In this study, a risk analysis and assessment model based on Sigma metrics and intended use was constructed to ensure the quality in clinical laboratories and meet the low risk requirements of patients and clinicians.
Materials and methods
This study was performed in the Clinical Chemistry Laboratory of the Peking University Shenzhen Hospital, Shenzhen, China, in 2017. The laboratory is accredited according to the International Organization for Standardization (ISO) 15189 2012 by China National Accreditation Service for Conformity Assessment in 2015. Thirty-six serum analytes were evaluated on the Beckman AU5800 chemistry analyser (Beckman Coulter, Tokyo, Japan), which included 27 original manufacturer reagents and 9 “non-kit” reagents. The 27 original manufacturer reagents were: alanine aminotransferase (ALT), aspartate aminotransferase (AST), alkaline phosphatase (ALP), gamma-glutamyltransferase (GGT), total protein (TP), albumin (Alb), total bilirubin (BT), direct bilirubin (BD), urea, creatinine (CREA), uric acid (UA), glucose (Glc), creatinine kinase (CK), lactate dehydrogenase (LD), amylase (AMY), triglycerides (TG), total cholesterol (CHOL), high density lipoprotein cholesterol (HDL), low density lipoprotein cholesterol (LDL), potassium (K), sodium (Na), chloride (Cl), calcium (Ca), magnesium (Mg), inorganic phosphate (Phos), iron (Fe), and transferrin (TRSF). There were 9 “non-kit” reagents, of which IgG, IgM, IgA, pre-albumin (PA), cystatin C (Cys-C) were from DiaSys (Holzheim, Germany); ß2-microglobulin (BMG), C3, C4 were from Leadman (Beijing, China) and C-reactive protein (CRP) was from Sekisui (Tokyo, Japan). Johnson Vitros5600 analyser (Ortho Clinical Diagnostics, Raritan, USA) and original manufacturer reagents were used to measure cardiac troponin I (cTn-I), Roche e601 analyser (Roche Diagnostics, Mannheim, Germany) and original manufacturer reagents for cardiac troponin T (cTn-T), Siemens RP500 blood gas analyser (Siemens Healthcare Diagnostics, Suffolk, United Kingdom) and original manufacturer reagents for pH, pCO2, pO2 and SEBIA CAPILLARY 2 capillary electrophoresis analyser and original manufacturer reagents (Sebia, Evrycedex, France) for HbA1c. Some commercial control samples of human origin were used for evaluation. Unassayed Specialty Chemistry & Protein Control level 1 and 2 (Qualab Biotech, Shanghai, China) were used for Cys-C. Liquid Unassayed Special Protein Control level 2 and 3 (Qualab Biotech, Shanghai, China) were used for PA, BMG, CRP. Liquichek Cardiac Markers Plus Control LT level 1 and 2 (Bio-Rad Laboratories, Irving, USA) were used for cTn-I, cTn-T. Rapid QC Complete level 1, 2 and 3 (Siemens Healthcare Diagnostics Inc, Terry town, USA) were used for pH, pCO2, pO2; HbA1c capillary controls (expected values-HbA1c percentages) (Sebia, Lisses, France) were used for HbA1c. For all other tests Liquid Assayed Multiqual level 1, 2 and 3 (Bio-Rad Laboratories, Irving, CA, USA) were used.
The critical decision level-making Sigma metrics
The critical decision levels were determined from a sigma verification program (SVP) (Westgard QC, Madison, USA) or established upon consultation with clinicians. Imprecision was calculated from cumulative results of the internal quality control (IQC) excluded out-of-control in different time periods of 2016 (Table 1). Cumulative coefficient of variation (CV) or standard deviation (SD) was chosen from QC with the concentration closest to the level of critical decision marked in the CV% column of Table 1. Bias calculations were from 2016 external quality assurance (EQA) programs in laboratory medicine by National Centre of Clinical Laboratories (NCCL) of China. Passing-Bablok regression was performed using Microsoft Excel 2007 to determine the bias between the test result and the instrument group mean for comparison from the ten or fifteen EQA program results for each test (12). The equation generated y = ax + b (R2 > 0.95) was then applied to determine bias at the critical decision level. Total allowable error (TEa) referred to the SVP and EQA criteria from NCCL of China. The Sigma metrics were calculated as follows: Sigma metrics = (TEa - bias) / SD or Sigma metrics = (TEa% - bias%) / CV%.
Risk assessment based on Sigma metrics and intended use
The severity of harm due to exceeding TEa was investigated via a questionnaire survey through the Internet (www.wenjuan.com). Construction of questionnaire survey referred the manuscript of “Guidelines for constructing a survey” (13). In the questionnaire, total 42 questions were designed for 42 analytes, and the severity of harm was defined at the level of medical critical decision. For example, “ALT (test name) results deviate from the true results of more than 20% (TEa) at the level of 95U/L (critical decision level), how do you think the impact on clinical diagnosis and treatment? Neglected, minor, serious, critical, catastrophic”. The questionnaire was published in three WeChat work groups in where are 44 doctors in different clinical departments of Peking University Shenzhen hospital, 22 laboratory technicians in Peking University Shenzhen hospital and 23 clinical biochemistry laboratory supervisors of different hospitals in Shenzhen. The doctors answer only the questions within the scope of their practice and the specialists in laboratory medicine answer all the questions. The proportion of the severity of harm classification was statistically summarized and submitted to the FMEA team for discussion. To draw laboratory’s attention to the risk, the test without consensus received a higher level of risk assessment. Intended use of tests referred to the expert advisor, application guide, or reagent manual and categorized using diagnostic, screening, and patient management decisions.
A modified FMEA was applied to produce an analytic risk rating based on three novel factors, each test of which was graded as follows: 1) Sigma metrics; 2) the severity of harm; 3) intended use (diagnosis, screening, patient management decision). Three novel factors were in accordance with the 5 point system, as shown in Table 2. By multiplying the score of Sigma metrics by the score of severity of harm by the score of intended use, each was assigned a typical RPN. When a test had a different intended use in different clinical applications, it was classified according to the use with the highest risk score. RPN > 50 was considered high risk, the degree of risk was unacceptable; 25 < RPN ≤ 50 for the medium risk, laboratory personnel needed to pay attention to the test; and RPN ≤ 25 for low risk was here considered acceptable. According to the intended use and the accumulated score of the severity of harm, the sigma performance expectations were calculated.
Sigma metrics at the critical decision levels
The 42 clinical chemical analytes were performed on five instruments. The results of the Passing-Bablok regression and 95% confidence intervals (CI) for slope and intercept are listed in Table 1. The tests whose 95% CI for slope do not include 1 were as follows: ALT, AST, ALP, BT, Alb, AMY, TG, HDL, Ca, Mg, C3, CRP, cTn-T, pH, pO2. Most tests also had the 95% confidence interval of the y-axis intercept including zero; except for ALB, BT, HDL-C, Cl, Ca, Mg, Fe, PA, pO2, HbA1c.
The Sigma metrics for the critical decision level-making was calculated and listed in Table 1 and the normalized method decision chart demonstrating the sigma values was showed in Figure 1. There were 21 analytes with world class performance (σ ≥ 6). The analytes with excellent performance (5 ≤ σ < 6) were urea, IgG, IgM, cTn-I, pH; the analytes with good performance (4 ≤ σ < 5) were Glc, CHOL, Cl, Ca, PA; the analytes with marginal performance were Na, IgA, C4, Cys-C, cTn-T, PCO2 and the analytes with poor or unacceptable performance (σ ≤ 3) were Alb, C3, BMG, pO2, HbA1c.
Risk analysis and assessment
A total of 52 professional personal participated in the questionnaire survey, which included 32 doctors, 12 laboratory technicians and 8 clinical biochemistry laboratory supervisors. The number of negligible, minor, serious, critical, and catastrophic was 3, 13, 14, 9 and 3, respectively; the number of diagnostic, screening, and patient management decisions tests was 14, 11, and 17, respectively. There were 7 tests including Glc, Na, Ca, BMG, cTn-T, PCO2 and PO2 with high-risk of RPN > 50; 8 medium risk items with 25 < RPN ≤ 50. The 5 tests with σ < 3 were evaluated as high risk or medium risk items. All of these results were shown in Table 1.
Establishing a differential sigma performance expectations
Here, 13 tests had sigma performance expectations ≥ 6; 15 tests had sigma performance expectations ≥ 5; 3 tests had sigma performance expectations ≥ 4; 11 tests had sigma performance expectations ≥ 3. The results were shown in detail in Table 3.
When assessing quality on the σ scale, the higher the σ metric, the better the quality. Here, quality was assessed on the σ scale with a benchmark for minimum process performance of 3σ and a goal for world-class quality of 6σ (14). There were 21 tests with σ ≥ 6 and 5 tests with σ ≤ 3 of the 42 tests in this study. When calculating Sigma metrics, the selection of appropriate TEa and analyte concentration is crucial. A study in Belgium showed the Sigma metrics of Alb ranged from 1.3 to 32 varied with analyte concentration and the TEa target (15). It is desirable that TEa is defined by the highest possible hierarchical model, and then, simple point estimates of sigma at medical decision concentrations are sufficient for laboratory applications (16, 17). However, outcome-based approaches for goal setting may not be possible to set for all analytes (18). In this study, the TEa specifications were obtained from the SVP and the EQA criteria from NCCL of China; the CV values and bias were estimated at the critical decision level and the Sigma metrics at that level was calculated.
The integration of RPN of this study is based on three novel factors of Sigma metrics, the severity and intended use. Sigma metrics are directly related to the probability of risk and they can also be indirectly associated with the detection capability of 6 sigma QC rules. Thus, the use of Sigma metrics directly determined the probability of occurrence, simplifying the process of risk assessment. The evaluation of the severity is usually highly subjective and ultimately depends on the team’s experience and competence. So, the summarized data of the survey collected from clinicians and technicians in this study is benefit to making a relatively objective evaluation. Accounting for the intended use of test will also help design a comprehensive risk assessment model. For example, when HbA1c σ = 2.8, HbA1c is mainly used as patient management decisions in China, so RPN score of 45 is moderate risk. However, HbA1c was approved by the American Diabetes Association for use as a diagnostic indicator of diabetes, and the RPN score would therefore be adjusted to 75, which is high risk.
In this study, bias was estimated by the EQA data. However, it is several limited, such as the acceptance criteria and peer group comparison, compared to the primary method using a reference standard material (19). The intended use of the test is mainly based on the expert advisor, application guide, or reagent manual. Some of them may lack clear criteria. These problems and their solutions still need to be explored and further standardized.
At present, clinical laboratories can’t achieve world class quality (σ ≥ 6) for all tests. The results of the risk assessment also showed that tests that posed negligible risk to the patient could be allowed to reach lower Sigma metrics. Identifying the differentiated sigma performance expectations can avoid repeated residual risk evaluation, which is regarded as a time-consuming task (8, 9). If one test can’t achieve the sigma quality performance, it should be adjusted or changed. If intended use lowers the PRN so that the “Sigma performance expectation” isn’t 6 but is only 3 or 4, that still needs to be aligned with the QC procedures implemented. Currently, a test with 3 sigma or below will need more sensitive QC rules, testing multiple QC samples at each QC event, and more frequent QC events to reducing patient risk (5, 20-22).
In conclusion, this study demonstrates that the implications of Sigma metrics can be extended beyond the QC design and method acceptability. A new RPN based on Sigma metrics and intended use have been explored, which can make a more comprehensive and objective assessment of the risk of tests. Such model can also be used to establish the Sigma performance expectations and meet the low risk requirements of patients and clinicians.