Clinical laboratory testing results are important for ensuring patient safety. Approximately two-thirds of important clinical decisions on patient management are based on laboratory test results (1). Therefore, continuous improvement and the minimizing of errors in testing are the major goals of every clinical laboratory. Six Sigma (6σ) quality management is a data-based, customer-centered, advanced quality management model that has been recently developed and is used globally. Following its introduction in clinical laboratories, 6σ quality management has become a major research focus (2). Sigma (σ) metrics evaluate process capability. The clinical application of 6σ quality management involves the combined use of quality requirements and laboratory performance to quantitatively evaluate whether a laboratory meets clinical testing standards. This evaluation is typically based on the expected defect rate. The ultimate goal of 6σ quality management is to implement laboratory risk management and thus ensure patient safety.
Several studies on the application of 6σ management in laboratory testing have been reported, including studies on theoretical methods and their significance, the evaluation of performance of different assays, and the optimization of quality control (QC) schedules based on performance evaluation (3-5). Sigma metrics quantitatively estimates quality based on the traditional parameters used in the clinical laboratory: allowable total error (TEa), bias and imprecision. Imprecision is usually expressed as standard deviation (SD) or coefficient of variation (CV). However, the quality requirements of various sources may be associated with differences within the same assay, which can affect the parameter selection. Furthermore, different approaches to bias and CV calculation may influence the final σ calculation. Laboratories should thus be aware of such sources of variation when σ management is applied, as they may modulate the value of σ and the accuracy of σ measurement. For example, previous studies have reported bias in external quality assessment (EQA) survey reports (e.g., College of American Pathologists or the Randox International Quality Assessment Scheme), whereas CV is derived from the cumulative coefficient of variation of internal quality control (IQC) materials (6-9). However, a strong correlation has been established between CV and the concentration of the test substance. If the concentration of IQC materials differs significantly from that of proficiency testing (PT) samples used in bias calculation, this method is inadequate for calculating σ. Bias may also be derived from reagent package inserts (10). Given that the cumulative average value of QC material may change over time, it is inappropriate to use the target value from manufacturers’ QC material when calculating bias. In fact, with the continuous development of information technology, vendors can statistically analyse IQC data from the majority of laboratories to determine a more appropriate “group mean”. This approach can facilitate inter-laboratory quality management based on IQC, which has become popular among clinical laboratories. Using this group mean to calculate for bias has been shown to be a convenient and reliable method.
Therefore, in the present study, we aimed to compare two approaches to the calculation of CV and bias and the effect on σ calculation at different TEa. The two methods used to calculate the σ value for 10 routine biochemical assays on three different analysers were a PT-based approach, where materials for routine clinical chemistry from the China National Center for Clinical Laboratories (NCCL) were evaluated for imprecision (CV%) in each assay and bias (bias%) was calculated by comparison with the group mean for each PT sample in the NCCL report; and an IQC-based approach, where IQC results were used to calculate the CV% of each assay and bias% was calculated by comparison with the global group mean. Both methods thus harmonized the source of bias and CV derived from the same sample, based on which variations in σ were compared despite the different sources of parameters. This method of evaluating σ has not previously been reported.
Materials and methods
Ten assays were evaluated using the manufacturer/analyser combinations routinely used in Peking Union Medical College Hospital (Table 1). All reagents and calibrators for the three analysers were obtained from the original manufacturer except for creatinine (CREA; Maccura Biotechnology Co., Ltd., Chengdu, China) and bilirubin, total (BT; Wako Pure Chemical Industries, Ltd., Osaka, Japan) on the Beckman AU5800 system (Beckman Coulter, Inc., Brea, USA).
Sample preparation. PT materials (five lots: 201721–201725) for routine clinical chemistry were provided by NCCL (Peking, China). Samples were prepared using an analytical balance with an accuracy of 0.001 g. Powdered samples were dissolved in 3 mL deionized water, capped, maintained at room temperature for 30 min and gently mixed until completely dissolved. Samples were protected from light and stored between 2 °C and 8 °C until use within 7 days.
Imprecision evaluation. During the PT period (June 5 – 9, 2017), the Clinical Laboratory Standards Institute (CLSI) EP15A3 protocol was followed, with the same sample tested five times daily for albumin (Alb), alanine aminotransferase (ALT), potassium (K), sodium (Na), chloride (Cl), calcium (Ca), total bilirubin (BT), glucose (Glc), creatinine (CREA, enzymatic), and urea (Urea) on each analyser for five consecutive days (11). The mean, SD (within laboratory) and CV (within laboratory) for each test item was calculated.
Bias calculation. Based on NCCL routine clinical chemistry requirements, the mean value of the instrument group (excluding data more than two standard deviations away from the mean) was used to verify the target value of each assay. Our mean (N = 25) for each assay from the different analysers was calculated as described above. Bias% was determined as (our mean − mean of all laboratories using the same instrument and method) / (mean of all laboratories using the same instrument and method) x 100.
Bio-Rad (Bio-Rad Laboratories, Inc., Hercules, CA, US) Liquid assay multiqual QC materials (694/696; lot numbers 45751, 45753) were used daily to monitor internal testing quality. All assays participated in the Bio-Rad global report. Data were collected from internal QC at Peking Union Medical College Hospital from January 1, 2017 to June 30, 2017. Monthly (June 2017) mean, SD and CV were calculated. Bias was calculated based on target value, which was averaged from the Bio-Rad global report for the same assay performed with the same instrument/method.
Sigma calculation and data analysis. According to quality requirements, the formulas σ = (TEa − |Bias%|) / CV (for percentage) and σ = (TEa − |Bias|) / SD (for concentration value) were used to calculate σ metrics for each assay. The allowable total error of each assay was based on the American Clinical Laboratory Improvement Amendments of 1988 (CLIA ‘88) and People’s Republic of China Health Industry Standard (WS/T403-2012), designated as TEaCLIA and TEaWS/T, respectively (12, 13). The specific requirements of the 10 assays are listed in Table 2. Excel 2010 software (Microsoft Corporation, Redmond, Washington State, US) was used for data analysis and graphing.
|bias% (absolute value)||bias%|
|TEaCLIA - allowable total error derived from US Clinical Laboratory Improvement Amendments 1988 (CLIA ‘88). TEaWS/T - allowable total error derived from the People’s Republic of China Health Industry Standard (WS/T403-2012). Alb - albumin, g/L. ALT - alanine aminotransferase, U/L 37 °C. BT - bilirubin, total, µmol/L. Glc - glucose, mmol/L. CREA - creatinine, µmol/L. Urea, mmol/L. K - potassium, mmol/L. Na - sodium, mmol/L. Cl - chlorides, mmol/L. Ca - calcium, mmol/L.|
PT Sigma metrics
The σ values calculated using two TEa sources (σWS/T and σCLIA) for the three analysers are shown in Table 3. The TEaCLIA used absolute bias in the K, Na and Ca assays, whereas TEaWS/T used percentage bias in all other assays and was more stringent than TEaCLIA. We showed that σWS/T < σCLIA for all assays except Na. For σWS/T, only the Siemens Dimension analyser achieved 6σ for BT testing in all five lots. The 3σ level was achieved for ALT, CREA and K on all three analysers; BT, Glc and Na on the Roche C8000 analyser; and Cl and Ca on the Beckman AU5800 analyser.
The σCLIA calculation based on TEaCLIA showed that eight assays (all except Urea and Na) achieved σ > 3 on the Beckman AU5800 analyser. Among the eight, CREA and K reached 6σ levels in the five lot numbers during proficiency testing; four assays (ALT, BT, Cl and Ca) achieved 5σ. For the Roche C8000 analyser, all assays except Urea, Na and Cl achieved 3σ, whereas BT, Glc, CREA and K reached 6σ. For the Siemens Dimension analyser, BT and K reached 6σ; ALT, CREA and Urea reached 4σ; and Alb, Glc, Na (201721, 201724, 201725) and Cl (201721) were < 3σ at certain concentrations.
Significant differences in σCLIA values were observed using the same assay at the same concentration but for different analysers. More intuitive σCLIA levels for all assays at different concentrations and with different analysers are shown in Supplementary Figure 1.
IQC Sigma metrics
The σ metrics for the ten assays calculated from internal QC data (June 2017) are shown in Table 4. Similar to the PT results, σWS/T was < σCLIA in 9 out of 10 assays (all except Na), with ALT, BT and Glc reaching 6σ; Na reaching 4σ; and K and Ca reaching 3σ on the Beckman AU5800 analyser. On the Roche C8000 analyser, Glc, CREA, K and Na reached the 6σ level; ALT, BT and Urea reached 4σ; and Ca reached 3σ. For the Siemens Dimension analyser, CREA reached 6σ, Alb reached 5σ, and Glc and K reached 3σ.
For σCLIA, all 10 assays were above 3σ on the Beckman AU5800 analyser, of which ALT, BT, CREA and Ca achieved 6σ at two levels of QC materials. For the Roche C8000 system, σCLIA < 1σ for Cl, whereas all other nine items were > 3σ, and BT, CREA, K and Ca achieved 6σ while ALT reached 5σ. For the Siemens Dimension analyser, the σCLIA value for Glc, Na and Cl was < 3, but Alb, CREA, K and Ca achieved 6σ. Supplementary Figure 2 displays more intuitive σCLIA levels for the 10 assays at various concentrations using the three analysers. We also calculated σ metrics for the 10 assays based on IQC data from January – June 2017 and observed some differences compared with the σ metrics calculated from the IQC data in June (Supplementary Table 1).
Comparison of σ Metrics Between Methods
In Figure 1, the differences in σCLIA as calculated using the two methods and three analysers are shown. For some analytes, the values of σCLIA derived from the two approaches are significantly different. For example, σCLIA for the PT-based approach versus the IQC-based approach at similar concentrations was 6.5 (201722) versus 3.9 (45751) for Alb and 1.4 (201722) versus 4.6 (45751) for Na on the Beckman AU5800 analyser. To allow comparison of the differences between the two methods of calculating σCLIA at similar concentrations, we have listed the σCLIA values from IQC materials and PT samples in Supplementary Figure 3.
Evaluating the quality of laboratory testing is an important research topic in clinical laboratories. Six Sigma quality standards take bias (system error) and CV (random error) into account to systematically and extensively guide quality management in clinical laboratories while analysing possible causes of error, identifying solutions, better assuring testing quality and optimizing the QC schedule. However, the optimum TEa, bias, CV, and other indicators to calculate 6σ remain unclear, particularly when the sources of bias and CV vary between laboratories. We therefore compared two new approaches to calculate σ metrics as a future reference for the application of 6σ quality management in clinical laboratories.
To our knowledge, the present study is the first to use PT samples to assess imprecision and further calculate σ values. We obtained CV values from these samples to ensure that the same source of bias and CV were used in σ calculations. Given that PT samples typically have five different levels of concentrations, and it is easier to cover different levels of an assay for medical decision-making, this approach conveniently evaluates σ at different concentrations to better indicate analyser performance. A limitation of the PT-based approach is the assessment of short-term imprecision, which may lead to a lower CV and overestimated σ value. According to the manufacturer’s instructions for the PT materials, prepared samples are stable for only seven days, and so are unsuitable for use in long-term evaluations.
To compare differences in σ levels calculated by the two methods, we used the relatively long-term CV calculated in the same month of IQC data in the IQC-based approach, thus accounting for other factors (batch number, instrument status, calibration, personnel, temperature, humidity, etc.). Our findings indicate that, for some assays, the σ values derived from the two approaches are significantly different. These differences may clearly have significant outcomes for QC rule selection; for example, a 4σ method requires multi-rule QC while a 6σ method can be controlled by a simple, single-rule QC.
There may be several reasons for the difference between the two approaches. First, there may be obvious differences in analyte concentrations between the IQC control samples and the PT materials. Second, the group mean might not have been appropriate. Third, systematic deviations may have been present, possibly due to the short detection time period when evaluating σ with PT. In addition, the σ value itself is influenced by both bias and imprecision, and in clinical laboratories, multiple factors can influence these parameters. Where the σ value is not satisfactory, the cause should be determined, whether it is a bias or an imprecision issue, and an appropriate solution should be identified. In the PT-based approach, the imprecision was mostly acceptable, and the suboptimal 6σ values may have been mainly attributable to bias. If a laboratory wants to use individualized quality control rules, it may select one method for evaluating σ level and use another method for verification.
We also compared the σ calculation in the IQC-based approach during different time periods and found that the 1- and 6-month values differed significantly for some analytes (Table 4 and Supplementary Table 1). As greater imprecision is expected with prolonged time, 6-month σ values were expected to be lower. However, this was not observed in all cases; for example, Glc (low QC level, 45751) on the Siemens analyser was 2.5 at 1 month versus 5.1 at 6 months, and Urea (low QC level, 45751) on the Beckman analyser was 4.7 versus 7.2. From the original data analysis, the difference in Glc on the Siemens analyser was due to the larger CV observed in June. The difference in Urea on the Beckman analyser, however, may have been the result of periodic biases with different signs averaged to an ignorable long-term bias. In practice, laboratories must take caution when implementing “Westgard Sigma Rules” in quality control based on σ values, because these values may change continuously with respect to precision and bias arising, for example, from calibrations, reagents, or personnel (4). As described previously, σ value between different periods for some analytes may differ, laboratory should monitor the σ level continuously when using individualized quality control rules based on the σ evaluation.
Bias can significantly impact the σ metric. It can (theoretically) be corrected, while imprecision is more difficult to influence. However, bias is generally more difficult to estimate. The most reliable way to do so is to use a reference method. Most studies, including the present study, use group means from EQA data on the same instruments and methods as a target, rather than the reference method. Therefore, the observed bias is only “arbitrary” instead of “true”. External quality assessment peer group evaluation has also been shown to be insufficient in determining analytical quality and may compromise patient care, despite its acceptance by participating laboratories and manufacturers (14). This approach is therefore a limitation of the present study, but also represents a common limitation of 6σ for quality management at present, as most routine laboratory testing does not have a reference method that can be conveniently implemented. Determining bias from proficiency testing or global QC reports thus remains the primary approach in current 6σ evaluations. It is therefore critical to select the appropriate group when using group mean to assess bias. In the PT-based approach used in our study, group means were calculated after excluding data more than two standard deviations away from the mean in order to exclude extreme values. Compared with the IQC-based approach, the relatively small number of laboratories using Siemens instruments (< 10) in the PT-based approach may have affected the reliability of the means. This limitation should be considered when selecting an appropriate method for σ calculation.
The TEa is another important parameter in σ calculations, and extensive efforts to understand, establish and unify the quality of testing and analysis are ongoing. In May 2014, the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) held its first meeting on countermeasures in Milan, Italy, under the theme “Analytical performance targets set 15 years after the Stockholm Conference”. At the conference, experts made in-depth and detailed discussions on the progress and further understanding of setting up analytical performance goals in clinical laboratories in the 15 years after the Stockholm meeting, and also issued a statement of synergies after the meeting (15). At present, TEa values are primarily derived from CLIA guidelines, although few reports have used biological variations (6, 7, 9, 16, 17). In China, the Ministry of Health published analytical quality specifications for routine clinical biochemistry (WS/T 403-2012) in 2012, derived from data on within-subject and between-subject biologic variation, while taking into account the quality of analysis currently achievable. However, these standards are expert based and have the objective (at least for CLIA) to set broad quality limits that will include the majority of laboratories; CLIA guidelines, for example, are often considered “loose” in terms of analytical performance. The TEa value selection can lead to significant differences in the evaluation of σ values (10). This limitation should be considered when selecting an appropriate TEa for σ calculation.
We also compared the effects of TEa values on σ calculations. The σWS/T value was significantly lower than that of σCLIA in most assays, given that TEaWS/T is more stringent than TEaCLIA. For some assays, the analyser could not achieve even the 3σ level. In addition, absolute bias was used for the K, Na and Ca assays in the CLIA guidelines, and BT, Glc, Urea and CREA at the low levels, but all percentage bias was used for TEaWS/T. Therefore, for low-concentration specimens (201722, 45751), σCLIA was significantly higher than σWS/T for BT, CREA, Urea, K and Ca. When screening TEa sources, the source most closely related to the performance for a given laboratory should be selected to ensure continuous improvement in quality management. Laboratories should not pursue the best σ metrics as a laboratory goal, nor should they select the most stringent TEa sources, to avoid unnecessary burden on laboratories.
In the present study, only the reagents for CREA and BT on the Beckman AU5800 system were not obtained from the original manufacturer. The results showed a minimum σCLIA for BT and CREA of 5.6 and 17.4, respectively, when calculated from proficiency testing samples, and 7.4 and 8.4, respectively, when calculated from IQC. Both results were satisfactory. These findings indicate that both domestic and foreign reagents selected for routine laboratory testing can achieve a high quality level.
The performance of the analysers was also compared. The Beckman AU5800 and Roche C8000 systems each reached 3σ levels for seven assays, while the Siemens Dimension analyser received 3σ levels for five assays. Different assays showed variations in performance among the analysers, although these variations were not significantly different. A laboratory may select an analyser based on assay usage frequency while still considering the σ evaluation, thereby personalizing the selection. Different assays may be also assigned to different instruments based on these results. We found that, for the Siemens Dimension system, σ assessed by both methods had multiple values < 3 (14% for PT samples, 20% for IQC materials). Given that the Siemens Dimension instrument in our laboratory has been in daily use for more than 5 years, it was replaced by a new instrument in December 2017.
In conclusion, both methods of evaluating σ in this study can be used to assess the performance of a specific analyser, despite the observed differences in σ calculated by different methods. In the practical application of σ metrics for QC management, σ should be evaluated multiple times when optimizing a QC schedule.