Sigma metrics for assessing the analytical quality of clinical chemistry assays: a comparison of two approaches

Introduction Two approaches were compared for the calculation of coefficient of variation (CV) and bias, and their effect on sigma calculation, when different allowable total error (TEa) values were used to determine the optimal method for Six Sigma quality management in the clinical laboratory. Materials and methods Sigma metrics for routine clinical chemistry tests using three systems (Beckman AU5800, Roche C8000, Siemens Dimension) were determined in June 2017 in the laboratory of Peking Union Medical College Hospital. Imprecision (CV%) and bias (bias%) were calculated for ten routine clinical chemistry tests using a proficiency testing (PT)- or an internal quality control (IQC)-based approach. Allowable total error from the Clinical Laboratory Improvement Amendments of 1988 and the Chinese Ministry of Health Clinical Laboratory Center Industry Standard (WS/T403-2012) were used with the formula: Sigma = (TEa − bias) / CV to calculate the Sigma metrics (σCLIA, σWS/T) for each assay for comparative analysis. Results For the PT-based approach, eight assays on the Beckman AU5800 system, seven assays on the Roche C8000 system and six assays on the Siemens Dimension system showed σCLIA > 3. For the IQC-based approach, ten, nine and seven assays, respectively, showed σCLIA > 3. Some differences in σ were therefore observed between the two calculation methods and the different TEa values. Conclusions Both methods of calculating σ can be used for Six Sigma quality management. In practice, laboratories should evaluate Sigma multiple times when optimizing a quality control schedule.


Introduction
Clinical laboratory testing results are important for ensuring patient safety. Approximately two-thirds of important clinical decisions on patient management are based on laboratory test results (1). Therefore, continuous improvement and the minimizing of errors in testing are the major goals of every clinical laboratory. Six Sigma (6σ) quality management is a data-based, customer-centered, advanced quality management model that has been recently developed and is used globally. Following its introduction in clinical laboratories, 6σ quality management has become a major research focus (2). Sigma (σ) metrics evaluate process capability. The clinical application of 6σ quality management involves the combined use of quality requirements and laboratory performance to quantitatively evaluate whether a laboratory meets clinical testing standards. This evaluation is typi- Guo X. et al. Comparison of sigma calculated from two approaches cally based on the expected defect rate. The ultimate goal of 6σ quality management is to implement laboratory risk management and thus ensure patient safety.
Several studies on the application of 6σ management in laboratory testing have been reported, including studies on theoretical methods and their significance, the evaluation of performance of different assays, and the optimization of quality control (QC) schedules based on performance evaluation (3)(4)(5). Sigma metrics quantitatively estimates quality based on the traditional parameters used in the clinical laboratory: allowable total error (TEa), bias and imprecision. Imprecision is usually expressed as standard deviation (SD) or coefficient of variation (CV). However, the quality requirements of various sources may be associated with differences within the same assay, which can affect the parameter selection. Furthermore, different approaches to bias and CV calculation may influence the final σ calculation. Laboratories should thus be aware of such sources of variation when σ management is applied, as they may modulate the value of σ and the accuracy of σ measurement. For example, previous studies have reported bias in external quality assessment (EQA) survey reports (e.g., College of American Pathologists or the Randox International Quality Assessment Scheme), whereas CV is derived from the cumulative coefficient of variation of internal quality control (IQC) materials (6)(7)(8)(9). However, a strong correlation has been established between CV and the concentration of the test substance. If the concentration of IQC materials differs significantly from that of proficiency testing (PT) samples used in bias calculation, this method is inadequate for calculating σ. Bias may also be derived from reagent package inserts (10). Given that the cumulative average value of QC material may change over time, it is inappropriate to use the target value from manufacturers' QC material when calculating bias. In fact, with the continuous development of information technology, vendors can statistically analyse IQC data from the majority of laboratories to determine a more appropriate "group mean". This approach can facilitate inter-laboratory quality management based on IQC, which has become popular among clinical laboratories. Using this group mean to calculate for bias has been shown to be a convenient and reliable method.
Therefore, in the present study, we aimed to compare two approaches to the calculation of CV and bias and the effect on σ calculation at different TEa. The two methods used to calculate the σ value for 10 routine biochemical assays on three different analysers were a PT-based approach, where materials for routine clinical chemistry from the China National Center for Clinical Laboratories (NCCL) were evaluated for imprecision (CV%) in each assay and bias (bias%) was calculated by comparison with the group mean for each PT sample in the NCCL report; and an IQC-based approach, where IQC results were used to calculate the CV% of each assay and bias% was calculated by comparison with the global group mean. Both methods thus harmonized the source of bias and CV derived from the same sample, based on which variations in σ were compared despite the different sources of parameters. This method of evaluating σ has not previously been reported.

Materials
Ten assays were evaluated using the manufacturer/analyser combinations routinely used in Peking Union Medical College Hospital (Table 1). All reagents and calibrators for the three analysers were obtained from the original manufacturer except for creatinine (CREA; Maccura Biotechnology Co., Ltd., Chengdu, China) and bilirubin, total (BT; Wako Pure Chemical Industries, Ltd., Osaka, Japan) on the Beckman AU5800 system (Beckman Coulter, Inc., Brea, USA).

Methods
Sample preparation. PT materials (five lots: 201721-201725) for routine clinical chemistry were provided by NCCL (Peking, China). Samples were prepared using an analytical balance with an accuracy of 0.001 g. Powdered samples were dissolved in 3 mL deionized water, capped, maintained at room temperature for 30 min and gently mixed until Imprecision evaluation. During the PT period (June 5 -9, 2017), the Clinical Laboratory Standards Institute (CLSI) EP15A3 protocol was followed, with the same sample tested five times daily for albumin (Alb), alanine aminotransferase (ALT), potassium (K), sodium (Na), chloride (Cl), calcium (Ca), total bilirubin (BT), glucose (Glc), creatinine (CREA, enzymatic), and urea (Urea) on each analyser for five consecutive days (11). The mean, SD (within laboratory) and CV (within laboratory) for each test item was calculated.
Bias calculation. Based on NCCL routine clinical chemistry requirements, the mean value of the instrument group (excluding data more than two standard deviations away from the mean) was used to verify the target value of each assay. Our mean (N = 25) for each assay from the different analysers was calculated as described above. Bias% was determined as (our mean − mean of all laboratories using the same instrument and method) / (mean of all laboratories using the same instrument and method) x 100.

PT Sigma metrics
The σ values calculated using two TEa sources (σ WS/T and σ CLIA ) for the three analysers are shown in Table 3. The TEa CLIA used absolute bias in the K, Na and Ca assays, whereas TEa WS/T used percentage bias in all other assays and was more stringent than TEa CLIA . We showed that σ WS/T < σ CLIA for all  Significant differences in σ CLIA values were observed using the same assay at the same concentration but for different analysers. More intuitive σ CLIA levels for all assays at different concentrations

IQC Sigma metrics
The σ metrics for the ten assays calculated from internal QC data (June 2017) are shown in Table 4. Similar to the PT results, σ WS/T was < σ CLIA in 9 out of 10 assays (all except Na), with ALT, BT and Glc reaching 6σ; Na reaching 4σ; and K and Ca reaching 3σ on the Beckman AU5800 analyser. On the Roche C8000 analyser, Glc, CREA, K and Na reached the 6σ level; ALT, BT and Urea reached 4σ; and Ca reached 3σ. For the Siemens Dimension analyser, CREA reached 6σ, Alb reached 5σ, and Glc and K reached 3σ.
For σ CLIA, all 10 assays were above 3σ on the Beckman AU5800 analyser, of which ALT, BT, CREA and Ca achieved 6σ at two levels of QC materials. For the Roche C8000 system, σ CLIA < 1σ for Cl, whereas all other nine items were > 3σ, and BT, CREA, K and Ca achieved 6σ while ALT reached 5σ. For the Siemens Dimension analyser, the σ CLIA value for Glc, Na and Cl was < 3, but Alb, CREA, K and Ca achieved 6σ. Supplementary Figure 2 displays more intuitive σ CLIA levels for the 10 assays at various concentrations using the three analysers. We also calculated σ metrics for the 10 assays based on IQC data from January -June 2017 and observed some differences compared with the σ metrics calculated from the IQC data in June (Supplementary Table 1).

Comparison of σ Metrics Between Methods
In Figure 1, the differences in σ CLIA as calculated using the two methods and three analysers are shown. For some analytes, the values of σ CLIA derived from the two approaches are significantly different. For example, σ CLIA for the PT-based approach versus the IQC-based approach at similar concentrations was 6.5 (201722) versus 3.9 (45751) for Alb and 1.4 (201722) versus 4.6 (45751) for Na on the Beckman AU5800 analyser. To allow comparison of the differences between the two methods of calculating σ CLIA at similar concentrations, we have listed the σ CLIA values from IQC materials and PT samples in Supplementary Figure 3.

Discussion
Evaluating the quality of laboratory testing is an important research topic in clinical laboratories. Six Sigma quality standards take bias (system error) and CV (random error) into account to systematically and extensively guide quality management in clinical laboratories while analysing possible causes of error, identifying solutions, better assuring testing quality and optimizing the QC schedule. However, the optimum TEa, bias, CV, and other indicators to calculate 6σ remain unclear, particularly when the sources of bias and CV vary between laboratories. We therefore compared two new approaches to calculate σ metrics as a future reference for the application of 6σ quality management in clinical laboratories.
To our knowledge, the present study is the first to use PT samples to assess imprecision and further calculate σ values. We obtained CV values from these samples to ensure that the same source of bias and CV were used in σ calculations. Given that PT samples typically have five different levels of concentrations, and it is easier to cover different levels of an assay for medical decision-making, this approach conveniently evaluates σ at different concentrations to better indicate analyser performance. A limitation of the PT-based approach is the assessment of short-term imprecision, which may lead to a lower CV and overestimated σ value. According to the manufacturer's instructions for the PT materials, prepared samples are stable for only seven days, and so are unsuitable for use in long-term evaluations.
To compare differences in σ levels calculated by the two methods, we used the relatively long-term CV calculated in the same month of IQC data in the IQC-based approach, thus accounting for other factors (batch number, instrument status, calibration, personnel, temperature, humidity, etc.).
Our findings indicate that, for some assays, the σ values derived from the two approaches are significantly different. These differences may clearly have significant outcomes for QC rule selection; for example, a 4σ method requires multi-rule QC while a 6σ method can be controlled by a simple, single-rule QC.    been present, possibly due to the short detection time period when evaluating σ with PT. In addition, the σ value itself is influenced by both bias and imprecision, and in clinical laboratories, multiple factors can influence these parameters. Where the σ value is not satisfactory, the cause should be determined, whether it is a bias or an imprecision issue, and an appropriate solution should be identified. In the PT-based approach, the imprecision was mostly acceptable, and the suboptimal 6σ values may have been mainly attributable to bias. If a laboratory wants to use individualized quality con-trol rules, it may select one method for evaluating σ level and use another method for verification.
We also compared the σ calculation in the IQCbased approach during different time periods and found that the 1-and 6-month values differed significantly for some analytes (Table 4 and Supplementary Table 1). As greater imprecision is expected with prolonged time, 6-month σ values were expected to be lower. However, this was not observed in all cases; for example, Glc (low QC level, 45751) on the Siemens analyser was 2.5 at 1 month versus 5.1 at 6 months, and Urea (low QC level, 45751) on the Beckman analyser was 4.7 versus 7.2. From the original data analysis, the difference in Glc on the Siemens analyser was due to the larger CV observed in June. The difference in Urea on the Beckman analyser, however, may have been the result of periodic biases with different signs averaged to an ignorable long-term bias. In practice, laboratories must take caution when implementing "Westgard Sigma Rules" in quality control based on σ values, because these values may change continuously with respect to precision and bias arising, for example, from calibrations, reagents, or personnel (4). As described previously, σ value between different periods for some analytes may differ, laboratory should monitor the σ level continuously when using individualized quality control rules based on the σ evaluation.
Bias can significantly impact the σ metric. It can (theoretically) be corrected, while imprecision is more difficult to influence. However, bias is generally more difficult to estimate. The most reliable way to do so is to use a reference method. Most studies, including the present study, use group means from EQA data on the same instruments and methods as a target, rather than the reference method. Therefore, the observed bias is only "arbitrary" instead of "true". External quality assessment peer group evaluation has also been shown to be insufficient in determining analytical quality and may compromise patient care, despite its acceptance by participating laboratories and manufacturers (14). This approach is therefore a limitation of the present study, but also represents a common limitation of 6σ for quality management at present, as most routine laboratory testing does not have a reference method that can be conveniently implemented. Determining bias from proficiency testing or global QC reports thus remains the primary approach in current 6σ evaluations. It is therefore critical to select the appropriate group when using group mean to assess bias. In the PTbased approach used in our study, group means were calculated after excluding data more than two standard deviations away from the mean in order to exclude extreme values. Compared with the IQC-based approach, the relatively small number of laboratories using Siemens instruments (< 10) in the PT-based approach may have affected the reliability of the means. This limitation should be considered when selecting an appropriate method for σ calculation.
The TEa is another important parameter in σ calculations, and extensive efforts to understand, establish and unify the quality of testing and analysis are ongoing. In May 2014, the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) held its first meeting on countermeasures in Milan, Italy, under the theme "Analytical performance targets set 15 years after the Stockholm Conference". At the conference, experts made in-depth and detailed discussions on the progress and further understanding of setting up analytical performance goals in clinical laboratories in the 15 years after the Stockholm meeting, and also issued a statement of synergies after the meeting (15). At present, TEa values are primarily derived from CLIA guidelines, although few reports have used biological variations (6,7,9,16,17). In China, the Ministry of Health published analytical quality specifications for routine clinical biochemistry (WS/T 403-2012) in 2012, derived from data on within-subject and between-subject biologic variation, while taking into account the quality of analysis currently achievable. However, these standards are expert based and have the objective (at least for CLIA) to set broad quality limits that will include the majority of laboratories; CLIA guidelines, for example, are often considered "loose" in terms of analytical performance. The TEa value selection can lead to significant differences in the evaluation of σ values (10). This limitation should be considered when selecting an appropriate TEa for σ calculation.
We also compared the effects of TEa values on σ calculations. The σ WS/T value was significantly lower than that of σ CLIA in most assays, given that TEa WS/T is more stringent than TEa CLIA . For some assays, the analyser could not achieve even the 3σ level. In addition, absolute bias was used for the K, Na and Ca assays in the CLIA guidelines, and BT, Glc, Urea and CREA at the low levels, but all percentage bias was used for TEa WS/T . Therefore, for low-concentration specimens (201722, 45751), σ CLIA was significantly higher than σ WS/T for BT, Guo X. et al. Comparison of sigma calculated from two approaches CREA, Urea, K and Ca. When screening TEa sources, the source most closely related to the performance for a given laboratory should be selected to ensure continuous improvement in quality management. Laboratories should not pursue the best σ metrics as a laboratory goal, nor should they select the most stringent TEa sources, to avoid unnecessary burden on laboratories.
In the present study, only the reagents for CREA and BT on the Beckman AU5800 system were not obtained from the original manufacturer. The results showed a minimum σ CLIA for BT and CREA of 5.6 and 17.4, respectively, when calculated from proficiency testing samples, and 7.4 and 8.4, respectively, when calculated from IQC. Both results were satisfactory. These findings indicate that both domestic and foreign reagents selected for routine laboratory testing can achieve a high quality level.
The performance of the analysers was also compared. The Beckman AU5800 and Roche C8000 systems each reached 3σ levels for seven assays, while the Siemens Dimension analyser received 3σ levels for five assays. Different assays showed variations in performance among the analysers, although these variations were not significantly different. A laboratory may select an analyser based on assay usage frequency while still considering the σ evaluation, thereby personalizing the selec-tion. Different assays may be also assigned to different instruments based on these results. We found that, for the Siemens Dimension system, σ assessed by both methods had multiple values < 3 (14% for PT samples, 20% for IQC materials). Given that the Siemens Dimension instrument in our laboratory has been in daily use for more than 5 years, it was replaced by a new instrument in December 2017.
In conclusion, both methods of evaluating σ in this study can be used to assess the performance of a specific analyser, despite the observed differences in σ calculated by different methods. In the practical application of σ metrics for QC management, σ should be evaluated multiple times when optimizing a QC schedule.