Comparison between Sigma metrics in four accredited Egyptian medical laboratories in some biochemical tests: an initiative towards sigma calculation harmonization

Introduction Analytical quality is an essential requirement for best practice in any medical laboratory. Lack of a harmonized approach for sigma calculation is considered an obstacle in the objective comparability of analytical performance among laboratories adopting sigma metrics. It is urgently needed that all laboratory professionals interested in the analytical quality to work hard towards harmonization protocol for sigma calculation in order to properly select their analytical goals. This study aims at harmonization of Sigma metrics calculation in four accredited Egyptian laboratories. Materials and methods This observational cross sectional study compared the sigma levels for certain biochemical parameters in the four participating laboratories. Results Coefficient of variation (CV) and bias were determined for some biochemical analytes, data assayed by different automated analysers in the four different accredited laboratories. The sigma level for the four medical laboratories was calculated for each biomedical parameter with changed sigma level after total allowable error (Tea) unification among participating laboratories. Conclusion Each laboratory should select the TEa goal based on clear standardized criteria of selection without any subjective preferences as either under or over estimation of Sigma metrics will affect the patient centred care negatively if laboratories use quality control procedures wrongly based on incorrect Sigma metrics calculation with subsequent misleading medical decisions.


Introduction
Analytical quality is an essential requirement for best practice in any medical laboratory. Patientcentred care, the main target of medical laboratories, depends on the key concepts of internal quality control (IQC) which was established by Levey and Jennings followed by the interpretative rules published by Westgard and colleagues and external quality assurance (EQA) programs, which established in the late nineties as a complementary pillar to IQC, provide a tool of peer comparison (1)(2)(3)(4)(5).
Consequently, the minimization of analytical imprecision reflected as random errors via proper IQC plans and minimization of analytical bias seen as systematic errors through EQA programs are considered fundamental tools for any quality management system in laboratory medicine (6).
Quality decision specifications based on laboratory performance characteristics (bias and imprecision) were recommended many years ago as a mechanism to support quality in the medical labo-ratories. It is now necessary to determine how the performance of a measurement procedure relates to the medical requirements for interpreting results in order to determine the frequency to measure and evaluate quality control (QC) samples and results (7,8). Sigma metrics (SM) have been used to assess quality in a quantitative manner. There are two different methodologies for assessing process performance in terms of Sigma metrics. The first method depends on counting the defects or errors which are expressed as defects per million (DPM); the DPM are subsequently converted to a SM scale of 0 to 6, with 6 being world class (3.4 defects per million) and 3 being the minimum level of performance (about 66,800 defects per million) (9).
The second method depends mainly on measuring the variation of the measurement process to predict its performance and evaluate how well a measurement procedure performs using the two pillars of performance characteristics (bias and precision) and the total allowable error (TEa) (10). The goal is to seek for 6-sigma (world class) quality, with the common minimum level of acceptable quality broadly considered to be 3-sigma (11).
The use of SM offers many advantages to laboratories as it helps in determining their IQC frequency; thus avoiding repeated IQC testing during periods of stable performance, consequently minimizing unnecessary costs and human-hour wastage. In addition, it facilitates the comparison of the same assay performance across multiple systems (9,12,13).
To the best of our knowledge, this study represents the first study that tackles the variation in Sigma metrics calculation among accredited laboratories in Egypt. Since most of the laboratories are calculating the sigma and comparing the results while using different methods of calculating the sigma elements (bias and CV) as well as selecting suitable TEa, this variation might affect the comparability of analytical performance though they are all accredited. Also, it sheds the light over some key points in Sigma metrics calculation that allows laboratorians to make use of such valuable tool for assessment of method performance in a more objective manner.
This study aims at harmonization of Sigma metrics calculation following a standardized protocol in order to improve its utility for evaluation of the performance among accredited laboratories which is consequently reflected on the patient centred care in addition to facilitating comparability of sigma values in a more objective methodology.

Study design
The current study is an observational cross sectional study which was conducted after approval of the Research ethical committees in each university and the whole study design was approved by the Medical Research Institute, ethics committee. The presented data are from four Egyptian International Organization for standardization (ISO) 15189:2012 accredited medical laboratories: Chemical Pathology department Medical Research Institute, Alexandria University (MRI laboratory) a hospital laboratory in Alexandria governorate, Zagazig University hospital laboratory located in Zagazig governorate, Ain Shams University hospital laboratory located in Cairo governorate and a private laboratory in Alexandria governorate.

Methods
Coefficient of variation (CV, from IQC records) and bias (from proficiency testing data) were determined for some biochemical analytes, data assayed by different automated analysers in the four different accredited laboratories then Sigma metrics calculation was performed.
Estimated parameters were glucose (Glc), urea, creatinine (CREA), uric acid (UA), cholesterol (CHOL), triglycerides (Tg), albumin (Alb) bilirubin, direct (BD), bilirubin, total (BT), total protein (TP), calcium (Ca), inorganic phosphates (Phos), magnesium (Mg) and potassium (K). Moreover, the following enzyme activities were measured: alanine aminotransferase (ALT), alkaline phosphatase (ALP), lactate dehydrogenase (LD) and gammaglutamyltransferase (GGT). As well as two immu-noassay parameters were evaluated: alpha fetoprotein (AFP) and carcinoembryonic antigen (CEA). Some parameters were not calculated by the four laboratories and presented in the current study to show the effect of different TEa on sigma calculation to emphasize on the idea of need for harmonization of the current sigma calculation.
The instruments used by the four laboratories: Olympus AU 400 (Beckman Coulter International SA, Nyon, Switzerland) was used in MRI laboratory where most of biochemical reagents were dedicated Beckman Coulter except bilirubin (Spectra, Cairo, Egypt), and creatinine (Randox, Antrim, United Kingdom). Immunoassay parameters were assayed on Immulite 1000 (Siemens Healthineers GmbH, Erlangen, Germany), while AVL 9180 (Roche Diagnostics GmbH, Mannheim, Germany) was used to assay the potassium. In Zagazig University hospital laboratory Cobas 8000 modular system (Roche Diagnostics GmbH, Mannheim, Germany) was used to assay both biochemical and immunoassay parameters while AVL 9180 (Roche Diagnostics GmbH, Mannheim, German) was used to assay the potassium. As for Ain Shams University hospital laboratory; Olympus AU 480 with Beckman Coulter dedicated reagents (Beckman Coulter International SA, Nyon, Switzerland) was used and AVL 9180 (Roche Diagnostics GmbH, Mannheim, Germany) was used to assay the potassium. The data collected from Ain Shams University hospital laboratory lacked of the immunoassay parameters. Finally the private laboratory used Cobas c501 (Roche Diagnostics GmbH, Mannheim, Germany) to assay biochemical parameters.
Internal QC data was extracted from the analysers records from January till May 2016 (130 QC run during 130 working days / one run per day). Control materials were run before each analytical run. Each laboratory has its customized internal quality control and calibrations protocol which was done according to each laboratory internal quality control policies and procedure. Each laboratory selected the TEa according to the current analytical laboratory performance.
Internal quality control (IQC) data (same lot for each laboratory and level 1 QC values determined by manufacturers) were used by the four laboratories to determine each parameter CV after exclusion of outliers (QC observations that violate 1 3S rule). External quality assurance (EQA) data were used by the four laboratories to determine bias for each analyte. The results of EQA samples for at least 3 months were included, the EQA programs used by the four laboratories were not accuracy based and the mean of comparator group is considered as consensus group peer data. The mean of comparators selected according to each laboratory method and instrument, so there were no real true values used in the current study by any of the participating laboratories to determine the bias of the studied biochemical parameters.
The participating laboratories assayed the BioRad monthly program as external quality assessment scheme which consisted of twelve monthly samples in each cycle. According to the manufacturer, the total number of samples for the entire cycle was provided at the same time. All submitted results for each analyte are grouped according to comparators (peer, method, and mode/all results) then an ISO 13528 robust statistical analysis was performed (14).
The approach used in the current study to calculate the Sigma metrics relied on method performance measurement. For laboratory measurements, the Sigma metrics is calculated by the following formula (8): Sigma metrics = (TEa -bias observed) / CV (coefficient of variation) observed. The studied parameters were sorted into 6 categories; world class performance (SM = 6 or more), excellent performance (SM = 5-6), good performance (SM = 4-5), marginal performance (SM = 3-4), poor performance (SM = 2-3) and unacceptable performance (SM is less than 2).
The CV is estimated from the QC data as previously described. It is critically important that the estimate of CV is done using QC data that represent all or most components of variability that occur over an extended time period. A CV that represents stable measurement performance can usually be estimated from the cumulative standard deviation (SD) over a 6 to 12-month period for a single lot of QC material (8). It is noteworthy to state that there was no unified protocol used in the current study to estimate either CV or mean % bias of the studied parameters and each laboratory calculate these performance characteristics according to each laboratory policy.
Two Sigma metrics were calculated for each parameter using 2 different CVs that were obtained from 2 levels IQC data.
The harmonization protocol (Annex 1) is a novel protocol that was suggested by the working group of the current study. It aims at checking most of the key points that are considered as potential sources of variability in Sigma metrics calculation.

Data presentation
The formula CV = (Standard deviation / mean) x 100 has been used for the calculation of coefficient of variation. Bias (%) calculation of single PT measurement using Bio-Rad EQAS programs was calculated using [(mean of all laboratories using same instrument/method -laboratory mean) / mean of all laboratories using same instrument and method] x 100. Mean bias (%) was calculated through the sum of all % bias of PT values of specific parameter / number of PT values. The Sigma metrics was calculated as (TEa -Bias observed / CV observed). Table 1 shows the performance characteristics of the parameters from Medical Research Institute, Alexandria University, Chemical Pathology Laboratory; Sigma metrics were calculated using the total allowable errors from the different sources as shown. Among the assayed parameters total bilirubin had the highest sigma (10.5) while magnesium had the lowest sigma value (-0.7). Table 2 shows the performance characteristics of the parameters assayed in Zagazig University Hospital laboratory. Sigma metrics were calculated using the total allowable errors from the different sources as shown in the table. Triglycerides and direct bilirubin had the highest sigma (10.7) while calcium had the lowest sigma value (0.7) among the assayed parameters. TEa -total allowable error. CV -coefficient of variation. BV -biological variation. RiliBÄK -guidelines of the German medical association for the quality assurance of laboratory medical examinations. AFP -alpha fetoprotein. ALTalanine aminotransferase. Alb -albumin. ALP -alkaline phosphatase. BD -billirubin, direct. BT-bilirubin, total. CEAcarcinoembryonic antigen. LD -lactate dehydrogenase. GGT -gamma glutamyltransferase.  Table 3 shows the performance characteristics of the parameters from Ain Shams University hospital laboratory; Sigma metrics were calculated using the total allowable errors from the different sources as shown in the table. Among the assayed parameters glucose had the highest sigma (4.9) while albumin had the lowest sigma value (0.4).     Figures 1-4 demonstrate the precision and accuracy of the studied parameters for the four participating laboratories using the method decision chart while Figure 5 shows different sigma metrics in the four laboratories using the same total allowable error.

Discussion
This study is conducted over one year for four accredited medical laboratories. The sigma values were calculated based on performance approach and were compared to each other in a trial to highlight the lack of objective method of comparability. Control of analytical performance is an essential procedure that shall be done by medical laboratories specially those who are seeking accreditation through method verification which by itself is a standardized process but has no harmonized approach.
Implementing harmonized QC procedures in a medical laboratory requires both knowledge and practical updates. For each of these updates a lot of considerations can be made and a lot of prob-         The variability of TEa sources as the biological variation data used by Ricos and her colleagues are completely different from the PT limits used by CLIA. Sometimes if we want use the same source like biological variation (BV) the TEa might not be available (e.g. direct billirubin) and if available vary due to updating of the studies used and even if all this source of variability are nullified for sake of harmonization. The BV will answer the questions of method performance in three different ways according to which subtype of BV used (optimal TEa which equal half the desirable TEa and one third the minimal TEa).
Precision and bias verification are considered the corner stones of the verification procedures and as mentioned previously both of them have no harmonized protocol, different materials and different targets to achieve, even in case of choosing the same inputs we might get different output for example in case of using the external quality assessment material to determine bias the mean percent bias will be different according to the number, levels, commutability and uncertainty of materials used as all of these characteristics differ from program to another.
Sigma metrics calculation harmonization will help laboratories not to waste time and efforts analysing AFP 3.5 3.0 ----   SM values and changing TEa sources to fit for each analyte. After harmonization the laboratory managers' efforts will be directed towards the possible causes of poor performance. Taking the calcium in this study as an example, its sigma was unaccepted in three out of four laboratories and this might be due to: improper reagent handling starting from shipment, storage, preparation or on board stability or poor calibration/quality control procedure (reconstitution vehicle, storage and or mixing), or personnel incompetency or lack of instrument preventive maintenance or insufficient environmental conditions monitoring.
After reaching the right root cause, we will have the opportunity to select the proper corrective action and eventually achieve the medical laboratories` ultimate goal which is the high quality patient care.
Comparing the sigma levels in four accredited medical laboratories (three universities and one private laboratory) as an initiative to harmonize the sigma calculations to the best of our knowledge, authors in this study suggested a harmonized protocol for sigma calculation (Annex 1).
Our results showed different sigma levels for different parameters that were calculated using different TEa selected by each laboratory resulting in different categories of performance. This was in agreement with Schoenmakers et al. who discussed the variables that affects the sigma calcula-  tions and concluded that the use of the road map based on sigma metrics leads to fast and easy implementation of optimal Westgard QC rules. This approach needs standardization in order to lead to better patient care and ultimately in reduction of costs (16).
We compared the sigma level of the participating laboratories after unified the selected TEa, most of parameters compared to CLIA TEa except those which had no CLIA limits were compared to RCPA TEa. Table  5 shows SM that was calculated in the four laboratories after harmonization of TEa source. These results highlighted how can the TEa source selection affects the sigma level significantly in a way that may obscure the analytical performance.
Comparing the SM using the same TEa as step towards harmonization gives more real indicator of performance in a more objective approach than using different TEa by each laboratory. For example, total bilirubin SM according to the data calculated by each lab had far worse performance in MRI lab (2.0) than in Zagazig lab (7.8) while after using one common TEa (CLIA) in both laboratories the SM in MRI (7.8) showed better performance than that of Zagazig (4.1). Another example was glucose which according to the data calculated by each lab had almost the same SM in MRI and Zagazig which proved to be wrong when the TEa harmonized in both laboratories (CLIA) and showed that glucose performance at MRI (6.3) was better than that at Zagazig (3.9). Moreover, the data calculated by each laboratory showed that the magnesium had poor performance at MRI and Zagazig laboratory and excellent performance at the private laboratory, but after unifying the source of TEa and recalculating the magnesium SM using the same TEa (CLIA) the performance of magnesium at Zagazig proved to be excellent and even better than that of the private laboratory (6 and 5.6, respectively) and even the magnesium SM at MRI showed some increase (3.5).
The limitation of this study included: the Sigma metric equation as formulated by Westgard (17) (25). In 2009, a convocation of experts on quality control issued a collective opinion paper recommending the use of Sigma metrics in the Westgard formulation (26). Simply put, the Coskun formulation of the Sigma metrics is neither in wide acceptance nor wide use.
There are many variables that affect the comparability of estimated Sigma metrics among medical laboratories which include: the time interval upon which Sigma metrics is calculated, the different vendor systems providing external proficiency testing programs and quality control programs upon which bias and imprecision values are calculated and different environmental conditions. Also, the variability in the methods used for bias calculation. In addition to the different analytical or clinical benchmarks that are chosen for evaluation of TEa.
The laboratories participating in this study determined analytical bias through EQA programs where the bias was calculated through the difference between laboratory result and that of the EQA group mean against the group mean. Therefore, this is not a true value.
In conclusion, this study is considered the first to highlight the need for Sigma metrics harmonization. Therefore it is mandatory that all laboratory professionals interested in the analytical quality to harmonize the approach of sigma calculation with special empathizes on the bias and CV which are the main components of the sigma equation as well as to unify the methodology used among different laboratories. As for the bias calculation it is recommended to standardize the calculation by using duplicate readings of a number of materials with different concentrations to exclude the element of random error if the PT samples are used as a source for bias calculation. This in turn will help laboratories to find a unified objective tool to judge their method correctly. Finally, the TEa sources shall be vigorously reviewed and only approved sources shall be adopted for calculation.
Each laboratory should select the TEa goal based on clear standardized criteria of selection without any subjective preferences as either under or over estimation of Sigma metrics will affect the patient centred care negatively if laboratories use quality control procedures wrongly based on incorrect Sigma metrics calculation with subsequent misleading medical decisions. Laboratories performance using different tolerance limit can't be compared to each other using sigma approach. Further studies shall be conducted by the accredited laboratories in different sectors adapting the concept of harmonized approach. One of the most important outcomes of this study is the suggested harmonized protocol presented in Annex 1. A steering committee of the laboratory director and technical manager and quality manager should be involved in the selection of the TEa in each laboratory according to the following criteria.
a. Unifying the source of TEa upon which the SM of each level should be compared among the compared laboratories.

Bias 2
Optional a. Duration upon which bias was calculated, the longer the better (more than 6 months) The lab should document the following requirement for CV calculation Optional a. If the laboratory is using IQC for CV calculation according to EP15 (or equivalent) Mandatory  The length of the period during which QC assayed and included in CV calculation (not less than 1 month) Mandatory  Numbers of QC observation used in calculation (no less than 25) Optional  The length of the period during which QC assayed and included in CV calculation, the longer the better (more than 6 months) Optional  Method for calculation analytical CV (within run and in between run imprecision considered in calculation) Optional  Statistical method for exclusion of outliers Optional  Medical decision covered Mandatory  CV calculated over multiple concentrations