A practical approach example to measurement uncertainty: Evaluation of 26 immunoassay parameters

Introduction Measurement uncertainty is a non-negative parameter that characterizes the distribution of all values appropriate to the measured size and is associated with the measured result. In this study, we aimed to compare the results with various suggestions and produce more qualified results by calculating the measurement uncertainties of the immunoassays like fertility hormones, drug concentration tests, cardiac markers, thyroid function tests and tumour markers. Materials and methods Uncertainty calculation was made in accordance with the top-down approach according to Nordtest guide. The 12-month study of internal and external quality assessment results were used. The parameters of drug concentration tests were performed on the Abbott Architect c8000, other hormones/markers on the i2000 of the same brand. Results Factors that increased the measurement uncertainty of a test were due to external quality control data. The calculations showed that 13 of 26 parameters satisfied quality requirements. The highest uncertainty value, with 28% belonged to cancer antigen 19-9 test. The lowest value was calculated for prolactin with 8.3%. Dehydroepiandrosterone sulfate and phenytoin performed poorly in terms of measurement uncertainty, although internal and external quality control assessment results were considered favourable for both. Conclusion It is recommended that the concept of measurement uncertainty, which plays an important role in the total quality performance of the laboratory, should be followed up by the clinical laboratory experts at certain time intervals and should be increased the awareness of clinicians about the subject.


Introduction
The famous scientist Galileo Galilei, who is known as the father of modern physics, once said "Doubt is the father of invention" (1). He emphasized that the most important factor triggering the development of science was doubt. It is obvious that every subject that is suspected and therefore investigated is more open to development and progress. The word uncertainty means doubt. In its broadest definition, measurement uncertainty means doubt about the validity of the measurement results (2).
Today, in economics, physics, chemistry and many other fields the concepts of doubt and uncertainty find a larger area day by day because every result of a measurement comes with the "uncertainty" arising from its own nature (3)(4)(5).
Measurements made in clinical laboratories play an important role in diagnosis, treatment and follow-up of diseases. However, it is known that when any test is repeated, even if all conditions are optimized, the probability of achieving exactly the same result is very low. In other words, the results of chemical reactions are actually components of a distribution. At this point, it would be a proper approach to talk about the measurement uncertainty of results.
According to the definition of the International Vocabulary of Metrology, measurement uncertainty (MU) is non-negative parameter characterizing the dispersion of the quantity values being attributed to a measurand, based on the information used (6). In simple terms, it is a quantitative indicator of the range in which the "real" result may actually be.
In laboratory measurements, there are many variables related to sampling, transport and storage, preparation of reagents, maintenance of devices, measurement method etc. In order to obtain reliable results, sources causing variation should be identified and minimized (7). The International Organization for Standardization (ISO) 15189 guide, which prepared for the standardization of medical laboratories, requires laboratories to have a defined method for establishing the MU of tests, determining performance criteria, and reviewing MU data regularly (8).
Generally, two approaches can be mentioned about the MU calculations. These are bottom-up and top-down approaches (9). The top-down approach prioritizes the use of quality control results for calculations. Random and systematic errors need to be resolved for this purpose. In the mathematical expression of these errors; internal quality control data is used for random errors and external quality assessment (EQA) can be used for systematic errors. This approach makes it very easy to adapt applications to clinical laboratories. According to the bottom-up approach, all uncertainty components are rigorously detected. All of the above-mentioned variables are examined one by one and included in the calculation in proportion to their contribution to uncertainty. That process is considered laborious for the clinical laboratories. Moreover, such complex calculations may not be necessary, given that the effects of any uncertainty components are reflected in the results of the internal control sample studied daily just like a patient sample. Since similar results were found in a study in which uncertainty calculations were made according to both approaches, it was recommended to use the top-down approach as a simpler method (10).
There is no restriction about which method to use. Unfortunately, there is no internationally recognized guide to compare whether the results are appropriate or not. Therefore it is important to publish the calculated MU of tests for comparison between laboratories.
When the literature was researched, for drug concentration tests no study was found using same/ similar method. In this respect, we believe that our study will shed light on future research. The purpose of this study is to present an example of calculating MU that is practical to use in clinical routine and to discuss how the results can be evaluated.

Materials and methods
In our study, it was planned to calculate the MU of immunoassay tests like fertility hormones, drug concentration tests, cardiac markers, thyroid function tests and tumour markers using the internal quality control results and EQA data of our laboratory between January 01, 2018 and December 31, 2018. Permission was obtained from the Ethics Committee of Aydın Adnan Menderes University Faculty of Medicine Non-Invasive Clinical Research with protocol number 2020/09. Table 1 shows the tests included to the study and their measurement methods with ranges.
Parameters of drug concentration tests were performed on the Abbott Architect c8000 (Abbott Diagnostics, Abbott Park, USA), and the remaining hormones and biomarkers were studied on the Abbott Architect i2000 SR hormone analyser using the company's original kits and calibrators. Internal quality control studies were carried out daily and external quality control studies were performed once a month. The EQA goal of our laboratory is to prevent the z-score from rising above 3, and try to keep it below 2 as much as possible. The z-score is a measure of the laboratory's bias rela-  if it is 2-3, a warning and an investigation are recommended; a z-score is ≥ 3 is unacceptable and remedial action usually is required. The EQA data of every parameter was compared with its peer group except drug concentration tests. Because of the number of equivalent group laboratories was low (< 20), the calculations for drugs were made according to the mode group or overall results.

Calculation steps
The top-down approach was chosen. For this purpose the steps of the Nordtest guide were followed (11): 1. Specify measurand, range, and target MU.
3. Quantify the uncertainty component associated with method and laboratory bias (u(bias)).
The standard uncertainty component for the within-laboratory reproducibility (u(Rw)) is derived from internal control sample results collected long enough to cover all worst-case scenarios and studied just like a patient sample. The number of results should ideally be more than 60 and cover a time period of at least one year to reflect all variations such as different stock solutions, new batches of critical reagents, recalibrations of equipment (11).
Internal quality control results were examined separately according to the lots of each level. Coefficient of variation (CV%) is the standard deviation expressed as a percentage of the mean. The CV is useful because it is independent of concentration. Coefficient of variation is calculated from standard deviation (SD) and mean of internal control sample results (CV% = (SD/Mean)x100). Then, the standard uncertainty component for the within- 4 laboratory reproducibility is calculated as u(Rw) = Rw/2 where Rw = (Σ(CV%) 2 /N)) 1/2 and N is the number of calculated CV%.
For method and laboratory bias (u(bias)), two bias components have to be estimated: 1) the root mean square of the individual bias values (RMS bias ), and 2) the mean of the standard uncertainty of the assigned values (u(Cref)). There are three ways to calculate u(bias) component, namely; the use of Certified Reference Material, participation in EQA and performing recovery tests. Use of EQA was the most practical and cost effective way. However, for reference materials a mean value over time is used and for each EQA a single lab-oratory result is used. Therefore the estimated RMS bias from EQA will usually be higher. In order to have a reasonably clear picture of the bias from EQA results, a laboratory should participate at least 6 times within a reasonable time interval (11). The calculations are as follows: RMS bias = (Σ(bias) 2 /N)) 1/2 , where N = number of EQA rounds; u(Cref) = Σ(CV EQA /N Lab 1/2 ) 2 /N, where N Lab = number of participating laboratories in each round and CV EQA is the CV% of each EQA round.
The standard uncertainty (u(x)) is calculated as follows: u(bias) = (RMS bias 2 + u(Cref) 2 ) 1/2 . Combined standard uncertainty (u c ) equals u c = (u(Rw) 2 + u(bias) 2 ) 1/2 , while expanded uncertainty U = 2xu c .    Since there are no universally accepted limits that determine the measurement uncertainty of the tests, it was planned to make a comparison according to the allowable total error and biological variation data, like many other articles (12)(13)(14).

Results
In order to make these complex calculations more descriptive, the example of calculation for a parameter is demonstrated. Table 2 shows the calculations for u(Rw) component of phenytoin (PHNY). According to the Nordtest approach, calculation steps for all parameters are shown in Table 4. For each test, u(bias) value was higher than u(Rw) value. The volume of u(Cref) affected the MU more minimally than the others. Prolactin had the lowest MU (8.3%) and CA 19-9 had the highest MU (28%).  (15)(16)(17)(18)(19)(20). According to that  U -calculated uncertainty results of tests. Westgard -Westgard's desirable specification for allowable total error (15). CLIA -Clinical Laboratory Improvement Amendments, Criteria for acceptable analytical performance (16). Ricos -Ricos's databases on biologic variation (17). Rili-BAEK -German Guidelines for Quality, Acceptable relative deviation in interlab tests (18). RCPA -Royal College of Pathologists of Australasia, Allowable limits of performance for biochemistry (19).  When our other 'Failed' test, DHEAS, was examined, it was seen that the uncertainty value found was not far beyond the limits specified by the guidelines. The MU of the parameter was found to be 14% for our laboratory. The recommendations of Westgard, Ricos and RCPA are 13.8%, 11.5%, 12%, respectively. However, to improve the parameter, all PT data were re-examined for u(bias), which has the dominant factor on the increasing of MU, it was noticed that none of the 12 participants to EQA had a z-score above two. At this point, it was thought that MU calculations could guide us in recognizing overlooked problems.
In a study conducted by Ayyıldız, the MU for DHE-AS was reported as 15.5% (21). When that report was examined, it was seen that the same brand and model device was used as in our study. There is no standardized formula for clinical laboratories to estimate MU of tests (12). Furthermore there is no standardized limits to compare results. Thus every laboratory should define its own formulation system and evaluation method. And so errors can be noticed by renewing calculations at certain periods.
Measurement uncertainty is among the standards as a criterion in quality assessment. The fact remains that, such calculations will not become prevalent unless they are practically done in daily routine applications. Hence MU models that can be calculated using the available data without the need for a separate budget are important and valuable.
The limitation of this study is the lack of possibility to calculate EQA data according to our peer group for drug concentration tests. Although we predict that this situation will increase the uncertainty val-ue of the parameters calculated from the EQA data, we think that special attention should be paid to the measurement uncertainty calculations of these tests, especially since they are analytes that must be kept in a narrow range in plasma and must be precisely adjusted.
In conclusion, our study showed that there can be a wide range of recommendations for a parameter and the importance of choosing which guideline to follow. A test that is appropriate according to one guideline may give very poor results according to another.
It is recommended that the concept of measurement uncertainty, which plays an important role in the total quality performance of the laboratory, should be followed up by the clinical laboratory experts at certain time intervals and should be increased the awareness of clinicians about the subject.

Potential conflict of interest
None declared.