Paal, Habler, and Vogeser: Estimation of inter-laboratory reference change values from external quality assessment data


A major objective of laboratory medicine is standardization, which is intended to enable the interoperability of results from different test sites (1, 2). This is essential for both the development and application of clinical algorithms with decision limits based on laboratory values and for the long-term follow-up of patients with chronic diseases. Similarly, interoperability of laboratory results is a major challenge for scientific investigations that rely on unselected and not individually traceable routine data (e.g., big data approach). Especially the implementation of universal standards for identifying medical laboratory observations in electronic records, such as the Logical Observation Identifiers Names and Codes (LOINC) code system, has fuelled the mining of lab data (3).

Discrepancies in values due to insufficient standardization can in principle be compensated by the determination of method-specific reference values; for scientific applications, the evaluation can then, for example, be carried out as x-fold of a certain reference range value. However, this is currently not practiced for essential laboratory analytes. Indeed, clinical guidelines addressing analytes such as prostate-specific antigen (PSA) or creatinine do not consider possible methodological discrepancies at all (4, 5).

So far, there is no comprehensive data available to assess the limits of interoperability of standard laboratory analytes. Results of external quality assessment (EQA) schemes, some of which are publicly available, are an attractive data source in this context.

Reference change values (RCV) or critical differences have become established for the estimation of a true intra-individual dynamic in the follow-up of a patient beyond the measurement uncertainty (6). The calculation of the RCV takes into account the analytical variation of the measurement of an analyte (CVA) and the intra-individual biological variation (CVI) of the respective analyte (7). The RCV indicates the percentage by which two sequential results of a patient must differ if an actual biological change in the concentration must be assumed with a high degree of probability. The estimation of RCVs in the current application assumes that follow-up measurements are done in one and the same laboratory under unchanged conditions.

The evaluations of EQA campaigns across all participants – without method-specific evaluations – can be used to estimate CVA in an inter-laboratory setting. The database of the European Federation of Laboratory Medicine (EFLM) provides a good basis for the CVI to be assumed (8). Calculated inter-laboratory RCV (IL-RCV) could in turn be used for a critical appraisal of inter-laboratory records as exploited in epidemiological and data mining studies (Figure 1).

Figure 1

Scheme of calculation of inter-laboratory reference change values. EQA analytical variation from participating clinical laboratories and intra-individual biological variation (CVI) data can be used to calculate analyte-specific IL-RCV, which may be used for a comprehensive evaluation of combined inter-laboratory data sets from unselected laboratories with heterogenous measurement procedures, e.g., as typically extracted in data mining and epidemiological studies. IL-RCV – inter-laboratory reference change values. EQA – external quality assessment. CVA – analytical variation. CVI – intra-individual biological variations.


The aim of the study was to provide a proof-of-concept for the feasibility of estimating an IL-RCV from the above-mentioned data sources as a basis for assessing the interoperability of a particular analyte and to reflect the limitations of this approach. We addressed five representative standard analytes from different biochemical classes.

Materials and methods


External quality assessment data of the Reference Institute for Bioanalytics (Rfb, Bonn, Germany) were obtained from the freely accessible web resources (9). Four EQA programs were distributed in four campaigns in 2019 (identification: KS 1-4 2019; HM 1-4 2019; TM 1-4 2019; GH 1-4 2019). The proficiency test samples are lyophilized materials. Data sets were evaluated for the five analytes: serum calcium, creatinine, aldosterone, PSA, and whole blood haemoglobin A1c (HbA1c). For each EQA sample, the mean value of the concentrations found by the participants, the CV observed for each analyte and sample from the participants’ data (as total analytical CV, CVA), the number of participants in the respective campaign, and the number of different methods stated in the EQAs report were assessed.

The data on biological variation (within-subject BV, CVI) of the five exemplary analytes were taken from the respective database of the EFLM and, in the case of serum calcium, from a printed publication (8, 10).


There are several ways to calculate the RCV (7). The traditional approach introduced by Harris and Yasaka is RCV = 21/2 x Zα x (CVA2 + CVI2)1/2 (11). This is used to calculate symmetrical limits of the RCV for analytes that follow a normal distribution, where Zα is defining the number of standard deviations appropriate for the probability.

Given that many laboratory analytes have skewed rather than normal distributions, the log-normal approach is typically considered the best approach to determining RCV values (12, 13). Accordingly, we applied the log-normal approach, described by Fokkema et al. to calculate asymmetrical limits for the positive (upward) and negative (downward) reference change values (RCVpos, RCVneg) with the following equation RCVpos/neg = 100% x (exp (± Zα x 21/2 (SDA2 + SDI2)1/2 - 1) (14). The calculation of SDA2 is performed as SDA2 = ln (CVA2+1), while SDI2 is calculated as SDI2 = ln (CVI2+1).

In many clinical situations, decision-making is typically based on the assessment of a significant rise or fall of a target analyte. We, therefore, set Zα to 1.96, leading to 95% probability (P < 0.05) that is regarded as significant.


For all five exemplary analytes, the estimation of an IL-RCV was found possible. The results are summarized in Table 1. The range of IL-RCVs of the analytes investigated was considerable and RCVpos ranged from 13.3% to 203% and RCVneg from - 11.8% to - 67.0% for serum calcium and aldosterone, respectively.

Table 1

Estimation of inter-laboratory positive and negative reference change values for five exemplary analytes based on data from external quality assessment schemes

Analyte Unit Lowest – highest concentration (mean)* Median of results CVA (%) CVI (%)§ Estimated
Number of methods Number of EQA participants**
Creatinine µmol/L 115 – 478 168 4.7 4.5 +19.7 / - 16.5 14 645
Calcium mmol/L 1.81 – 3.35 2.20 3.1 3.3 +13.3 / - 11.8 12 615
Aldosterone nmol/L 0.3 – 20.1 0.8 18.6 36.6 +203 / - 67.0 8 203
PSA µg/L 0 – 14.0 0.4 18.8 6.8 +73.1 / - 42.2 13 804
HbA1c mmol/mol 36.0 – 71.2 51.5 5.1 1.7 +16.1 / - 13.9 12 810
*Lowest/highest mean concentration observed in the four EQA campaigns and each two samples per analyte. Median of 8 means observed in the four EQA campaigns and each two samples per analyte. Median CV of all participants observed in the four EQA campaigns studied, each two samples (median of 8 CVs). §Biological variation according to EFLM data base and for serum calcium (8,10). RCV according to Fokkema et al. (14). Mean number of methods in the four campaigns studied. **Mean number of participants per EQA scheme from four campaigns in 2019. PSA – prostate-specific antigen. HbA1c – haemoglobin A1c. CV – coefficient of variation. CVA – analytical variation of all participants. CVI – intra-individual biological variations. RCVpos/neg – positive (upward) and negative (downward) reference change values. EQA – external quality assessment. EFLM – European Federation of Clinical Chemistry and Laboratory Medicine.


We have shown that the estimation of IL-RCVs from publicly available EQA and intra-individual biological variation data is possible for essential laboratory analytes.

We observed very large differences for RCVpos and RCVneg between five exemplary analytes from different biochemical classes applying the formula for log-normal RCV calculation (14). While IL-RCV for serum calcium, creatinine, and whole blood HbA1c were found below 20%, the IL-RCV for serum aldosterone must be assumed to be well above 150% for positive and 50% for negative changes. This means that if patient’s samples are sent to randomly selected laboratories for follow-up testing, a real change in the concentration of the analyte in the biological system can only be assumed with a high degree of probability (95%) if the rise in values exceeds 203% in case of serum aldosterone. An increase of reported concentrations below cannot be considered a real change within the limits of measurement uncertainty in a between-laboratory setting.

We recognize that our approach has some limitations. External quality assessment samples are typically processed specimens (e.g., for virus inactivation), which can lead to limitations regarding commutability. Especially lyophilized control materials can show commutability problems and behave differently than patient samples. Accordingly, analytical variation values calculated from lyophilized materials may not correspond to the analytical variation calculated on the basis of commutable material. It is therefore possible that the real IL-RCV is lower than we observed when using exclusively unprocessed single-donor material in EQA. Corresponding data should in principle be generated but are currently not publicly available.

The Biological Variation Data Critical Appraisal Checklist (BIVAC) regularly updates the biological variation data from the EFLM database from systematic reviews and published studies. Still, it must be taken into account that the determination of biological variation data is not always based on very large data sets and that it has uncertainties, given that published CVI may not necessarily match the investigated patient cohort (15).

Our procedure can be used to assess the measurement uncertainty in epidemiological surveys and data mining studies, such as the Medical Informatics Initiative Germany, when data from unselected laboratories are used (16). They allow an estimation of the limits of interoperability of routine data collected with today’s heterogeneous standard procedures and kits and should be considered when interpreting corresponding combined data sets. Furthermore, IL-RCV may assist in the interpretation of changes in patient serial results obtained from different laboratories. For a more comprehensive assessment, multiple EQA schemes and accessions might be combined to establish a corresponding expanded data set for inter-laboratory analytical variation.

A notable caveat in large-scale data mining studies is the usage of pooled data from multiple laboratories. It must be emphasized that corresponding IL-RCVs are not exact mathematical calculations that can be transferred to individual evaluations without exception but are rather estimates that may be useful for critical appraisal, in particular of data mining studies.

We conclude that EQA data together with data on the biological variation – both freely available – allow the estimation of inter-laboratory RCVs. These differ substantially between different analytes and can help to assess the boundaries of interoperability in laboratory medicine.


[1] Conflicts of interest Potential conflict of interest

None declared.



Plebani M, Laposata M, Lippi G. A manifesto for the future of laboratory medicine professionals. Clin Chim Acta. 2019;489:49–52.


Ricós C, Perich C, Boned B, Gonzalez-Lao E, Diaz-Garzon J, Ventura M, et al. Standardization in laboratory medicine: Two years’ experience from category 1 EQA programs in Spain. Biochem Med (Zagreb). 2019;29:010701.


Parr SK, Shotwell MS, Jeffery AD, Lasko TA, Matheny ME. Automated mapping of laboratory tests to LOINC codes using noisy labels in a national electronic health record system database. J Am Med Inform Assoc. 2018;25:1292–300.


Carter HB, Albertsen PC, Barry MJ, Etzioni R, Freedland SJ, Greene KL, et al. Early detection of prostate cancer: AUA Guideline. J Urol. 2013;190:419–26.


Kidney Disease: Improving Global Outcomes (KDIGO) Acute Kidney Injury Work Group. KDIGO Clinical Practice Guideline for Acute Kidney Injury. Kidney Int Suppl. 2012;2:1–138.


Plebani M, Lippi G. Biological variation and reference change values: an essential piece of the puzzle of laboratory testing. Clin Chem Lab Med. 2012;50:189–90.


Fraser CG. Reference change values. Clin Chem Lab Med. 2011;50:807–12.


Aarsand AK, Fernandez-Calle P, Webster C, Coskun A, Gonzales-Lao E, Diaz-Garzon J, et al. The EFLM Biological Variation Database. Avaiable at: Accessed December 20th 2020.


Reference Institute for Bioanalytics. Avaiable at: Accessed November 15th 2020.


Lacher DA, Hughes JP, Carroll MD. Estimate of biological variation of laboratory analytes based on the third national health and nutrition examination survey. Clin Chem. 2005;51:450–2.


Harris EK, Yasaka T. On the calculation of a “reference change” for comparing two consecutive measurements. Clin Chem. 1983;29:25–30.


Frankenstein L, Wu AH, Hallermayer K, Wians FHJr, Giannitsis E, Katus HA. Biological variation and reference change value of high-sensitivity troponin T in healthy individuals during short and intermediate follow-up periods. Clin Chem. 2011;57:1068–71.


Klersy C, d’Eril GV, Barassi A, Palladini G, Comelli M, Moratti R, et al. Advantages of the lognormal approach to determining reference change values for N-terminal propeptide B-type natriuretic peptide. Clin Chim Acta. 2012;413:544–7.


Fokkema MR, Herrmann Z, Muskiet FA, Moecks J. Reference change values for brain natriuretic peptides revisited. Clin Chem. 2006;52:1602–3.


Fuentes-Arderiu X, Padro-Miquel A, Rigo-Bonnin R. Disadvantages of using biological variation data for reference change values. Clin Chem Lab Med. 2011;50:961, author reply 963–4.


Medical Informatics Initiative Germany. Avaiable at: Accessed July 20th 2020.