Habibzadeh and Habibzadeh: The likelihood ratio and its graphical representation

Introduction

First described in 1763, Bayes’ theorem, named after Reverend Thomas Bayes (an English statistician and philosopher), is now one of the cornerstones of methods used for interpreting diagnostic test results. In mathematical terminology, it is presented as follows in equation (Eq.) 1:

bm-29-2-020101-e1.tif
provided P(B) ≠ 0, and where A and B are two events, P(A) represents the probability that A happens, and P(A | B) is the conditional probability of A happens given the B has happened (1).

Likelihood ratio

Suppose that A is the presence (D+) or absence (D) of a disease and that B is the condition the result of a diagnostic test (x) fulfils, say the test result being equal to the value r. Based on Eq. 1, the probability of the presence of a disease (D+) given a test value r is:

bm-29-2-020101-e2.tif

The probability of the absence of the disease (D) given the test result equals to r is therefore:

bm-29-2-020101-e3.tif

Dividing Eq. 2 by Eq. 3, and replacing P(D) with 1 – P(D+) gives:

bm-29-2-020101-e4.tif
which is:

bm-29-2-020101-e5.tif

the well-known equation used in Bayesian approach to interpret test results (2). The factor P(x = r | D+) / P(x = r | D) is termed the likelihood ratio (LR) when the test result equals to r and is represented as LR(r) (1). Generally speaking, the likelihood ratio indicates how many times more (or less) likely a certain condition for a test result is expected to be observed in diseased, compared with non-diseased, people (3). Four general possible conditions include likelihood ratio for a certain test value, likelihood ratio for a positive or negative test, and likelihood ratio for a range of test values (Table 1). To better understand the concept, let us examine the graphical representation of LR(r).

Table 1

Likelihood ratio for various test value conditions

Likelihood ratio for Notation Definition Graphical representation and equation
Certain test value of r LR(r) The probability of observing a test value equal to r in diseased compared with non-diseased people Slope of the tangent line to the ROC curve at the point corresponding to r;
bm-29-2-020101-g1.tif
Positive test (a test value equal to or more than a set cut-off value) LR(+) The probability of observing a positive test in diseased compared with non-diseased people Slope of the line segment joining the origin of the unit square to the point on the ROC curve corresponding to r;
bm-29-2-020101-g2.tif
Negative test (a test value less than a set cut-off value) LR(–) The probability of observing a negative test in diseased compared with non-diseased people Slope of the line segment joining the point on the ROC curve corresponding to r to the upper-right corner of the unit square
bm-29-2-020101-g3.tif
A range of test values LR(Δ) The probability of observing test values within a certain range in diseased compared with non-diseased people Slope of the line segment joining the two points on the ROC curve corresponding to the upper and lower limits of the range
bm-29-2-020101-g4.tif
Se - sensitivity. Sp - specificity.

Graphical representation

Likelihood ratio for a specific test result

Let f(x) and g(x) be the probability density function of a hypothetical diagnostic test with continuous results (x) for diseased (D+) and non-diseased (D) population (Figure 1), respectively. We arbitrarily chose the test values having normal distribution for both the diseased and non-diseased population, although the functions can theoretically have any distributions. Each point of the test result (x) can be considered a cut-off value. Previously, we showed that the test sensitivity (Se) and specificity (Sp) are functions of the cut-off value as follows (4):

bm-29-2-020101-e6.tif
Figure 1

The probability density functions of a diagnostic test with continuous results for diseased, f(x), and non-diseased, g(x), persons. On the horizontal axis are test values with an arbitrary unit. Graphically, the likelihood ratio is generally a ratio of two areas, except for the LR(r), which is the ratio of two lengths. There are two test values, r and s (in our example FBS of 98 and 93 mg/dL, respectively, on the x axis). For the calculation of LR(+) and LR(–), r was considered the cut-off value. FN – false negative. TP – true positive. TN – true negative. FP – false positive.

bm-29-2-020101-f1

Assume that we set our cut-off value at x = r. Se is indeed the area under the curve f(x) to the right of the cut-off value r (the pink area in Figure 1). Then, by definition, partial derivative of the Se with respect to x is:

bm-29-2-020101-e7.tif

The minus sign before f(x) is because Se is a decreasing function of the cut-off value—Se decreases as cut-off value increases (4).

In a similar way, the partial derivative of Sp with respect to x can be derived:

bm-29-2-020101-e8.tif

By definition:

bm-29-2-020101-e9.tif

However, considering that f(x) and g(x) are density functions illustrating the distribution of the result values in the diseased and non-diseased population (Figure 1), we have:

bm-29-2-020101-e10.tif

Before going further, there is a technical point worth to mention: from the theoretical point of view, the probability that a continuous random variable (here, x) will assume a particular value (here, r) is zero. Therefore, in the above equation, the statement x = r should be construed as r – hxr + h, when h approaches zero. Combining Equations 7 and 8, then:

bm-29-2-020101-e11.tif

meaning that the likelihood ratio that the test result equals to the value r, LR(r), is equal to the slope of the line tangent to the receiver operating characteristic (ROC) curve (grey short dashed line, Figure 2) at the point corresponding to the test cut-off value, r (Table 1) (4).

Figure 2

The ROC curve (solid black line) fitted to the data points (open circles) assuming the test value has a binormal distribution (Figure 1). The slope of the tangent line to the ROC (grey short dashed line) at the solid circle, the point corresponding to a test value r (FBS = 98 mg/dL in our example) in Figure 1, is the likelihood ratio of having an FBS of 98 mg/dL. Assuming a cut-off value of 98 mg/dL for FBS for the diagnosis of diabetes mellitus, the likelihood ratio of having a positive test, LR(+), is the slope of the line joining the origin to the solid circle (grey long dashed line). The likelihood ratio of a negative test, LR(–), is the slope of the line joining the solid circle to the upper-right corner (grey dash dotted line). The slope of the line segment joining the solid circle to the solid square (grey dash dot dotted line) is the likelihood ratio of having a test value between s and r (Figure 1). Se - sensitivity. Sp - specificity.

bm-29-2-020101-f2

Although LR(r) might provide useful information, its precise derivation is not generally possible in practice, unless a large database is available (5). The ROC curve is practically drawn from a set of discrete data that cannot be well fitted to a function; we just have a few discrete points. Although these points can be joined by various methods (line segments, spline, curve fitting, etc.), the curve is not differentiable and thus, in practice, it is not possible to determine the exact slope of the curve at a given point based on the available data (4-6). This makes accurate derivation of LR(r) very difficult, even impossible.

Likelihood ratio for a positive/negative test result

Although determination of the likelihood ratio for a test value of r is difficult, we can easily derive the likelihood ratio for test values equal to or more than r or tests with dichotomous results—positive or negative. Suppose that the value r is the test cut-off value. This means test values equal to or more than r is considered positive (T+); otherwise the test result is considered negative (T). The positive likelihood ratio, LR(+), is:

bm-29-2-020101-e12.tif

Graphically, LR(+) is the area under the curve f(x) to the right of the cut-off line (true-positive rate = Se) divided by the area under the curve g(x) to the right of the cut-off line (false-positive rate = 1 – Sp) (Figure 1). Mathematically, it is (4):

bm-29-2-020101-e13.tif

LR(+) is then clearly, the slope of the line segment joining the origin of the unit square to the point on the ROC curve corresponding to the test cut-off value, r (the solid circle, Figure 2, and Table 1).

There is a long-standing confusion between LR(r) and LR(+) in scientific literature. Some authors repeatedly have mentioned that LR(+) is equal to the slope of the cut-off point on the ROC curve, whereas, it is really the slope of the line joining the origin of the unit square to the cut-off point (Figure 2) (7-11). Although Choi has already addressed this misunderstanding, herein, we try to make things more clear, using a graphical approach, in hope to provide ways for better understanding the key concepts of the likelihood ratio (5).

In a similar way, the negative likelihood ratio, LR(), can be calculated as:

bm-29-2-020101-e14.tif

In other words (4),

bm-29-2-020101-e15.tif

Graphically, LR() is the slope of the line segment joining the cut-off point on the ROC curve to the upper-right corner of the unit square (gray dash dotted line, Figure 2, and Table 1). It is also the area under the curve f(x) to the left of the cut-off line, line x = r (false-negative rate = 1 – Se, yellow plus the red-hatched area in Figure 1) divided by the area under the curve g(x) to the left of the cut-off line (true-negative rate = Sp, green plus the blue-hatched area in Figure 1).

Likelihood ratio for a range of test results

Suppose that we want to decrease the cut-off value from r to s (Figure 1). Graphically, this corresponds to moving along the ROC curve from the solid circle up and to the right to the solid rectangle (Figure 2). Here, we want to examine the likelihood of having a test value between s and r in those with a disease compared with those without the disease. This is particularly important for tests with polytomous results, say scores obtained from a questionnaire used to categorize people into those with no, mild, moderate, and severe depression. We can define the likelihood ratio for an interval, LR(Δ), as follows (4, 5):

bm-29-2-020101-e16.tif

where indices indicate the Se and Sp for the cut-off values of r and s (Figures 1 and 2Figure 2). Graphically, it is equal to the slope of the line segment joining the two points on the ROC curve corresponding to the two cut-off points (grey dash dot dotted line, Figure 2, and Table 1). It also corresponds to the ratio between the red-hatched and blue-hatched areas in Figure 1.

Example

Suppose the fasting blood sugar (FBS) concentration has a binormal distribution in a group of studied people, having a mean of 89.7 (SD 5.0) mg/dL in healthy people and 99.7 (SD 7.2) in a group of patients with diabetes mellitus. The data presented in Figures 1 and 2Figure 2 are based on these assumptions. The test values r and s are 98 and 93 mg/dL, respectively.

As mentioned earlier, LR(r) for an FBS of 98 mg/dL, is very hard to derive precisely in general. However, assuming the binormal distribution of FBS in our example, then we can easily calculate the density functions for f(r) and g(r) using either the MS Excel® function NORMDIST() or R function dnorm(). For example, based on the above information, using the Excel function, f(r) is then NORMDIST(98, 99.7, 7.2, FALSE), which is equal to 0.0539. Using the R function, the g(r) is dnorm(98, mean = 89.7, sd = 5), which is 0.0201. Note that you do not need to use both functions; one is enough. Here, we just used both to show how to use these functions. LR(FBS = 98 mg/dL), the slope of the tangent line to the ROC curve corresponding to the point r, f(r) / g(r), is then 2.68 (= 0.0539 / 0.0201), meaning that an FBS of exactly 98 mg/dL is 2.68 times more likely to be observed in a person with diabetes mellitus as compared with a healthy person.

Now, suppose that the prevalence of diabetes mellitus in the studied population is 0.1. This translates to a pre-test odds of 0.11 [= 0.1 / (1 – 0.1)]. Also, suppose that we take the FBS cut-off value for the diagnosis of diabetes mellitus equal to 98 mg/dL, i.e., those with FBS ≥ 98 mg/dL are considered diabetic. Considering the Se of 0.60 and Sp of 0.95 (1 – Sp = 0.05) at the point on the ROC curve corresponding to r (Figure 2), the LR(+), the slope of the line segment joining the origin of the unit square to the point corresponding to r on the ROC curve, is 12.0 (= 0.60 / 0.05). To determine the post-test odds of the disease, we have:

bm-29-2-020101-e17.tif

But,

bm-29-2-020101-e18.tif

Here, a positive test, having an FBS ≥ 98 mg/dL, increased the probability of diabetes mellitus in a person from 0.1 to 0.57.

Now, what if a person has a negative test result—FBS < 98 mg/dL? Considering the Se of 0.60 (1 – Se = 0.4) and Sp of 0.95 at the cut-off point, r (Figure 2), the LR(–), the slope of the line joining the point corresponding to r on ROC curve to the upper-right corner of the unit square, is 0.42. Then, the post-test odds of having diabetes mellitus is:

bm-29-2-020101-e19.tif
translating to a post-test probability of the disease of 0.05. Notice, when the probability and odds are small, the two values are almost equal.

Finally, to calculate the likelihood ratio of having a FBS between 93 and 98 mg/dL, we need to calculate the slope of the line segment joining the points corresponding to r and s on the ROC curve (Figure 2). The Se and Sp of s are 0.81 and 0.77. Then we have:

bm-29-2-020101-e20.tif
meaning that an FBS between 93 and 98 mg/dL is 1.17 times more likely to be found in a person with diabetes mellitus as compared with a healthy person.

Conclusion

Having a clear understanding of the meaning and usage of the likelihood ratio is of paramount importance in correct interpretation of test results. Graphical representation of test indices is very helpful in better understanding of this issue. Attention should be paid not to get confused about the likelihood ratio for a specific test result, for a positive or negative test results, and for a range of test values.

Acknowledgments

The authors would like to thank Professor Anders Kallner of Karolinska University Hospital, Stockholm, Sweden, for the long time he spent discussing the issues raised in this manuscript. The views expressed in this paper are the sole responsibility of the authors.

Notes

[1] Conflicts of interest None declared.

References

1 

Newman TB, Kohn MA, editors. Evidence-Based Diagnosis. New York: Cambridge University Press; 2009. https://doi.org/https://doi.org/10.1017/CBO9780511759512

2 

Goodman SN. Toward evidence-based medical statistics. 2: The Bayes factor. Ann Intern Med. 1999;130:1005–13. https://doi.org/10.7326/0003-4819-130-12-199906150-00019

3 

Fletcher RH, Fletcher SW, Fletcher GS, editors. Clinical Epidemiology: The Essentials. Fifth ed. Philadelphia: Lippincott Williams & Wilkins; 2014.

4 

Habibzadeh F, Habibzadeh P, Yadollahie M. On determining the most appropriate test cut-off value: the case of tests with continuous results. Biochem Med (Zagreb). 2016;26:297–307. https://doi.org/10.11613/BM.2016.034

5 

Choi BC. Slopes of a receiver operating characteristic curve and likelihood ratios for a diagnostic test. Am J Epidemiol. 1998;148:1127–32. https://doi.org/10.1093/oxfordjournals.aje.a009592

6 

Fierz W. Likelihood ratios of quantitative laboratory results in medical diagnosis: The application of Bezier curves in ROC analysis. PLoS One. 2018;13:e0192420. https://doi.org/10.1371/journal.pone.0192420

7 

Erdreich LS, Lee ET. Use of relative operating characteristic analysis in epidemiology. A method for dealing with subjective judgement. Am J Epidemiol. 1981;114:649–62. https://doi.org/10.1093/oxfordjournals.aje.a113236

8 

Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem. 1993;39:561–77.

9 

Shapiro DE. The interpretation of diagnostic tests. Stat Methods Med Res. 1999;8:113–34. https://doi.org/10.1177/096228029900800203

10 

Kallner A. Bayes’ theorem, the ROC diagram and reference values: Definition and use in clinical diagnosis. Biochem Med (Zagreb). 2018;28:010101. https://doi.org/10.11613/BM.2018.010101

11 

Hazra A, Gogtay N. Biostatistics series module 7: the statistics of diagnostic tests. Indian J Dermatol. 2017;62:18–24. https://doi.org/10.4103/0019-5154.198047