On determining the sensitivity and specificity of a new diagnostic test through comparing its results against a non-gold-standard test

Diagnostic tests are important clinical tools. To assess the sensitivity and specificity of a new test, its results should be compared against a gold standard. However, the gold-standard test is not always available. Herein, I show that we can compare the new test against a well-established diagnostic test (not a gold-standard test, but with known sensitivity and specificity) and compute the sensitivity and specificity of the new test if we would have compared it against the gold-standard test. The technique presented is useful for situations where the gold standard is not readily available.


Introduction
Diagnostic tests are among the important means commonly used in clinical medicine. Before a new test can be used in clinical practice, it should be evaluated for clinical validity. Studies assessing the clinical validity of a test (also termed diagnostic accuracy studies) involve determining the test performance indices including the test sensitivity (Se) and specificity (Sp) (1). Other common performance indices are positive and negative predictive values, and likelihood ratios, which can be calculated based on the Se and Sp and the prevalence (pr) of the disease of interest (2,3). To determine a test performance, its results should be evaluated against another test, the so-called reference standard (4). The reference standard can be a goldstandard test, i.e., a test with a Se and Sp of 1.0 (or 100%). The gold-standard test can thus correctly discriminate those with and without the disease or condition of interest. For a test with binary results, the outcome is clear -positive or negative. For tests with continuous results, however, we need to set a cut-off value to categorize the results into positive or negative (2). Compared to the gold standard, the obtained results can be categorized into true-positive (TP), true-negative (TN), falsepositive (FP), and false-negative (FN) results (Table  1a). The tests Se and Sp are defined as follows (5)

P D P T D P T D P D P T D P T D Sp Sp Se Se
The prevalence of the disease (π), is then:

The proposed solution
When we compare T 2 against T 1 , the calculated prevalence, pr, is not really the true prevalence, π, as T 1 is not a gold standard and thus would have FP and FN results. However, we can calculate the true prevalence, π, as follows (7)   where P(x) designates the probability of x. To evaluate the Se and Sp of a new test, it is common to compare its test results against those obtained from a gold-standard test. Nonetheless, the goldstandard test may not always be available. It either does not exist or is very difficult or expensive to perform for certain disease conditions (6). The question arise is that whether it is possible to calculate the Se and Sp of the new test based on the results obtained from its comparison with a nonperfect reference standard -a well-established (but not a gold-standard) test? This is not a new question, and several solutions has so far been proposed (1). Herein, I wish to propose an analytical method to address the question raised.

Stating the question
Suppose that we have a well-established test, say T 1 , with known Se and Sp (measured against a gold-standard test) of Se 1 and Sp 1 (Table 1a). Now, suppose that we have a new test, say T 2 , the results of which were compared against T 1 (not against a gold standard), and that it had a Se and Sp (against T 1 ) of Se 2,1 and Sp 2,1 (Table 1b). We wish to derive the Se and Sp of T 2 (Se 2 and Sp 2 ), if it would have been tested against the gold standard (e.g., Table 1c).

P D P T D P T D P D P T D P T D Se Se Sp Sp
Eq. 6 Eq. 7 and where T + and Trepresent positive and negative test results; and D + and D -, presence and absence of the disease, respectively. P(A|B) denotes the conditional probability of event A given event B.
Based on Eq. 6, we have: Eq. 12 Eq. 13 Eq. 14 Eq. 15 Equations 9 and 11 are a system of two simultaneous equations. Substituting π from Eq. 5 and solving for Se 2 and Sp 2 , yield:  Assuming that Se 2 is a function of independent random variables pr, Se 2,1 , Sp 2,1 , and Sp 1 (Eq. 12), using Eq. 13 and employing basic calculus, we have: In the same way, assuming that Sp 2 is a function of independent random variables pr, Se 2,1 , Sp 2,1 , and Se 1 (Eq. 12), we have: The SE for the Se and Sp of the tests can be calculated using Eq. 2.

(1 -π) Sp 2 260
Total 80 320 400 a) a well-established test, T 1 , against the gold-standard test; b) a new test, T 2 , against T 1 ; note that here, the true prevalence, π, is replaced by the apparent prevalence, pr (7) (Table 1b). Based on the information provided, the apparent prevalence, pr, is 0.25 (SE 2 = 3.1 × 10 -4 ). Using Eq. 5, the true prevalence (π) is: which is correct when the disease prevalence is measured by a gold-standard test (Table 1a). The Se and Sp (along with their SE 2 ) of T 2 against T 1 (  (Table 1c). Note that the 95% CI of the calculated Se 2 and Sp 2 when they are derived through comparing the results with T 1 is wider than those if they are directly compared against a gold-standard test.
In conclusion, it seems that this technique is useful, particularly where the gold-standard test is not readily available or is expensive. Further studies are needed to elaborate on the conditions of the validity study where the Se 1 and Sp 1 are estimated, the minimum number of data points examined, the probable effect of the prevalence of the disease or condition of interest on the choice of the reference test, among other things.

Potential conflict of interest
None declared.