Two parameters are key to understand how useful medical tests are: sensitivity and specificity. Sensitivity measures what fraction of the affected people that the test successfully identifies, so that the complement to 1 of the sensitivity measures the false negatives. Specificity is the complement to 1 of the fraction of healthy people that the test erroneously flags as affected (i.e., the false positives).
Clearly, a test is better when both sensitivity and specificity are high. A high sensitivity means that few affected people go through the test undetected, while a high specificity means that few healthy people get the scare of a positive test result.
When a test for prostate cancer gives a positive result, doctors recommend a biopsy, which is an invasive and often uncomfortable procedure. Regarding prostate biopsies, Wikipedia says: Biopsies detect prostate cancer in about 25% of men with abnormal screening tests. However a negative biopsy does not ensure the absence of disease. Repeat prostate biopsies are positive in about 25-30% of patients whose initial biopsy was negative. Now, Wikipedia is not always reliable, but I have read from other sources that prostate biopsy is not particularly dependable. To complicate the picture, I have also read (don’t remember where) that many prostate cancers are not malignant, and people can live with them without consequences.
I know: my last paragraph is just hand-waving. Without references to original sources, my statements have no scientific value whatsoever. But, now that I have written it, I don’t like to remove it. Let me leave it at that and go back to statistical analysis.
For the purpose of this post, I will consider two tests: PSA (Prostate-Specific Antigen) and PCA3 (Prostate Cancer Antigen 3). PSA measure the level in urine of a protein produced by the prostate gland. PCA3 checks for the presence of a gene.
The following table for PSA comes from an article on the website of the Prostate Cancer Research Institute (www.prostate-cancer.org/pcricms/node/122):
PSA (ng/mL) | Sensitivity | Specificity |
1.1 | 83.4 | 38.9 |
1.6 | 67.0 | 58.7 |
2.1 | 52.6 | 72.5 |
2.6 | 40.5 | 81.1 |
3.1 | 32.2 | 86.7 |
4.1 | 20.5 | 93.8 |
6.1 | 4.6 | 98.5 |
8.1 | 1.7 | 99.4 |
10.1 | 0.9 | 99.7 |
The study was conducted (if I understand it correctly) on 5,587 subjects, of whom 4,362 had no cancer of any type. The Symbion Laverty Pathology Lab that conducted my PSA test a few months ago stated that a level below 4.5 ng/mL is considered ‘normal’ (although, to be picky, the correlation between PSA and prostate cancer actually changes with the age of the subject).
If you (linearly) interpolate the table, you find that the specificity for a PSA of 4.5 is 94.7%. Considering that specificity as a function of PSA ‘flattens up’, we can be generous and say that the cut-off is around 95%. This means that 5% of healthy subjects will be encouraged to take a biopsy if their PSA is 4.5 ng/mL or higher. Similarly, the interpolated sensitivity for a PSA of 4.5 is 17.3%, which I am happy to generously round up to 18% to be on the safe side.
The following data, relative to PCA3, comes from an article published by Reviews in Urology and accessible via the website of the American National Institute of Health (www.ncbi.nlm.nih.gov/pmc/articles/PMC2556484/):
PCA3 Cut-off | Sensitivity (%) | Specificity (%) |
5 | 96 | 14 |
20 | 71 | 56 |
35 | 54 | 74 |
50 | 40 | 83 |
65 | 32 | 91 |
90 | 20 | 95 |
The study was conducted on 570 men, of whom 36% tested positive. But what does it mean ‘positive’? It was decided to use a PCA3 cut-off of 35 (whatever that means...), because that score (I quote) combined the greatest cancer sensitivity and specificity. Indeed, 35 is the value that maximises both the sum and the product of sensitivity and specificity.
Notice that, like with PSA, higher specificity is associated with lower sensitivity. This obviously makes a lot of sense: the price you pay to reduce false positives is to get more false negatives.
To summarise, the two major tests used to diagnose prostate cancer (if we exclude the practice of sticking a finger up your anus to check whether your prostate is too enlarged to be considered healthy) have sensitivities respectively of 17% and 54% and specificities of 95% and 74%.
Let set these figure aside for a moment to concentrate on the incidence of prostate cancer. I will use the figures provided by the Prostate Cancer Foundation of Australia (www.prostate.org.au/articleLive/pages/Prostate-Cancer-Statistics.html), but I am very confident that my conclusions will be applicable worldwide.
54 out of 1000 men in their sixties are on average diagnosed to have prostate cancer (1/1000 for the 40s, 12/1000 for the 50s, and 80/1000 for the 70s). This formulation is in my opinion a bit ambiguous, because it is not clear to me whether all those with a positive diagnosis actually had the cancer. And it doesn’t say how many men had a cancer that remained undiagnosed. But let’s move on, because it will not significantly affect the result of my analysis.
OK. If I assume that 54/1000 measures the ‘real’ number of Australian men in their 60s with prostate cancer, I now have all the information I need to calculate the number of PSA and PCA3 tests that give a positive result.
Sensitivity | Specificity | Positive (1000 subjects) | ||
Sick (54) | healthy (946) | |||
PSA | 18% | 95% | 10 | 47 |
PCA3 | 54% | 74% | 29 | 246 |
All in all, more than 82% (obtained as 47/(10+47)) of the men that might subject themselves to a biopsy because of a positive PSA test wouldn’t need to do it at all. And the figure rises to 89% with the PCA3 test. And imagine the shock of being diagnosed with cancer! Furthermore, if you happen to have prostate cancer, PSA will only detect it with a probability of 18%, while PCA3 will detect it with a probability just above fifty-fifty.
I might be too analytical, but to me, a test for prostate cancer doesn’t seem worth doing.
Then, you might ask, why did I decide to get a PSA test done? Well, I did it and I didn’t... My doctor decided to do it against my expressed desire not to have it done. Fortunately, it came back with 0.31 ng/mL. That is, the result was so low that the sensitivity was off the lower end of the scale, probably 90% of better. But next time I am going to insist. No more PSA test for me!
In general, the statistical problems with medical tests occur when the disease being tested is rare, even if the test is very good. To explain why, let’s look at two very good tests, one for a hypothetical disease that affects 1% of the population and one for an equally hypothetical rare disease that affects 0.01% of the population. Further, let’s assume that both tests have a sensitivity of 100% (actually impossible, because there are always cases that remain undiagnosed, but bear with me) and a specificity of 99.99%.
With the more common disease, when screening 10,000 subjects, the test correctly identifies the 100 sick people and gives a positive result for 9900 * 0.01% = 0.99 ≈ 1 person who is in fact healthy.
With the rarer disease, when screening 10,000 subjects, the test, as in the previous case, identifies all sick people and picks 9999 * 0.01% = 0.9999 ≈ 1 person who is healthy. But this time, the number of sick people is also 1. Therefore, half of the people for whom the test gives a positive result are in fact healthy.
Next time your doctor suggests that you take a test, perhaps you should check whether it is worth the effort. I wonder what the sensitivity and specificity of the many tests the doctor prescribe for check-ups actually are. And how rare are the pathologies being tested for? MMmmm...
No comments:
Post a Comment