I used to work for a company that worked with drug abusers; drug testing was of obvious interest.  They also did work with people with AIDS, so disease testing was of interest.  For the purposes of this diary, the two are roughly equivalent.  There are also profound civil liberties questions involved in these, but they aren't covered here, I'm just the stats, man.

More below the fold, but first

This series is for anyone.  There will be no advanced math used.  Nothing beyond high school, usually not beyond grade school.  But it'll go places you didn't go in elementary school or high school.

I welcome thoughts, ideas, or what-have-you.  If anyone would like to write a diary in this series, that's cool too.  Just ask me.  Or if you want to co-write with me, that's fine.

The rules:  Any math that is required beyond arithmetic and very elementary algebra will be explained.  Anything much beyond that will be VERY CAREFULLY EXPLAINED.

Anyone can feel free to help me explain, but NO TALKING DOWN TO PEOPLE.  I'll hide rate anything insulting, but I promise to be generous with the mojo otherwise.

A test for a disease or a drug (or other things), usually gives a result of POSITIVE or NEGATIVE.  Sometimes, the result is INCONCLUSIVE, but most tests don't do that, and, when they do, the solution is usually to test again, so we won't cover those.  Similarly, in reality, the person may have used the drug (have the disease) or not.  There are, thus, four possibilities:

(for simplicity, I will just deal with "drugs" from now on.

True positive: The person did the drug, the test says he/she did
True negative: The person did not do the drug, the test says he/she did not.
False positive: The person did not do the drug, the test says he/she did
False negative: The person did the drug, the test says he/she did not.

These are usually presented in a table:

Reality
Negative      Positive
T                                        |
e                                        |
s             Negative          TN       |     FN
t                                        |
R                           ----------------------------
e                                        |
s                                        |
u             Positive         FP        |     TP
l                                        |
t

When you test a test, you typically get a group of people known to have done the drug, and a group known not to have done it, and give the test to them all, and record the results.  Sometimes people summarize these results with two numbers called sensitivity and specificity.  Sensitivity is TP/(TP + FN); specificity is TN/(TN + FP). Overall accuracy is TP + TN/(TP + TN + FP + FN).  Suppose we have a test that is 98% accurate

Reality
Negative      Positive
T                                        |
e                                        |
s             Negative          98%      |     2%
t                                        |
R                           ----------------------------
e                                        |
s                                        |
u             Positive         2%        |     98%
l                                        |
t

Now, we pull someone in off the street and give him the test.  He tests positive.  What are the chances he has done the drug? It depends.  What were those numbers (not percentages), in the general population?

Were they

Reality
Negative      Positive
T                                        |
e                                        |
s             Negative          9800     |     2
t                                        |
R                           ----------------------------
e                                        |
s                                        |
u             Positive         200       |     98
l                                        |
t

or were they

Reality
Negative      Positive
T                                        |
e                                        |
s             Negative          980      |     2
t                                        |
R                           ----------------------------
e                                        |
s                                        |
u             Positive         20        |     98
l                                        |
t

or were they

Reality
Negative      Positive
T                                        |
e                                        |
s             Negative          98       |     20
t                                        |
R                           ----------------------------
e                                        |
s                                        |
u             Positive         2         |     980
l                                        |
t

The first might be for some drug done by relatively few people (like, say, injecting heroin) and the latter for a drug done by quite a few (like, say, nicotine), and the last for a drug done by nearly everyone (maybe caffeine).

In the first case, if you give the guy the test and he comes up positive, his chance of actually being positive is  98/(98 + 200), or about 1 in 3.  In the second case, it is 98/(98 + 20) = about 83%.  In the last case, it's 980/982, or almost certain!  And, if the drug were done by almost no one in the population, then the chance could be much less than 1 in 3.  In fact, if no one in that population does the drug, then, by definition, the chance that this person does the drug is 0, regardless of how accurate the test is!

So, you  have to know what population the person comes from, and you have to know something about that population, as well.

#### Tags

EMAIL TO A FRIEND X
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags

?

More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
 Unpublish Diary (The diary will be removed from the site and returned to your drafts for further editing.) Delete Diary (The diary will be removed.)
Are you sure you want to save these changes to the published diary?

#### Comment Preferences

• ##### Would you have any thoughts on(4+ / 0-)

a recently proposed rule concerning drug testing put forward by the federal Mine Safety and Health Administration?

Mine operators would be required to establish an alcohol- and drug-free mine program, which includes a written policy, employee education, supervisory training, alcohol- and drug-testing for miners that perform safety- sensitive job duties and their supervisors, and referrals for assistance for miners and supervisors who violate the policy. The proposed rule would also require those who violate the prohibitions to be removed from the performance of safety-sensitive job duties until they successfully complete the recommended treatment and their alcohol- and drug-free status is confirmed by a return-to-duty test.

Of course, practically every job in a mine is going to be "safety-sensitive," obviously.

The agency has received some criticism in thatthe proposal was issued abruptly on September 8, without the usual prior notice through the semi-annual rulemaking agenda, and in that the agency is holding public hearings this week -- by webcast, simultaneously in several places, rather than in-person and sequentially as per normal. Definitely the fast track.

It's known that some miners use alcohol and drugs and in a tiny number cases, coroners found intoxication that likely played a role in a fatality -- either alcohol or methamphetamines.

Any idea if this proposal is likely to do more harm than good?

• ##### Whether it will do more harm than good(4+ / 0-)

depends on a lot of things that I don't know.  This is the first I hear of this.

I am not opposed to all drug testing.  There is some balance point between public safety and private rights, the question is how to strike that balance.

I am not at all expert on how drug testing is done for, say, pilots.

Statistically, there are ways to vary the balance between false negatives and false positives.  But there is no way to eliminate them --- they can be reduced with better tests, or repeated testing with different tests.

One approach might be to require tests of ability, rather than of drug residue in the body.

• ##### Never mind that some "drugs" stay in the(7+ / 0-)

body for weeks while some - the most serious - vacate in just a few days.

Drug testing can be traced to friends of Ronald Reagan, particularly a man named Carlton Turner:

Why do so many U.S. companies piss-test? Conrad says that in the Reagan years, Carlton Turner did studies for the National Institute of Drug Abuse on drugs in the work place.

"Turner didn't find much about the effects of drugs in the work place, but he did discover someone had invented a way to test for THC in urine. Turner recognized it could be a gold mine. The Reagan administration, at Turner's bidding, agreed to put urinalysis requirements in the government contract process."

The U.S. government effectively subsidized the pee-test industry. (Note: In '91, the U.S. Center for Disease Control reported labs routinely come up with false positives in piss tests.)

He and others became wealthy and the Repubs and Dems (Tip O'Neil was as reefer mad as the worst Repub) got people to totally piss away their rights to privacy.

Note the focus on false positives and remember stories about the witchhunters with their "special tools" like knives with blades that could be moved into the handle such that a woman could be "shown" to be a witch because the "knife sank in deep but left no mark and no blood".

The huge focus on drug testing the Repubs managed to force on us was clearly - and remain - a central part of a facist ideology that wants different ways to invade your privacy and know what you are doing on your own time.

Those who think it's about "saving lives" clearly forget about the 360000 people who die every year from smoking legal tobacco.

Drug testing is nothing but a very lucrative  scam enriching GOP_connected companies and enforcing GOP ideology.

And it is 90% about pot smokers since THC is the  thing it's best at.

Not to piss on your excellent work plf....

• ##### You raise good issues(0+ / 0-)

although they are not what I focused on .... like I said, I'm the stats guy.

But I am not opposed to all drug testing ... for instance, I am not opposed to testing pilots.

OTOH, I know the test for THC is bad, it picks up all kinds of stuff that it shouldn't, because, as you note, THC sets off the test long after the effect of THC have passed.

One approach might be to have ability tests, rather than content tests.  Or to have ability tests if a content test had given a positive result.

As for 'saving lives' I don't see how testing plays a role in saving the lives of users of drugs, even if we assume drugs cause health problems.  The idea is that some drug use endangers others ... and I don't think there's much doubt that some drugs impair functioning, at least temporarily

• ##### There are performance tests used by some companie(3+ / 0-)
Recommended by:
Wee Mama, triv33, plf515

that are much more accurate in determining actual impairment - but I know little about them.

I thought I remembered the name of one of them...but it's for diagnosing ADHD... D'oh!.

I merely inserted my comment because so many people accept the intrusion of drug testing like it's no big deal, yet get up in arms when Bush is rifling through their e-mail.

IT's the same exact people doing the same exact thing - invading your privacy. Why piss for them with a smile and then bitch about you e-mail?

And, last observation - making people piss for a job is what pack animals like wolves would do - to make you show real submission to the Leader. Pre-employment drug testing is clearly fascism.

• ##### Reminds me of a Bayes' theorem example problem(2+ / 0-)
Recommended by:
Wee Mama, plf515

in my probability textbook last year. So basically if the drug is rare, the test tends to suck?

November 4th: McCuster's Last Stand That One '08

• ##### It is definitely related to Bayes' theorem(1+ / 0-)
Recommended by:
Wee Mama

but your conclusion is a bit too simplistic.

There are good tests for very rare diseases (less so for very rare drugs) but you can't take a single number about a test and then assume that it works equally well for everyone.  That's why, e.g., HIV testing of the whole population makes little sense... there would be an absolutely huge number of false positives.  It's much more sensible to test among groups that are at risk for the disease (or that are suspected of using a drug).

Take alcohol testing of drivers .... if you test everyone then you will get a much higher percentage of false positives then if you only test people who are driving oddly.

• ##### If we could only get those who set the tests for (1+ / 0-)
Recommended by:
plf515

blood donors to read, mark, learn and inwardly digest this!!

• ##### I can't fault you on your analysis of the problem(1+ / 0-)
Recommended by:
plf515

that you define, but I don't think the generalization of the results to the assesment of test results is valid.  The application of the result assumes that test failures, such as false positives, are probabilistic in character.  This is rarely the case, however.  I could give you several examples but I think it might be better to flesh out your problem instead.

In your first example, let's say a postive result is assigned to a test value of 100 ng/mL or more.  For the test results that were deemed positive, 250 of them fell into a range of 95 to 120 ng/mL, 30 of them between 120 ng/mL and 200 ng/mL, 10 between 200 ng/mL and 300 ng/mL, 5 between 300 ng/mL and 500 ng/mL, and 3 between 500 ng/mL and 800 ng/mL.  The values found by a gold reference test showed that the test under consideration over estimated values by ~ 21 ng/mL.

"Now, we pull someone in off the street and give him the test.  He tests positive.  What are the chances he has done the drug? It depends."  In this case, however, the liklihood of the person having done the drug will depend on the test result.  A test result of 720 ng/mL will be much more likely to have done the drug than someone with a value of 102 ng/mL.

The problem you define is equivalent to placing various numbers of black and white jellybeans and gumballs into a bag and picking either a jellybean or gumball and asking the question, what's the probability that it's white or black?  Applying this type of problem to the evaluation of test results is a bit of a strecth IMHO, although you definitely illustrate some of the potential misinterpretations that can arise from a test that is "98% accurate".

• ##### You're certainly right(1+ / 0-)
Recommended by:
guyeda

and I regularly argue with my clients about dichotomizing continuous results.  However, my results are also true for the average of the people who have positive results.

• ##### I'm still not so sure about that.(1+ / 0-)
Recommended by:
plf515

I really do understand your point, but it's applicability does depend on the type of test, how its specificity was determined, and its purpose.

For screening tests like immunoassays that may cross-react with certain compounds whose presence may be in a significant fraction of the population, I think your approach is reasonable if the accuracy assesment was done on a similarly representative sample.  Often, however, this is not done.  As an example, consider assays for tumor markers.  These type of screening assays are potentially helpful in alerting physicians of the potential for cancer in patients who otherwise show no symptoms.  However, assays for these analytes are often assesed using samples that are taken from unhealthy patients as it provides a means of evaluating the accuracy of the test in comparison to biopsy results, i.e. serum from patients that show symptoms are evaluated using the assay under consideration in addition to performing the biopsy test that conclusively shows the presence of cancer.  Serum from patients who show no symptoms are not evaluated because no one wants to give or undergo a biopsy procedure if there's no reason to do so.  As a result, the assay assesment is done on a sample of the population (those showing symptoms) that is distinctly different from its intended target (those showing no symptoms.)

Additionally, almost all assays are continuous in nature, even if the reported result is only positive or negative.  Consider the famed "litmus test".  Most people would consider this as a simple yes/no or acidic/basic type of test.  The visual evaluation, however, depends on the extent of protonation of the litmus dye that is dependent on the pH of the sample.  As a result, the test is useful for discriminating between solutions of pH 6 and 8, not so good for pH 6.9 and 7.1.

I'm curious about the context in which you actually did this assesment.  For the toxicological example you give I'd find it difficult to view a positive GC/MS result with the type of probabilitic assement you provide.  From my perspective, short of operational errors by the test lab, like sample mishandling, I can't see this probabilistic result being true.  I'll also admit, however, that probability evaluations are not my strong suit.  I still can't quite believe the result to the "Monty Hall" problem even though I know and understand two solutions to that problem.

• ##### I can't quote an exact source(1+ / 0-)
Recommended by:
guyeda

but I've seen analyses similar to mine in a lot of intro to probability books.

Oddly, I'd never seen pH used in a 'yes/no' manner... it's clearly more or less.  OTOH, the question of whether someone has a disease is yes/no, even though the probability of having a disease, based on a test, is dependent on the exact score, not just whether it's past a certain cutoff.

In my experience, epidemiologists are among the worst offenders in this regard, e.g., classifying newborns as "low birth weight" or 'normal weight' even though low birth weight covers a vast range of weight

Anyway, I think we agree with each other ... This diary (and series) are really more introductory

• ##### Yeah, a lot of my reply was essentially thinking(1+ / 0-)
Recommended by:
plf515

"aloud" about your thought provoking diary.  I've done assay development for a number of applications and have never seen your approach in evaluating the the end likelihood of a given result.  Quite simply I've never really had to address that issue.  Even when calculating or using confidence intervals I don't really consider the question of "how likely is this result the correct one?"  From my experience, non-statistical sources of error dominate statistical ones for most assays, but your point involves the relevance of the distribution of the population on the interpretation of the results.  I'll be thinking about this for quite some time.  Thanks for the excellent diary.

• ##### It's not a bug, it's a feature(2+ / 0-)
Recommended by:
marina, plf515

The continuous nature of the actual test results mean that if you're getting too many false positives, you don't have to abandon the test method and go back to guessing: you just have to change the threshold at which you deem a result "positive".

Put that way, an excess of false positives becomes a political failing, not a technical one: someone, somewhere, panicked so badly about the drug to be tested, that they ratcheted up the sensitivity of the test to eliminate any possibility of false negatives, because they thought eliminating false negatives was more important than false positives.

See the infamous No-Fly List: thousands of false positives, because someone thinks the world will end if a single false negative gets through (it won't. More money was lost, and more people will die, because of the credit crisis, than because of the 9/11 attacks)

• ##### Testing the miners is a vindictive response(1+ / 0-)
Recommended by:
plf515

from Bush's mining buddies, IMHO.

Mining is a tough job, and as with other tough jobs, the workers like to blow off steam or relax when they're off duty. I truly doubt that many miners would use substances while on the job (other than maybe a little amphetamine to stay sharp; but alcohol use on the job would be deadly and the guy would be drummed out by his peers.

Something tells me this is some kind of harassment of the miners.