Separating Gold From Junk in Medical Studies

New York Times
October 23, 2002

You no doubt hear or read about startling new study findings all the time: some food or supplement is found to prevent cancer, relieve arthritis or reverse hair loss; some drug is shown to prevent deaths from heart disease and stroke, or a new outpatient procedure has been developed to replace a major operation.

In years past, only physicians had to know how to interpret the findings of a medical study, though certainly not all of them were up to the task. Now, with a seemingly insatiable public appetite for research findings, the job of interpreting studies for the public has fallen largely to journalists, many of whom are far less qualified than physicians to make sense of the studies. Members of the news media often have a poor understanding of research methods and statistics, lack an appreciation for the extent and limitations of new data, and are often unable to convey the subtle but critically important nuances of research in the time or space allotted.

Unfortunately, this has not stopped the news media from proclaiming all sorts of medical findings as "facts" that may be far from certain. So now, the job of understanding the relevance of research is falling increasingly to the general public.

And so, my attempt here at a quick lesson on how to read between the lines and determine whether some new finding has any significance.

What Kind of Study?

Some studies produce more certain findings than others. Least certain of all are animal studies. Even though people share many genetic characteristics with research animals, metabolism and immune defenses may differ enough to make the finding irrelevant to people.

Epidemiological studies or observational studies follow people for years, and the studies may uncover correlations between exposure to substances or living habits and particular health outcomes. These findings may suggest true relationships, or they may be due to some other unmeasured or unmeasurable factor.

Likewise, case-control studies in which patients with a particular disease are compared with similar people who are healthy may suggest, but not prove, that some factor was responsible for the illness.

More and more, people are likely to hear or read about "meta-analyses." With these, many smaller studies are combined to search for a finding that only a large study could reveal. But it is important to realize that meta-analyses are no more accurate than the studies they include. If these smaller studies were poorly designed, the conclusions of a meta-analysis are likely to be erroneous.

Even the so-called gold standard of medical research ' the placebo-controlled, randomized, double-blind clinical trial ' sometimes produces spurious results or results that apply to a limited group or only under certain conditions.

Nonetheless, such a trial is most likely to yield results that can be reliably applied to people like those in the study population. In such a study, participants are randomly assigned to an experimental group or control group, and neither the participants nor the researchers who evaluate them know which person is in which group until the study is completed.

Does It Apply to You?

How participants are recruited can influence the reliability of the findings. Advertising for participants in a newspaper may favor those who are better educated or more highly motivated than the general population. Such people may have habits or attitudes that can affect the outcome of the study. Many studies exclude people who have other ailments, take certain medicines or speak languages other than English. If a study of a new drug is conducted among healthy young men, the findings may not apply to older women with an existing illness. Or if a study is done among people with advanced disease, the outcome may be different for those with milder forms. Finally, where and how was the study conducted? If the participants had to be hospitalized or if the research involved equipment that was not generally available to practicing physicians, the findings might be useless to an ambulatory patient being cared for by a private doctor or in an outpatient clinic.

The Issue of Structure

The question the study was designed to answer limits the dependability and extendability of the results. Thus, in one placebo-controlled randomized study of postmenopausal hormone replacement, participants who took the hormones had levels of blood fats that strongly suggested better protection against heart disease.

This is considered a "soft endpoint" ' an indication of, but not proof of, protection against heart disease. For proof, a study has to include many more participants and last much longer to show that those on hormones do or do not suffer fewer cardiac problems, thus providing a "hard endpoint." Most studies are designed to find that a particular outcome has statistical significance. This is called the primary endpoint. Sometimes other findings, called secondary endpoints, are also found to have statistical significance, but these results are not in themselves dependable enough to consider the finding an established fact.

So when a study designed to examine the relationship between pancreatic cance r and smoking found a link between this cancer and coffee drinking, the latter finding was a secondary endpoint, which ultimately proved to be untrue. The size and duration of the study is also important. It must be big enough and last long enough to produce statistically significant results, and this is determined by how likely an event in question will occur among the participants and in what length of time.

For example, in a clinical trial assessing the ability of two different drugs to prevent breast cancer in healthy women considered at high risk of developing the disease, 22,000 participants are needed who must be followed for seven years.

But treatment studies in women who already have breast cancer may require only 6,000 patients who are followed for five years to determine whether the treatment in question prevents a recurrence.

Or if a study involves a much more common disease, like heart disease, far fewer participants may be needed to determine whether a drug lowers high blood levels of cholesterol or, if the participants are over 65, whether it prevents heart attacks.

You may also want to know who paid for the study and whether the research findings were independently evaluated. More and more research is now financed by private industry and conducted by individual physician investigators. You should know whether the sponsor or researchers will benefit financially from a particular outcome. Safeguards against conflicts of interest, including teams of independent reviewers, must be in place.

But just because a drug company pays for a study does not mean the findings will be misrepresented. For example, a study financed by Wyeth-Ayerst, the maker of the hormone drug Prempro, examined the value of hormone replacement in women who already had heart disease, but to everyone's disappointment, the drug resulted in more, not fewer, deaths among these women.

Finally, it is important to realize that the result of even the most thorough and careful study may require independent confirmation before it is considered a fact that should change medical practice. Rarely does one study bring about a major change in disease treatment or prevention.