|
|
From "The Economist," September 27, 1997
SCIENCE AND TECHNOLOGY
Methodical progress
Applying the scientific method to the processes of science
can be illuminating
The claim of science to be a superior route to the truth
rests on its procedures. Hypothesis suggests experiment and
experiment hypothesis in a never-ending virtuous circle.
Ideas that turn out to be incorrect are ruthlessly
discarded. Individual scientists may be fallible, but their
weaknesses are inevitably exposed. And so forth.
The success of many scientific disciplines suggests there is
much to this claim. But that does not exempt it from
examination. And one way to do that is to apply to science
the same method that it applies to everything else, in order
to see if its everyday practices are, themselves, scientific
enough. With his colleagues, Christopher Martyn of
Southampton University, in Britain, has done just that. He
has looked at the process known as peer review, which is
supposed to filter the torrents of scientific papers that
pour into editors' offices and identify those worthy of
publication. And having looked, he has found it wanting.
Dr. Martyn was one of some 300 people at the recent
Conference on Biomedical Peer Review, in Prague. The paper
he presented made disturbing listening. Peer review involves
the editors of scientific journals forwarding the papers
they have received to experts for assessment. The
assessments are passed back to the authors, but
anonymously, the idea being to encourage honest appraisal.
Papers are accepted, with or without modification, or
rejected, largely on the say-so of these expert referees.
Peer review has long had its critics. The referees are
usually busy people and are rarely paid for their trouble,
so the process is often slow. There is also a feeling that,
despite the anonymity, an old-boy network operates in some
fields (an idea reinforced by a recent study in Sweden which
showed how discrimination in favour of acquaintances, and
against women, operates in the related area of reviewing
grant applications). But some people may think that what Dr.
Martyn has found is even more troubling: that even
well-meaning peer review is of disturbingly low quality.
In 1995 his research group sent a paper about the risk
factors for death in the elderly to the British Medical
Journal. After normal peer review, it was accepted. But
then, in collaboration with the journal's editors, he
deliberately introduced eight errors into his paper.
The modified manuscript was sent to 420 potential reviewers
from the BMJ's database. Of the 221 who responded, none
identified all eight mistakes and few caught more than two
or three of them. Nonetheless, the reviewers tended to be
free with their suggestions. For example, one neurologist
who described himself as "unqualified to comment" because he
lacked the requisite training in epidemiology or statistics,
wrote "having said all this, the paper is clearly rubbish .
. ."
Obviously the latter problem is partly the BMJ's fault for
sending the modified paper to inappropriate reviewers
(though all were deemed suitable by the database). That
shows the importance of picking reviewers carefully but does
not undermine the whole concept of peer review. The main
conclusion of the study-that even appropriate referees fail
to spot mistakes-is, however, more damning. It suggests that
a fundamental overhaul of the review process is called for.
A partial solution may be more use of electronics. In a
virtual version of what sometimes happens at conferences, a
paper could be published first to a limited audience.
Criticisms would be invited and then incorporated into the
final version-or not, as the case may be.
Later this year, the Medical Journal of Australia will begin
an experiment along these lines. It will post research
articles on the World Wide Web and give an expert group of
reviewers a password that will allow them to comment. After
a period, it will then give a more broadly based group of
practitioners the same access, so that the version of the
paper which is ultimately printed will reflect both of these
points of view. How such editing-by-committee will work in
practice remains to be seen-but if the experiment fails, the
theory behind it can, of course, be rejected like any other
failed hypothesis.
The scientific method can also be applied to assessing the
papers themselves. In another conference presentation Simon
Wessely of King's College, London, argued that the
nationalities of authors, and their areas of specialisation,
can be the enemies of objectivity. These are serious
allegations. That specialists might be a little myopic is
understandable, but one of science's claims to elevated
status is that it is blind to such trivia as nationality.
That was not, however, what Dr. Wessely found.
He deliberately chose a controversial topic-a debilitating
illness commonly known as chronic fatigue syndrome (CFS),
the cause (and even the existence) of which is a subject of
much discussion. He and his colleagues analysed 89 overview
articles published about CFS in English-language journals
between 1980 and 1996.
That, in itself, might be thought a rather narrow approach
to a project that was examining cultural bias (though, to be
fair, even journals from non-English-speaking countries are
often published in English these days). But even in the
monoglot world the researchers had chosen, national biases
were apparent. Studies originating in America, for example,
rarely cited British research, while British reports on the
subject tended to give information from the other side of
the Atlantic short shrift. This suggests that the two
countries' CFS researchers were, to a large extent, ignoring
each other. (For good measure, different groups of experts,
such as those who study infectious disease and those who
study mental health, did, indeed, ignore each other as
well.)
Worrying as all this is, a third issue addressed by the
conference was still more disturbing. It is an axiom of the
scientific method that any data being analysed in a study
must be a representative sample of reality. But publishers
generally prefer research that has a positive result. Papers
showing that something happens are more likely to be printed
than those which show that it does not, even though such
"negative" results can be important.
Amnesty international
The problem is particularly acute in the field of clinical
trials for new medical treatments. The bias is so strong
that many researchers do not bother to report trials in
which the new treatment is no better than an existing one.
At first sight, that may seem to make sense. But in this
area negative results are particularly significant. That is
because few trials are large and clear enough to be
decisive. To overcome this, a new science, known as
meta-analysis, has grown up over the past few years.
Meta-analysis is a way of extracting statistically
meaningful information from lots of small trials, even if
they have been conducted in ways that make them difficult to
compare "by eye". Its conclusions, however, are only valid
if the negative trials are included as well as the positive
ones. Leave out the negatives and the results may be too
optimistic.
Around 500,000 controlled clinical trials are thought to
have been carried out since the method was devised in 1948,
and it is estimated that at least 10% of these have
languished in unpublished obscurity simply because they
failed to demonstrate that "A" was better than "B". In this
case, however, there is good news. To try to overcome the
effects of publication bias, more than 100 journals around
the world made an announcement coinciding with the
conference that they are declaring an "amnesty" for such
unpublished work (something of a cheek, since they helped to
cause the problem in the first place), and are opening a
register to receive it (meta@ucl.ac.uk). They have asked for
researchers who have conducted a study that was never
published in full, or who know of others who have conducted
such studies, to come forward. Better late than never.
|