In retrospect, perhaps one of the most important results in the ISIS trials was the analysis of the results by astrological star sign. All of the patients had their date of birth entered as an important 'identifier'. We were therefore able to divide our population into 12 subgroups by astrological star sign. Even in a highly positive trial such as ISIS-2 , in which the overall statistical benefit for aspirin over placebo was extreme (P <0.00001), division into only 12 subgroups threw up two (Gemini and Libra) for which aspirin had a nonsignificantly adverse effect (9% ± 13%)
Of course most physicians (but not all!) laughed when they were presented with these results. However, when presented with other less ridiculous subgroup analyses they are likely to believe the results, and forget the example from astrology, particularly if the result can be justified by some pet theory.
When one divides a trial by a seemingly more legitimate grouping (eg by the individual countries in a multinational study), then it is highly probable that a negative or neutral result will be seen in one country. Indeed, this was a point of discussion during the 1 May 2000 US Food and Drug Administration hearings (Yusuf S, personal communication) on the results of the recent HOPE study , in which ramipril had no significant effect in the US participants. We have seen similar results in the ISIS trials, but did not report these because of the possibility of harm caused by misinterpretation of such statistical 'flukes' (and hence a failure to use a useful treatment in that country).
ISIS-2 was carried out in 16 countries. For the streptokinase randomization, two countries had nonsignificantly negative results, and a single (different) country was non-significantly negative for aspirin.
There is no plausible explanation for such findings except for the entirely expected operation of the statistical play of chance. Of course, another reasonable explanation for negative or curious results in a subgroup is that the statistical power to detect a result is reduced by either a low event rate (eg in a low-risk subgroup such as young hypertensive persons) or by a low number of subjects in a particular subgroup (eg old age or female sex).
It is very important to realize that lack of a statistically significant effect is not evidence of lack of a real effect. Unfortunately, this error is often made by physicians.