Pharmikos Pharmacology Consulting                                    John Lehmann PhD Scientific analysis of therapeutic and side effects of drugs, data interpretation, causation, patent claims, NDAs

Randomized Double-Blind Placebo-Controlled Clinical Trials Randomized Double-Blind Placebo-Controlled Clinical Trials epidemiology and statistics expert witness Epidemiology and Statistics in Pharmacology and Toxicology; Statistics are a favorite tool of the pharmacologist, as is Epidemiology, e.g.: Case-control studies, cohort studies, randomized double-blind placebo-controlled clinical trials; epidemiology, pharmacology, statistics, toxicology, case-control studies, clinical trials, randomized, double blind, placebo, expert witness, false positive, false negative








Pharmikos > Focus Areas > Epidemiology & Statistics
Epidemiology epidemiology epidemiology case report epidemiology case series epidemiology case control study epidemiology relative risk expert witness absolute risk epidemiology hazard ratio epidemiology cross sectional studies expert witness epidemiology cohort study epidemiology randomized double-blind placebo-controlled clinical trials

Case Reports. Single Case Reports can be useful in supporting intuitive reasoning, but as empirical evidence carry relatively little weight. The absence of case reports drawing attention to a given adverse reaction is particularly meaningless, as some may consider the relationship trivially obvious; or the publication of the observation could be blocked by the absence of supporting data.

Case Series. Case-Series are collections of case reports, or reviews of previously published case reports, which have one or more common features (e.g., drug exposure, type of adverse reaction). Some case series can provide essentially compelling evidence of a meaningful association, for instance, the otherwise extremely rare clear cell vaginal carcinoma found in woman fetally exposed to Diethylstilbestrol (DES Daughters). Nonetheless, as a rule, larger, more formal, and rigorous studies need to be performed to demonstrate association. Good case series depend on alert clinicians to identify trends either empirically or based on reasoning, leading to a new hypothesis.

Case Control Study. Individuals suffering an adverse reaction (e.g., lung cancer) are scored for various risk factors (e.g., tobacco smoking, vitamin A use) and age-, gender-, race-matched individuals are scored for the same risk factors. A higher rate of risk factors in one group suggests that such risk factors are in fact causes. A disadvantage of Case-Control Studies is that only Relative Risk and not Absolute Risk can be determined. Hazard Ratios, derived by Cox's polynomial method, is among the most elegant methods in Epidemiology.

Cross-Sectional Studies. Cross-Sectional or Prevalence Studies can be used to generate baseline data. A major problem when a statistically significant difference is found in a Clinical Trial (see below) or other well-designed experiment which the investigators then reject because one or both values conflicts with such baseline data. The comparison between a Cross-Sectional Study and another study is not valid since a different sample is used.

Cohort Study. As patient registries become more feasible in the informatics age, Cohort Studies are becoming more significant and useful as a means for performing epidemiological studies, both prospective and retrospective. A Cohort Study is simple in design, and involves the comparison of different experimental groups. Major advantages of Cohort Studies can include: detailed and accurate temporal information about exposure to risk factors and appearance of disease; study in a more naturalistic situation mimicking actual "market conditions" better than do clinical trials (e.g., no "recall bias"); multiple outcomes can be followed; both absolute risk and relative incidence can be determined; retrospective Case Control Studies can also be performed on the same data collected.

Randomized Double-Blind Placebo-Controlled Clinical Trials. A well-designed randomized double-blind placebo-controlled clinical trial can be an excellent experiment, performing a rigorous test with good sensitivity. Widely called the "Gold Standard", such Clinical Trials are unfortunately overrated above other epidemiological approaches by non-experts. A few of the weaknesses of Randomized Clinical Trials include:

  • It is not economically feasible to detect adverse reactions of low incidence
  • Results can be obscured by combinations or subdivision of experimental groups (see False Negative / Type II Errors).
    Results are not necessarily representative of patient populations under less rigorous medical supervision
  • The failure of the Investigators to accept the results of the VIGOR trial (Bombardier et al., 2000) showing that Vioxx (rofecoxib) caused greater numbers of heart attacks than did naproxen, but rather rationalized the data to a more amenable interpretation, is an example of how a so-called "gold standard" can be misused.
  • What happens if two Gold Standards disagree with each other? Like any other type of experiment, clinical trials sometimes give results that do not replicate. This shows that the concept of a "gold standard" is the wrong mindset to use, when approaching any general type of experimental design.


Statistics, parametric and non-parametric data, tests of statistical significance, statistical and systematic errors, false positive type 1 error, false negative type 2 error

Parametric and non-parametric data. The first step in statistical analysis and even display of data is to determine what type of data is being treated. "Parametric" refers to continuously variable data, and furthermore assumes that any distributions are normal; and that variances are equivalent in samples that are being compared. Examples of typically parametric data include temperatures and distances. Non-parametric data violate any of these assumptions. Quantal data are typically non-parametric. Examples of non-parametric data include results from Hamilton Depression Scales or golf scores.

Tests for Statistical Significance are very important tools for the scientist. They do not tell a person anything about truth or causation. Rather, they give an indication whether a particular result is likely to be fortuitous, a fluke occurring merely by chance. The level of significance P<0.05 is arbitrarily chosen to indicate that a finding is not merely fortuitous. It means that the chances are better than 20-t0-1 that the result is not simply fortuitous.

Statistical and Systematic Errors. Errors can be either Statistical (due to ubiquitous "noise") or Systematic (due to an accidental or intentional design flaw). Obviously, if a person performs 20 experiments, probably one of them will give a P<0.05 difference, but it will nonetheless be fortuitous. Statistical tests are developed to reduce the likely of this kind of error, called a False Positive or Type I Error.

False Positive or Type I Errors. A socially relevant example of a pernicious Type I Error is the practice of not publishing clinical trials which have negative results. If 20 clinical trials are performed to determine if the mythical drug Zowax reverses pattern baldness, probably one will come out positive at P<0.05. The mythical and unscrupulous pharmaceutical company making Zowax (Zowagen) could theoretically conceal the (negative) results of 19 clinical trials and claim successful therapeutic results with just one of them. In practice, FDA requires pharmaceutical companies to disclose to FDA all clinical trials, reducing the seriousness of this problem. But what appears in peer-reviewed scientific journals and the lay press is not subject at all the the same requirement. This is one reason why FDA is opposed to off-label uses of drugs: apparently valid data could in fact be completely fallacious.

Absence of evidence does not constitute evidence of absence. Scientists are always extra cautious when drawing conclusions based on negative results. Statistics are tools which are generally not broadly understood or used by the public.

False Negative or Type II Errors indicate that there is no statistically significant difference, when in fact there is a significant difference. Aside from the inevitable statistical error, systematic errors producing False Negative results are very easy to produce, intentionally or unintentionally; and the designer of experiments must be vigilant against an unconscious desire to suppress certain kinds of results. Well-intentioned scientists at Zowagen, for instance, may be interested in benefiting society with hairier men, and convinced that hypertensive side effects are just a red herring that could pointlessly slow approval of the drug. For this reason, blood pressures of subjects before Zowax and after Zowax are averaged together in their clinical trial, and no significant change is detected. However, had they measured the changes in blood pressure for each individual, they would have detected a 10 mm increase in blood pressure after 4 weeks of use. If FDA does not insist on better design prior to approval of the drug, the drug could be approved and marketed, and the clinically significant adverse reaction would probably never be detected.

Very little in medical research is easier than obscuring a real result and generating a Type II Error with bad design. One of the classic methods is by pooling data, aka meta-analysis. Often post-hoc re-analysis of the data can reveal the positive results that were obscured through bad design. Those who consider Randomized Clinical Trials (Placebo-Controlled and Double Blind) as the Gold Standard would do well to be on their guard against Systematic Type I and Type II Errors.


Bandolier Evidence-Based Thinking about Healthcare
John W Tukey Exploratory Data Analysis
evidence-based knowledge to improve clinical practice and outcomes for patients
Caltech, the California Institute of Technology
The New England Journal of Medicine
Novartis Pharmaceuticals, formed by the merger of Ciba-Geigy and Sandoz
Pharmikos Inc ©2006 - 2010 | Consultant in Pharmacology & Toxicology
statistics parametric and non-parametric data tests of statistical significance statistical and systematic errors false positive type i error false negative type ii error