Critical Thinking in Biology:
Case Problems

A Guide for Instructors

By PETER OMMUNDSEN


This page illustrates how CASE PROBLEMS can inspire students in an introductory college biology course to develop scientific reasoning skills.


Critical thinking means seeking reliable knowledge. Many students fail to assess the reliability of information to which they are exposed in everyday life, let alone pursue the dissection of scientific literature. And many people are deceived and defrauded by pseudoscience. Practice in critical thinking prompts thoughtful examination of the role of science in society. This is an important outcome of a biology education, and brings us closer to addressing the Socratic dictum "The unexamined life is not worth living."

CONTENTS:
SAMPLE SOLVED PROBLEMS:

Analysis of a Biology News Report
Weighing Conflicting Evidence
Judging Confidence
Judging Testability of a Claim
Science vs Religion
MORE PROBLEMS
REFERENCES

Currently featured Biology Critical Thinking links:

PART I: SAMPLE SOLVED PROBLEMS:

A good starting point in development of critical thinking skills is use of authentic examples meaningful to the student. The popular media are rich in such material -- sports physiology, reproductive health, nutrition, fad diets, psychoactive drugs, alternative therapies, pollution, genetic engineering, and evolution.

The following examples show how active learning can be incorporated into the lecture theater environment to target critical thinking skills. Active learning requires that the students themselves grapple with the case examples, such as in temporary small groups within the lecture hall. During small group work, the instructor can circulate among the groups and suggest directions for student discussion. Short lectures (15-20 minutes) are excellent for inspiring students and for demonstrating how to attack problems, but active learning is superior to a mimetic learning environment in which students only listen, take notes, and repeat what they have been told.

Note the brevity of these introductory cases. Brief quotations help students focus on the fundamental scientific issues without distraction and provide ample substrate for group discussion. Students may of course progress to the analysis of longer articles, scientific papers, advertisements, and web pages assigned as homework or term papers.

In each case, the students must evaluate the reliability of the claim being promulgated. Reliable knowledge is evidence-driven. The students musk ask, "What is the quality of evidence supporting this claim?"


ANALYSIS OF A BIOLOGY NEWS REPORT
Case Example: Reproductive Health

This case provides skills and practice in evaluating a report of a scientific investigation, and introduces students to the need for evidence-based medicine.

Exposure of students to a diversity of health care claims is an important component of a biology education. Health care is a multi-billion dollar biology-based industry, but with many examples of pseudoscience, deception and fraud. Scientific reasoning is an indispensable tool for the citizenry in evaluating health care claims.

Alternative therapies such as homeopathy, reflexology, acupressure, and therapeutic touch are often of interest to students and their families, and can be critically examined in a biology course on a topical basis, e.g., iridology when studying the biology of the eye. Students can critique the evidence for alternative medicine against a rigorous scientific protocol. Pathologist and consumer advocate Marcia Angell has repeatedly stated that there are but two kinds of medicine: that which has been adequately tested and that which has not (Angell and Kassirer 1998).

Example:

Ask the students to form groups of three to five within the lecture room or assign groups by lottery. Provide the following brief news report to analyze. Tell the groups that you may call upon them when the class reconvenes after 15 minutes of small group work. Ask the groups to decide whether or not the claim in the report is justified by the evidence cited in the report.

The 11 November 1998 issue of the Journal of the American Medical Association reported that a Chinese remedy used on pregnant women improves the position of the fetus for an easier birth. The treatment reduces the risk of breech (rump-first) births.

The treatment involved heating an acupuncture point upon the smallest toe ("moxibustion"). Two hundred and sixty women with poorly positioned fetuses were studied. Half of the women were randomly selected to receive moxibustion. The other women ["controls"] received no treatment. The investigators found that the untreated women had significantly more breech births. The authors stated that previously no randomized controlled trial [like this one] had ever been conducted.

Ask the students to form their decision using the following five criteria:

  1. Outcome measure: Was the promised treatment outcome actually measured to determine if it occurred?

  2. Control: Was the outcome of the treatment group compared to the outcome of an otherwise similar untreated group?

  3. Replication: Was the treatment replicated, that is tested on an adequate number of subjects to rule out coincidence?

  4. Randomization: Were subjects assigned to the treatment or control in an unbiased manner?

  5. Reproducibility: Has other research produced similar evidence?

  6. Plausibility: Are the results consistent with established science?


Reconvene the class and solicit responses from the groups for an instructor-led discussion with the class as a whole. Comments may be similar to the following.

  1. Outcome measure: Was the claimed treatment outcome actually measured?
    In this case the answer is yes. The claimed outcome pertains to the risk of breech births, and that is the outcome that was measured -- the number of breech births. In some cases fraudulent claims are pseudojustified by measuring irrelevant variables or by citing speculative assertions by "authorities" (argumentum ad verecundiam).

  2. Control: Was the treatment outcome compared to that of an otherwise similar untreated group?
    In this case the outcome of treated women was compared to that of a control group of 130 untreated women that should have been otherwise similar. However, this control group was not otherwise similar. Treated women knew that they were being treated and untreated women knew that they were different -- they were not being treated. Differing expectations between the two groups may have affected the motility of the fetuses. For example, the treated women may have been less worried about birthing because they knew that they were being treated and therefore had different levels of stress hormones. A better investigative design would blind the women to their treatment status. The control women would receive a sham treatment such as heating of a non-acupuncture point.

  3. Replication: Was the treatment replicated?

    The answer is yes. One hundred and thirty pairs of women were used in the investigation. If only one pair of women was studied (no replication) there would be little confidence that a difference in outcome between them was other than happenstance. Adequate replication establishes a statistical benchmark against which to judge the treatment, that is the likelihood of breech births among untreated women.

  4. Randomization: Was the treatment allocated randomly?

    The answer is yes. The women were randomly assigned to either the treatment group or the control group. This is important because it guards against bias. For example, if all women of slim physique were assigned to the treatment group, unique results in that group might be credited to the treatment when in fact perhaps physique was the cause.

  5. Reproducibility: Has other research produced similar evidence?

    The answer is no. The report stated that until now no randomized controlled trial had been conducted. A claim tested by only a single experiment, as in this case, is tenuous until the results have been reproduced by a number of high quality trials conducted by independent investigators. Results must be reproducible.
  6. Plausibility: Are the results consistent with established science?

    The answer is no. There is no known biological mechanism whereby heat applied to the smallest toe could affect the position of a fetus. If a claim is not founded in basic science, or contradicts established laws of nature, caution is required in viewing the results. A common aphorism is that "extraordinary claims require extraordinary evidence," or as stated by Thomas Jefferson, evidence must be "proportional to the difficulty" of the claim.

The claim that moxibustion is helpful in positioning of the fetus is not well justified. A major flaw in the evidence is the lack of proper controls. A better test of this claim would involve double-blinding, that is giving a sham treatment to the control group, thus isolating the independent variable.

Also, the claim is implausible as there is no known mechanism. It is especially desirable to determine if other investigators can repeatedly reproduce the results using better quality methods. Johnson (1999) states, "If results from a study cannot be reproduced, they have no credibility ... Individual studies rarely contain sufficient information to support a final conclusion about the truth or value of a hypothesis." The next example addresses this issue in more depth.


Analysis of a Biology News Report
Weighing Conflicting Evidence
Judging Confidence
Judging Testability of a Claim
Science vs Religion

Critical thinking skills may be tested in large classes using multiple-choice exams. [An example is shown below.]


WEIGHING CONFLICTING EVIDENCE
Case Example: Treatment of Stroke

This case introduces skills and practice in weighing conflicting evidence and detecting publication bias. It is a surprise to many students that a multitude of apparently similar experiments may produce somewhat differing results, but this is a crucial point to address in teaching critical thinking skills.

Have the students participate in a two-stage analysis:

  1. Ask the students to form groups of three to five in the lecture hall or assign groups by lottery.

    Provide the students with practice in analyzing funnel plots. Display or issue copies of funnel plots such as Exhibits A and B, below. Each dot shows the outcome of a published controlled trial. Exhibit A is a plot of the outcomes of 34 trials that investigated the effect of hostility on the risk of developing coronary heart disease (adapted from Petticrew et al. 1999). Exhibit B shows the results of 19 trials that studied the effects of maternal smoking on risk of preterm delivery (adapted from Shah and Bracken 2001).

    Remind the students that reproducibility is an important criterion of reliable knowledge. Ask the students to spend ten minutes or so preparing an interpretation of the plots: Is there an effect of the independent variable (hostility and smoking)? Why are the results within each exhibit so variable? How might we account for the shape of the plots?

    When the class reconvenes, solicit suggestions from the student groups, and assist the class in explaining the plots. The students might suggest: (1) The shape of each plot is consistent with the Law of Large Numbers: greater variability among estimates from small samples. (2) Results from the larger sample sizes in each plot may be worthy of greater weighting in drawing a conclusion. (3) The hypothesis that hostility increases the risk of coronary heart disease is not well justified by the pattern in Exhibit A. (4) The weight of evidence is consistent with the inference that that maternal smoking may be a cause of preterm delivery.

  2. Now the students can progress to the stroke case. Ask them to form small groups again. Display or issue copies of Exhibit C, the effect of acupuncture in the treatment of stroke (adapted from Tang et al. 1999). Ask the students to spend ten minutes or so discussing the shape of the plot and proposing explanations. Then reconvene the class, solicit suggestions from the various groups, and assist the class in explaining the shape of the plot.

    The students may suggest: (1) The position of the outcomes of trials with large sample sizes suggest little evidence of a treatment effect of acupuncture on stroke. (2) The lack of symmetry is suspicious -- we would expect additional points to the left of the plot (in this case negative outcomes). (3) The lack of symmetry suggests the possibility of publication bias, a tendency for negative outcomes to not be published. (4) Other factors also could explain the shape, such as disproportionately poor responses of control subjects in small trials. Recognition of the possibility of publication bias is an important learning outcome.


Analysis of a Biology News Report
Weighing Conflicting Evidence
Judging Confidence
Judging Testability of a Claim
Science vs Religion

JUDGING CONFIDENCE
Case Example: Treatment of a Cold

This case introduces the concept of confidence. At first glace it appears to involve some heavy slogging, but the numerical values used in this example are very easy to work with and the educational benefits are immense.

Display or issue copies of the plot below and ask the students to form groups of three to five in the lecture hall. The graph shows the last reported day of cold symptoms in (a) a group of 100 people treated with an experimental purported cold remedy designed by a biotech company (dark bars) and (b) a group of 100 people treated with placebo (light bars).

  1. Ask the student groups to discuss the graph for ten minutes. Ask them to decide if the treatment "worked," and ask them to decide how much confidence they have in their decision. The students will soon realize that they must formulate an arbitrary criterion in order to make a decision, and criteria may vary among the groups. Also they will notice that experimental data do not necessarily segregate nicely: in this case many subjects who consumed the placebo fared as well or better than those who used the remedy.

  2. Reconvene the class and tally the results of the decisions, solicit criteria employed, and discuss these as appropriate.

    Inform the students that they will be asked to further interpret the data, but should use some specific tools. You may wish to suggest that they weigh any or all of the following:

    Difference in mean recovery time
    The mean of the treated group is six days and the mean of the control group is seven days.

    Relative risk of still showing symptoms beyond six days can easily be computed (= 0.6). The number of subjects showing symptoms beyond six days in the treatment group was 37 and in the control group was 61. [(37/100)/(61/100) = 0.6].

    Comparison of standard deviations between the groups
    These are 1.9 and 2.1, respectively, or for purposes of this exercise could be rounded to 2.

    Confidence intervals
    By rounding, a ninety-five percent confidence interval around the mean (plus or minus 0.4 days) can be calculated in the students' heads simply as plus or minus two standard errors. This simplified process gives the students an intuitive feel for a confidence interval. If the confidence intervals do not overlap, the students would be justified in rejecting the statistical hypothesis that the difference between the groups was random variation.

    P value
    A simple explanation of a P value can be presented regarding the data. P is the probability of observing the data (or more extreme data) if the statistical null hypothesis is true, that is assuming samples were drawn from the same population. In this case the likelihood of observing such a difference between sample means (per unit of standard error) is improbably small (P=<0.05).

    Risk of error
    Weighing risk of error requires considering whether to set the P value rejection criterion high (P=0.05) or lower. The latter protects against a type I error (false positive). Setting a high value protects against a type II error (false negative).

    Effect size
    Effect size (difference between the means/mean standard deviation) is a function of sample size. The results of fifteen pairs of random subsamples of n = 5 and fifteen pairs of n = 60 are shown below:

  3. Ask the students to return to their groups and spend fifteen minutes reassessing the graph using their new tools. Ask the students (a) Do the confidence intervals overlap? (b) What does a P value tell us about the probability that the treatment works? (c) Does the data pattern justify the inference that there is a biologically or clinically significant difference between the groups in this trial? (d) How might a different relative risk statistic be derived from this plot? (e) What factors could haved biased the outcome of the trial? (f) What further research, if any, would they like to view before inferring that the treatment is effective?

  4. Reconvene the class, solicit opinions from the students, and discuss these as appropriate.

    Likely the students will point out that although there is a statistically significant difference between the groups, the two confidence intervals are very close, which raises questions about the biological significance. Cold symptoms are subjective and to infer symptom clearance within a time frame of a fraction of a day is likely fraught with problems. The P value tells us nothing about the probability that the treatment works (the biological hypothesis). The P value tells us only that the chances of obtaining our sampled data are small given a statistical null hypothesis. This may or may not justify inferring a clinical treatment effect. Many other aspects of the trial require scrutiny.

    Make a list of potential sources of bias and confounding variables suggested by the students. It would be desirable to know what operational definition of a "cold" was used by investigators. Were all subjects infected with the smae cultured virus? "Colds" may be caused by many viruses, which may show differences among them in response to a particular drug. The students should note that the numerical value of the relative risk statistic depends upon the end point that is chosen. In this case the end point was six days. Is there a more meaningful end point? Ultimately, the students may state that they would like to know if the results could be consistently reproduced by other investigators.

    Time permitting, you may wish to demonstrate how to derive a confidence interval on relative risk (in this example the 95% interval is 0.46 - 0.78) and introduce the students to graphical methods of plotting confidence intervals such as forest plots.

    Finally, you may wish to introduce some discussion of Bayesian analysis, in which the probability of the statistical hypothesis, given the data, is calculated, P(H I D), rather than the probability of the data given the hypothesis, P(D I H), as above. A Bayesian approach allows the likelihood to be modified by a prior estimate of probability of the hypothesis. The effect of assigning a low prior probability is to require the accumulation of more data to reach the same conclusion as a less skeptical researcher.

    Major outcomes of the discussion should focus on (a) the difficulty of hypothesis-testing as the treatment effect moves toward zero, (b) the difference between rejecting a statistical null hypothesis and making an inference about a biological treatment effect, two separate operations, and (c) the reasons why the literature on many useless alternative medicines may be plagued by false positives.


Analysis of a Biology News Report
Weighing Conflicting Evidence
Judging Confidence
Judging Testability of a Claim
Science vs Religion

JUDGING TESTABILITY OF A CLAIM
Case Example: Brain and Behavior

This case introduces the concept of testability.

A common impediment to scientific progress is failure to frame testable questions. Pseudoscience may employ untestable metaphors as explanations for alleged healing powers and may invoke unmeasurable variables. (Wilhelm Reich, the psychotherapist, championed "orgone rays," touch therapists cite "energy fields," traditional Chinese medicine cites qi, and traditional Indian medicine (Ayurveda) cites the tridosha.) Challenging students to dissect such claims provides important lessons in the need for testability, parsimony, elegance, and rigor in scientific reasoning. An example follows:

Ask the students to form groups of three to five within the lecture room or assign groups by lottery. Provide the following brief quotation to analyze. Tell the groups that you may call upon them when the class reconvenes after 15 minutes of small group work. Ask the students how they would decide if the claim were justified, and how they might investigate the claim. Encourage them to draw upon their knowledge of biology to suggest tests of any alternate hypotheses that might occur to them.

Why do some people have little difficulty quitting smoking? The answer is Will Power. This is what psychotherapists term "intrinsic motivation," an underlying need for competence and self determination.

To investigate why some people are successful nicotine abstainers, the students might suggest an experiment that compares long-terms abstainers to to a control group of people who frequently resume smoking. The dependent variable in this case is frequency (per unit time) of resuming smoking after quitting. The independent variable, Will Power, is of course elusive. The only way to measure "Will Power" and its trappings as portrayed in the quotation is to measure the dependent variable, success in quitting smoking, leaving the experiment with no independent variable! The reasoning is circular, a common problem of pseudoscience.

Once the students identify the circularity, they can proceed only if they frame a testable hypothesis, and this will require application of biology. The students will have to brainstorm a list of measurable variables that might impact on levels of nicotine addiction. They might hypothesize that easy-quitters have less nicotinic receptors and/or nicotinic receptors that are less sensitive to nicotine. There are many possibilities for creative investigative designs. In terms of critical analysis, however, a major outcome of interest to the educator is that the students recognize the flaw in the original proposal regarding will power.

This is a milestone case for many students, as it awakens them to literature that masquerades as science, and makes them more critical readers. Much writing on natural history, healing, and human behavior is entangled in untestable metaphors and therefore is divorced from the window of science. A splendid analysis of the concept of "motivation" is provided by Chiesa (1994).

The eminent microbiologist Louis Pasteur revolutionized medical practice largely by demanding testability and scientific rigor (Debre and Forster 1998). This was a hard-won lesson that needs constant reinforcement in biology classes.


Analysis of a Biology News Report
Weighing Conflicting Evidence
Judging Confidence
Judging Testability of a Claim
Science vs Religion

SCIENCE VS RELIGION
Case Example: Evolutionary Biology

This case provides practice in discriminating scientific explanations from nonscientific explanations.

Most students are aware that some religious organizations lobby to have supernatural claims, particularly intelligent design creationism, taught in science classes. Media attention given to this issue is fortuitous, as it provides a dramatic substrate upon which to confront the question "What is science?"

Rather than lecturing students on science as though it were dogma, such as the "evidence for evolution" as is presented in many textbooks, it is preferable to actively engage the students in examining and comparing scientific and nonscientific theories. Ask the students to bring to class lists of criticims of evolution that they find on the Internet. These can be analyzed in small groups and students can themselves generate a list of criteria to discriminate science from religion.

Example --

Ask the students how they would go about testing a phylogenetic hypothesis. This forces them to confront a question that many have never before contemplated.

Have the students form groups of three to five in the lecture hall or assign groups by lottery. Ask them to suggest an investigation to test the phylogeny hypotheses in the following quotation. ALSO ask them to reflect on their methodology and brainstorm a list of characteristics of science.

Evolutionary biologists have proposed that new kinds of living organisms arise sequentially through time via genetic modification. For example, it is hypothesized that amphibians evolved from fish and that birds evolved from saurischian dinosaurs. By contrast, there are "scientific" creationists [members of the Creation Research Society] who believe that all basic kinds of life (birds, amphibians, reptiles, etc.) arose within a six day period.

The students will likely suggest that the evolutionary explanation predicts that the fossil record would show a sequential emergence of fish, amphibians, reptiles, and birds over a span of time geologically dated at significantly longer than six days, whereas the creation hypothesis predicts that all life forms would appear together throughout the fossil record, inhabiting the earth from the beginning of the record of life. The students may have other suggestions, such as seeking transitional fossils, quiescent ancestral genes, and present-day mutants.

The creationist claim, by contrast, predicts that the fossil record would show the emergence of birds and reptiles (and all other major life forms) simultaneously without transitional forms.

When the class reconvenes, the students' ideas can be elicited, and an instructor-led discussion can include a review of the the fossil record, transitional forms ( Archaeopteryx, Tiktaalik, etc.), and genetics (such as Hoxd13 pattern found in Australian lungfish, tooth genes in birds, the Talpid specimen, etc.)

Of most interest, however, is the list of characteristics of science contributed by the students. Any of the following might emerge and are worthy of discussion.

  1. Scientific ideas are provisional

    The tentative language of biological science used in the quotation ("proposed", "hypothesized") contrasts with the dogmatic language ("believe") of the scientific creationist. Creationist Research Society literature states that "members of the Society are ...committed to full belief" in their explanation. By contrast, scientific hypotheses are provisional and are abandoned if they fail rigorous testing. For example, it was once thought that plant development was influenced by "mitogenic rays," but this idea has not withstood extensive investigation (Langmuir 1989) and has been discarded. As such, it is important for the student to recognize that science is simply a human behavior, not a mirror of Truth or a mirror of Reality. The provisional nature of science illustrates that truth and reality are human interpretations subject to revision. As David Western has said, "Anyone who believes that science is dispassionate and objective has never worked with scientists" (Western 1997).

  2. Science deals only with the observable

    Science constructs explanations based upon a language of human perceptions of the natural environment. Scientific claims are tested by observation and measurement. In the case of phylogeny this may involve geometric dating, fossil records, DNA typing, genome sequencing, etc. The value of scientific hypotheses is measured by their predictive power -- whether or not patterns of physical evidence in nature conform to those expected by the hypothesis. Nonscientific explanations may encompass the non-observable (supernatural), may rely upon historic writings (religious texts), or may invent untestable constructs (psychoanalysis). In the phylogenetic problem, the creationists may rely on religious scripture as one form of evidence.

  3. Science relies upon persuasion rather than force

    Competition among scientific theories is resolved through continual research. For example, by the late 1800s most scientists accepted the Germ Theory of Disease because it had been well tested. The popularity of the theory was not dependent upon scientists being ordered to believe it. This contrasts with religions in which research may be unwelcome (e.g., Galileo's testing of Copernican Theory), and members are expected to hold certain beliefs. Membership in the Creation Research Society implies commitment "to full belief."

  4. There is considerable consensus among scientists

    Science relies on full disclosure of investigative methods and on principles of testability and reproducibility. This promotes consensus. The vast majority of scientists worldwide use the theory of evolution as an explanatory tool. Nonscientific disciplines are often highly fragmented as they lack a protocol for resolving differences. The number of religions is testimony to this trait, as are their differing positions on evolution.

  5. Science thrives on curiosity and eagerness for new information and new discoveries

    Science as a cultural enterprise encourages ongoing research. Science is self-critical and evolutionary hypotheses are continually being reshaped and fine-tuned. Nonscientific and pseudoscientific organizations may be wary of research and new ideas. The alternative health care movement "largely denies the need" for research (Angell and Kassirer 1998), and organized religion has punished (Galileo) or killed (Aikenhead, Vanini) reflective thinkers.

A few students may view religious teachings as incontrovertible, and be impervious to scientific evidence, and it may be helpful to point out that acquisition of new and more reliable knowledge through controlled trials was in fact exemplified in ancient texts:

Daniel 1:12-15 -- "Test your servants for ten days; let us be given vegetables to eat and water to drink. Then let our appearance and the appearance of the youths who eat the king's food be observed by you, and deal with your servants according to what you see." So he listened to them in this matter, and tested them for ten days. At the end of ten days it was seen that they were better in appearance and fatter in flesh than all the youths who ate the king's food.

Although several of the examples discussed above were from popular literature, students should be made well aware that articles in peer-reviewed journals are not necessarily scientifically credible. For example, Skalski and Robson (1992) listed a large number of published peer-reviewed ecological experiments that lacked randomization and/or adequate replication.


Analysis of a Biology News Report
Weighing Conflicting Evidence
Judging Confidence
Judging Testability of a Claim
Science vs Religion

PART II: MORE CASE PROBLEMS

  • Homeopathy is a form of treatment in which a remedy may be repeatedly diluted until only the solvent remains in the container, a solvent thought to have acquired a "memory" of the remedy once present. A recent study of the effectiveness of homeopathy reported here involved interviewing 6544 patients to determine if their health improved after homeopathic treatment. "70.7% reported positive health changes." Does this result support the claim that homeopathy has a unique biological effect? Explain.

    [The students should point out that the claim is not justified because there was no control group. It is unknown what percentage of people would have reported positive health changes without treatment. And most treatments have placebo effects -- that is why blood-letting was popular for thousands of years. Recipients of biologically ineffective treatments may claim that "it works" because of placebo effects, but this is not evidence of a unique biological effect. The response of a treatment group must be compared to the response of a sham-treated group (both groups and clinicians being unaware of which group is recieving which treatment). Also in this example, it would be important to have clear outcome measures, not just reports of generally feeling better. A discussion of this study is available here .]

  • A homeopathic therapist admits that most large well-designed trials of homeopathy show no treatment effect, but he can point to quite a few experiments that do show a treatment effect and he claims that these demonstrate that homeopathy must have some value. Is he justified in his claim?

    [The students should respond that he is not justified. In fact we can EXPECT false positives due to chance even when the weight of evidence from many trials points toward NO treatment effect. And the smaller the sample sizes used in the experiments, the greater the likelihood that their estimated treatment effects are not representative of what would be obtained with large sample sizes.]

  • A touch therapist claimed that therapeutic touch can reduce the frequency of headaches and as evidence published testimonials from clients claiming reduced headache frequency after one treatment. Critique this evidence and suggest a better test of this claim.

    [The students should respond that it must be determined if touch is better than placebo at reducing headache frequency. An investigative design must rule out coincidence. Testimonials are not useful as evidence because a positive results may be coincidence or a placebo effect. A test of the claim requires a large number of subjects (replication) who are randomly assigned to either touch or a sham treatment (control group) such as an inert pill. The outcome measure is frequency of headaches.]

  • An ad for an herbal product that has failed large clinical trials contained the following testimonial as evidence of its effectiveness: "I was very pleased with your product. It certainly works for me. My headaches now disappear within a few hours of taking the pills." Critique the quality of evidence in support of a treatment effect for this product in this customer, and suggest a better source of evidence.

    [This is an interesting claim, because it is quite possible that a drug may show no significant effect on a test population yet benefit a unique subset of people. This was seen with the drug bucindolol in treating heart failure. Therefore the question in this case is whether the product is better than placebo in this one person. The "target of inference" from any evidence is this one single person, not a population. That rules out replication. However, randomization is possible. An experiment could be designed in which the subject was randomly given (blindly) either the herb or a placebo each time he/she suffered a headache. The outcome measure is headache duration.]

  • An experiment was reported in which the potential effects of a pollutant were assayed by applying the pollutant to a large fish tank containing 100 fish, and comparing their survival to 100 otherwise similar fish in an unpolluted control tank. The treatment tank was selected in a random manner. Critique this experiment based on this brief description. What is the greatest flaw in the investigative design? [Such questions are adaptable to multiple-choice format:]

    a. Lack of clear outcome measure.
    b. Lack of appropriate controls.
    c. Lack of randomization.
    d. Lack of replication.

    [The answer is (d), this is an example of pseudoreplication. It is the tanks that must be replicated (Hurlbert 1984).]

  • A drug company is seeking licencing for a drug that reduces breast cancer risk by 50 percent. However, this same drug increases the risk of uterine cancer by 100 percent. How would you decide if the drug should be approved?

    [The students should realize that they require baseline data on the risks of the two cancers. The incidence of breast cancer is about 80 per 100,000 population. The incidence of uterine cancer is about 15 per 100,000.]

  • Homeopathy is a form of treatment in which a remedy may be repeatedly diluted until only water remains in the container, water which is thought to have acquired a "memory" of the remedy once present. Arguments that homeopathy is pseudoscience cite the fact that scientists have been unable to demonstrate that water has a memory. They also cite the failure of well-designed randomized controlled trials to show any value beyond a placebo effect. An argument in favor of the effectiveness of homeopathy is that there are three thousand practicing homeopaths in Great Britain. Is this good evidence of the efficacy of homeopathy? Explain.

    [The students should respond that this is an example of argumentum ad verecundiam -- argument from authority rather than from measurement of effectiveness. The number of proponents has no relation to determining whether or not something works.]

  • Some proponents of "intelligent design" demand that schools include a supernatural explanation for the diversity of life in science classes. What feature of science distinguishes it from the realm of religion and the supernatural?

    [Science is a method of explaining nature using hypotheses that can be tested against human observations. It does not deal with supernatural claims which are by definition untestable.]

  • An advertisment for ginseng, a herbal product, claims that it " promotes endurance." How would you decide if this claim were fraudulent? What type of evidence would you like to see prior to accepting this claim? Provide details of an investigative design

  • Biologists hypothesize that whales evolved from land-dwelling animals now extinct (mesonychids). How might this hypothesis be tested?

  • A web page promoting the health benefits of "touch therapy" claims that "Healing Touch provides an energetic liver cleanse." What evidence would you like to see in support of such a claim?

  • Natural history programs commonly attribute adaptive value to animal colors. For example, the snake species Thamnophis elegans occurs in two color phases: grey and brown. Grey snakes predominate on rocky beaches, where they blend with the background color, while brown snakes are most common in the uplands. What kind of experiment would verify whether or not the colors were adaptive? How would you design a rigorous experiment that would discriminate between the hypotheses of selection and drift?

  • A documentary TV program on "reflexology" claimed that pressure massage of a certain region of the foot would relieve congested sinuses. What evidence would be necessary before you would accept such a claim?

  • Iridologists claim that by examining the eye they can detect diseases in other body organs because the iris is neurally connected to them. Biologists say that there is no neural transmission between the iris and the major organs. What evidence would be required to validate iridology, for example, to diagnose kidney disease by viewing only the eye? See if you can find if such evidence exists.

  • Sigmund Freud, in his book "The Interpretation of Dreams" claimed that dream content may receive contributions from a special "psychic [censorship] function" separate from dream thoughts. Freud said that episodes of dream resistance and dream realization prior to wakening was "incontestable proof" of this psychic function. Critique Freud's evidence from a biological perspective.

  • The web site of The American College or Orgonomy states that this branch of psychiatry is based upon the theory of Wilhelm Reich that "in almost all individuals, the flow and release of orgone energy [which 'fills the universe and pulsates in all living things'] is blocked by chronic muscle contraction in various areas of the body." Orgone therapy [of emotional illness] is thus aimed at relief of spastic muscles. Critique this theory. What type of investigation would be necessary to test this theory? What type of evidence would you like to see prior to accepting this proposition and spending money on orgone therapy?


REFERENCES

Angell, M and Kassirer, JP (1998) Alternative medicine: The risks of untested and unregulated remedies. The New England Journal of Medicine, 339:839-41.

Chiesa, M (1994) Radical behaviorism: The philosophy and the science. Author's Cooperative, Inc.

Debre P and Forster E (1998) Louis Pasteur. Johns Hopkins University Press.

Giere RN (1998) Understanding scientific reasoning. Holt, Rinehart, and Winston.

Hurlbert, SH (1984) Pseudoreplication and the design of ecological experiments. Ecological Monographs, 54:187-211.

Johnson, D (1999) The insignificance of statistical significance testing. Journal of Wildlife Management, 63:763-772.

Langmuir, I (1989) Pathological Science. Physics today, October:36-48.

Nussbaum, MC (1997) Cultivating humanity. Harvard University Press.

Peters RH (1991) A Critique for Ecology. Cambridge University Press.

Petticrew, M et al. 1999. Relation between hostility and coronary heart disease. British Medical Journal 1999;319:917 ( 2 October ).

Rorty, R (1991) Objectivity, relativism, and truth. Cambridge University Press.

Shah, N and MB Bracken. 2001. A systematic review and meta-analysis of prospective studies on the association between maternal cigarette smoking and preterm delivery. Journal of Nutrition. 2001;131:1032S-1040S.

Skalski JR and Robson DS (1992) Techniques for wildlife investigations. Academic Press.

Tang J-L, Zhan S-Y, Ernst E. 1999. Review of randomised controlled trials of traditional Chinese medicine. British Medical Journal 319:160-161.

Western D (1997) In the dust of Kilimanjaro. Island Press.


Copyright 1999, 2005 Peter Ommundsen

To contact, click HERE

A copy of this web page is available in Estonian, translated from English by Anna Galovich:

Kriitiline mõtlemine Biology: Juhul, kui probleem.


Cape West Publishing Home Page

University-level Biology Education Links:

  • Problem-based Learning in Biology with 20 Case Examples
  • Pronunciation of Biological Latin
  • Biology Case Studies in Multiple-choice Questions
  • Biology Teaching: Three Measures of Success
  • Skulls of British Columbia Mammals
  • Cartoons - ecology and biology
  • Key Words:
    Case-based biology
    Case-based learning
    Critical thinking biology