Science misinformation is being spread in the lecture halls of top universities

Should universities remove online courses that contain incorrect or misleading information?

There are lots of scientific controversies where different scientists have conflicting views. Eventually these controversies will be solved by normal scientific means involving evidence and logic but for the time being there isn't enough data to settle a genuine scientific controversy. Many of us are interested in these controversies and some of us have chosen to invest time and effort into defending one side or the other.

But there's a dark side of science that infects these debates—false or misleading information used to support one side of a legitimate controversy. To give just one example, I'm frustrated at the constant reference to junk DNA being defined as non-coding DNA. Many scientists believe that this was the way junk DNA was defined by its earliest proponents and then they go on to say that the recent discovery of functional non-coding DNA refutes junk.

I don't know where this idea came from because there's nothing in the scientific literature from 50 years ago to support such a ridiculous claim. It must be coming from somewhere since the idea is so widespread.

Where does misinformation come from and how is it spread?

Read more »

Philip Ball’s new book: “How Life Works”

Philip Ball has just published a new book "How Life Works." The subtitle is "A User’s Guide to the New Biology" and that should tell you all you need to know. This is going to be a book about how human genomics has changed everything.

Read more »

How many genes in the human genome (2023)?

The latest summary of the number of genes in the human genome gets the number of protein-coding genes correct but their estimate of the number of known non-coding genes is far too high.

In order to have a meaningful discussion about molecular genes, we have to agree on the definition of a molecular gene. I support the following definition (see What Is a Gene?).

Read more »

Definition of a gene (again)

The correct definition of a molecular gene isn't difficult but getting it recognized and accepted is a different story.

When writing my book on junk DNA I realized that there was an issue with genes. The average scientist, and consequently the average science writer, has a very confused picture of genes and the proper way to define them. The issue shouldn't be confusing for Sandwalk readers since we've covered that ground many times in the past. I think the best working definition of a gene is, "A gene is a DNA sequence that is transcribed to produce a functional product" [What Is a Gene?]

Read more »

David Allis (1951 – 2023) and the “histone code”

C. David Allis died on January 8, 2023. You can read about his history of awards and accomplishments in the Nature obituary with the provocative subtitle Biologist who revolutionized the chromatin and gene-expression field. This refers to his work on histone acetyltransferases (HATs) and his ideas about the histone code.

The key paper on the histone code is,

Strahl, B. D., and Allis, C. D. (2000) The language of covalent histone modifications. Nature, 403:41-45. [doi: 10.1038/47412]

Histone proteins and the nucleosomes they form with DNA are the fundamental building blocks of eukaryotic chromatin. A diverse array of post-translational modifications that often occur on tail domains of these proteins has been well documented. Although the function of these highly conserved modifications has remained elusive, converging biochemical and genetic evidence suggests functions in several chromatin-based processes. We propose that distinct histone modifications, on one or more tails, act sequentially or in combination to form a ‘histone code’ that is, read by other proteins to bring about distinct downstream events.

They are proposing that the various modifications of histone proteins can be read as a sort of code that's recognized by other factors that bind to nucleosomes and regulation gene expression.

This is an important contribution to our understanding of the relationship between chromatin structure and gene expression. Nobody doubts that transcription is associated with an open form of chromatin that correlates with demethylation of DNA and covalent modifications of histone and nobody doubts that there are proteins that recognize modified histones. However, the key question is what comes first; the binding of transcription factors followed by changes to the DNA and histones, or do the changes to DNA and histones open the chromatin so that transcription factors can bind? These two models are referred to as the histone code model and the recruitment model.

Strahl and Allis did not address this controversy in their original paper; instead, they concentrated on what happens after histones become modified. That's what they mean by "downstream events." Unfortunately, the histone code model has been appropriated by the epigenetics cult and they do not distinguish between cause and effect. For example,

The “histone code” is a hypothesis which states that DNA transcription is largely regulated by post-translational modifications to these histone proteins. Through these mechanisms, a person’s phenotype can change without changing their underlying genetic makeup, controlling gene expression. (Shahid et al. (2022)

The language used by fans of epigenetics strongly implies that it's the modification of DNA and histones that is the primary event in regulating gene expression and not the sequence of DNA. The recruitment model states that regulation is primarily due to the binding of transcription factors to specific DNA sequences that control regulation and then lead to the epiphenomenon of DNA and histone modification.

The unauthorized expropriation of the histone code hypothesis should not be allowed to diminish the contribution of David Allis.


Editing the ‘Intergenic region’ article on Wikipedia

Just before getting banned from Wikipedia, I was about to deal with a claim on the Intergenic region article. I had already fixed most of the other problems but there is still this statement in the subsection labeled "Properties."

According to the ENCODE project's study of the human genome, due to "both the expansion of genic regions by the discovery of new isoforms and the identification of novel intergenic transcripts, there has been a marked increase in the number of intergenic regions (from 32,481 to 60,250) due to their fragmentation and a decrease in their lengths (from 14,170 bp to 3,949 bp median length)"[2]

The source is one of the ENCODE papers published in the September 6 edition of Nature (Djebali et al., 2012). The quotation is accurate. Here's the full quotation.

As a consequence of both the expansion of genic regions by the discovery of new isoforms and the identification of novel intergenic transcripts, there has been a marked increase in the number of intergenic regions (from 32,481 to 60,250) due to their fragmentation and a decrease in their lengths (from 14,170 bp to 3,949 bp median length.

What's interesting about that data is what it reveals about the percentage of the genome devoted to intergenic DNA and the percentage devoted to genes. The authors claim that there are 60,250 intergenic regions, which means that there must be more than 60,000 genes.1 The median length of these intergenic regions is 3,949 bp and that means that roughly 204.5 x 106 bp are found in intergenic DNA. That's roughly 7% of the genome depending on which genome size you use. It doesn't mean that all the rest is genes but it sounds like they're saying that about 90% of the genome is occupied by genes.

In case you doubt that's what they're saying, read the rest of the paragraph in the paper.

Concordantly, we observed an increased overlap of genic regions. As the determination of genic regions is currently defined by the cumulative lengths of the isoforms and their genetic association to phenotypic characteristics, the likely continued reduction in the lengths of intergenic regions will steadily lead to the overlap of most genes previously assumed to be distinct genetic loci. This supports and is consistent with earlier observations of a highly interleaved transcribed genome, but more importantly, prompts the reconsideration of the definition of a gene.

It sounds like they are anticipating a time when the discovery of more noncoding genes will eventually lead to a situation where the intergenic regions disappear and all genes will overlap.

Now, as most of you know, the ENCODE papers have been discredited and hardly any knowledgeable scientist thinks there are 60,000 genes that occupy 90% of the genome. But here's the problem. I probably couldn't delete that sentence from Wikipedia because it meets all the criteria of a reliable source (published in Nature by scientists from reputable universities). Recent experience tells me that the Wikipolice Wikipedia editors would have blocked me from deleting it.

The best I could do would be to balance the claim with one from another "reliable source" such as Piovasan et al. (2019) who list the total number of exons and introns and their average sizes allowing you to calculate that protein-coding genes occupy about 35% of the genome. Other papers give slightly higher values for protein-coding genes.

It's hard to get a reliable source on the real number of noncoding genes and their average size but I estimate that there are about 5,000 genes and a generous estimate that they could take up a few percent of the genome. I assume in my upcoming book that genes probably occupy about 45% of the genome because I'm trying to err on the side of function.

An article on Intergenic regions is not really the place to get into a discussion about the number of noncoding genes but in the absence of such a well-sourced explanation the audience will be left with the statement from Djebali et al. and that's extremely misleading. Thus, my preference would be to replace it with a link to some other article where the controversy can be explained, preferably a new article on junk DNA.2

I was going to say,

The total amount of intergenic DNA depends on the size of the genome, the number of genes, and the length of each gene. That can vary widely from species to species. The value for the human genome is controversial because there is no widespread agreement on the number of genes but it's almost certain that intergenic DNA takes up at least 40% of the genome.

I can't supply a specific reference for this statement so it would never have gotten past the Wikipolice Wikpipedia editors. This is a problem that can't be solved because any serious attempt to fix it will probably lead to getting blocked on Wikipedia.

There is one other statement in that section in the article on Intergenic region.

Scientists have now artificially synthesized proteins from intergenic regions.[3]

I would have removed that statement because it's irrelevant. It does not contribute to understanding intergenic regions. It's undoubtedly one of those little factoids that someone has stumbled across and thinks it needs to be on Wikipedia.

Deletion of a statement like that would have met with fierce resistance from the Wikipedia editors because it is properly sourced. The reference is to a 2009 paper in the Journal of Biological Engineering: "Synthesizing non-natural parts from natural genomic template."


1. There are no intergenic regions between the last genes on the end of a chromosome and the telomeres.

2. The Wikipedia editors deleted the Junk DNA article about ten years ago on the grounds that junk DNA had been disproven.

Djebali, S., Davis, C. A., Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A. et al. (2012) Landscape of transcription in human cells. Nature 489:101-108. [doi: 10.1038/nature11233]

Piovesan, A., Antonaros, F., Vitale, L., Strippoli, P., Pelleri, M. C., and Caracausi, M. (2019) Human protein-coding genes and gene feature statistics in 2019. BMC research notes 12:315. [doi: 10.1186/s13104-019-4343-8]

Richard Dawkins talks about the genetic code and information

This is a video published a few weeks ago where Jon Perry interviews Richard Dawkins. Jon Perry is the author of animations posted on his website Stated Clearly. He (Perry) has a very adaptaionist view of evolution—a view that he got from Richard Dawkins.

The main topic of the interview concerns DNA as information and the genetic code. Both Dawkins and Perry give the impression that the only kind of information in the genome is the genetic code (sensu stricto); in other words, the code that specifies a sequence of amino acids using the sequence of nucleotides in a coding region [The Real Genetic Code]. Dawkins makes the same point he has often made; namely, that this is a real code just like any other code.

Perry points out that most people don't understand this, including many atheists who argue that the "code" is merely an analogy and not to be taken literally. Atheists, and others, also argue that the information content of DNA includes lots of other things such as genes that specify functional RNAs and sites that bind proteins. It's hard to argue that a gene for tRNA functions as any kind of a code and it's hard to argue that the DNA binding sites in origins of replication are codes even though you could argue that they carry information.

I don't get excited about arguments over whether DNA carries "information" because there's not much to be gained by such arguments. Who cares whether the genetic code falls under the definition of "information theory"? However, I do get annoyed when people say that the ONLY information in DNA is in the form of the genetic code.

Watch the video and let me know what you think. Jerry Coyne watched it and he wasn't the least bit bothered by the things that bothered me [A discussion on genetics, evolution, and information with Richard Dawkins].


The bad news from Ghent

A group of scientists, mostly from the University of Ghent1 (Belgium), have posted a paper on bioRxiv.

Lorenzi, L., Chiu, H.-S., Cobos, F.A., Gross, S., Volders, P.-J., Cannoodt, R., Nuytens, J., Vanderheyden, K., Anckaert, J. and Lefever, S. et al. (2019) The RNA Atlas, a single nucleotide resolution map of the human transcriptome. bioRxiv:807529. [doi: 10.1101/807529]

The human transcriptome consists of various RNA biotypes including multiple types of non-coding RNAs (ncRNAs). Current ncRNA compendia remain incomplete partially because they are almost exclusively derived from the interrogation of small- and polyadenylated RNAs. Here, we present a more comprehensive atlas of the human transcriptome that is derived from matching polyA-, total-, and small-RNA profiles of a heterogeneous collection of nearly 300 human tissues and cell lines. We report on thousands of novel RNA species across all major RNA biotypes, including a hitherto poorly-cataloged class of non-polyadenylated single-exon long non-coding RNAs. In addition, we exploit intron abundance estimates from total RNA-sequencing to test and verify functional regulation by novel non-coding RNAs. Our study represents a substantial expansion of the current catalogue of human ncRNAs and their regulatory interactions. All data, analyses, and results are available in the R2 web portal and serve as a basis to further explore RNA biology and function.

They spent a great deal of effort identifying RNAs from 300 human samples in order to construct an extensive catalogue of five kinds of transcripts: mRNAs, lncRNAs, antisenseRNAs, miRNAs, and circularRNAs. The paper goes off the rails in the first paragraph of the Results section where they immediately equate transcripts wiith genes. They report the following:

  • 19,107 mRNA genes (188 novel)
  • 18,387 lncRNA genes (13,175 novel)
  • 7,309 asRNA genes (2,519 novel)
  • 5,427 miRNAs
  • 5,427 circRNAs

As Sandwalk readers know, there's a bit of a controvery over the functionality of transcripts. I maintain that most noncoding transcripts are junk RNA resulting from spurious transcription; therefore, it is incorrect to associate each transcript with a gene [On the misrepresentation of facts about lncRNAs] [How many lncRNAs are functional?].

I'm not the only one who's skeptical about lncRNAs. I haven't got time to list all the papers that discuss the controversy but here's one of the most important ones.

Palazzo, A.F. and Lee, E.S. (2015) Non-coding RNA: what is functional and what is junk? Frontiers in genetics 6:2(1-11). doi: doi: 10.3389/fgene.2015.00002

The genomes of large multicellular eukaryotes are mostly comprised of non-protein coding DNA. Although there has been much agreement that a small fraction of these genomes has important biological functions, there has been much debate as to whether the rest contributes to development and/or homeostasis. Much of the speculation has centered on the genomic regions that are transcribed into RNA at some low level. Unfortunately these RNAs have been arbitrarily assigned various names, such as “intergenic RNA,” “long non-coding RNAs” etc., which have led to some confusion in the field. Many researchers believe that these transcripts represent a vast, unchartered world of functional non-coding RNAs (ncRNAs), simply because they exist. However, there are reasons to question this Panglossian view because it ignores our current understanding of how evolution shapes eukaryotic genomes and how the gene expression machinery works in eukaryotic cells. Although there are undoubtedly many more functional ncRNAs yet to be discovered and characterized, it is also likely that many of these transcripts are simply junk. Here, we discuss how to determine whether any given ncRNA has a function. Importantly, we advocate that in the absence of any such data, the appropriate null hypothesis is that the RNA in question is junk.

Here's the problem. Not only do the Ghent scientists make the mistake of equating transcipts with genes, they also completely ignore the controversy. They do not reference the Palazzo and Lee paper in their list of 90(!) references, nor do they reference any other papers that question the functionality of noncoding transcripts.

This is not right. Is it possible that they are completely unaware of the controversy in their own field? Or is there another explanation?2

I'm reminded of something said by one of my favorite scientists when discussing "cargo cult science."

Details that could throw doubt on your interpretation must be given, if you know them. You must do the best you can — if you know anything at all wrong, or possibly wrong — to explain it. If you make a theory, for example, and advertise it, or put it out, then you must also put down all the facts that disagree with it, as well as those that agree with it.

      Richard Feynman "Cargo Cult Science"

The essence of cago cult science is a complete lack of skepticism and an unwillingness to consider any explanation other than the one preferred by the cult.


1.The title of this post is a take-off on How They Brought the Good News from Ghent to Aix. I'm aware of the fact that the citizens of Gent are mostly Flemish and they prefer the spelling "Gent." I'm only using the English version of the name because that's the one used in the preprint.

2. I'm normally a fan of Hanlon's razor but there are times when stupidity just doesn't seem to be the right answer.

Of mice and Michael

Michael Behe has published a book containing most of his previously published responses to critics. I was anxious to see how he dealt with my criticisms of The Edge of Evolution but I was disappointed to see that, for the most part, he has just copied excerpts from his 2014 blog posts (pp. 335-355).

I think it might be worthwhile to review the main issues so you can see for yourself whether Michael Behe really answered his critics as the title of his most recent book claims. You can check out the dueling blog posts at the end of this summary to see how the discussion evolved in real time more than four years ago.

Many Sandwalk readers participated in the debate back then and some of them are quoted in Behe's book although he usually just identifies them as commentators.

My Summary

Michael Behe has correctly indentified an extremely improbably evolution event; namely, the development of chloroquine resistance in the malaria parasite. This is an event that is close to the edge of evolution, meaning that more complex events of this type are beyond the edge of evolution and cannot occur naturally. However, several of us have pointed out that his explanation of how that event occurred is incorrect. This is important because he relies on his flawed interpretation of chloroquine resistance to postulate that many observed events in evolution could not possibly have occurred by natural means. Therefore, god(s) must have created them.

In his response to this criticism, he completely misses the point and fails to understand that what is being challenged is his misinterpretation of the mechanisms of evolution and his understanding of mutations.


The main point of The Edge of Evolution is that many of the beneficial features we see could only have evolved by selecting for a number of different mutations where none of the individual mutations confer a benefit by themselves. Behe claims that these mutations had to occur simultaneously or at least close together in time. He argues that this is possible in some cases but in most cases the (relatively) simultaneous occurrence of multiple mutations is beyond the edge of evolution. The only explanation for the creation of these beneficial features is god(s).

An important part of The Edge is defining the edge and this is where he discusses the development of chloroquine resistance in the malaria parasite, Plasmodium falciparum. He relies heavily on a guesstimate made by Nicholas White some years ago. Here’s how Behe describes it [An Open Letter to Kenneth Miller and PZ Myers].

... considering the number of cells per malaria patient (a trillion), times the number of ill people over the years (billions), divided by the number of independent events (fewer than ten) — the development of chloroquine-resistance in malaria is an event of probability about 1 in 1020 malaria-cell replications.

None of us have a serious problem with this guesstimate but several of us have objected to the way Behe interprets it. Here's how he thinks chloroquine resistance evolves.

From the sequence and laboratory evidence it’s utterly parsimonious and consistent with all the data — especially including the extreme rarity of the origin of chloroquine resistance — to think that a first, required mutation to PfCRT is strongly deleterious while the second may partially rescue the normal, required function of the protein, plus confer low chloroquine transport activity. Those two required mutations — including an individually deleterious one which would not be expected to segregate in the population at a significant frequency — by themselves go a long way (on a log scale, of course) to accounting for the figure of 1 in 1020, perhaps 1 in 1015 to 1016 of it (roughly from the square of the point mutation rate up to an order of magnitude more than it).

He is assuming that the frequency of a single mutation is approximately 10-8 and, since two mutations are required, they have to occur together within a very short time. For the sake of simplicity I'll say that the two mutations have to occur simultaneously althought Behe will quibble that this is not a strict requirement. The probablity of these two mutations occuring simltaneously is 10-8 × 10-8 = 10-16 or one in 1016. This is what he means when he says that simultaneous mutations account for most of the observed frequency of chloroquine resistance.

The rest of Behe's argument is based on this explanation of chlroquine resistance. He argues that the simultaeous occurrance of two different mutations is required to achieve a beneficial effect and this is possible in single cell organisms like Plasmodium because they have huge populations. The probablities drop significantly if more than two mutations are required and/or the populations are much smaller, as in humans. He claims that in humans and other species there are many observed examples of beneficial effects where mutlple mutations are necessary. Given the upper limit seen in chloroquine resistance, it would be impossible to evolve such beneficial effects because they are beyond the edge of evolution.

Ken Miller and PZ Myers tried to explain to Behe that his assumption of simultaneous mutations is unnecessary and probably incorrect. The observed frequency of chloroquine resistance is compatible with other explanations including explanations assuming that some of the required muations are neutral and fixed by random genetic drift. Behe didn't seem to understand why this is important because in the post quoted above he says, "What’s puzzling to me is your thinking the exact route to resistance matters much when the bottom line is that it’s an event of probability 1 in 1020." But, in fact, it matters a great deal since Behe's argument depends on his explanation being correct.

He then doubles down on his explanation ....

It’s also entirely reasonable shorthand to characterize such a situation as needing "simultaneous" or "concurrent" mutations, as has been done by others in the malaria literature, even if the second mutation actually occurs separately in the recent progeny of some sickly, rare cell that had already suffered the first, harmful mutation. Guys, please don’t hide behind some dictionary or Einsteinian definition of "simultaneous." It matters not a whit to the practical bottom line. If you think it does, don’t just wave your hands, show us your calculations.

This is where I stepped in to answer Behe's challenge and try to come up with calculations based on realistic assumptions. I began with the results of Summers et al. (2014) who showed that many chloroquine resistant lines of Plasmodium had multiple different mutations (usually four) that contributed to the resistance phenotype. Several of these mutations were effectively neutral and were segregating in the popluations long before the parasites ever encountered chloroquine. Another complication includes the fact that you have to account for fixation, or at least an increase in frequency, by random genetic drift, so you can't just assume that the observed frequency (10-20) is mostly due to mutation alone.

I also pointed out that the real mutation rate is probably closer to 10-10 which makes the probablitly of two simultaneous mutations even more unlikely. This really confused Behe, who thought that I was making an even stronger case for his explanation when, in fact, I was challenging his explanation.

There are lots of other complications and many unknown variables. Some of these were discussed in the comments under my post(s) and others were discussed in the Summers et al. paper. I concluded that it's quite possible to explain some versions of chloroquine resistance without ever assuming that two simultaneous mutations are required and without ever assuming that one of the mutations is deleterous.

Thus, I believe I answered Michael Behe's challenge to Ken Miller and PZ Myers as he (Behe) described it. I can provide estimates but I can't give precise calculations but, then again, neither can Behe. Here's the Behe challenge.

If you folks think that [my] direct, parsimonious, rather obvious route to 1 in 1020 isn’t reasonable, go ahead, calculate a different one, then tell us how much it matters, quantitatively. Posit whatever favorable or neutral mutations you want. Just make sure they’re consistent with the evidence in the literature (especially the rarity of resistance, the total number of cells available, and the demonstration by Summers et al. that a minimum of two specific mutations in PfCRT is needed for chloroquine transport). Tell us about the effects of other genes, or population structures, if you think they matter much, or let us know if you disagree for some reason with a reported literature result.

Not only did I (we) meet the challenge of providing a reasonble alternative route, we also showed why Behe's route to 10-20 is unresonable. Naturally, Behe didn't like this and he responded on that well-known scientific website, Evolution News and Views, where comments are forbidden. These are the responses that he incorporated into his latest book.

Behe's first attempt was extremely disingenuous. He agreed that we could provide alternative routes to 10-20 by postulating neutral, or nearly-neutral, mutations that did not have to occur simultaneously. In other words, we met the challenge but he dismisses this in his blog post [Laurence Moran’s Sandwalk Evolves Chloroquine Resistance] by saying ...

The bottom line is that numbers can be tweaked and a few different scenarios can be floated, but there’s no escaping the horrendous improbability of developing chloroquine resistance in particular, or of getting two required mutations for any biological feature in general, most especially if intermediate mutations are disadvantageous. If a (selectable) step has to be skipped, the wind goes out of Darwin’s sails.

This is disingenuous because what we showed was that Behe's simple calculation using two simultaneous mutations doesn't take into account lots of variables that he simply ignored. We also showed a reasonable pathway to chloroquine resistance based on single mutations occuring in a background of a population with lots of variation due to segregating nearly-neutral alleles.

Effects that require only two mutations will be common if the first one is effectively neutral (or nearly-neutral) and that's the real lesson of chloroquine resitance. That's not a lesson that Behe likes because he wants to argue that it's difficult to get "two required mutations for any biological feature in general."

To a Mouse

The best laid schemes o' mice and men

Gang aft agley"

Robert Burns

I do not deny that the observed routes to chloroquine resistance were highly improbable (10-20) but I account for this low probabilty by trusting the results of Summers et al. (2014) who showed that four separate mutations were required for effective chloroquine resistance and the mutations had to occur in a particular order. In addition, there are several other factors that contribute to the low overall probability; the most important is the demonstration that the particular combination of mutations are probably the only possible routes to resistance.

Michael Behe tries really hard to counter my objections but as so often happens his best laid schemes go agley (awry). He never really grasps the objections to the chloroquine frequency data: it's not that we are disagreeing with the frequency of chloroquine resistance (about 10-20), it's that we are disagreeing with his explanation of that frequency. That's why I was surprised to see him admit defeat but then claim that it didn't matter [How Many Ways Are There to Win at Sandwalk?]

...it matters not a whit for the prospects of Darwinian theory whether the pathway consists of two required mutations that are individually lethal to a cell and must occur strictly simultaneously (that is, in the exact same replication cycle), or whether it consists of several mutations each with moderately negative selection coefficients, or consists of, say, five required mutations that are individually neutral and segregating at some appreciable frequency in the population, or some other scenario or combination thereof. The bottom line for all of them is that the acquisition of chloroquine resistance is an event of statistical probability 1 in 1020. It is the outlandish improbability of the pathway — not its particular features — that is the crux. It puts strong limits on what we can expect from Darwinian processes. And that is an important point for any biologist — whether in a medical field or not — to appreciate.

I can assure you that we all agree with Michael Behe that the development of chloroquine resistance in Plasmodium falciparum is an extermely rare event that puts strong limits on what we can expect of evolution. That's not the issue: the issue is his incorrect interpretation of that observation and, subsequently, his conclusion that events requiring two or more mutations in humans are beyond the reach of evolution by nautural means.

You have to read The Edge of Evolution carefully to see how he develops his argument. For example, he says on page 112 ...

Suppose, however, that the first mutation wasn't a net plus; it was harmful. Only when both mutations occurred together was it beneficial. Then on average a person born with the mutation would leave fewer offspring than otherwise. The mutation would not increase in the population, and evolution would have to skip a step for it to take hold, because nature would need both necessary mutations at once. ... The Darwinian magic works well only when intermediate steps are each better ("more fit") than preceeding steps, so that the mutant gene increases in number in the population as natural selection favors the offspring of people who have it. Yet its usefullness quickly declines when intermediate steps are worse that earlier steps, and it is pretty much worthless if several intervening steps aren't improvements.

Behe is trying to convince his readers that evolution ("Darwinism") can only work if every single step in a pathway is beneficial. There are a few exceptions, such as chloroquine resistance, but these are such low probabilty events that they can only occur in species with huge population sizes and short generation times. He then goes on to propose the "Two-Binding-Sites" rule, which states that it is very unlikely that a new protein-protein interaction could rise by chance if two mutations are required for binding and it is beyond the edge of evolution for complexes of three or more interactions to evolve.

In short, complexes of just three or more different proteins are beyond the edge of evolution. They are lost in shape space.

And the great majority of proteins in the cell work in complexes of six or more. Far beyond that edge. (p. 135)


The immediate, most important implication is that complexes with more than two different binding sites—ones that require three or more different kinds of proteins—are beyond the edge of evolution, past what is biologically reasonable to expect Darwinian evolution to have accomplished in all of life in all of the billion-year history of the world. (p. 146)

... With the criterion of two protein-protein binding sites, we can quickly see why stupendously complex structures such as the ciliumm, the flagellum, and the machinery that buids then are beyond Darwinian evolution.

At the risk of beating a dead horse, allow me to state the obvious, once again. Behe's entire argument in The Edge of Evolution is founded on a false understanding of evolution because he assumes that most of evolution requires a "Darwinian" mechanism where each new mutation has to confer a beneficial effect. This allows for the sequential evolution of effects with multiple mutations. However, according to Behe, most effects require several mutations where one of the steps is deleterious and those effects are beyond the edge of evolution. Here's how he summarizes the argument ...

Although two or three missing steps [mutations] doesn't sound like much, that's one or two more Darwinian jumps than were requried to get chloroquine resistance in malaria. In Chapter 3 I dubbed that level a "CCC," a "chloroquine-complexity cluster," and I showed that its odd were 1 in 1020 births. In other words (keeping in mind the roughness of the calculation):

Generating a single new cellular protein-protein binding site is of the same order of difficulty or worse than the development of chloroquine resistance in the malarial parasite. (p. 134-135)

It is ridiculous to claim that creating a new protein-protein interaction is as difficult as a CCC.

That's because lots of spurious protein-protein interactions are already present inside the cell. These can be due to single mutations that are effectively neutral, especially in small populations where the effects of negative selction are diminshed (drift-barrier hypothesis). It then takes only one more mutation to make the interaction strong enough to make a difference. This is the essential point in constructive neutral evolution. We tried to explain this to Behe using the development of chloroquine reistance because that also involves neutral mutations but he failed to understand why that pathway was relevant.

I'm disappointed that in this particular instance Behe did not answer his critics. What this means is that A Mousetrap for Darwin is Behe's fourth book and his fourth failed attempt to come up with a decent scientific argument against evolution.

A Key Inference of The Edge of Evolution Has Now Been Experimentally Confirmed July 14, 2014 (Michael Behe)
So, Michael Behe Was Right After All; What Will the Critics Say Now? July 16, 2014 (Casey Luskin)
Quote-mined by Casey Luskin! July 17, 2014 (PZ Myers)
An Open Letter to Kenneth Miller and PZ Myers July 21, 2014 (Michael Behe)
A Pretty Sharp Edge: Reflecting on Michael Behe's Vindication July 28, 2014 (Ann Gauger)
Michael Behe and the edge of evolution July 31, 2014 (Larry Moran)
Taking the Behe challenge! Aug. 1, 2014 (Larry Moran)
Laurence Moran's Sandwalk Evolves Chloroquine Resistance Aug. 13, 2014 (Michael Behe)
Flunking the Behe Challenge! Aug. 13, 2014 (Larry Moran)
CCC's and the edge of evolution Aug. 15, 2014 (Larry Moran)
How Many Ways Are There to Win at Sandwalk? Aug. 15, 2014 (Michael Behe)
Guide of the Perplexed: A Quick Reprise of The Edge of Evolution Aug. 20, 2014 (Michael Behe)
Understanding Michael Behe Aug. 22, 2014 (Larry Moran)
Drawing My Discussion with Laurence Moran to a Close Aug. 26, 2014 (Michael Behe)
Michael Behe's final thoughts on the edge of evolution Aug. 27, 2014 (Larry Moran)


Summers, R.L., Dave, A., Dolstra, T.J., Bellanca, S., Marchetti, R.V., Nash, M.N., Richards, S.N., Goh, V., Schenk, R.L., Stein, W.D., Kirk, K., Sanchez, C.P., Lanzer, M. and Martin, R. (2014) "Diverse mutational pathways converge on saturable chloroquine transport via the malaria parasite’s chloroquine resistance transporter." Proceedings of the National Academy of Sciences 111: E1759-E1767. [doi: 10.1073/pnas.1322965111]

Why is the Central Dogma so hard to understand?

The Central Dogma of molecular biology states ...

... once (sequential) information has passed into protein it cannot get out again (F.H.C. Crick, 1958).

The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential information. It states that such information cannot be transferred from protein to either protein or nucleic acid (F.H.C. Crick, 1970).

This is not difficult to understand since Francis Crick made it very clear in his original 1958 paper and again in his 1970 paper in Nature [see Basic Concepts: The Central Dogma of Molecular Biology]. There's nothing particularly complicated about the Central Dogma. It merely states the obvious fact that sequence information can flow from nucleic acid to protein but not the other way around.

So, why do so many scientists have trouble grasping this simple idea? Why do they continue to misinterpret the Central Dogma while quoting Crick? I seems obvious that they haven't read the paper(s) they are referencing.

I just came across another example of such ignorance and it is so outrageous that I just can't help sharing it with you. Here's a few sentences from a recent review in the 2020 issue of Annual Reviews of Genomics and Human Genetics (Zerbino et al., 2020).

Once the role of DNA was proven, genes became physical components. Protein-coding genes could be characterized by the genetic code, which was determined in 1965, and could thus be defined by the open reading frames (ORFs). However, exceptions to Francis Crick's central dogma of genes as blueprints for protein synthesis (Crick, 1958) were already being uncovered: first tRNA and rRNA and then a broad variety of noncoding RNAs.

I can't imagine what the authors were thinking when they wrote this. If the Central Dogma actually said that the only role for genes was to make proteins then surely the discovery of tRNA and rRNA would have refuted the Central Dogma and relegated it to the dustbin of history. So why bother even mentioning it in 2020?


Crick, F.H.C. (1958) On protein synthesis. Symp. Soc. Exp. Biol. XII:138-163. [PDF]

Crick, F. (1970) Central Dogma of Molecular Biology. Nature 227, 561-563. [PDF file]

Zerbino, D.R., Frankish, A. and Flicek, P. (2020) "Progress, Challenges, and Surprises in Annotating the Human Genome." Annual review of genomics and human genetics 21:55-79. [doi: 10.1146/annurev-genom-121119-083418]