Thesis defense: 50th anniversary

Today is the 50th anniversary of my Ph.D. oral defense. The event took place in the Department of Biochemical Sciences at Princeton University back in 1974. It began with a departmental seminar. When the seminar was over I retired with my committee to a small classroom for the oral exam.

I don't remember everyone who was on my committee. My Ph.D. supervisor (Bruce Alberts) was there, as was my second reader, Abe Worcel. I know Uli Laemmli was there and so was Arnie Levine. I'm pretty sure the external member of the committee was Nancy Nossal from NIH in Bethesda, MD (USA). It's a bit of a blur after all these years.

I remember being fairly confident about the exam. After five and a half years I was pretty sure that everyone on my committee wanted to get rid of me and the easiest way to do that was to let me pass. Bruce stood to gain $3000 per year of research money and Uli was going to get back the basement of his house where I had been living for the past month after getting kicked out of the married graduate students housing project for taking too long to complete my thesis.

The toughest questions were from Uli Laemmli, which should not come as a surprise to anyone who knows him. He has this annoying habit of expecting people to understand the basic physics and chemistry behind the biochemical sciences. Fortunately, my inability to answer most of his questions didn't deter him from voting to pass me.

Read more »

My father on D-Day: 75 years ago

Today is the 75th anniversary of D-Day—the day British, Canadian, and American troops landed on the beaches of Normandy.1

For us baby boomers it always meant a day of special significance for our parents. In my case, it was my father who took part in the invasions. That's him on the right as he looked in 1944. He was an RAF pilot flying rocket-firing typhoons in close support of the ground troops. His missions were limited to quick strikes and reconnaissance during the first few days of the invasion because Normandy was at the limit of their range from southern England. During the second week of the invasion (June 14th) his squadron landed in Crepon, Normandy and things became very hectic from then on with several close support missions every day [see Hawker Hurricanes and Typhoons in World War II].


I have my father's log book and here are the pages from June 1944 (below). The red letters on June 6 say "DER TAG." It was his way of announcing D-Day. On the right it says "Followed SQN across channel. Saw hundreds of ships ... jumped by 190s. LONG AWAITED 2nd FRONT IS HERE." Later that day they shot up German vehicles south-east of Caen where there was heavy fighting by British and Canadian troops. The next few weeks saw several sorties over the allied lines. These were mostly attack missions using rockets to shoot up German tanks, vehicles, and trains.


The photograph on the right shows a crew loading rockets onto a typhoon based just a few kilometers from the landing beaches in Normandy. You can see from the newspaper clipping in my father's log book that his squadron was especially interested in destroying German headquarter units and they almost got Rommel. It was another RAF squadron that wounded Rommel on July 17th.

The colorized photo on the left is my father in his Typhoon.

The log book entry (above) for June 10th says, "Wizard show. Recco area at 2000' south west of Caen F/S Moore and self destroyed 2 flak trucks, 2 arm'd trucks, and 1 arm'd command vehicle, Every vehicle left burning but one. Must have been a divisional headquarters? No casualties."

Here's another description of that rocket-firing typhoon raid [Air Power Over the Normandy Beaches and Beyond].
Intelligence information from ULTRA set up a particularly effective air strike on June 10. German message traffic had given away the location of the headquarters of Panzergruppe West on June 9, and the next evening a mixed force of forty rocket-armed Typhoons and sixty-one Mitchells from 2 TAF struck at the headquarters, located in the Chateau of La Caine, killing the unit's chief of staff and many of its personnel and destroying fully 75 percent of its communications equipment as well as numerous vehicles. At a most critical point in the Normandy battle, then, the Panzer group, which served as a vital nexus between operating armored forces, was knocked out of the command, control, and communications loop; indeed, it had to return to Paris to be reconstituted before resuming its duties a month later.

My father was awarded the Distinguished Flying Cross (DFC) for his efforts during the war.

(This article was first posted on June 6, 2014.)


1. The British landed at Sword Beach and Gold Beach, the Canadians at Juno Beach, and American troops landed at Omaha and Utah Beaches.

Most popular Sandwalk posts of 2018

Blogging was light last year because I was busy with other things and because the popularity of blogs is declining rapidly. The most popular post, based on the number of views, garnered only 9229 views, which is more than the most popular post of 2017 but only half as much as the most popular post of 2016. The post with the most comments (53) has almost 10X fewer comments than posts from a few years ago but that's partly because more people are commenting on Facebook and because I'm restricting blog comments in various ways.

Here's the most popular post by total views. It attracted a number of people who attempted, rather unsuccessfully, to defend evolutionary psychology.
Is evolutionary psychology a deeply flawed enterprise?

You may disagree with these criticisms of evolutionary psychology but there's no denying that the discipline is under attack. In fact, it's hard to think of any other academic discipline whose fundamental validity is being questioned so openly.
The post with the most comments generated a lively discussion about Neutral Theory and Nearly-Neutral Theory and I learned a lot.

Celebrating 50 years of Neutral Theory
The journal of Molecular Biology and Evolution has published a special issue: Celebrating 50 years of the Neutral Theory. The key paper published 50 years ago was Motoo Kimura's paper on “Evolutionary rate at the molecular level” (Kimura, 1968) followed shortly after by a paper from Jack Lester King and Thomas Jukes on "Non-Darwinian Evolution" (King and Jukes, 1969).
One of my posts that took a lot of work is also one that I think is pretty informative.
How many protein-coding genes in the human genome?

There are many ways of predicting protein-coding genes using various algorithms that look for open reading frames. The software is notorious for overpredicting genes leading to many false positives and that's why every new genome sequence contains hundreds of so-called "orphan" genes that lack homologues in other species. When these predicted genes are examined more closely they turn out to be artifacts—they are not functional genes.
I figured out how to do pie charts and how to represent the overlapping categories of junk DNA (e.g. defective transposons within introns). You can see my first attempt in the figure at the top of the page but watch for an update in the post below.
What's In Your Genome? - The Pie Chart

This adds up to 8% of the genome. The remaining 92% is junk.

Most of the junk consists of: (1) very obvious examples of broken genes (pseudogenes 5%); (2) bits and pieces of transposon sequences that used to be capable of transposing but have mutated over time (45%); and (3) ancient viral sequences that have degenerated (9%). That's 59% of the genome that's clearly junk DNA. In addition, there's plenty of evidence that most intron sequences are dispensable. That accounts for another 28% of the genome. The total amount of junk DNA is at least 87%.

Note that protein-coding genes take up about 23% of the genome (1% exons, 22% introns). Genes for functional noncoding RNAs take up an additional 7% of the genome (1% exons, 6% introns). (Much of the functional region of noncoding RNA genes consists of 300 copies of ribosomal RNA genes (0.4%).) The important point is that roughly 30% of the genome is genes when we define a gene as a DNA sequence that's transcribed. A lot of this is junk within introns.


Most popular Sandwalk posts of 2017

I was looking at some of my posts from the past few years and wondered which ones were the most popular. I had previously identified the most popular post of 2016 but not the most popular ones from 2017 so here they are.

The one with the most views (7481) is a link to a video by Michio Kaku who tells us that humans have stopped evolving [Another physicist teaches us about evolution].

The one with the most comments (259) is a post about my attempts to teach a creationist about glycolysis and evolution [Trying to educate a creationist (Otangelo Grasso)].

The post that I'm most proud of is: Historical evolution is determined by chance events


My DNA story

This is the latest update from Ancestry.com. Their algorithms are getting better and better. This corresponds very closely to what I know of my ancestors.



Who wants “A Sad Case: Owen vs Huxley” pamphlet and a possible Darwin letter?

A friend has a neighbor who's in possession of a pamphlet from 1863 on the Owen vs Huxley debate. The text of the pamphlet is here: A Report of A SAD CASE, Recently tried before the Lord Mayor, OWEN versus HUXLEY, In which will be found fully given the Merits of the great Recent BONE CASE. A photocopy of the pamphlet is shown below along with a possible letter from Charles Darwin (I have not authenticated the letter).

The owners are willing to donate the material to a worthy cause, preferably a museum if it's valuable. Does anyone know of a worthy home?










I’m going to a birthday party!

It's Bruce Alberts' 80th birthday party in San Francisco. There will be food, wine, cake, and (probably) dancing but first you go to the symposium on education.


Bruce Alberts’ 80th Birthday Gathering and Symposium

Saturday, April 14
Symposium on Science Education and Science Policy in Honor of Bruce Alberts’ 80th Birthday
(At the Metropolitan Club, 640 Sutter St., San Francisco 94102)

9a Guests arrive and register

10a Introduction by Master of Ceremonies Gregor Eichele

10:10a Session 1 How do we convey the importance of science to the public?
Moderator: Maureen Munn
Panelists: Janet Coffey, Will Colglazier, Janet English, Caroline Kiehle

11:40a Break

12p Buffet Lunch served in the Garden Room

1:30p Session 2 Innovations in Teaching and Learning in Higher Education
Moderators: Doug Kellogg and Kimberly Tanner.
Panelists: Judy Miner, Sally Pasion (one more panelist TBA)

2:30p Coffee and tea break

3p Session 3 Challenges Facing the Next Generation of Scientists
Moderators: Cynthia Fuhrmann and Bill Theurkauf.
Panelists: Marc Kirschner, Barry Selick, Nolan Sigal

4p Break

4:30p Session 4 Science Policy
Moderators: Mary Maxon and Jason Rao
Panelists: Bill Colglazier, Haile Debas, Donna Riordan, Keith Yamamoto

5:30p Elaine Bearer’s Duet for clarinet and viola: “Replication Machine”

6:15p Reception at Metropolitan Club Bar (4th Floor)

7p Buffet Dinner (Metropolitan Club Main Dining Hall — 4th Floor) Ending at 9:30p.

Sunday, April 15

10a - 2p Drop-in Brunch for all hosted at Beth Alberts’ home


Photo: Bruce Alberts with his first three graduate students: Glenn Herrick (right), Keith Yamamoto (left), Larry Moran (middle right), Bruce Alberts (middle left).

Happy Darwin Day 2018!

Charles Darwin, the greatest scientist who ever lived, was born on this day in 1809 [Darwin still spurs tributes, debates] [Happy Darwin Day!] [Darwin Day 2017]. Darwin is mostly famous for two things: (1) he described and documented the evidence for evolution and common descent and (2) he provided a plausible scientific explanation of evolution—the theory of natural selection. He put all this in a book, The Origin of Species by Means of Natural Selection published in 1859—a book that spurred a revolution in our understanding of the natural world.

Modern evolutionary theory has advanced well beyond Darwin's theory but he still deserves to be honored for being the first to explain evolution and promote it in a way that convinced others. Here's one passage from the introduction to Origin of Species.
Although much remains obscure, and will long remain obscure, I can entertain no doubt, after the most deliberate and dispassionate study of which I am capable, that the view which most naturalists entertain, and which I formerly entertained—namely, that each species has been independently created—is erroneous. I am fully convinced that species are not immutable; but that those belonging to what are called the same genera are lineal descendants of some other and generally extinct species, in the same manner as the acknowledged varieties of any one species are the descendants of that species. Furthermore, I am convinced that Natural Selection has been the main but not exclusive means of modification.


What’s in Your Genome?: Chapter 4: Pervasive Transcription (revised)

I'm working (slowly) on a book called What's in Your Genome?: 90% of your genome is junk! The first chapter is an introduction to genomes and DNA [What's in Your Genome? Chapter 1: Introducing Genomes ]. Chapter 2 is an overview of the human genome. It's a summary of known functional sequences and known junk DNA [What's in Your Genome? Chapter 2: The Big Picture]. Chapter 3 defines "genes" and describes protein-coding genes and alternative splicing [What's in Your Genome? Chapter 3: What Is a Gene?].

Chapter 4 is all about pervasive transcription and genes for functional noncoding RNAs. I've finally got a respectable draft of this chapter. This is an updated summary—the first version is at: What's in Your Genome? Chapter 4: Pervasive Transcription.
Chapter 4: Pervasive Transcription

How much of the genome is transcribed?
The latest data indicates that about 90% of the human genome is transcribed if you combine all the data from all the cell types that have been analyzed. This is about the same percentage that was reported by ENCODE in their preliminary study back in 2007 and about the same percentage they reported in the 2012 papers. Most of the transcripts are present in less than one copy per cell. Most of them are only found in one or two cell types. Most of them are not conserved in other species.
How do we know about pervasive transcription?
There are several technologies that are capable of detecting all the transcripts in a cell. The most powerful is RNA-Seq, a technique that copies RNAs into cDNA then performs massive parallel sequencing ("next gen" sequencing) on all the cDNAs. The sequences are then matched back to the reference genome to see which parts of the genome were transcribed. The technique is capable of detecting concentrations of less than one transcript per cell.
Different kinds of noncoding RNAs
There are ribosomal RNAs, tRNAs, and a variety of unique RNAs like those that are part of RNAse P, signal recognition particle etc. In addition there are six main classes of other noncoding RNAS in humans: small nuclear RNAs (snRNAs); small nucleolar RNAs (snoRNAs); microRNAs (miRNAs); short interfering RNAs (siRNAs); PIWI-interacting RNAs (piRNAs); and long noncoding RNAs (lncRNAs). There are many proven examples of functional RNAs in each of the main classes but there are also large numbers of putative members that may or may not be true functional noncoding RNAs.
        Box 4-1: Long noncoding RNAs (lncRNAs)
There are more than 100,000 transcripts identified as lncRNAS. Nobody knows how many of these are actually real functional lncRNAs and how many are just spurious transcripts. The best analyses suggest that less than 20,000 meet the minimum criteria for function and probably only a fraction of these are actually functional.
Understanding transcription
It's important to understand that transcription is an inherently messy process. Regulatory proteins and RNA polymerase initiation complexes will bind to thousands of sites in the human genome that have nothing to do with transcription of nearby genes.
        Box 4-2: Revisiting the Central Dogma
Many scientists and journalist believe that the discovery of massive numbers of noncoding RNAs overthrows the Central Dogma of Molecular Biology. They are wrong.
        Box 4-3: John Mattick proves his hypothesis?
John Mattick claims that the human genome produces tens of thousands of regulatory RNAs that are responsible for fine-tuning the expression of the protein-coding genes. He was given the 2012 Chen Award by the Human Genome Organization for "proving his hypothesis over the course of 18 years." He has not proven his hypothesis.
Antisense transcription
Some transcripts are complimentary to the coding strand in protein-coding genes. This is consistent with spurious transcription to yield junk RNA but many workers have suggested functional roles for most of these antisense RNAs.
What the scientific papers don't tell you
There are hundreds of scientific papers devoted to proving that most newly-discovered noncoding RNAs have a biological function. What they don't tell you is that most of these transcripts are present in concentrations that are inconsistent with function (<1 molecule per cell). They also don't tell you that conservation is the best measure of function and these transcripts are (mostly) not conserved. More importantly, the majority of these papers don't even mention the possibility that these transcripts could be junk RNA produced by spurious transcription. That's a serious omission—it means that science writers who report on this work are unaware of the controversy.
On the origin of new genes
Some scientists are willing to concede that most transcripts are just noise but they claim this is an adaptation for future evolution. The idea here is that the presence of these transcripts makes it easier to evolve new protein-coding genes. While it's true that such genes could evolve more readily in a genome full of noise and junk, this cannot be a reason for such a sloppy genome.
How do you determine function?
The best way to determine function is to take a single transcript and show that it has a demonstrable function. If you take a genomics approach, then the best way to narrow down the list is to concentrate on those transcripts that are present in sufficient concentrations and are conserved in related species. In the absence of evidence, the null hypothesis is junk.
Biochemistry is messy
We're used to the idea that errors in DNA replication give rise to mutations and mutations drive evolution. We're less used to the idea that all other biochemical processes have much higher error rates. This is true of highly specific enzymes and it's even more true of complex processes like transcription, RNA processing (splicing), and translation. The idea that transcription errors could give rise to spurious transcripts in large genomes is perfectly consistent with everything we know about such processes. In fact, it's inevitable that spurious transcripts will be common in such genomes.
        Box 4-4: The random genome project
Sean Eddy has proposed an experiment to establish a baseline level of spurious transcripts and to demonstrate that the null hypothesis is the best explanation for the majority of transcripts. He suggests that scientists construct a synthetic chromosome of random DNA sequences and insert it into a human cell line. The next step is to perform an ENCODE project on this DNA. He predicts that the methods will detect hundreds of transcription factor binding sites and transcripts.
Change your worldview
There are two ways of looking at biochemical processes within cells. The first imagines that everything has a function and cells are as fine-tuned and functional as a Swiss watch. The second imagines that biochemical processes are just good enough to do the job and there's lots of mistakes and sloppiness. The first worldview is inconsistent with the evidence. The second worldview is consistent with the evidence. If you are one of those people who think that cells and genomes are the products of adaptive excellence then it's time to change your worldview.