History of the word ‘data’

Sandra Rendgen describes the history of “data” the word and where it stands in present day.

All through the evolution of statistics through the 19th century, data was generated by humans, and the scientific methodology of measuring and recording data had been a constant topic of debate. This is not trivial, as the question of how data is generated also answers the question of whether and how it is capable of delivering a “true” (or at least “approximated”) representation of reality. The notion that data begins to exist when it is recorded by the machine completely obscures the role that human decisions play in its creation. Who decided which data to record, who programmed the cookie, who built the sensor? And more broadly – what is the specific relationship of any digital data set to reality?

Oh, so there's more to it than just singular versus plural.

He was once a prominent cancer researcher. Then his gambling — and a finding of scientific misconduct — got in the way.

In September 2014, an investigation into the work of an award-winning cancer researcher in Illinois concluded that multiple papers had been affected by misconduct. Now, nearly four years later, two of those articles have been retracted. What happened in the intervening years reveals a complicated and at times bizarre story involving not only scientific misconduct, … Continue reading He was once a prominent cancer researcher. Then his gambling — and a finding of scientific misconduct — got in the way.

Press release from the Francis Crick Institute misrepresents junk DNA

Press releases have become a serious problem. I'm frequently upset whenever I read a press release covering a field I'm familiar with. They rarely do a good job of explaining what's actually in the paper and putting it into the proper context. The people who write press releases are more concerned with sensationalizing the work than they are with teaching the general public about how science works. They often do this with the blessing and participation of the scientists who did the work.

Let me illustrate the problem using a recent examples from the Francis Crick Institute in London, UK [Non-coding DNA changes the genitals you're born with]. The press release covers a recent Science paper from the Lovell-Badge lab ....
Gonen, N., Futtner, C.R., Wood, S., Garcia-Moreno, S.A., Salamone, I.M., Samson, S.C., Sekido, R., Poulat, F., Maatouk, D.M., and Lovell-Badge, R. (2018) Sex reversal following deletion of a single distal enhancer of Sox9. Science. [doi: 10.1126/science.aas9408]
These workers discovered and characterized a regulatory region upstream of the mouse Sox9 gene. The Sox9 gene controls the development of testis and deletion of the regulatory region reduces the level of Sox9 gene expression leading to XY individuals that are phenotypically female.

We have known about regulatory DNA for more than 50 years so the paper doesn't make any contribution to our general understanding of transcriptional regulation. In fact, it fits right in with decades of work on enhancers, promoters, and transcription factors.

You wouldn't know that from reading the press release. Even the title of the article (Non-coding DNA changes the genitals you're born with) suggests that there's something unusual about noncoding DNA that has a function. This point is highlighted in the press release ...
Only 2% of human DNA contains the 'code' to produce proteins, key building blocks of life. The remaining 98% is 'non-coding' and was once thought to be unnecessary 'junk' DNA, but there is increasing evidence that it can play important roles.
This is 2018. Isn't it about time that science writers stopped spreading this fake news? There was never a time when knowledgeable scientists thought that all noncoding DNA was junk. Never.

Furthermore, we've had a pretty good understanding of regulatory DNA since the early 1980s. Think about what that means. It means that "increasing evidence" is a misleading way of saying that the basic facts have been known and understood for more than thirty years. Forty years ago you might have gotten away with saying that there's "increasing evidence" that noncoding DNA has a function, but not today.

Is this just sloppy science written by an employee who really doesn't understand the history of gene expression and genome composition? No, it isn't just ignorance on the part of the press office because they have the support of the lead author on the paper; a postdoc named Nitzan Gonen. She is quoted in the press release ...
Dr Nitzan Gonen, first author of the paper and postdoc at the Crick, says: "Typically, lots of enhancer regions work together to boost gene expression, with no one enhancer having a massive effect. We identified four enhancers in our study but were really surprised to find that a single enhancer by itself was capable of controlling something as significant as sex."

"Our study also highlights the important role of what some still refer to as 'junk' DNA, which makes up 98% of our genome. If a single enhancer can have this impact on sex determination, other non-coding regions might have similarly drastic effects. For decades, researchers have looked for genes that cause disorders of sex development but we haven't been able to find the genetic cause for over half of them. Our latest study suggests that many answers could lie in the non-coding regions, which we will now investigate further."
Here's a better way of explaining the significance of this paper.
The opening sentence of the paper says, "The regulation of genes with important roles in embryonic development can be complex, involving multiple, often redundant enhancers, repressors, and insulators."

These regulatory elements are usually found near the genes they regulate and they represent an important part of the genome. This study identifies a regulatory element that controls the Sox9 gene in mice. Defects in regulatory elements are known to cause genetic disorders and it has long been suspected that disorders of sex development are also due to mutations in regulatory elements. This study identifies an important regulatory element that controls sex development and demonstrates that mutations in this element cause sex development disorders.
There's no mention of "noncoding DNA" in the paper and no mention of junk DNA. That's because nobody is surprised to find regulatory elements that aren't in coding exons. Nobody who reads the paper is going to be surprised to learn that noncoding DNA has a function even though they understand that 90% of our genome is junk. Why can't the press release make this clear to the general reader? Why can't the authors make sure the press release accurately represents the published report?

Finding the best Mario Kart character, statistically speaking

Henry Hinnefeld answers the age-old debate of which Mario Kart character is best, using data as his guide.

Some people swore by zippy Yoshi, others argued that big, heavy Bowser was the best option. Back then there were only eight options to choose from; fast forward to the current iteration of the Mario Kart franchise and the question is even more complicated because you can select different karts and tires to go with your character. My Mario Kart reflexes aren’t what they used to be, but I am better at data science than I was as a fourth grader, so in this post I’ll use data to finally answer the question “Who is the best character in Mario Kart?”

For me, it doesn't matter. You will smoke me regardless of which character I have, because I am world's worst video game player.

