An Author Rights Perspective on Scientific Editors

  By Hilda Bastian   What should scientific editors be able to do well? We would all be able to agree easily on some basics. Last year, a group led by David Moher and colleagues

Black History Month: Mathematicians’ Powerful Stories

0000-0002-8715-2896     It was a turning point. The previous year, the US Civil Rights Act had passed. On 26 January 1969 in New Orleans, 17 African-American mathematicians gathered at the annual national mathematical meeting.

The abysmal response of the Salk Institute to accounts of gender discrimination in its midst

Last week news broke of a pair of lawsuits filed by two prominent female scientists alleging they had been subject to persistent gender discrimination by The Salk Institute, the storied independent research center in La Jolla, California, where they both work.

I obviously can’t speak to the validity of these specific charges – it’s not a trivial task to dissect the basis for the successes and failures of small numbers of individuals. But the accounts of Lundblad and Jones sound all too familiar: case studies of a system that we know from myriad individual stories and a bevy of rigorous studies to be systematically biased against women.

Lawsuits are complicated, obviously, and tend to bring out the worst in institutions. But, even given this, the responses of the Salk and its leaders to these charges have been incredibly disappointing.

In an initial statement issued on July 14th, the Salk coupled anodyne verbiage about their commitment to equality and diversity with a document listing “issues” with the careers of both Lundblad and Jones.

Amongst the Salk’s complaints were that, in the past decade, both Jones and Lundblad had “failed to publish a single paper in any of the most respected scientific publications (Cell, Nature and Science)” and that their annual productivity (measured in numbers of papers per year) was below the median of their colleagues.

There are so many things wrong with this statement it is hard to know where to begin. First, counting publications is a horrible way to measure someone’s contributions to science – many fantastic scientists publish slowly and carefully, and a lot of highly “productive” labs publish a large number of worthless papers. Even worse, attempting to equate a scientist’s value to the number of papers they have in CellNature and Science (CNS) is pure bullshit. Everyone in science knows that getting papers into these journals is a brutally competitive lottery, based on an highly flawed system for projecting the quality and impact of a work, heavily impacted the perceived sexiness of the topic (hence the referral by many scientists to these as “glam journals”). There are plenty of people – myself included – who think that the system of review and editorial selection at these journals does not lead to their publishing best science – and to use this as the primary way of judging someone’s career is absurd.

There is also a deeply political aspect to getting papers into these journals, and many serious and outstanding scientists simply choose not to play the game. Crucially, exactly the same kind of “old boys club” effect that Jones and Lundblad cite as affecting their careers at the Salk also plays a role in selecting papers for these “top” journals. I will put aside the fact for now that this obsession with top journals by the Salk is perpetuating the toxic culture of the impact factor that many top Salk scientists (including its president) have derided. More directly relevant to this issue, in citing a poor record of CNS publications as the primary reason that Jones and Lundblad have not been rewarded as much as their male colleagues, they are not strengthening their case – rather the Salk is confessing that it relies on a biased system to judge their scientists, precisely what Jones and Lundblad are alleging.  

The Salk was pretty harshly – and justifiably – trashed for their stance over the ensuing few days, leading to a second statement from the Salk’s president Elizabeth Blackburn, which I repeat here in full:

I’m saddened that an institute as justly revered as the Salk Institute is being misrepresented by accusations of gender discrimination. Our stellar scientists, both female and male, hail from 46 countries around the world and all bring their unique and valuable perspective to our efforts to unravel biological mysteries and discover cures.

I am a female scientist. I have been successfully pursuing scientific research with passion and energy for my entire career. I am not blind to the history of a field that has, unfortunately and sometimes unconsciously, favored males. But I would never preside over an Institute that in any way condoned, openly or otherwise, the marginalizing of female scientists. The Salk Institute and some of the greatest female scientific minds in the world have always worked together for their mutual benefit and the benefit of humanity.

At every place where I have had a leadership voice—the World Economic Forum, the President’s Council, the American Society for Cell Biology, the American Association for Cancer Research, our nation’s prestigious universities, and many committees—I have emphasized diversity and inclusion. That’s an undebatable tenet of mine. Important biological research that is going to impact humanity and improve the condition of our people and our planet is difficult work. Thus we need the best minds in the world— regardless of race, gender or nationality—to help us discover solutions.

This is what we do at the Salk Institute and what we will continue to do: work together to help people live longer, healthier lives.

I have tremendous respect for Blackburn as a scientist and a person, and her words passionately defending diversity are nice. But, to be blunt, this statement is pathetic.

First of all, the fact that Blackburn emphasized diversity and inclusion in Davos or anywhere else is of no consequence. She is now the leader of an academic institution and what matters now is not words but tangible steps to eliminate discrimination at her institution. And the idea that she would “never preside over an Institute that in any way condoned, openly or otherwise, the marginalizing of female scientists” is risible. The marginalization of female and many other types of scientists is not a rare, isolated facet of specific institutions – it is an endemic, universal problem in science.

Almost by definition every leader of every institute is presiding over an organization that participates in the marginalization of women in science, because it is intrinsic to operating in the world we live in today. The Salk would have to be an unprecedentedly remarkable place if it were free of gender and other forms of discrimination. The question for Blackburn and other scientific leaders is not whether they condone discrimination, it is whether they are willing to confront the fact that it unequivocally does exist AT THEIR INSTITUTION, whether they endeavor recognize the specific ways it manifests AT THEIR INSTITUTION and whether they use their leadership position to take tangible action to eliminate it AT THEIR INSTITUTION. 

Instead of doing any of this, Blackburn would have us believe that any assertion of discrimination must be false simply because she would never be the leader of such an institution. Instead of dealing with the problem and instead of recognizing the bravery it took for Jones and Lundblad to put themselves forward in this way, Blackburn has publicly called them bad scientists and liars. And in doing so Blackburn joins a long list of institutional leaders who, when presented with evidence of discrimination at their institution attack the messengers, valuing their short-term interests of their institution at the expense of the long-term interests of science and people who carry it out.

For all her lofty rhetoric about the value of diversity, Blackburn has failed the acid test of promoting it.

Exploring the relationship between gender and author order and composition in NIH-funded research

Last week there was a brief but interesting conversation on Twitter about the practice of “co-first” authors on scientific papers that led me to do some research on the relationship between author order and gender using data from the NIH’s Public Access Policy.

I want to note at the outset that this is my first foray into analyzing this kind of data, so I would love feedback on the data, analyses and finding, especially links to other work on the subject, as I know some of these issues have been addressed elsewhere.

A long post follows, but here are some main things I found:

  • The number of female authors falls off as you go down the list of authors of a paper, with fewer than 30% of senior authors female.
  • Contrary to my expectation, there doesn’t seem to be a bias to put the male author first when there are male-female co-first author pairs.
  • There are, however, far fewer male-female co-first author pairs than there should be based on the number of male and female first and second authors.
  • The same thing holds true more generally for first-second author pairs. There is a deficit of cross gender pairs and a surplus of same gender pairs.
  • Part (and maybe most) of this effect is due to an overall skew in gender composition of authors on papers.
  • If you are female, there is a 45% chance that a random co-author on one of your papers is female. If you a male, there is only a 35% chance that a random co-author on one of your papers is female.

Before I explain how I got all this, let me start with a quick explainer about how to parse the list of authors on a scientific paper.

By convention in many scientific disciplines (including biology, which this post is about), the first position on the author list of a paper goes to the person who was most responsible for doing the work it describes (typically a graduate student or postdoc) and the last position to the person who supervised the project (typically the person in whose lab the work was done). If there are more than two authors an effort is made to order them in rough relationship to their contributions from the front, and degree of supervision from the back.

Of course a single linear ordering can not do justice to the complexity of contribution to a scientific work, especially in an era of increasingly collaborative research. One can imagine many better systems. But, unfortunately, author order is currently the only way that the relative contributions of different people to a work is formally recorded. And when a scientist’s CV is being scrutinized for jobs, grants, promotions, etc… where they are in the author order matters A LOT – you only really get full credit if you are first or last.

Because of the disproportionate weight placed on the ends of the author list, these positions are particularly coveted, and discussions within and between labs about who should go where, while sometimes amicable, are often difficult and contentious.

In recent years it has become increasingly common for scientists to try and acknowledge ambiguity and resolve conflicts in author order by declaring that two or more authors should be treated as “co-first authors” who contributed equally to the work, marking them all with a * to designate this special status.

But, as the discussion on Twitter pointed out, this is a bit of a ruse. First is still first, even if it’s first among equals (the most obvious manifestation of this is that people consider it to be dishonest to list yourself first on the author list on your CV if you were second with a * on the original paper).

Anyway, during this discussion I began to wonder about how the various power dynamics at play in academia played out in the ordering of co-equal authors. And it seemed like an interesting opportunity to actually see these power dynamics at play since the * designation indicates that the contributions of the *’d authors was similar and therefore any non-randomness in the ordering of *’d authors with respect to gender, race, nationality or other factors likely reflects biases or power asymmetries.

I’m interested in all of these questions, but the one that seemed most accessible was to look at the role of gender. There are probably many ways to do this, but I decided to use data from PubMed Central (PMC), the NIH’s archive of full-text scientific papers. Papers in PMC are available in a fairly robust XML format that has several advantages over other publicly available databases: 1) full names of authors are generally provided, making it possible to infer many of their genders with a reasonable degree of accuracy, and 2) co-first authorship is listed in the file in a structured manner.

I downloaded two sets of papers from PMC: 1,355,350 papers in their “open access” (OA) subset that contains papers from publishers like PLOS that allow the free text to be redistributed and reused 424,063 papers from the “author manuscript” (AM) subset that contains papers submitted as part of the NIH’s Public Access Policy. There papers are all available here.

I then wrote some custom Python scripts to parse the XML, extracting from each paper the author order, the authors’ given names and whether or not they were listed as “co-first” or “equal” authors (this turned out to be a bit trickier than it should have been, since the encoding of this information is not consistent). I will comment up the code and post it here ASAP.

I looked at several options for inferring an author’s gender from their given name, recognizing that this is a non-trivial challenge, with many potential pitfalls. I found that a program called genderReader, recommended by Philip Cohen, worked very well. It’s a bit out of date, but names don’t change that quickly, so I decided to use it for my analyses.

I parsed all the files (a bit of a slow process even on a fast computer) and started to look at the data. I’m going to focus on the AM subset here first, because these are all NIH funded papers and thus mostly from the US, so intercountry differences in authorship practices won’t confound the analyses, and because the set is likely more representative of the universe of papers as a whole than is the OA subset. I will try to note where these two datasets agree and disagree.

Of the 424,063 papers in AM, there are 2,568,858 total authors with a maximum of 496 and a wide distribution.

Author Number Histogram

There are 219,559 unique given names (including first name + middle initials), of which about 75% were classified successfully by genderReader as male, mostly male, female, mostly female or unisex. About 25% were not in their database. For the purpose of these analyses, I treated mostly male as male and mostly female as female. I’m sure there’s some errors in this process, but I looked over a reasonable subset of the calls and the only clear bias I saw was that it didn’t do a good job of classifying Asian names – treating most of them as unisex, and thereby excluding them from my analysis. All together there were 1,206,616 male authors, 737,424 female authors and 624,818 who weren’t classified. Of the authors who were classified, 62% were male.

Of the 424K paper 32,304 contained co-equal authors, and 28,184 contained two or more co-first authors (assessed by asking if the co-equal authors were at the beginning of the author list). Of these, 85% (24,087) had exactly two co-first authors and 12% (3,285) had three co-first authors (one had 20 co-first authors, which I’m just going to leave here for discussion). I decided to use only those with exactly two co-first authors for the next set of analyses.

There were a total of 11,340 papers with exactly two co-first authors both of whose genders were inferred. Of these, the author order counts were as follows:


I will admit I expected to see a lot more papers with Male-Female than Female-Male orders amongst two co-first authors. That is, however, not what the data show.

However, that doesn’t mean there’s not something interesting going on with gender here. First, there’s obviously a lot more male authors than female authors. In this set of papers, only 40.3% of authors in position 1 and 41.0% in position 2 are female. Given this you can easily calculate the expected number of MM, MF, FM and FF pairs there should be.


Although there doesn’t seem to be a bias in favor of M-F over F-M, there are significantly (p << .0000000001 by Chi-square) fewer mixed gender co-first author pairs than you’d expect given the overall number of male and female co-first authors.

What can explain this? Are young scientists less likely to collaborate across gender lines than within them? Are male and female pairs better able to resolve their authorship disputes, and are thus underrepresented amongst co-first authors? Or are there fewer opportunities for them to collaborate because of biased lab compositions?

First I wanted to ask if there was a similar bias if we looked at all papers, not just the relatively rare co-first author papers. Here is the fraction of female author by position in author list for all papers (excluding the last author for now).

Author gender by position

Female authors are most common in the first author position and they are increasingly less represented as you go back in the author order. Maybe this has to do with the well documented problem of academia driving out women between graduate school and faculty position. So next I asked what fraction of senior authors are women.

Gender by Senior Author Position

Yikes. Only 28% of senior authors of NIH author manuscripts are female compared to 46% of first authors. That’s horrible.

So what about the question from above. Are mixed gender first and second author pairs less common across all papers, not just co-firsts? The answer is yes.


Again, there are lots of possible explanations for this, but I was curious about the effect of biased lab composition (if the gender composition of labs is skewed away from parity then you’d expect more same gender author pairs). It’s hard to look at this directly with this data, but if one were going to guess at a covariate for skewed lab gender it would be the gender of the PI, and this I can look at with this data.

So, I next broke the data down by the gender of the senior author.

Author gender by PI gender

And in tabular form since the data are so striking.

 PI FemalePI Male
1st56.3 %41.0 %
2nd53.0 %40.6 %
3rd50.6 %40.0 %
4th48.5 %39.0 %
5th45.1 %37.1 %

This data very strongly suggests that women are more likely to join labs with female PIs and men more likely to join labs with male PIs. But it doesn’t say why. It could be that people simply choose labs with a PI of their gender, or that PIs select people of the same gender for their labs. This could have to do with direct gender bias, or with lab style or many other things. Or it could be that there’s a hidden field effect here – that different fields have different gender biases, which would drive the gender distribution of labs on average away from parity.

But whatever the reason it’s a clear confounding factor in looking at gender and authorship. Interestingly, the bias against mixed gender first and second authorship is still there (p-values << .0000000001) even if you control for the gender of the PI.

Next I asked if we could detect a skew in the gender composition of the entire author list of papers. So I took sets of papers with number of authors ranging from 2 to 8 (these are the ones for which we have enough data), filtered out papers where one or more authors didn’t have an inferred gender, and compared the distribution of the number of female authors to that expected by the frequency of male and female authors at each position. There is very consistently a skew towards the extremes, with a significant excess in every case of papers with authors of one gender.

Gender skew

So there’s a pretty systemic skew in the gender composition of authors on papers, but where that skew comes from is unclear. Let’s look at the gender mix of all of the other authors on a paper as a function of the gender of the last author.

Gender skew by last author

Again, there’s a pretty strong skew. But is this due to the PI’s gender or to a more general gender imbalance? It’s a bit hard to tell from this data alone. It turns out the skew you see after dividing based on the gender of the last author is roughly the same if you divide based on the gender of any other position in the author order. Here, for example, is what you get for papers with six authors.

effect of reference author

There’s a lot more one could and should do with this data, and I will come back to it later, but for now I will end with this observation. If you are female, there is a 45% chance that a random co-author on one of your papers is female. If you are male, it goes down to 35%. That’s a pretty big and striking difference, and I’m curious if anyone has a good explanation for it.

Wikipedia Activism and Diversity in Science

There’s no getting around it. A lot of scientists are white men, and it’s always been that way. But it’s never been the whole picture. Getting a better picture of scientists whose work or lives

The Value of 3 Degrees of Separation on Twitter

  The more interconnected our Twitter networks get, the more the distance between us and total strangers shrinks [PDF]. That’s not always a good thing. Twitter is fabulous. There’s fun, camaraderie, fascinating people, and ideas you wouldn’t otherwise encounter. Victoria Costello … Continue reading »

The post The Value of 3 Degrees of Separation on Twitter appeared first on PLOS Blogs Network.

The Outrage Factor – Then and Now

  There’s a lot of outrage about outrage storming around women in science and science journalism at the moment. And fear of causing it, too. It’s easy to cast outrage as inimical to thinking and discussion. It’s not unusual to want … Continue reading »

The post The Outrage Factor – Then and Now appeared first on PLOS Blogs Network.

“Just” Joking? Sexist Talk in Science

    I’m a scientist who’s also a cartoonist. So I’ve got a pretty keen interest in scholarship and empirical research on humor. And I want to talk about research and sexist jokes, and where that leads. It’s a response to a narrative … Continue reading »

The post “Just” Joking? Sexist Talk in Science appeared first on PLOS Blogs Network.

Sympathy for the Devil?

My Facebook feed is awash with people standing up for Tim Hunt: “The witch hunt against Tim Hunt is unbearable and disgraceful”, “This is how stupidity turns into big damage. Bad bad bad”, “Regarding the Tim Hunt hysteria”, and so on. Each of these posts has prompted a debate between people who think a social media mob has unfairly brought a good man down, and people like me who think that the response has been both measured and appropriate.

I happened to met Tim Hunt earlier this year at a meeting of young Indian investigators held in Kashmir. We both were invited as external “advisors” brought in to provide wisdom to scientists beginning their independent careers. While his “How to win a Nobel Prize” keynote had a bit more than the usual amount of narcissism, he was in every other way the warm, generous and affable person that his defenders of the last week have said he is. I will confess I kind of liked the guy.

But it is not my personal brush with Hunt that has had me thinking about this meeting the past few days. Rather it is a session towards the end of the meeting held to allow women to discuss the challenges they have faced building their scientific careers in India. During this session (in which I was seated next to Hunt) several brave young women stood up in front of a room of senior Indian and international scientists and recounted the specific ways in which their careers have been held back because of their gender.

The stories they told were horrible, and it was clear from the reaction of women in the room that these were not isolated incidents. If any of the scientists in positions of power in the room (including Hunt) were not already aware of the harassment many women in science face, and the myriad obstacles that can prevent them from achieving a high level of success, there is no way that could have emerged not understanding.

When I am thinking about what happened here, I am not thinking about how Twitter hordes brought down a good man because he had a bad day. I am instead thinking about what it says to the women in that room in Kashmir that this leading man of science – who it was clear everybody at the meeting revered – had listened to their stories and absorbed nothing. It is unconscionable that, barely a month after listening to a women moved to tears as she recounted a sexual assault from a senior colleague and how hard it was for her to regain her career, Hunt would choose to mock women in science as teary love interests.

Hunt’s words, and even more so his response to being called out for them, suggest that he does not understand the damage his words caused. I will take him at his word that he did not mean to cause harm. But the fact that he did not realize that those words would cause harm is worse even than the words themselves. That a person as smart as Hunt could go his entire career without realizing that a Nobel Prizewinner deriding women – even in a joking way – is bad just serves to show how far we have to go.

So, you’ll have to forgive me for recoiling when people ask me to measure my words based on the effect they will have on Hunt. I understand all too well the effects that criticism can have on people. But silence also has its consequences. And we see around us the consequences of decades of silence and inaction on sexism in science. If the price of standing up to that history is that Tim Hunt has to weather a few bad weeks, well so be it.

Celebrating 10 years of Athena SWAN Charter advancing women in science

By Sara Carvalhal Gender inequality in science has been in the news lately, particularly around the fall-out of Sir Tim Hunt’s biased comments toward female scientists. Sir Hunt’s comments are not held in isolation, but rather indicate the need for … Continue reading »

The post Celebrating 10 years of Athena SWAN Charter advancing women in science appeared first on PLOS Blogs Network.