Continuous geographic structure is real, "discrete races" aren't

Posted 29 February 2012 by Nick Matzke

Note: This topic is outside of my specialty, so it may be that I have missed some important points. I think I've got the basics correct, but this is a very complex topic. I will be interested in critical but constructive posts in the comments. Update: required reading, which basically confirms my points I think: Weiss, K. M. and J. C. Long (2009). "Non-Darwinian estimation: My ancestors, my genes' ancestors." Genome Research 19(5), 703-710. Nievergelt, C. M., O. Libiger and N. J. Schork (2007). "Generalized Analysis of Molecular Variance." PLoS Genetics 3(4), e51. On Monday, Jerry Coyne at Why Evolution is True posted on "Are there human races?" While acknowledging the very bad history of the race concept in human history, and noting some of the problems with applying the concept to humans, Coyne concluded, basically, that the answer was yes, there are human races. While reviewing Jan Sapp's piece which concluded that human races did not objectively exist, Coyne wrote:

As Sapp notes, and supports his conclusion throughout the review:
Although biologists and cultural anthropologists long supposed that human races--genetically distinct populations within the same species--have a true existence in nature, many social scientists and geneticists maintain today that there simply is no valid biological basis for the concept. The consensus among Western researchers today is that human races are sociocultural constructs.
Well, if that's the consensus, I am an outlier. I do think that human races exist in the sense that biologists apply the term to animals, though I don't think the genetic differences between those races are profound, nor do I think there is a finite and easily delimitable number of human races. Let me give my view as responses to a series of questions.

Unfortunately, while Coyne's position is admirably clearly expressed, his reasons for accepting race don't hold up. Some of his commentators have pointed out some of the problems in the comments. My own critique of his position is informed by a symposium I attended in San Francisco last year, "Is There Space for Race in Evolutionary Biology?", and particularly by the arguments that speakers Massimo Pigliucci and Alan Templeton made there. Here's the conference information:

Is There Space for Race in Evolutionary Biology? The 2011 Spring Conference for the Bay Area Biosystematists Are human races geographical subspecies? How about distinct evolutionary lineages? What about ecotypes? Does it even make sense to divide humans into subspecies? Can there be biological races in other species even though there aren't any in humans? This conference aims to answer these questions and more as it explores the question of whether race can be validly defined in the context of evolutionary biology. Come join us at the 2011 Spring Conference for the Bay Area Biosystematists! Please peruse this site to learn more about the conference. Date: May 14, 2011 Time: 12:45-7pm Location: McLaren Conference Center, Room 250, University of San Francisco

Coyne asks a series of questions and gives answers: 1. What are races? Coyne answers:

"races of animals (also called "subspecies" or "ecotypes") are morphologically distinguishable populations that live in allopatry (i.e. are geographically separated)"

2. Under that criterion, are there human races? Coyne answers:

Yes. As we all know, there are morphologically different groups of people who live in different areas, though those differences are blurring due to recent innovations in transportation that have led to more admixture between human groups.

3. How many human races are there? Answer:

That's pretty much unanswerable, because human variation is nested in groups, for their ancestry, which is based on evolutionary differences, is nested in groups.

4. How different are the races genetically? Answer:

Not very different. [...]

...but he goes on to cite the evidence of the results of clustering analyses that are often able use genetic data to place humans quite accurately into one population or another. I.e., the argument is basically that if there is enough signal in the genetics to place humans into one population or another, then races exist. 5. Why do these differences exist? Answer:

The short answer is, of course, evolution. The groups exist because human populations have an evolutionary history, and, like different species themselves, that ancestry leads to clustering and branching, though humans have a lot of genetic interchange between the branches!

He goes on to discuss the possible influence of natural and sexual selection on a few traits, and genetic drift and neutral evolution on the rest. 6. What are the implications of these differences? Coyne writes:

Not much. There are some medical implications.

...and goes on to (appropriately) dismiss certain pernicious ideas like the hard-to-kill one that IQ is genetically different between different races. However, he does leave open the possibility that a trait like intelligence could vary genetically between races, although it seems unlikely to him, he says there is no evidence supporting this, and in any event even a statistically-detectable difference between populations would be swamped by the within-population variability in such a trait. (Therefore, even in the event that such a population-level difference was found, one could not safely extrapolate from a race identification to a prediction about an individual.) The critique The key problems with the point of view that Coyne presents were made clear, at least to me, by Alan Templeton's talk on this subject. I went to the talk in a quite skeptical mood, since Templeton is in my judgement on the wrong side of certain issues in phylogeography*. But Templeton's points on race are simple and pretty convincing. See also:

Templeton, A. R. (1998). "Human Races: A Genetic and Evolutionary Perspective." American Anthropologist 100(3), 632-650. Race is generally used as a synonym for subspecies, which traditionally is a geographically circumscribed, genetically differentiated population. Sometimes traits show independent patterns of geographical variation such that some combination will distinguish most populations from all others. To avoid making "race" the equivalent of a local population, minimal thresholds of differentiation are imposed. Human "races" are below the thresholds used in other species, so valid traditional subspecies do not exist in humans. A "subspecies" can also be defined as a distinct evolutionary lineage within a species. Genetic surveys and the analyses of DNA haplotype trees show that human "races" are not distinct lineages, and that this is not due to recent admixture; human "races" are not and never were "pure." Instead, human evolution has been and is characterized by many locally differentiated populations coexisting at any given time, but with sufficient genetic contact to make all of humanity a single lineage sharing a common evolutionary fate.

Taking Coyne's points in order, I will summarize what I think Templeton would say, and add my own points. 1. What are races? Are human races "allopatric"? Templeton would deny that human populations "live in allopatry". Humans are not, in fact, living in discrete, disconnected populations. Instead, humans are pretty continuously distributed from Africa all the way around the globe, and have been since glacial times (the tiny possible exceptions to this statement involve the recent colonization of remote islands e.g. by Polynesians, but this occurred only in the last few hundred-2000 years, a geological eyeblink). And genetic contact, i.e. interbreeding, seems to have been occurring between adjacent "populations" the whole time. The overwhelming signal in human genetics seems to be isolation-by-distance -- the larger the geographical distance between two samples, the greater the average genetic difference. But it is crucial to understand that this is not a confirmation of the idea of discrete races. If races were discrete, if you sampled along a transect, you would sample individuals from race A, A, A, A, and then suddenly switch to race B, B, B, B. But what we typically see instead, if you sample along a transect, is that humans measured by some index of genetic difference along a transect would produce values with something like 0, 1, 2, 3, 4, 5, 6, 7. (I am making these numbers up, but imagine that they are a measurement of genetic difference from the first sample, i.e. the difference is 0 at the first sampling location, 1 at a sample 1000 km away, 2 at a sample 2000 km away, etc.) This is a *crucial* distinction to get before we proceed, so I encourage readers to go back and re-read the previous paragraphs to get the idea into their head. Key concept: if you sample humans along a transect, the typical pattern is that genetic difference will increase in a continuous fashion with greater distance between samples, rather than in large discrete "steps" as you would expect with well-separated allopatric populations, or discrete "races". Here's is Templeton's depiction of this phenomenon: Templeton_1999_AA_Fig5_geographic_distance_in_miles_vs_genetic_distance.png

Templeton_1999_AA_Fig5_geographic_distance_in_miles_vs_genetic_distance.png

2. Under that criterion, are there human races? Coyne answers "yes", because we can see obvious differences between humans in different locations. But that, by itself, does not establish that the differences are discrete and categorical, which is what would have to be the case for ideas like "counting the number of races" and "distinguishing one race from the next" to make sense. Quite often, geneticists like Coyne seem to be being mislead by the fact that sampling is usually discrete. Usually, scientists go and get some samples from one region in Africa, some samples from one region in Europe, some from a region or two in Asia, etc. Before there was genetic sampling, there were similar issues in skull measurements, photographs, anthropologists' monographs, etc. When this form of sampling is done, of course it *looks like* the races are discrete, but what was actually discrete was your sampling. Templeton's assertion is that if you had sampled on a grid, instead, and sampled all regions equally, you wouldn't be able to draw discrete lines between races. The pattern would be that closer samples are more related, but this similarity would decline continuously with distance, and not in big discrete steps. 3. How many human races are there? Coyne says the question is "pretty much unanswerable", and notes that the answers have varied from 3 to over 30. Coyne is correct to say the question is unanswerable, and the reason this is so is easily explained on Templeton's view. But the unanswerability is quite a puzzle -- or ought to be -- if one is trying to argue that races are real. If you are going to assert that there is a rigorous and objective and scientific concept called "race", but you can't even explain how to objectively identify the races and thus count them, there's a big problem somewhere. For example, even though there are all kinds of definitions of "species", and thus some variation in the count in the number of species, at least (a) the various species concepts are more-or-less objective, and (b) it happens that different concepts will often, although not always, give the same or similar answers. But, it is well-known that if "defining a species" is somewhat difficult and controversial, it is much worse for defining a subspecies, and it is close-to-hopeless for the ubiquitous concept of "population". There are hundreds of papers on species concepts, but there are almost none published on "population concepts" (this, Millstein 2009, is one of the only ones; see also discussion of the paper). In practice, "populations" of organisms are basically either merely convenient geographic groupings determined subjectively by biologists for convenience or sampling purposes, very often geographically defined; or, they are convenient abstract entities useful for mathematical and computational modeling. But whether or not the "populations" you create for sampling and analysis purposes represent real discrete groups out there in the real world is an empirical question, and it looks like, in the case of humans, there is little basis for assuming discrete populations. Finally, Coyne asserts at one point in his blogpost that races exist in humans much as subspecies exist in other species. But in terms of genetic divergence, at least, this is difficult to support. Templeton points out that the genetic divergence among living humans is much less than that typically observed between subspecies. This is not controversial or poorly known so I won't argue this in detail. Templeton (1999) provides a graphical version:

(Note: see further discussion of criticism of Templeton (1998)'s Figure 1 at this comment.) Or, we can compare the branch lengths of a tree of built of the aligned sequences from the mitochondria of humans and our ape relatives: Gagneux_etal_1999_PNAS_mtDNA_African_hominids_Fig1_phylogeny.png

Gagneux_etal_1999_PNAS_mtDNA_African_hominids_Fig1_phylogeny.png

(From: Gagneux et al. 1999, PNAS; full-resolution figure) (Hat tip: Reed Cartwright) Now, it is important to remember that the mitochondrial DNA sequence is really just one nonrecombining genetic marker, and it is possible for the mtDNA phylogeny to differ from the true species phylogeny (due to introgression) or to not reflect the pattern of diversity in the nuclear markers. Nuclear alleles coalesce more slowly, and recombination means that nuclear DNA will not have a simple branching phylogenetic history that can be easily represented by a tree, at least not within an interbreeding population. However, my sense of it is that in the great ape group (including humans), the mtDNA accurately represents the overall picture of phylogeny and genetic diversity. 5. Why do these differences exist? Coyne writes:

The short answer is, of course, evolution. The groups exist because human populations have an evolutionary history, and, like different species themselves, that ancestry leads to clustering and branching, though humans have a lot of genetic interchange between the branches!

Obviously we agree that selection, drift, etc. explain the evolution of observed diversity in the human species. But there are important questions here that Coyne doesn't address. (1) How do we know that we've got discrete or even somewhat discrete "populations" which then can be said to have a distinct evolutionary history. Maybe we've just got one big interconnected population. This was addressed above. (2) How much interchange do you have to have to have before claims of clustering and branching break down? The evidence seems to indicate that the genetic differences between human "populations" in different regions are primarily a matter of differences in allele frequencies, but where most non-tiny populations have most of the alleles observed in the global population, just with varying frequencies. If this observation is true, this causes a problem for Coyne's statement, because another way to say "the differences are differences in allele frequencies" is to say that "the differences are not fixed between populations", and another way to say that is "coalescence has not occurred in these populations for these markers, either because of the lack of time or the presence of gene flow due to migration or both". Yet another way to say this is to say that "for most loci, these populations are not reciprocally monophyletic", and if you are a phylogeneticist, you known that this statements is equivalent to saying "your data are not appropriately represented by a single branching tree, instead you have a number of trees representing the individual histories of individual loci in the population". Yet one more way of saying it is "you don't have discrete populations, you are sampling within one genetically connected population." Coyne mentions clustering results, which have indeed impressed some people, and led to a minor resurgence in the race concept among geneticists in the last decade. It is true that, if you sample a region where people have mostly lived in or near the villages where they grew up for dozens of generations (e.g. parts of Europe, and e.g. not North America), you can take an individual's genetic data and place their ancestry on the map with a fair degree of accuracy. And it is also true that a computer algorithm, if given a number of samples from e.g. Africa, Europe, and China, could accurately place the samples into those 3 clusters, if told the number of clusters ahead of time. It is even true that newer algorithms can figure out the number of clusters in the data on their own, e.g. Structurama (full disclosure: authored by my advisor). These sorts of observations have led a number of geneticists to speak of "Lewontin's Fallacy" (e.g. Edwards 2003). The geneticist Richard Lewontin famously argued in 1972 that "about 85% of the total genetical variation is due to individual differences within populations and only 15% to differences between populations or ethnic groups" (Edwards 2003). Lewontin argued that this fact made the idea of racial classification irrelevant, since most variation was within groups instead of between them. You can see his point in an intuitive way by imagining that you have a sample of two people. Assuming the 85%/15% distribution of variation, for any one locus that varies between the two individuals, it is much more probable that the difference is just due to within-population variability than due to between-population differences. Lewontin's paper and argument have been cited thousands of times -- for example, it was cited in just this fashion in the review by Sapp which Coyne is critiquing. Edwards points out that the problem with Lewontin's statement is that it is only true if you have only one locus to look at, or if you can safely assume that the variation at all loci is independent. It turns out that if you have, say, dozens or hundreds or thousands of loci to look at, and there is some population structure in the data, for example, population-level differences in allele frequencies, then by looking at all those loci, it becomes quite easy to identify the clusters and to accurately place individuals into them. Edwards describes how this works:

Consider two haploid populations each of size n. In population 1 the frequency of a gene, say "+" as opposed to "-", at a single diallelic locus is p and in population 2 it is q, where p + q = 1. (The symmetry is deliberate.) Each population manifests simple binomial variability, and the overall variability is augmented by the difference in the means. The natural way to analyse this variability is the analysis of variance, from which it will be found that the ratio of the within-population sum of squares to the total sum of squares is simply 4pq. Taking p = 0.3 and q = 0.7, this ratio is 0.84; 84% of the variability is within-population, corresponding closely to Lewontin's figure. The probability of misclassifying an individual based on his gene is p, in this case 0.3. The genes at a single locus are hardly informative about the population to which their bearer belongs. Now suppose there are k similar loci, all with gene frequency p in population 1 and q in population 2. The ratio of the within-to-total variability is still 84% at each locus. The total number of "+" genes in an individual will be binomial with mean kp in population 1 and kq in population 2, with variance kpq in both cases. Continuing with the former gene frequencies and taking k=100 loci (say), the mean numbers are 30 and 70 respectively, with variances 21 and thus standard deviations of 4.58. With a difference between the means of 40 and a common standard deviation of less than 4.6, there is virtually no overlap between the distributions, and the probability of misclassification is infinitesimal, simply on the basis of counting the number of "+" genes. Fig. 1 shows how the probability falls off for up to 20 loci.

This sort of information (obviously the between-population allele frequencies always differ between loci in real life, but this doesn't matter much for the classification) is what lets the clustering algorithms find their clusters, and place people into them. Edwards concludes:

There is nothing wrong with Lewontin's statistical analysis of variation, only with the belief that it is relevant to classification. It is not true that "racial classification is... of virtually no genetic or taxonomic significance". It is not true, as Nature claimed, that "two random individuals from any one group are almost as different as any two random individuals from the entire world", and it is not true, as the New Scientist claimed, that "two individuals are different because they are individuals, not because they belong to different races" and that "you can't predict someone's race by their genes". Such statements might only be true if all the characters studied were independent, which they are not. Lewontin used his analysis of variation to mount an unjustified assault on classification, which he deplored for social reasons. It was he who wrote "Indeed the whole history of the problem of genetic variation is a vivid illustration of the role that deeply embedded ideological assumptions play in determining scientific 'truth' and the direction of scientific inquiry".(5) In a 1970 article Race and intelligence(11) he had earlier written "I shall try, in this article, to display Professor Jensen's argument, to show how the structure of his argument is designed to make his point and to reveal what appear to be deeply-embedded assumptions derived from a particular world view, leading him to erroneous conclusions." A proper analysis of human data reveals a substantial amount of information about genetic differences. What use, if any, one makes of it is quite another matter. But it is a dangerous mistake to premise the moral equality of human beings on biological similarity because dissimilarity, once revealed, then becomes an argument for moral inequality. One is reminded of Fisher's remark in Statistical Methods and Scientific Inference(12) "that the best causes tend to attract to their support the worst arguments, which seems to be equally true in the intellectual and in the moral sense."

Edwards's statistical argument is correct, but unfortunately -- and ironically for someone criticizing Lewontin for not recognizing a mistake caused by a "deeply-embedded assumption" -- Edwards is at least potentially being misled by his own deeply embedded assumption, as are others who argue along the lines of "clustering of humans based on genetics works, therefore different races exist." Just because an analysis shows that the humans you sample fall into clusters does not necessarily mean that the clusters are out there in the real world in the human population. If the human population basically has continuous genetic variation, with genetic similarity decaying smoothly with distance, then, if your sampling was geographically clustered, your samples will show that structure. I am not an expert in this area, but my sense of the matter from Templeton's comments is that it is much more true to say that, at least for humans living basically in their ancestral locations, genetic difference between "populations" is essentially a smooth function of geographic distance. This is definitely "geographic structure" in the human population, but it is not evidence for discrete populations or races. And genetic clustering that is due to clustered sampling (done for convenience, funding reasons, political access reasons, and/or prior assumptions that localized sampling of a region represents that region, because you "already know" that a particular race inhabits that region) cannot be taken as evidence of real discrete structure in the human population. It is possible that there is some discrete structure -- due to language barriers or large migration events or whatever -- but I would want to see this explicitly tested for rather than just assumed. And if it exists, it will be important to ask how much of genetic variation this explains, compared to the already major isolation-by-distance effect that we already know about and already know is quite strong. A little more on clustering One other point about clustering, this time with hierarchical clustering. Hierarchical clustering algorithms produce groups within groups, which are commonly represented by a tree structure very similar (superficially) to a phylogeny. You will sometimes see the argument that one can take the genetic distances between a bunch of samples of humans, throw them into a hierarchical clustering algorithm, and get a tree out, and therefore we can safely think of human races as subspecies within our species. This matches up superficially with a lot of our background conceptual model of how species can be placed as groups within groups on a phylogeny, and thus tends to reinforce the "races are real" idea in some peoples' heads. However, this argumentation is at least dubious. First, clustering methods taken as a phylogenetic method are a form of "distance method", and distance methods are well-known in phylogenetics to be the crudest sort of phylogenetic method and the one most prone to errors; unless some pretty specific assumptions are met (e.g. molecular clock) they can produce very bad errors. Basically, distance methods rely on the pairwise Euclidean distances between samples in your data space. You can imagine this as the "as the bird flies" distance. However, the distance that we are really interested when we produce a phylogeny is not the "as the bird flies", Euclidean distance between our samples; instead, we are most interested in the path distance, i.e., the actual mutational steps (or character-change steps) that led from a common ancestor, up each branch, to the tips that we observe. This form of distance is known as a "Manhattan distance", named after the square street grid in downtown New York. In New York City, you can't get from A to B by going "as the crow flies", you have to drive along the street grid, avoiding one-way streets, construction, traffic jams, etc. Only in idea situations will the actual traveled path distance equal or closely approximate or correlate linearly with the as-the-crow-flies distance. Phylogenetic methods are thus really about finding the best paths (branching patterns) between the observed samples, according to some criterion (e.g. parsimony, maximum likelihood, posterior probability). Therefore, clustering methods are not phylogenetic methods, except in special situations. Second, a hierarchical clustering method will produce hierarchical clustering from almost any but the most artificially-constructed distance data. If there is true hierarchical structure, the method can find it, but some sort of clustering will be produced, even if it's poorly supported and changes dramatically under e.g. resampling of the data. And there can be many sources of any clustering that is observed. I bet that you could: * lay down a latitude/longitude grid on the earth, * take a "sample point" at every intersection point between latitude and longitude lines (done at, say, 1 degree or 5 degree intervals) * exclude points that fell in the ocean * calculate the great circle geographic distance between all points * input this into a hierarchical clustering algorithm ...and you would get out a nice treelike diagram that would show each of the major continents as clusters, probably with linked continents forming groups of groups. This would be successful hierarchical clustering, for sure, but it obviously wouldn't be telling you about the evolutionary connectedness of the continents or anything like that. It's a summary of the geographical distance matrix between the samples. The interesting question is -- what if you had the samples to calculate the genetic distances between the (indigenous) humans sampled at each of these points. Would you get a clustering scheme all that much different than the one you would get by clustering pure geographic distance? Probably there would be some differences, since we know that humans started in Africa and genetic diversity is highest there -- but would it be a big effect? Or would the "clustering" that is observed in the genetic distances primarily just be the effect of isolation by distance and the fact that humans live on land, which is unevenly distributed around the globe? This might verify "race" in a very vague way, but it would be a long way from the traditional concepts that most people have in their head. Lastly, we reach point 6. 6. What are the implications of these differences? Coyne writes:

Not much. There are some medical implications. [...]

Here, interestingly, Templeton might actually agree, at least for the United States. Although it is illegitimate to do discrete sampling of continuous variation, and then conclude that discrete variation exists in reality, it is important to note that in some cases, "discrete sampling" in history has played a role in producing modern populations. For example, the population of the U.S. is not a random sample of the global human population. For historical reasons, in the formation of the U.S. population there really were major contributions from discrete parts of the globe -- Europe, West Africa, East Asia, etc. For this reason, if you were to examine a few hundred randomly sampled Americans, you would indeed find detectable discrete genetic variation! In this (very limited) sense, "race" is physically real, but that physical reality was completely constructed by contingent cultural events in recent history! In certain cases, this might even be medically relevant, for example if certain genotypes are common in a certain population, and these genotypes react especially well or especially poorly to a particular drug. Certainly you would want to know this if you were a doctor prescribing medication to a patient, and government agencies like the FDA will want to know it when they are approving drugs for sale or for insurance coverage. And certainly, it seems arbitrary and unfair to do drug tests on one group (say, white males) and then assume that the results will apply with equal validity to everyone else. Finally, the fact that there is some geographical structure (whether discrete or continuous) in the human population, or in the weird sample of the human population that we have in the U.S., is important to take into account for statistical reasons. Many studies are attempting to correlate genetic traits with various diseases and conditions. But these studies can be badly misled if there is correlation in the genetic data that is due to, say, geographic ancestry that is unrecognized. This is a vast field, and pretty much represents a huge job market for evolutionary biologists and population geneticists who decades ago would have puttered around putting equations on chalkboard, but who are now crucial components of any well-done study involving genetic data from populations. Those are some of the arguments favoring taking "race" into account in medicine and medical research. However, as you can imagine, there are all kinds of difficulties and dangers. A big one is that some statistical effect that is true in one situation -- say, people who identify as African American have a higher risk of condition X -- will be illegitimately extrapolated to other people who fall into doctors' and the public's cultural definition of African American. What if one's ancestry is from southern Africa, but the genetic trait that correlates with the condition derives from West Africa? The statistical generalization made in some study may not apply to you. This post is long enough, but hopefully readers can see that the topic of race in the human species is a complicated one, fraught with danger even from a purely scientific point of view, even before we get to the even more hazardous arenas of culture and politics. I think that it should be clear that one cannot give the question "Do races exist?" any kind of simple "yes" answer in humans. One can say "yes" to the question, "Is there geographical structure in human genetics?", but hopefully I have shown that this is a long way from establishing that there is any kind of discrete genetic structure in the human population at large, i.e. discrete "racial groups" which could be identified and counted. It is possible that with ever more data, some shadow of this idea may be statistically supportable, but it seems unlikely that it will compare in strength to the strong overall pattern of continuous genetic variation following an isolation-by-distance model. Conclusion I'll let the conclusion of Templeton (1998) speak for itself:

Conclusions The genetic data are consistently and strongly informative about human races. Humans show only modest levels of differentiation among populations when compared to other large-bodied mammals, and this level of differentiation is well below the usual threshold used to identify subspecies (races) in nonhuman species. Hence, human races do not exist under the traditional concept of a subspecies as being a geographically circumscribed population showing sharp genetic differentiation. A more modem definition of race is that of a distinct evolutionary lineage within a species. The genetic evidence strongly rejects the existence of distinct evolutionary lineages within humans. The widespread representation of human "races" as branches on an intraspecific population tree is genetically indefensible and biologically misleading, even when the ancestral node is presented as being at 100,000 years ago. Attempts to salvage the idea of human "races" as evolutionary lineages by invoking greater racial purity in the past followed by admixture events are unsuccessful and falsified by multilocus comparisons of geographical concordance and by haplotype analyses. Instead, all of the genetic evidence shows that there never was a split or separation of the "races'" or between Africans and Eurasians. Recent human evolution has been characterized by both population range expansions (with perhaps some local replacements but no global replacement within the last 100,000 years) and recurrent genetic interchange. The 100,000 years ago "divergence time" between Eurasians and Africans that is commonly found in the recent literature is really only an "effective divergence time" in sensu Nei and Roychoudhury (1974, 1982). Since no split occurred between Africans and Eurasians, it is meaningless to assign a date to an "event" that never happened. Instead, the effective divergence time measures the amount of restricted gene flow among the populations (Slatkin 1991). Because of the extensive evidence for genetic interchange through population movements and recurrent gene flow going back at least hundreds of thousands of years ago, there is only one evolutionary lineage of humanity and there are no subspecies or races under either the traditional or phylogenetic definitions. Human evolution and population structure have been and are characterized by many locally differentiated populations coexisting at any given time, but with sufficient genetic contact to make all of humanity a single lineage sharing a common, long-term evolutionary fate.

My recommendation would be to ditch the discussion of "human races" in science wherever possible (it may be unavoidable in certain fields, e.g. medicine, due to its cultural reality), particularly in genetics, and instead go with "geographic ancestry" instead. Some of us humans will trace back to a small region somewhere, others of us will be fusions from two or five or six continents. Coyne's post argued that races exist, but were becoming less distinct due to interbreeding. I would say that continuous geographic structure exists, but that it is becoming weaker as the human population becomes more and more panmictic. Note: After I wrote the above, I noticed that Coyne has a followup post up that cites the famous "map of Europe reflected in genetics" (paraphrasing) figure from Novembre et al. (2008). He notes the evidence for the isolation-by-distance effect, and concludes that there are no races in Europe, at least:

In other words, genetically closer populations are more genetically similar, as expected if individuals tend to mate with other individuals from the same country, and close by. This is an "isolation by distance" model: genetic similarity falls off gradually with distance. As the authors note, this does not support the existence of "discrete, well-differentiated populations," i.e., there are no races. None are expected in such a small area, particularly because biological "races" are those populations that (at least at one time) were geographically isolated and genetically differentiated. That geographical isolation never happened in Europe. [...] As I said, this doesn't show that there are discrete "races" in Europe, and I don't think there are obviously discrete "races" anywhere these days, though there is large-scale genetic differentiation among worldwide population suggesting that such races once existed as relatively discrete and geographically isolated populations. The discreteness that once existed, or so I think, is now blurring out as transportation and migration are beginning to mix the discrete groups into not a melting pot, but sort of a lumpy pudding of humanity. What is clear is that, with considerable accuracy, you can diagnose an individual's geographic origin from his genes. Nearly everyone's DNA contains reliable information about their recent and ancient past. We are not all genetically alike. If we were, you couldn't do studies like the one of Novembre et al. But neither are we radically different genetically, for if we were, you wouldn't need hundreds of thousands of genes for such accurate predictions.

So Coyne is actually approaching the Templeton view of human genetic diversity pretty rapidly, although he is still buying into "relatively discrete" races at a larger scale (what are they then?), which I would disagree with. References Edwards, A. W. F. (2003). "Human genetic diversity: Lewontin's fallacy." BioEssays 25(8), 798-801. Gagneux, P., C. Wills, U. Gerloff, D. Tautz, P. A. Morin, C. Boesch, B. Fruth, G. Hohmann, O. A. Ryder and D. S. Woodruff (1999). "Mitochondrial sequences show diverse evolutionary histories of African hominoids." Proceedings of the National Academy of Sciences 96(9), 5077-5082. Lewontin RC. The apportionment of human diversity. In: Dobzhansky T, Hecht MK, Steere WC, editors. Evolutionary Biology 6. New York: Appleton-Century-Crofts. 1972. p 381-398. Millstein, R. L. (2009). "Populations as Individuals." Biological Theory 4(3), 267-273. Templeton, A. R. (1998). "Human Races: A Genetic and Evolutionary Perspective." American Anthropologist 100(3), 632-650. Notes * For example (1) he promotes a method that seems to be subjective and to not work (it's called Nested Clade Analysis, it's not about nested clades in the traditional phylogenetic/macroevolutionary senses of those words, and it would take awhile to explain...but see "Why does a method that fails continue to be used?" by Lacy Knowles for the basics, and Google Scholar for the ongoing discussion.) and (2) he doesn't seem to get Bayesian methods and his one-man war against ABC (Approximate Bayesian Computation) consists primarily of misunderstandings of Bayesian logic (although ABC does have various limitations and problems, they aren't primarily the ones he identifies; see e.g. "Invalid arguments against ABC: Reply to AR Templeton" by Csillery et al.). However, NCA is Templeton's pet method and was popular for a long time (thousands of citations), and Bayesian logic can be difficult even for very smart professionals, especially if they have been raised thinking of frequentism as the only way to think about statistics, and Templeton has vast experience in human population genetics and popgen applied to medicine, so I don't think his opinions on broader, simpler issues can be dismissed because of the fight over NCA/ABC. I might as well add that my specialty is evolutionary biogeography at the macroevolutionary level of large number of species related by a phylogeny, not phylogeography, which operates primarily at the level of population genetics of one species or a few closely-related species/populations, so my opinions in this arena are still developing as I learn more about it. ** Another note: re-reading Templeton (1998) after some time just now, I found rather more Nested Clade Analysis in there than I remembered. I don't think this affects the parts of his conclusions that I quoted, except perhaps for his skepticism of the 100,000 year "divergence time" for Africans vs. non-Africans and his endorsement of an alternative model (derived from his "trellis model"). Templeton in 1998 is in part lobbying for the "trellis" model, where all humans are connected by gene flow and always have been, all the way back to Homo erectus. He makes the modern isolation-by-distance observation a part of the evidence for this, but I don't see any reason to endorse this. We could have a fairly standard Recent Out-of-Africa model (adding minor interbreeding with Neandertals etc.) and still get the isolation-by-distance effect as long as humans have the behavior of breeding with nearby tribes and thus having high gene flow but mostly with adjacent groups. Other links of interest: http://en.wikipedia.org/wiki/%22Human_genetic_diversity:_Lewontin%27s_fallacy%22_%28scientific_paper%29

64 Comments

harold · 29 February 2012

As far as I can tell Coyne has no point here; he claims that human races exist but can't say how many there are (and therefore, by extension, if he can't identify what the races are in order to count them, he can't say who belongs to what "race").

Meanwhile, there is unequivocally a sufficient association of certain medical issues with certain socially-defined ethnic identities, to warrant recognition of a patient's self-identified ethnicity in some circumstances.

Many associations are the result of environmental factors. For example, "Mormon" isn't an ethnic group, but it is a group of people who have a low rate of pathologies related to excess consumption of ethanol or use of cigarettes. However, if an originally Mormon person takes up heavy drinking and smoking, that characteristic disappears. This is an example of how membership in a culturally defined group can be related to risk of certain diseases, without any genetic input.

On the other hand, sometimes there is an association which may have something to do with genes. People of East Asian and Amerindian descent have different rates of certain types of lymphoma/leukemia (fortunately these particular diseases are extremely rare; their incidence would not have significant impact on overall life expectancy of any population, but people get them and they need to be diagnosed and treated). As it happens, knowing the patient's ethnic group has absolutely nothing to do with the process of diagnosing these disorders, but if this association holds up, it might lead to a clue that would help us to better understand that pathogenesis of these disorders. Note that risk of these rare disorders is not increased in one "race" as culturally defined. The disorders discussed in the references below are also strongly associated with viruses, HTLV-1 and EBV, respectively. Other environmental factors can't be ruled out. Nevertheless, they occur in multiple diverse environments, but the increased (although still very low) risk in East Asian and Amerindian populations seems to hold up.

http://www.ncbi.nlm.nih.gov/pubmed/21205073

http://www.ncbi.nlm.nih.gov/pubmed/21738302

http://www.ncbi.nlm.nih.gov/pubmed/21510468

Conclusion - "Race" is a social construct based on clustering of a few superficial characteristics that often lumps together populations that aren't even closely related. However, pathology can associate with culturally defined ethnicity, probably usually for environmental reasons, but sometimes perhaps for genetic reasons.

D P Robin · 29 February 2012

Any of us, who took Biological Anthropology at the University of Michigan, from the 60's until a year or so ago, had the chance to take C. Loring Brace's course on race, which made most, if not all these points. The faculty in my time there (mid-late 1970's) had little good to say about the concept of human races at all. It was hard to cling to ideas that races were "real" after hearing Conrad Kottak talk about racial taxonomy in Brazil (where he had done fieldwork).

IMHO, this is old news, but I suppose race is one of those concept that will take a loooong time to die its well deserved death.

dpr

Nick Matzke · 29 February 2012

D P Robin said: Any of us, who took Biological Anthropology at the University of Michigan, from the 60's until a year or so ago, had the chance to take C. Loring Brace's course on race, which made most, if not all these points. The faculty in my time there (mid-late 1970's) had little good to say about the concept of human races at all. It was hard to cling to ideas that races were "real" after hearing Conrad Kottak talk about racial taxonomy in Brazil (where he had done fieldwork). IMHO, this is old news, but I suppose race is one of those concept that will take a loooong time to die its well deserved death. dpr

I thought it was old news also, but then I saw Jerry Coyne, a famous population geneticist and current head of the Society for the Study of Evolution making these arguments.

diogeneslamp0 · 29 February 2012

Uh, what is NUMT in the big chart above?

Henry J · 29 February 2012

Summary: the boundary between "races" is way fuzzier even than the boundary between closely related species.

Also the differences between "races" is often swamped by the differences within each of them.

I suppose that a long long time ago (in a galaxy far far away?), Europe and Asia may have had groups more or less distinct from each other or the ones in Africa (though I wouldn't count on it), but as the above points out, that ain't the case today. (Oh, and "African" is certainly not a distinct group!)

Henry

howard.peirce · 29 February 2012

Following the death of Christopher Hitchens, Coyne posted an impassioned defense of virulent misogyny. That's when I stopped reading him. Unfortunately, in the early 21st century, "Conservatism" trumps science, atheism, and rationalism. I wish this weren't true. It's a shame, too, because he's potentially a valuable person, and not a liability. But these are the times we live in.

Nick Matzke said:
D P Robin said: Any of us, who took Biological Anthropology at the University of Michigan, from the 60's until a year or so ago, had the chance to take C. Loring Brace's course on race, which made most, if not all these points. The faculty in my time there (mid-late 1970's) had little good to say about the concept of human races at all. It was hard to cling to ideas that races were "real" after hearing Conrad Kottak talk about racial taxonomy in Brazil (where he had done fieldwork). IMHO, this is old news, but I suppose race is one of those concept that will take a loooong time to die its well deserved death. dpr
I thought it was old news also, but then I saw Jerry Coyne, a famous population geneticist and current head of the Society for the Study of Evolution making these arguments.

Robert Byers · 1 March 2012

YEC here.
The bible says the people were one and then they divided up and covered the earth.
Is there races?
Well if merely being very segregated people groups and this a good long time equals race then there is races.
It doesn't matter how different people become and small or great differences are just manifestations of being segregated peoples.
For example I am of the great tribes of German and Celt. In no way am i biologically close to the slavic peoples. We were segregated at babel or at least everyone must agree since the language changes.
Yet I look exactly like this people. THis just because of the same adaptations due to the Northern areas we live in.
Likewise the great differences in men are from triggers in our bodies in reaction to needs in the world.
A problem here is that evolution is teaching small steps leading to big changes forces conclusions of great genetic segregation.
Yet in fact its just quick physical adaption that affects or doesn't segregated populations in like or unlike ways.

They are trying to say there is a white Adam and Eve and then the people segregated into different languages.
instead there was first the language segregation in Southern areas and then these segregated peoples moved into northern areas and equally adapted for like needs but nowhere near each other. JUst like animals in the north who all get white fur.

By the way its brought again, as a careful option, that iQ is different. some smarter then others upon birth.
This is impossible for many reasons yet shows that giving in a little on this will lead to whatever anyone thinks.
Yes different people and different peoples are "intelligent" in different degrees but this is just about time and place.
all babies are equal in bringing no intelligence with them. its all learnt after birth with results relative to motivation and whats within reach.

Evolutionary presumptions are what confuses matters.

dalehusband · 1 March 2012

Robert Byers said: YEC here. The bible says the people were one and then they divided up and covered the earth.

It does not matter what a 2000 year old book says. Only what can be discovered in the present reality we all live in.

Is there races? Well if merely being very segregated people groups and this a good long time equals race then there is races. It doesn't matter how different people become and small or great differences are just manifestations of being segregated peoples. For example I am of the great tribes of German and Celt. In no way am i biologically close to the slavic peoples. We were segregated at babel or at least everyone must agree since the language changes. Yet I look exactly like this people. THis just because of the same adaptations due to the Northern areas we live in. Likewise the great differences in men are from triggers in our bodies in reaction to needs in the world. A problem here is that evolution is teaching small steps leading to big changes forces conclusions of great genetic segregation. Yet in fact its just quick physical adaption that affects or doesn't segregated populations in like or unlike ways. They are trying to say there is a white Adam and Eve and then the people segregated into different languages. instead there was first the language segregation in Southern areas and then these segregated peoples moved into northern areas and equally adapted for like needs but nowhere near each other. JUst like animals in the north who all get white fur. By the way its brought again, as a careful option, that iQ is different. some smarter then others upon birth. This is impossible for many reasons yet shows that giving in a little on this will lead to whatever anyone thinks. Yes different people and different peoples are "intelligent" in different degrees but this is just about time and place. all babies are equal in bringing no intelligence with them. its all learnt after birth with results relative to motivation and whats within reach.

You clearly have never studied human brains and how they function.

Evolutionary presumptions are what confuses matters.

Once again, you have lied outright. That's about all you do, it seems.

Dave Luckett · 1 March 2012

There are times when Byers is embarrassing, and times when he's painful. This is two of those times.

https://www.google.com/accounts/o8/id?id=AItOawl13BBLvI0CYAoRYez4pAStq9oizm0pW2I · 1 March 2012

@diogeneslamp0,

NUMT is a sequence of mitochondrial origin, that was copied into the nuclear genome. The tree is based on a mitochondrial genome segment.

The relevant part from the paper, "We examined part of hypervariable CR1, corresponding to the human mtDNA nucleotide positions 16,053–16,465 (16), from 1,158 unique haplotypes, almost all (,1000) of known geographic provenance. These include 83 previously unpublished haplotypes representing 70 chimpanzees and 13 bonobos and 1,070 published sequences from GenBank (811 humans from around the world, 26 gorillas, 11 bonobos, and 222 chimpanzees). In addition, we included one Neanderthal (17) and one human nuclear sequence of mitochondrial origin (numt) involving a CR1 sequence that became inserted into chromosome 11 sometime in the past (18). Haplotypes compared ranged in length from 300–415 bp except for three comprising 270–298 bp that were included to capture the full extent of known variability. Three published orangutan sequences representing both subspecies were used as an outgroup for analytical purposes (19, 20). GenBank accession nos. of all sequences used, with their 35 specific source references and our alignments, are available from P.G."

-The Other Jim

D P Robin · 1 March 2012

Byers and Coyne on the same page?

iain.brown · 1 March 2012

How about the Aboriginal Australians? As far as I know they were genetically isolated for around 50,000 years. Could they form a race as you're defining it?

DS · 1 March 2012

Robert Byers said: YEC here. The bible says the people were one and then they divided up and covered the earth. Is there races? Well if merely being very segregated people groups and this a good long time equals race then there is races. It doesn't matter how different people become and small or great differences are just manifestations of being segregated peoples. For example I am of the great tribes of German and Celt. In no way am i biologically close to the slavic peoples. We were segregated at babel or at least everyone must agree since the language changes. Yet I look exactly like this people. THis just because of the same adaptations due to the Northern areas we live in. Likewise the great differences in men are from triggers in our bodies in reaction to needs in the world. A problem here is that evolution is teaching small steps leading to big changes forces conclusions of great genetic segregation. Yet in fact its just quick physical adaption that affects or doesn't segregated populations in like or unlike ways. They are trying to say there is a white Adam and Eve and then the people segregated into different languages. instead there was first the language segregation in Southern areas and then these segregated peoples moved into northern areas and equally adapted for like needs but nowhere near each other. JUst like animals in the north who all get white fur. By the way its brought again, as a careful option, that iQ is different. some smarter then others upon birth. This is impossible for many reasons yet shows that giving in a little on this will lead to whatever anyone thinks. Yes different people and different peoples are "intelligent" in different degrees but this is just about time and place. all babies are equal in bringing no intelligence with them. its all learnt after birth with results relative to motivation and whats within reach. Evolutionary presumptions are what confuses matters.

So that would be a no, you can't deal with the evidence at all. Too bad for you Robert. As for Adam and Eve, they were obviously black, but that's OK, undoubtedly Jesus was too. Get a clue man. BIblical presuppositions are what confuses you Robert, get rid of them and deal with reality. I suggest an immediate dump to the bathroom wall for Robert and his fantasies. Why derail an otherwise intelligent conversation?

eric · 1 March 2012

It seems we are rehashing Darwin, yet again. OOS 6th edition, chapter 2, section 3 (titled "Doubtful Species"):

How many of the birds and insects in North America and Europe, which differ very slightly from each other, have been ranked by one eminent naturalist as undoubted species, and by another as varieties, or, as they are often called, geographical races!

He even covered Nick's sampling issue!

After this discussion, the result of so much labour, he [A. de Candolle] emphatically remarks: "They are mistaken, who repeat that the greater part of our species [oaks] are clearly limited, and that the doubtful species are in a feeble minority. This seemed to be true, so long as a genus was imperfectly known, and its species were founded upon a few specimens, that is to say, were provisional. Just as we come to know them better, intermediate forms flow in, and doubts as to specific limits augment."

Maybe its time to ressurrect Darwin's term 'doubtful species,' and apply it to ourselves.

Paul Burnett · 1 March 2012

Robert Byers said: ...iQ is different.

Would you happen to know your IQ, Robert? Do you know if it's positive or negative?

diogeneslamp0 · 1 March 2012

Robert Byers said: YEC here... Is there races? ...there is races.

EV here. Is the creationists learning? No.

diogeneslamp0 · 1 March 2012

EV here.

Thanking Masked Panda for definition of NUMT in above graph. But why is NUMT such an outlier?

Among the humans, who are the genetic outliers?

Khoisan? Nuristani? Melanesians?

The cast of Jersey Shore?

John · 1 March 2012

Dave Luckett said: There are times when Byers is embarrassing, and times when he's painful. This is two of those times.

Agreed. But he's utterly pathetic without any hope for redemption. What more do you expect?

Nick Matzke · 1 March 2012

diogeneslamp0 said: Thanking Masked Panda for definition of NUMT in above graph. But why is NUMT such an outlier?

It's some combination of two possibilities. Nuclear mtDNAs (NUMTs): (a) are basically junk DNA, at least usually, and so experience higher substitution rates than a sequence under stabilizing selection would. (the functioning genes in the mtDNA are under stabilizing selection, for instance) (b) It's quite possible that that particular NUMT originated in the human lineage back in the Homo erectus days. For example, perhaps some mitochondrion in some germline cell was damaged, its DNA was floating around, and some of it got spliced into the nucleus by mistake by some DNA repair enzymes. Then, by chance, this got fixed in the population, or at least became frequent enough to get sampled. And actually, (b) is a pretty interesting idea if you think about it. If we had enough of these, we'd have some "fossil" genetic information about ancestral populations -- whereas with actual hominid fossils, we will probably never be able to get DNA from a Homo erectus, since the time limit on DNA is about 100,000 years even in the best conditions. See this paper I was recently on, we discovered that some mtDNAs in domestic horses that had been published in genbank seemed to stick out against the background of a bunch of sequences that we had. One possibility is that these were NUMTs that had accidentally been sequenced and were mistakenly thought to be true mtDNAs. When you grind up some cells and try to amplify your target mtDNA sequence using PCR, there is no guarantee you won't accidentally amplify a NUMT instead, without careful double-checking. (Ditto for having a computer reconstruct a sequence from a library of short sequences produced by shotgun sequencing. The computer could accidentally stick some NUMT sequence into your mtDNA sequence.) Lippold, S., N. J. Matzke, M. Reissmann and M. Hofreiter (2011). "Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication." BMC Evolutionary Biology 11(1), 328.

Among the humans, who are the genetic outliers? Khoisan? Nuristani? Melanesians? The cast of Jersey Shore?

If the argument I presented in the main post is correct, this is a meaningless question. The biggest overall genetic distance, measured across many loci, will be between samples that are (ancestrally, pre-1600s) geographically distant. In just the mtDNA sequence, the biggest diversity and deepest branches are in Africa, and this is also true for many individual loci in the nuclear DNA (although not all, e.g. the "Neandertal markers" are deeply-branching loci in Eurasians). But this cannot be extrapolated to make statements about populations and genetic distance, because of interbreeding. The mtDNA tree will not line up with the tree from nuclear locus #1, which won't line up with nuclear locus #2, etc. This all washes out when averaged across many loci, and you get the observation that genetic similarity correlates with geographic distance.

Nick Matzke · 1 March 2012

Byers and Coyne on the same page?

No. My disagreement with Coyne is fairly minor in the grand scheme of things. Byers is on another planet.

diogeneslamp0 · 1 March 2012

howard.peirce said: Following the death of Christopher Hitchens, Coyne posted an impassioned defense of virulent misogyny. That's when I stopped reading him. Unfortunately, in the early 21st century, "Conservatism" trumps science, atheism, and rationalism.
Where did he post that, and what was his logic? Um, should we read Coyne out of the movement now? I'm still steamed about that goddamn peppered moth thing. To think that the IDologues were citing COYNE in support of their mottephobic conspiracy theory! I'm reading Coyne's "Why Evolution Is True", and noticed outrageous errors. At one point he writes that some woodpecker's tongues are anchored in their nostrils! That's a creationist myth. He can't double-check that on TalkOrigins? You realize TalkOrigins is far more accurate than a tree-killin book?

diogeneslamp0 · 1 March 2012

Please ignore format of the post above.

howard.peirce said: Following the death of Christopher Hitchens, Coyne posted an impassioned defense of virulent misogyny. That's when I stopped reading him. Unfortunately, in the early 21st century, "Conservatism" trumps science, atheism, and rationalism. I wish this weren't true.

Where did he post that, and what was his logic? Um, should we read Coyne out of the movement now? I’m still steamed about that goddamn peppered moth thing. To think that the IDologues were citing COYNE in support of their mottephobic conspiracy theory! I’m reading Coyne’s “Why Evolution Is True”, and noticed outrageous errors. At one point he writes that some woodpecker’s tongues are anchored in their nostrils! That’s a creationist myth. He can’t double-check that on TalkOrigins? You realize TalkOrigins is far more accurate than a tree-killin book?

https://me.yahoo.com/a/iX7ogXAAzJTzNjXBRaGw5n0Bwpz2uulxmPduIrnhZEg-#cc736 · 1 March 2012

Matt,

I personally believe that I can pretty much figure out if someone traces his ancestors to Sweden, India, East Africa or Japan, or some place in the general region. Rednecks also believe this. Moreover, the redneck will see that you and your science are barking mad when you claim, using mumbo-jumbo technicalities, that human races do not exist. Every time biology or medicine finds another population-specific trait, racists are vindicated, and science falsified.

I personally believe that it is necessary for biology to recognize that human races exist, with the caveat that they are neither sharply defined genetically or biologically, that the genetic differences that exist are small and of recent origin (abt. 50000 years; 2000 generations), and that whatever population differences that may exist can be usefully combined through miscegenation. In any case, Edwards is right – denying the geography of genetics is not a solution.

https://me.yahoo.com/a/iX7ogXAAzJTzNjXBRaGw5n0Bwpz2uulxmPduIrnhZEg-#cc736 · 1 March 2012

Previous "Masked Panda" is W. Benson

Leszek · 1 March 2012

Since you people in general know a lot more then I do, I don't normally get to contribute.

However in this case I think I have something visual that can help get the point across.

On second thought this is also a question which I will get to in a moment. But first what I think:

If you look at geographical distribution of a genetic traits using a map, you often will notice that the distribution doesn't really line up with the boarders of races. For example this link
http://en.wikipedia.org/wiki/File:Westernparadigm_blue_eye_color_map.jpg

Shows the distribution of blue eyes in Europe. There is not any distinction that I can tell separating Germanic from Slavic or Francs for example. In fact the distribution seems to ignore most human ideas for territories and racial ancestry.

I am sure someone here can correct me if I am wrong but I don't recall any similar map that does respect such boundaries. My question is am I correct that this pattern is more or less repeated for any particular genetic trait?

Incidentally looking at the resident troll, I have both Slavic and Germanic ancestry and I got my blue eyes from both sides. The groups aren't nearly as segregated as it imagines. (And the diagram shows)

harold · 1 March 2012

Matt, I personally believe that I can pretty much figure out if someone traces his ancestors to Sweden, India, East Africa or Japan, or some place in the general region. Rednecks also believe this.

I doubt if you can do this as accurately as you seem to think; nevertheless, it is irrelevant even if you can. No-one here is denying the obvious fact that superficial traits like skin/hair/eye pigmentation and some others cluster geographically. It is very clearly explained in the original post why this is different from biological recognition of "races".

Moreover, the redneck will see that you and your science are barking mad when you claim, using mumbo-jumbo technicalities, that human races do not exist.

No-one has made this claim either. Of course the cultural construct "race" exists. Of course, in US society, it is related to pigmentation of skin, hair texture, and some other features. The term "race" is used differently in biology. The point here is that culturally recognized "races" of humans are not the same thing as biological races.

Every time biology or medicine finds another population-specific trait, racists are vindicated, and science falsified.

Racism is the subjective ethical decision to deliberately treat some people badly or unfairly because of their perceived ethnic origin. I personally prefer the term "ethnic bigotry" for clarity (since it is common for people to be bigoted against others whom they define as belonging to a different ethnic group, even if the two groups cannot be distinguished based on physical appearance), although I sometimes use "racism" because it is a more powerful word. To repeat for emphasis, the decision of whether or not to be a racist is purely a subjective ethical decision. For example, people with trisomy 21 unequivocally are genetically different from other humans in a significant way, and are disproportionately likely to have certain problems due to genetics. Yet I oppose treating them badly or unfairly. In an opposite example, various closely related Scandinavian populations who barely spoke different dialects, and were indistinguishable to outsiders, have been quite bigoted against each other in the past. Science cannot tell you whether it is "good" or "bad" to treat people unfairly. As it happens, racists often do make exaggerated claims that are falsified by science. I noted above that certain hematopoietic malignancies have different rates of incidence in different populations. This does not conflict with Nick Matzke's point about race.

I personally believe that it is necessary for biology to recognize that human races exist, with the caveat that they are neither sharply defined genetically or biologically, that the genetic differences that exist are small and of recent origin (abt. 50000 years; 2000 generations),

If you agree with that, then what are you arguing about?

and that whatever population differences that may exist can be usefully combined through miscegenation.

No doubt you are unaware of the obnoxious historical associations, and thus, potentially offensive nature, of the word "miscegenation". If it is not your objective to deliberately offend, or to send coded signals of approval of racism - and hopefully it is not - I recommend that you use a synonym in the future.

In any case, Edwards is right – denying the geography of genetics is not a solution.

Luckily, no-one is doing that at all, nor denying the existence of the cultural construct "race". The point is that genetic differences between human populations do rise to the level of designation of different biological races.

mpearle017 · 1 March 2012

***Human “races” are below the thresholds used in other species, so valid traditional subspecies do not exist in humans. A “subspecies” can also be defined as a distinct evolutionary lineage within a species. ***

Templeton unfortunately appears to have based this idea on a misreading of Smith et als's 1997 article from Herpetological Review entitled “Subspecies and Classification". Templeton suggested that there needed to be a FST of at least 25%-30%. However, Smith et al were not referring to FST. They were referring to the 75% rule for identifying sub species.

Incredibly few people have bothered to go back and read Smith's paper!

mpearle017 · 1 March 2012

*** Key concept: if you sample humans along a transect, the typical pattern is that genetic difference will increase in a continuous fashion with greater distance between samples, rather than in large discrete “steps” as you would expect with well-separated allopatric populations, or discrete “races”.***

If you read Rosenberg's 2005 paper on Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure there are in fact discrete clusters:

"That is, for the five clusters with K = 5 in Figure 2 of the present study and in Figure 1 of [3]—corresponding to Africa, Eurasia (Europe, Middle East, and Central/South Asia), East Asia, Oceania, and the Americas—an intercluster population pair is plotted only if it includes one population from Africa and one from Eurasia, one from Eurasia and one from East Asia, or one from East Asia and one from Oceania or the Americas.

For population pairs from the same cluster, as geographic distance increases, genetic distance increases in a linear manner, consistent with a clinal population structure. However, for pairs from different clusters, genetic distance is generally larger than that between intracluster pairs that have the same geographic distance. For example, genetic distances for population pairs with one population in Eurasia and the other in East Asia are greater than those for pairs at equivalent geographic distance within Eurasia or within East Asia. Loosely speaking, it is these small discontinuous jumps in genetic distance—across oceans, the Himalayas, and the Sahara—that provide the basis for the ability of STRUCTURE to identify clusters that correspond to geographic regions."

mpearle017 · 1 March 2012

Leszek said: If you look at geographical distribution of a genetic traits using a map, you often will notice that the distribution doesn't really line up with the boarders of races.

Look up Steve Hsu's post on 40 years of clustering (also see his posts on European genetic substructure, and scientific basis for race). "Represent each individual human by their DNA sequence. When aggregated, they cluster into readily identifiable groups. This has been known for 40 years now, although the technology and methods of analysis continue to improve. Below are results from 1966, 1978 and 2008." http://infoproc.blogspot.co.nz/2009/06/genetic-clustering-40-years-of-progress.html

mpearle017 · 1 March 2012

D P Robin said: Any of us, who took Biological Anthropology at the University of Michigan, from the 60's until a year or so ago, had the chance to take C. Loring Brace's course on race, which made most, if not all these points. The faculty in my time there (mid-late 1970's) had little good to say about the concept of human races at all. It was hard to cling to ideas that races were "real" after hearing Conrad Kottak talk about racial taxonomy in Brazil (where he had done fieldwork). dpr

Anthropologist recalled this conversation with Loring Brace: "One commenter asked whether Loring Brace actually believes what he says and writes. I’m sure he does. ‘Belief’ is rarely the result of careful thinking and impartial weighing of pros and cons. People want to feel good about themselves, and often they prefer feeling good and being wrong to feeling bad and being right. I remember corresponding with Loring Brace about Lewontin’s finding that genetic variation within human populations greatly exceeds genetic variation between human populations. I pointed out that we see this same genetic overlap between dog breeds that are nonetheless phenotypically distinct. He replied that dog breeds are a creation of human-directed selection and, thus, irrelevant. I then pointed out that not all dog breeds have been created by kennel clubs. More to the point, there are many sibling species that show the same kind of genetic overlap and yet are distinct in anatomy and behavior. At that point, he backed off completely. He said that genetic arguments were not critical to this issue anyway. I also had the creepy feeling that he wasn’t really surprised by what I said." http://westhunt.wordpress.com/2011/11/14/six-black-russians/

mpearle017 · 1 March 2012

edit: that should have read anthropologist Peter Frost.

mpearle017 · 1 March 2012

***Humans show only modest levels of differentiation among populations when compared to other large-bodied mammals, and this level of differentiation is well below the usual threshold used to identify subspecies (races) in nonhuman species. ***

As noted above - Templeton's comment that the level of differentiation is below the usual threshold is based on his own misreading of the Smith et al 1997 Subspecies and Classification" paper.

Also if you look at the figures for the level of genetic diversity within humans it is actually quite similar to other mammals that have various races or sub species.

(eg Woodley 2009) http://tinyurl.com/7wsb43d

Nick Matzke · 1 March 2012

mpearle017 said: *** Key concept: if you sample humans along a transect, the typical pattern is that genetic difference will increase in a continuous fashion with greater distance between samples, rather than in large discrete “steps” as you would expect with well-separated allopatric populations, or discrete “races”.*** If you read Rosenberg's 2005 paper on Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure there are in fact discrete clusters: "That is, for the five clusters with K = 5 in Figure 2 of the present study and in Figure 1 of [3]—corresponding to Africa, Eurasia (Europe, Middle East, and Central/South Asia), East Asia, Oceania, and the Americas—an intercluster population pair is plotted only if it includes one population from Africa and one from Eurasia, one from Eurasia and one from East Asia, or one from East Asia and one from Oceania or the Americas. For population pairs from the same cluster, as geographic distance increases, genetic distance increases in a linear manner, consistent with a clinal population structure. However, for pairs from different clusters, genetic distance is generally larger than that between intracluster pairs that have the same geographic distance. For example, genetic distances for population pairs with one population in Eurasia and the other in East Asia are greater than those for pairs at equivalent geographic distance within Eurasia or within East Asia. Loosely speaking, it is these small discontinuous jumps in genetic distance—across oceans, the Himalayas, and the Sahara—that provide the basis for the ability of STRUCTURE to identify clusters that correspond to geographic regions."

I didn't deny that it was possible there was some discreteness, this would be evidence of this. But even the authors you cite say it is "small". Nievergelt et al. (2007), now cited at the top of the OP, use an ANOVA technique to assess explainers of diversity/similarity. When looking at variation in the genetic similarity of individuals, they find that only 9-11% can be explained by geography (similar to Lewontin's point which is that ~85% of human genetic diversity is within population, not between-population, diversity. Of that 10% or so, when comparing continuous (geographic distance) and discrete (continental region) predictors, geographic distance from Addis Adaba comes out as the single strongest predictor (3.65% of the variation), presence/absence in Africa the next (2.37%), then East Asia (2.18%), then Oceania (1.11%), and the rest of the regions less than 1% individually or added up (and of course their sampling was somewhat discrete to start, from 51 sampled populations in this case IIRC). When comparing allele frequencies between the 51 predefined populations with Fst, geography explained about 70% of it, with geographic distance from Addis Adaba again taking first with explaining 35.75% of the variance, presence/absence in the Americas second with 13.6%, the rest of the continents explaining less.

Nick Matzke · 1 March 2012

Also if you look at the figures for the level of genetic diversity within humans it is actually quite similar to other mammals that have various races or sub species. (eg Woodley 2009) http://tinyurl.com/7wsb43d

I posted two published figures showing genetic diversity in comparable large mammals (one of them of humans' closest relatives!) which contradict this statement. However, maybe I stand to be corrected. However, your link is broken.

Nick Matzke · 2 March 2012

mpearle017 said: ***Human “races” are below the thresholds used in other species, so valid traditional subspecies do not exist in humans. A “subspecies” can also be defined as a distinct evolutionary lineage within a species. *** Templeton unfortunately appears to have based this idea on a misreading of Smith et als's 1997 article from Herpetological Review entitled “Subspecies and Classification". Templeton suggested that there needed to be a FST of at least 25%-30%. However, Smith et al were not referring to FST. They were referring to the 75% rule for identifying sub species. Incredibly few people have bothered to go back and read Smith's paper!

Well, I just did, and it doesn't seem to support what you are saying. Smith et al. do mention a 75% identifiability rule at one point in the paper, but pages before that they discuss variance and overlap in characters in a population. On what I think is page 13 according to this HTML transcription on this guy's Race FAQ derived from Anthro-L newsgroup discussions back in the day, Smith et al. (1997) write:

The non-discrete nature of subspecies is evident from their definition as geographic segments of any given gonochoristic (bisexually reproducing) species differing from each other to a reasonably practical degree (e.g., at least 70-75%), but to less than totality. All subspecies are allopatric (either dichopatric [with non-contiguous ranges] or parapatric [with contiguous ranges], except for cases of circular overlap with sympatry); sympatry is conclusive evidence (except for cases of circular overlap) of allospecificity (separate specific status). Parapatric subspecies interbreed and exhibit intergradation in contact zones, but such taxa maintain the required level of distinction in one or more characters outside of those zones. Dichopatric populations are regarded as subspecies if they fail to exhibit full differentiation (i.e., exhibit overlap in variation of their differentiae up to 25-30%), even in the absence of contact (overlap exceeding 25-30% does not qualify for taxonomic recognition of either dichopatric populations or of parapatric populations outside of their zones of intergradation). Phenotypic adjustment to differing environmental conditions through natural selection is likely the primary factor in divergence of parapatric subspecies, and undoubtedly is involved in some dichopaffic subspecies. The founder effect and genetic drift are involved more in the latter than in the former.

According to dictionary.com, "differentia", "noun, plural -tiae" means "the character or basic factor by which one entity is distinguished from another", so this passage is clearly referring to characters, not subjective human abilities. And since Templeton specifically refers to a 25%-30% cutoff, and also specifically cites p. 13 of Smith et al. (1997) on the previous page of his (Templeton's) article, I think we can be confident he is thinking of this passage and not the later one. That said, it is true that Fst specifically is not mentioned by Smith et al., although they mention genetic processes. The passage by Smith et al. is not particularly easy to interpret in a precise statistical way, but it looks to me like they are saying that subspecies are geographic populations which differ in their variable characters by at least 70-75%, but with some overlap and ability to interbreed (if there was no overlap and/or no interbreeding in sympatry, then they are full species). If anything, to me this looks like if we translate Smith et al. (1997) into Fst terms, their recommendation minimum cutoff for subspecies should be Fsts of 0.70-0.75, much higher than the cutoff Templeton reads, 0.25-0.30! Or, if we assume individuals between two populations have an average of 70%-75% difference in characters measured between populations, and 25%-30% difference in characters measured within populations, we get Fst values of (0.7-0.3)/0.7 to (0.75-0.25)/0.75 = Fst between 0.57 and 0.67. Either way, the cutoff is much higher than the human value of Fst ~0.15, and higher than the cutoff Templeton assumed, so this makes his argument (slightly) stronger, not weaker. Although obviously Smith et al. is just one reference that was current when he was writing in 1998, and Templeton himself goes out of his way to argue that genetic cutoffs alone aren't enough to define subspecies.

Because of these difficulties, the modern evolutionary perspective of a "subspecies" is that of a distinct evolutionary lineage within a species (Shaffer and McKnight 1996) (although one should note that many current evolutionary biologists completely deny the existence of any meaningful definition of subspecies, as argued originally by Wilson and Brown [1953]--see discussions in Futuyma [1986:108-109] and Smith et al. [1997:13]. The Endangered Species Act requires preservation of vertebrate subspecies (Pennock and Dimmick 1997), and the distinct evolutionary lineage definition has become the de facto definition of a subspecies in much of conservation biology (Amato and Gatesy 1994; Brownlow 1996; Legge et al. 1996; Miththapala et al. 1996; Pennock and Dimmick 1997; Vogler 1994). This definition requires that a subspecies be genetically differentiated due to barriers to genetic exchange that have persisted for long periods of time; that is, the subspecies must have historical continuity in addition to current genetic differentiation. It cannot be emphasized enough that genetic differentiation alone is insufficient to define a subspecies. The additional requirement of historical continuity is particularly important because many traits should reflect the common evolutionary history of the subspecies, and therefore in theory there is no need to prioritize the informative traits in defining subspecies. Indeed, the best traits for identifying subspecies are now simply those with the best phylogenetic resolution. In this regard, advances in molecular genetics have greatly augmented our ability to resolve genetic variation and provide the best current resolution of recent evolutionary histories (Avise 1994), thereby allowing the identification of evolutionary lineages in an objective, explicit fashion (Templeton 1994b, 1998a, 1998b; Templeton et al. 1995).

(italics original) Templeton goes on to argue that humans do not qualify for having subspecies either on the older definition of subspecies as "Geographically Circumscribed, Sharply Differentiated Populations", or the newer phylogenetic definition. Note on Templeton's Figure 1 This page from the race-FAQ guy says that the Templeton 1998 Figure 1 which I used in my post is comparing Fsts for human nuclear markers to Fsts for mostly mitochondrial markers, and is thus comparing apples and oranges. The page posts some (old) comparisons which indicate that the difference could be dramatic. However, other biases go in other directions (e.g. humans are globally distributed, which would lead to a higher Fst in almost all other species, but all the other compared species have much smaller geographic ranges; ditto for population size). And at any rate, the criticism doesn't apply to the great ape mtDNA tree I posted, which shows that the diversity of all modern humans is much lower than that of mere geographic regions of the other great apes. Finally, when I saw Templeton talk in 2011, IIRC he used newer figures which had the same point about genetic diversity; e.g. geographic subspecies of various songbirds have more genetic differentiation than the human species.

Nick Matzke · 2 March 2012

mpearle017 said:
Leszek said: If you look at geographical distribution of a genetic traits using a map, you often will notice that the distribution doesn't really line up with the boarders of races.
Look up Steve Hsu's post on 40 years of clustering (also see his posts on European genetic substructure, and scientific basis for race). "Represent each individual human by their DNA sequence. When aggregated, they cluster into readily identifiable groups. This has been known for 40 years now, although the technology and methods of analysis continue to improve. Below are results from 1966, 1978 and 2008." http://infoproc.blogspot.co.nz/2009/06/genetic-clustering-40-years-of-progress.html

Au contraire, the figures he cites mostly fall prey to the clustered-sampling-into-analysis = clusters-in-results critique. Africans, Oceanians, and Amerindians form distinct clusters, therefore races exist? Please. This is cherry-picking the populations most geographically distant from each other, and ignoring the continuity that occurs if you put in the in-between populations. The figure with the most complete sampling (all continents, ironically the oldest figure, from the 1960s) shows (a) the clusters are not widely-spaced and distinct, rather they touch or even overlap, and (b) the arrangement of the samples from various regions closely matches their geographic distances -- heck, it doesn't even take much imagination to rotate that figure and see the rough outline of the arrangement of the continents. Steve Hsu appears to be a physicist, not a biologist, this may be part of the problem...

Jason Antrosio · 2 March 2012

Thank you for writing this. As I've been examining the material, I was introduced (at the suggestion of anthropologist Henry Harpending) to the work of Guido Barbujani. His 2010 paper (co-authored with Vincenza Colonna) is a very careful overview of the scientific literature from someone who has been studying human genetic diversity for a long time. It's readable and current and answers many of the issues posed by Jerry Coyne: Human genome diversity: frequently asked questions.

I appreciate your use of Alan Templeton's work here--Templeton is a hidden gem who really should be read more widely. However, the problem with this 1998 paper is that some people dismiss it out of hand since it was published before those famous clustering studies and before the Lewontin's Fallacy argument really got chugging along. They also tend to dismiss anthropology. The advantage to the Barbujani and Colonna piece is that it is very careful in discussing all these post-2000 developments, and they are not anthropologists.

I discuss related issues in a blog-post titled Race redux: What are people "tilting against"?

Thank you again for writing this.

Nick Matzke · 2 March 2012

An irate reaction to Coyne from an anthropologist here: http://anthropomics.blogspot.com/2012/03/rant-on-race-and-genetics.html

Friday, March 2, 2012 A rant on race and genetics I have really had it with anti-intellectualism masquerading as biological science. What really set me off is a blog post by the distinguished evolutionary geneticist Jerry Coyne (http://whyevolutionistrue.wordpress.com/2012/02/28/are-there-human-races/ ). Unlike the great fruitfly geneticist Theodosius Dobzhansky, who was a member of various anthropological associations and had personal and professional relationships with anthropologists who worked on human diversity (notably Sherry Washburn, Ashley Montagu, and Margaret Mead) - and even let his daughter marry one, archaeologist Michael Coe - Coyne writes in abject ignorance of anthropology. Freed from the constrains of actual knowledge, then, Coyne is able to present his own commonsensical views as if they were based on science. He quotes historian of biology Jan Sapp, reviewing two new books on the subject which both come to the same conclusion - that human races are biocultural constructs, not natural facts - and dismisses the conclusion of the books and the reviewer: "Well, if that's the consensus, I am an outlier." The irony is that Coyne is not self-aware enough to appreciate that that is precisely parallel to the position of the creationists. His idea of race is the existence of between-group variation in the human species, and the discovery that groups of people are different from one another. Anthropologists have been studying the nature of that difference for around a century and a half, but Coyne isn't interested in what they've learned. Since there exist "morphologically different groups of people who live in different areas" then there are, ipso facto, human races, regardless of what anthropologists think they have learned about the subject. Of course the discovery that people in different places are different is a trivial one. At issue is the pattern of those differences and its relation to the classification of the human species. To equate the existence of between-group variation to the existence of human races is to miss the point of race entirely. Race is not difference; race is meaningful difference. It's the "meaningful" that takes the question of human races out of the geneticist's domain and places it into the anthropologist's domain (which is where it has always been - although occasionally opposed by reactionary geneticists like Charles Davenport and Ruggles Gates, whom Coyne would do well to read). At issue is the (cultural) decision about how much difference and what kinds of difference "count" in deciding that this kind of a person is categorically different from that kind of a person. The merest familiarity with the modern literature on race would have made that clear to Coyne. Coyne echoes right-wing ignoramuses with the sentiment that "the subject of human races, or even the idea that they exist, has become taboo." Jon Entine made the same claim in his stupid 2000 book on the imaginary genetic superiority of black athletes; and the segregationists made the same argument in the early 1960s. But of course, race is only taboo in the same sense that creationism has become taboo, as being a false theory about the world, from which scholars have moved on. In fact, Coyne's anti-intellectualism here is the equivalent of the creationist's claim that "We obviously did not evolve from apes, since apes still exist". It reveals such an abject ignorance of the topic that all you can do is suggest a return to kindergarten. Coyne's post, as it turns out, was inspired by a review (in American Scientist by Jan Sapp) of two books on race. He explains, "I haven't talked much about Sapp's review, as I find it tendentious; nor have I read the books he's reviewing." The books he's reviewing are: Race and the Genetic Revolution: Science, Myth, and Culture, edited by Sheldon Krimsky and Kathleen Sloan. I haven't read that one, but I can vouch that many of the contributors - including Troy Duster, Duana Fullwiley, Jonathan Kahn, Joe Graves, and Pilar Ossorio - have written insightfully and at considerable length on the subject, and know a heck of a lot more about it than Jerry Coyne does. The other book is called Race?: Debunking a Scientific Myth and is actually by a biological anthropologist and an evolutionary geneticist - Ian Tattersall and Rob DeSalle. If Coyne ever gets around to handling a copy of the book that inspired the review that inspired his ignorant blog post, he'll discover that the jacket blurb says, a prominent anthropologist and a prominent evolutionary geneticist have teamed up to give us a powerful scientific critique of the commonsensical idea of race. Distinguished scholars and skilled communicators, Ian Tattersall and Rob DeSalle show clearly how "race" simply cannot be used as a synonym for "human biological diversity". In the age of genomics, this partnership of intellectual specialties is particularly valuable, and the result is a splendid testament to the merits of trans-disciplinary collaborations. The good news is that there are evolutionary geneticists like Rob DeSalle out there. But the scholarly boat seems to have sailed away without Jerry Coyne on board. Ironically, the last time I gave a talk at the University of Chicago, about three years ago, it was on this very subject. My title was, "Some More Things I'm Pissed Off About". Coyne wasn't in attendance. And yes, I wrote that blurb.

I think Coyne's mistake wasn't due to deliberate neglect of anthropology etc. (Yes, IMHO on a topic like race it would be better if every scientist who spoke read broadly and thoroughly on the topic first, but that's another discussion.) Coyne's problem was mostly that he was applying fairly common and fairly accurate intuitions amongst those who study speciation to the inappropriate and unusual situation of humans. In many species, discrete geographic structure is common, incipient speciation is common, and if you study that you tend to think of everything as populations diverging and becoming more distinct. And if you're a geneticist, there is a danger of reifying the world "population" into being something out there in reality, rather than a term of convenience for sampling and a logical construct for mathematical theory.

Henry J · 2 March 2012

In many species, discrete geographic structure is common, incipient speciation is common, and if you study that you tend to think of everything as populations diverging and becoming more extinct.

s/extinct/distinct/ ? Henry

Nick Matzke · 2 March 2012

Henry J said:
In many species, discrete geographic structure is common, incipient speciation is common, and if you study that you tend to think of everything as populations diverging and becoming more extinct.
s/extinct/distinct/ ? Henry

Yes, thanks. Edit made.

mpearle017 · 3 March 2012

I think Coyne's mistake wasn't due to deliberate neglect of anthropology etc. (Yes, IMHO on a topic like race it would be better if every scientist who spoke read broadly and thoroughly on the topic first, but that's another discussion.) Coyne's problem was mostly that he was applying fairly common and fairly accurate intuitions amongst those who study speciation to the inappropriate and unusual situation of humans. In many species, discrete geographic structure is common, incipient speciation is common, and if you study that you tend to think of everything as populations diverging and becoming more distinct. And if you're a geneticist, there is a danger of reifying the world "population" into being something out there in reality, rather than a term of convenience for sampling and a logical construct for mathematical theory.

The other unusual situation of humans is that humans have politics. You're much less likely to get an "irate" post like the one you linked above about someone talking about biological races in a non-human species. The political milieu and social pressure are powerful amongst scientists too. In fact the author of that post has previously suggested firing someone for asking Nicholas Wade to speak at the Leakey Foundation's annual lectures! Straying from the party line obviously has social consequences which scientists aren't immune to.

Nick Matzke · 3 March 2012

mpearle017 said:
I think Coyne's mistake wasn't due to deliberate neglect of anthropology etc. (Yes, IMHO on a topic like race it would be better if every scientist who spoke read broadly and thoroughly on the topic first, but that's another discussion.) Coyne's problem was mostly that he was applying fairly common and fairly accurate intuitions amongst those who study speciation to the inappropriate and unusual situation of humans. In many species, discrete geographic structure is common, incipient speciation is common, and if you study that you tend to think of everything as populations diverging and becoming more distinct. And if you're a geneticist, there is a danger of reifying the world "population" into being something out there in reality, rather than a term of convenience for sampling and a logical construct for mathematical theory.
The other unusual situation of humans is that humans have politics. You're much less likely to get an "irate" post like the one you linked above about someone talking about biological races in a non-human species. The political milieu and social pressure are powerful amongst scientists too. In fact the author of that post has previously suggested firing someone for asking Nicholas Wade to speak at the Leakey Foundation's annual lectures! Straying from the party line obviously has social consequences which scientists aren't immune to.

All this is true. Although, here in the 21st century I sometimes wonder if "rebelling against the 'party line'" isn't perhaps more popular than following the party line on any given issue. It's particularly enticing when you can claim "I know I'm being politically incorrect here, but..." or "I know textbook orthodoxy says X, but I say Y", or whatever. Ooooh. I just had a thought. Let's dub this phenomenon "intellectual hipsterism".

Nick Matzke · 3 March 2012

Googled it, and, wow.

http://lesswrong.com/lw/2pv/intellectual_hipsters_and_metacontrarianism/

mpearle017 · 4 March 2012

Nick Matzke said: Googled it, and, wow. http://lesswrong.com/lw/2pv/intellectual_hipsters_and_metacontrarianism/

Heh!

thousand.islands.and.one · 4 March 2012

My comments would be:

1) It's true genetic distance is roughly clinal, but the slope of genetic distance to geographic difference is smaller and the fit better within, say Africa and Europe, separately than considered together.

A person living in Europe is more differentiated from a person in Africa than would be expected based on similarity to another person living in Africa.

This is also true on more local scale still (e.g. national and other traditonally considered measures of relatedness) compared to the continental, to Europe considered separately to Asia, to Eurasia considered separately to Eurasia+Africa.

It is true within the areas considered to be racial groups separately more than when combining these together.

Similarity as a function of distance is not the same between groups classically considered to be within a race or within a continent and groups considered to be outside of a race or from different continents. This is clear from FST tables and comparing to distance, even without looking at cluster analysis or components analysis.

2) Some parts of the world which are on a cline between the traditionally considered racial groups are both very sparsely populated and also often tend to be net population sinks rather than sources from the areas where races are considered to come from, without much backflow into those areas. For example the Sahara and Inner Asia are probably like this, certainly in the former respect.

3) Fst between human groups may not be large compared to similar animals, but may be more interesting to us than similar Fsts in other groups in light of the fact that we are a fast evolving lineage who fast evolving in ways which we specifically care about (intelligence, emotion, mind, personality, creativity).

4) Using geographical designations is not really better than races - e.g. take Asia or Oceania. Where do these geographical designations begin and end, as applied to human populations, where the geography ends or where clinal relatedness relative to distance begins to decrease massively and sharply (what you would be seen, classically, as a racial boundary)? Given this decrease in clarity (either within the realm of geography or human zoology), why choose to use geographic designations? Geographical designations are also distorting for African population structure between people with non-recent central-west and southern african ancestry (the former having dispersed through africa recently) since Africa lacks a clear geographical boundary.

5) No verbal description is ever going to easily convey all of the information in the genomes of millions of individuals for which large amounts are lost even in the first 3 dimensions of a highly sampled principal components analysis (which is about the most information about population structure that any human can really take in at once). A verbal description is never, ever going to be near the most accurate description of population structure, to the extent that it is even humanly possible to have a complete understanding. That being the case it's a question of going for the least distorting terminology, preferably adjusted relative to intelligibility for the most people and convenience of use.

Make of all this what you will.

Nick Matzke · 4 March 2012

thousand.islands.and.one said: My comments would be: 1) It's true genetic distance is roughly clinal, but the slope of genetic distance to geographic difference is smaller and the fit better within, say Africa and Europe, separately than considered together. A person living in Europe is more differentiated from a person in Africa than would be expected based on similarity to another person living in Africa. This is also true on more local scale still (e.g. national and other traditonally considered measures of relatedness) compared to the continental, to Europe considered separately to Asia, to Eurasia considered separately to Eurasia+Africa. It is true within the areas considered to be racial groups separately more than when combining these together. Similarity as a function of distance is not the same between groups classically considered to be within a race or within a continent and groups considered to be outside of a race or from different continents. This is clear from FST tables and comparing to distance, even without looking at cluster analysis or components analysis. 2) Some parts of the world which are on a cline between the traditionally considered racial groups are both very sparsely populated and also often tend to be net population sinks rather than sources from the areas where races are considered to come from, without much backflow into those areas. For example the Sahara and Inner Asia are probably like this, certainly in the former respect. 3) Fst between human groups may not be large compared to similar animals, but may be more interesting to us than similar Fsts in other groups in light of the fact that we are a fast evolving lineage who fast evolving in ways which we specifically care about (intelligence, emotion, mind, personality, creativity). 4) Using geographical designations is not really better than races - e.g. take Asia or Oceania. Where do these geographical designations begin and end, as applied to human populations, where the geography ends or where clinal relatedness relative to distance begins to decrease massively and sharply (what you would be seen, classically, as a racial boundary)? Given this decrease in clarity (either within the realm of geography or human zoology), why choose to use geographic designations? Geographical designations are also distorting for African population structure between people with non-recent central-west and southern african ancestry (the former having dispersed through africa recently) since Africa lacks a clear geographical boundary. 5) No verbal description is ever going to easily convey all of the information in the genomes of millions of individuals for which large amounts are lost even in the first 3 dimensions of a highly sampled principal components analysis (which is about the most information about population structure that any human can really take in at once). A verbal description is never, ever going to be near the most accurate description of population structure, to the extent that it is even humanly possible to have a complete understanding. That being the case it's a question of going for the least distorting terminology, preferably adjusted relative to intelligibility for the most people and convenience of use. Make of all this what you will.

I've been reading up on the literature, it looks to me like at a minimum we can say that different studies reach different conclusions on the relative importance of clines vs. e.g. "discrete" continental categories. My sense of it is that the pro-clustering people are typically (a) assuming rather than testing that clusters exist -- e.g., in the program STRUCTURE, you decide ahead of time how many clusters you have; and (b) ignoring the crucial issue of clustered sampling and how this induces artificial clusters even when reality is smooth or more smooth than clustered. Here's a couple of quotes from recent literature that address this. Guido Barbujani and Vincenza Colonna (2010). "Human genome diversity: frequently asked questions." Trends in Genetics. http://www.ncbi.nlm.nih.gov/pubmed/20471132

A popular approach to this question is based on an algorithm, STRUCTURE (Box 2), that assigns individual genotypes to an arbitrary number of groups, k. In the first, and most influential, worldwide analysis based on STRUCTURE, Rosenberg and colleagues [40] typed 377 STRs in the CEPH dataset and recognized six clusters, five of them corresponding to continents or subcontinents, and the sixth to a genetic isolate in Pakistan, the Kalash. In general, individuals of the same population fell consistently in the same cluster, or shared similar membership coefficients in two clusters. The authors concluded that self-reported ancestry contains information on DNA diversity, and hence that an objective clustering of genotypes is possible despite the low between-population variances, if large amounts of data are considered. In fact, clustering is certainly possible, but is not consistent across studies. In a subsequent paper based on a larger sample of microsatellites [63] the same authors rejected the claim that the distribution of samples in space itself accounts for the apparent population differentiation [43], but failed to confirm the Kalash as a separate unit; instead, the native American populations were this time split in two clusters [63]. The Kalash resurfaced as a distinct group when 15 Indian populations were added to the analysis, leading to the identification of 7 clusters, with most populations of Eurasia now showing multiple memberships [64]. In these studies, all African genotypes formed a single group, in contrast to broadly replicated evidence of deep population subdivision within Africa [35,45,47]. However, when the CEPH 377-marker dataset was analyzed by a different method that searches for zones of sharp genetic change or genetic boundaries [65], Africa appeared subdivided in four groups, and each American population formed an independent group, giving a total of 11 [66]. Previous work based on discriminant analysis had already shown that different clusters emerge when Ychromosome markers or Alu insertions are considered [67]. In a study of more than 500 000 SNPs, STRUCTURE indicated different clusterings if the SNPs were individually analyzed or if they were combined to form haplotypes; in turn, both inferred clusterings were inconsistent with those inferred from CNVs in the same individuals [44]. An independent study of more than 400 000 SNPs in over 3000 individuals [24] identified five clusters that overlap only in part with those listed in previous studies. Finally, it has been repeatedly observed that the genotypes of individuals collected in discrete areas of the world form clear continental clusters for low values of k, but subcontinental structuring emerges at finer levels of analysis [26,27, 40,44]. In practice, when the number of markers is large, many dissimilarities are detected, and a fraction of these are likely to achieve statistical significance. However, minor differences in the markers considered, in the sample distribution, or in the method of analysis, can lead to different clusterings. This should not come as a surprise; more than 80% of human SNP alleles are cosmopolitan: that is, they are present at different frequencies in all continents [44], and the differences between populations and groups represent only a small fraction of the species' diversity. In addition, not all polymorphisms, including most Alu insertions and CNV, show the gradients prevailing among SNPs and STRs [44,67-69], and hence should not be expected to reveal an identical population structure. As a consequence, we can cluster people based on any set of polymorphisms, but there is no guarantee that the same clustering will be observed when considering other polymorphisms in the same individuals.

Later in the article:

However, what matters for future research is whether by racial labeling we can approximate what is in a person's genome, and this does not often appear to be the case. For instance, Europeans and Asians appear to be clearly separated in Figure 2, and yet Watson's and Venter's complete genome sequences share more SNPs with a Korean subject (1 824 482 and 1 736 340, respectively) than with each other (1 715 851) [1]. This does not mean that Europeans in general are genetically closer to random Koreans than to each other, but instead highlights the limitations of such coarse categorizations. Populations are indeed structured in the geographical space but, when it comes to predicting individual DNA features, labels such as 'European', 'Asian' and the like are misleading because members of the same group, Watson and Venter in this case, can be very different.

Handley et al. (2007). "Going the distance: human population genetics in a clinal world." Trends in Genetics. http://www.ncbi.nlm.nih.gov/pubmed/17655965

Box 2. The 'clines versus clusters' debate The strong clinal patterns in Figure 1 in the main text seem to be at odds with work that has described human genetic diversity as discontinuous or 'clustered' (e.g. Ref. [8,15]). For instance, using the programme STRUCTURE [63], Rosenberg and colleagues identified six groups of genetically similar individuals ('clusters'), five of which correspond to major geographic regions, suggesting reduced gene flow at continental boundaries [8,10,49]. These two apparently incompatible representations of human genetic diversity led to numerous reanalyses of the HGDP-CEPH datasets and promoted debate on whether human genetic variation could be better described by clusters or clines [11,12,39,49-51,56,64]. STRUCTURE reveals gradients of ancestry proportions even under a model of strict IBD [51,63]. If sampling is heterogeneous (sampling sites are themselves clustered) then the data will reveal genetic clusters that are biologically meaningless [51,63] (see Figure I). Serre and Paabo [50], investigating this through simulations, argued that the clusters described by Rosenberg et al. [8] were caused by the discontinuous nature of the sampling scheme used for the HGDP-CEPH panel and found that, by sampling individuals uniformly across the globe, a picture of continuous, clinal variation emerged. Rosenberg et al. [49] subsequently explored several sub-sampling strategies and reached an opposite conclusion: clusters remain even when sampling uniformly across the globe. They suggested these clusters were genuine and attributed their presence to slight discontinuities in the pattern of IBD previously identified [10,11,17], which is consistent with reduced gene flow at geographical barriers such as the Himalayas and Sahara [49,65]. These different representations of human genetic diversity are, however, not mutually exclusive and several authors agree that human genetic diversity can probably be best explained by a synthetic model, in which most of the population differentiation can be explained by IBD, with some discontinuities arising from barriers to dispersal [51,56,66,67]. In other words, human genetic variation might be best explained by a combination of both clines and clusters. However, clusters explain only a minute fraction of the variance [8,49] relative to clines. As mentioned in the main text (Figure 1b), [greater than] 75% of the total variance of pairwise FST can be captured by geographic distance alone. Adding information on genetic clusters to this model captures only an extra ~2% of the variance. Figure I. Heterogeneous sampling can reveal genetic clusters that are biologically meaningless. The gradation in colour from blue to orange represents a hypothetical situation of strictly continuous variation in allele frequencies. If sampling is heterogeneous (population samples represented here by circles) then the pattern of clinal variation can be mistaken for genetically distinct clusters (black ellipses).

A response to one specific point:

4) Using geographical designations is not really better than races - e.g. take Asia or Oceania. Where do these geographical designations begin and end, as applied to human populations, where the geography ends or where clinal relatedness relative to distance begins to decrease massively and sharply (what you would be seen, classically, as a racial boundary)?

If such a "massive and sharp" decline in clinal relatedness actually exists, I want to see measured, tested, and plotted on a map. Everything I can dig up that addresses the issue says that whatever clustering effect that is found beyond continuous geographic distance (or some sort of friction-based path difference in more sophisticated studies) is relatively small, and explains only a few percent more of the variance beyond a model just using continuous change.

Given this decrease in clarity (either within the realm of geography or human zoology), why choose to use geographic designations?

Because the real situation in the real genetics isn't clear in real life, because there aren't clearly dividable groups. The main pattern is a continuum, just like there is an approximate continuum in geographical distance on the land surface of the globe. Clarity is *not* a good reason to adopt racial classification, if the clear classes are not clear in the data. If geographic terminology is fuzzy and unclear, well then, good! -- because human genetics is very similarly fuzzy. Ancestral language might be reasonable as well, but then we would have 10,000 "races", many people would be unable to trace their ancestral language, and inevitably languages would be lumped into language groups, at which point we would inevitably discover numerous exceptions where genetic similarity and linguistic similarity don't line up.

Geographical designations are also distorting for African population structure between people with non-recent central-west and southern african ancestry (the former having dispersed through africa recently) since Africa lacks a clear geographical boundary.

Not sure what the point of this sentence is. Certainly this would be at least as big a problem for someone who wants to use classic racial classification like e.g. "black". The only other options, really, would be to specify subregions in Africa (western, southern, etc.), but obviously southern African Zulu could be somewhat different than southern Africa Xhosa. Or one could go down to tribe, at the expense of the fact that many people won't be able to do that. In reality, the future will be everyone gets their personal genome done, and the analysis won't spit out "black" or "white", it will spit out a variety of classifications based on a variety of reference datasets, and it might be possible one day to say what percentage of one's ancestry is from locations X, Y, and Z, where X, Y, and Z are each regions are a few hundred kilometers across.

acoolcoolhippo · 5 March 2012

When considering whether or not there are human subspecies, it's crucially important to start with a specific subspecies concept and that concept's criteria. So, for example, we might meaningfully ask whether or not there are (or were) human races by the geographic subspecies concept or some other, but it's meaningless to ask whether or not there are (or were) human races per se -- as if "race" was an otological (and not taxonomic) category. Based on both Coyne's and Templeton's discussion, it seems that the the race concept under discussion is the geographic one. Hence Coyne says: “races of animals (also called “subspecies” or “ecotypes”) are morphologically distinguishable populations that live in allopatry (i.e. are geographically separated).” This cohere's with Ernst Mayr's characterization or this concept: "The subspecies category has been defined as “a geographically defined aggregate of local populations which differ taxonomically from other subdivisions of the species.” A valuable recent modification urged that the evidence for BCS subspecies designation should come from the concordant distribution of multiple, independent, genetically based traits. In an attempt to provide formal criteria for subspecies classifications we offer the following guidelines: Members of a subspecies share a unique geographical range or habitat, a group of phylogenetically concordant phenotypic characters, and a unique natural history relative to other subdivisions of the species. (O’Brien and Mayr, 1991. Bureaucratic Mischief: Recognizing Endangered Species and Subspecies.)"

The qualifying criteria by this concept seem to be:

(a) share a unique geographical range or habitat
(b) share a group of phylogenetically concordant phenotypic characters
(c) share a unique natural history relative to other subdivisions of the species
(d) differ taxonomically from other subdivisions of the species

Tempelton (1998) argues that there is an addition criteria -- a FST above .25 to .30. In the cited paper, he states:

"A standard criterion for a subspecies or race in the nonhuman literature under the traditional definition of a subspecies as a geographically circumscribed, sharply differentiated population is to have F* values of at least 0.25 to 0.30 (Smith et al. 1997). Hence, as judged by the criterion in die nonhuman literature, the human Fn value is too small to have taxonomic significance under the traditional subspecies definition…Although human “races” do not satisfy the standard quantitative criterion for being traditional subspecies (Smith etal. 1997)…(Templeton, 1998. Human races: a genetic and evolutionary perspective"

As mpearle017 has noted, however, Templeton's claim is based on a misreading of Smith et al. (1997). Smith et al's statement was that subspecies are "geographic segments ... differing from each other to a reasonably practical degree (e.g., at least 70-75%)." The 70-75% refers to defining phylogenetically concordant morphological differences or sets of them. See, for example, Patten and Unitt (2002):

"The standard level for defining a subspecies is based on the ‘‘75% rule’’ (Amadon 1949, Mayr 1969). Stated simply, to be a valid sub- species 75% of a population effectively must lie outside 99% of the range of other populations for a given defining character or set of characters. (“Patten and Unitt (2002):.”

This 70-75% difference would correspond with criteria (d) above.

Nick Matzke has commented on this point, stating:

"If anything, to me this looks like if we translate Smith et al. (1997) into Fst terms, their recommendation minimum cutoff for subspecies should be Fsts of 0.70-0.75, much higher than the cutoff Templeton reads, 0.25-0.30! Or, if we assume individuals between two populations have an average of 70%-75% difference in characters measured between populations, and 25%-30% difference in characters measured within populations, we get Fst values of (0.7-0.3)/0.7 to (0.75-0.25)/0.75 = Fst between 0.57 and 0.67."

Yet, what is of interest is not total genetic variability but a set of "defining character or set of characters." So Templeton's critiques doesn't withstand. It's not clear, however, if Coyne is right. Amadon (1949) offers the following simplified equation for diagnosing subspecies: M(a)- M(b)= or> 3.24a (SD) +0.680b (SD) -- The mean of population (a) minus population (b) for some set of traits must be equal to or greater than 3.24 times the SD of the set of traits in (a) plus .680 times the SD of the set of traits in population (b). I'm not aware of many traits for which there is an approximately 4 sigma difference between regional populations. Skin color would be one, at least between certain populations (see: Relethford, 2009, table a), but differences in this trait are clinal. (Clinality, per se, doesn't seem to be an issue for the 75% rule as this rule is merely used to diagnose taxonomic significance, not to define populations. For the geographic race concept, these are defined by criteria (a) and (c), above.)

There is probably no one characteristic for which major historic human populations effectively lie outside 99% of the range of other populations, but there may be sets of such characteristics. Five classically proposed races are Sub Saharan Africans, Caucasians, Pacific Islanders, Amerindians, and East Asians. See: figure 1 in Risch et al (2002) "Categorization of humans in biomedical research: genes, race and disease," and table 1 in Gill (1998) "Craniofacial criteria in the skeletal attribution of race." These continental populations fill criteria (b) and (c) of the geographic subspecies concept; in the recent past, they filled criteria (a). The real issue is (d). Given the 75% rule as discussed above, I'm not aware of a set of traits which would lead to the diagnosis of subspecies. But I wouldn't be surprised if an exotic combination could be pieced together, at least if one is allowed to used difference sets for different populations.

Nick Matzke · 5 March 2012

I don't get this part of what you are saying:

“The standard level for defining a subspecies is based on the ‘‘75% rule’’ (Amadon 1949, Mayr 1969). Stated simply, to be a valid sub- species 75% of a population effectively must lie outside 99% of the range of other populations for a given defining character or set of characters. (“Patten and Unitt (2002):.”

This is now a third version I have read in the past few days of what the 75% rule refers to, but I'll leave that aside. What is confusing me at the moment: if a subspecies A is defined as a population where 75% of the population is outside the 99% range for the other populations BCD...doesn't this mean that the other populations B, C, and/orD could be inside the 99% range of the traits for species A?

acoolcoolhippo · 5 March 2012

Jason Antrosio | March 2, 2012 10:09 AM | Reply

Thank you for writing this. As I’ve been examining the material, I was introduced (at the suggestion of anthropologist Henry Harpending) to the work of Guido Barbujani. His 2010 paper (co-authored with Vincenza Colonna) is a very careful overview of the scientific literature from someone who has been studying human genetic diversity for a long time. It’s readable and current and answers many of the issues posed by Jerry Coyne: Human genome diversity: frequently asked questions

I appreciate your use of Alan Templeton’s work here–Templeton is a hidden gem who really should be read more widely. However, the problem with this 1998 paper is that some people dismiss it out of hand since it was published before those famous clustering studies and before the Lewontin’s Fallacy argument really got chugging along. They also tend to dismiss anthropology. The advantage to the Barbujani and Colonna piece is that it is very careful in discussing all these post-2000 developments, and they are not anthropologists

Templeton's 1998 conclusion is dismissible for the reason given above. The paper was notable, though, for its proper methodology: one starts with a popularly used subspecies concept and its criteria, sees how the criteria are applied in practice, and evaluates human populations. Barbujani and Colonna (2010) add nothing to the debate. Their only argument is that "clustering is certainly possible, but is not consistent across studies"; this in no way contradicts the existence of human races, given the concept Coyne is talking about.

acoolcoolhippo · 5 March 2012

I don’t get this part of what you are saying: “The standard level for defining a subspecies is based on the ‘‘75% rule’’ (Amadon 1949, Mayr 1969). Stated simply, to be a valid sub- species 75% of a population effectively must lie outside 99% of the range of other populations for a given defining character or set of characters. (“Patten and Unitt (2002):.” This is now a third version I have read in the past few days of what the 75% rule refers to, but I’ll leave that aside. What is confusing me at the moment: if a subspecies A is defined as a population where 75% of the population is outside the 99% range for the other populations BCD…doesn’t this mean that the other populations B, C, and/orD could be inside the 99% range of the traits for species A?

There are several formulations of the rule and which to use -- or whether or not to use any -- hasn't been settled. You can read Patten and Unitt's discussion here: http://www.biosurvey.ou.edu/patten/Auk2002.pdf As for your question, I'm not sure what your asking. Ya, you could look at A to see if 75% of the members are outside the range of 99% (or 97% as Mayr put it) of B or vice versa. It's not clear to me how this works when it comes to sets of traits; I've come across a few instances were canonical analyses was used.

Nick Matzke · 5 March 2012

As for your question, I’m not sure what your asking. Ya, you could look at A to see if 75% of the members are outside the range of 99% (or 97% as Mayr put it) of B or vice versa. It’s not clear to me how this works when it comes to sets of traits; I’ve come across a few instances were canonical analyses was used.

The weird thing is that it seems like the criteria have to be symmetrical, since the labeling of A vs. B etc is arbitrary.

acoolcoolhippo · 5 March 2012

It's notable that Sapp doesn't merely conclude that races don't exist, but, following Lewontin, he -- and those he reviews -- also concludes that human genetic variation is trivial:

Although race is void of biological foundation, it has a profound social reality. All too apparent are disparities in health and welfare. Despite all the evidence indicating that “race” has no biological or evolutionary meaning, the biological-race concept continues to gain strength today in science and society, and it is reinforced by those who design and market DNA-based technologies...Tattersall and DeSalle confront those industries head on and in no uncertain terms, arguing that “race-based medicine” and “raced-based genomics” are deeply flawed.

This is an absolutely amazing conclusion because there's no support for it -- and because the argument made for it is ripe with fallacies. Apparently, the case goes: (1) the continental populations we call races don't qualify as "real races"; (2) therefore they aren't "biologically real"; (3) therefore, differences between the population we call races aren't "biologically real." What's bizarre is that the anti- genetic variation crowd frequently cites the low within to between ratio of genetic variance to support their claim. The average FST between race-like populations is .10. When one factors out intra-individual variance, the between individual, between population FST becomes .22; which is modest in magnitude. It could be argued that this modest FST was due to drift and is of no functional importance but the typically estimated phylogenetically concordant phenotypic FST is .12; for example, Relethford (2002) estimated a craniometric FST of 0.15 based on 6 continental regions and assuming a global craniometric heritability of 0.55; Hanihara (2008) estimated a nonmetric dental trait FST ranging from .7 to .16 based on 12 regions and assuming global heritabilities ranging from .4 to .8. So, a large chunk of the between individual, between population genetic variance seems to code for phenotypic variability, which is as we would expect. The between population genetic variance, nonetheless, seems small. And it is, relative to the within population variance. But the same holds for the "health and welfare" variance. Just imagine an anti-Lewontin arguing that the variance in educational attainment between ethnic population in the US effectively doesn't exist because it's only a minute fraction of the within population variance -- the ratio is maybe 7 percent to 93. "All too apparent are disparities" or just "disparities are only apparent"? Now, a phylogenetically concordant phenotypic FST of .12 is equivalent to a between population standardized difference of 0.7 ((SQRT(.12/.88))*2). So if we consider the differences in health and welfare to be "all too apparent," symmetry demands that we should consider the phylogenetically concordant phenotypic differences to be "all too apparent." To the extent that the former argues for the reality of "social race," the latter argues for the reality biological -- if not taxonomic -- race. Whatever the case, the "low" level of between to within population variance, whether based on average genetic variability or estimated phylogenetically concordant phenotypic variability, in no way supports the claim that the "profound" social reality of race is not conditioned, in part, by biological differences.

Nick Matzke · 5 March 2012

What the heck is "phylogenetically concordant" Fst? Fst is just a number, ie a distance. In certain situations a phylogenetic process can produce a distance matrix that contains phylogenetic information, but other processes can produce a distance matrix in Fst also. Eg actual geographic distance plus IBD. We don't have a phylogeny for living humans, we have the IBD pattern. So I'm not getting what you are saying.

acoolcoolhippo · 5 March 2012

Hanihara (2008) estimated a nonmetric dental trait FST ranging from .7 to .16 based on 12 regions and assuming global heritabilities ranging from .4 to .8.

This should read: "a nonmetric dental trait FST ranging from .07 to .16." The paper in reference is: "Morphological Variation of Major Human Populations Based on Nonmetric Dental Traits."

DS · 5 March 2012

Nick Matzke said: What the heck is "phylogenetically concordant" Fst? Fst is just a number, ie a distance. In certain situations a phylogenetic process can produce a distance matrix that contains phylogenetic information, but other processes can produce a distance matrix in Fst also. Eg actual geographic distance plus IBD. We don't have a phylogeny for living humans, we have the IBD pattern. So I'm not getting what you are saying.

Actually, the related parameter D, (Nei's genetic distance), can be used for phylogenetic inference. Perhaps that would be better than trying to use Fst values to infer degrees of relatedness.

acoolcoolhippo · 5 March 2012

What the heck is “phylogenetically concordant” Fst? Fst is just a number, ie a distance. In certain situations a phylogenetic process can produce a distance matrix that contains phylogenetic information, but other processes can produce a distance matrix in Fst also. Eg actual geographic distance plus IBD. We don’t have a phylogeny for living humans, we have the IBD pattern. So I’m not getting what you are saying.

I just meant the genetic differentiation for the phenotype (e.g., dentition or craniometric). I guess I should have just said "phenotypic FST" in the sense of "craniometric FST" but I wanted to make clear that we were talking about genetically mediated phenotypic differentiation. And I had Mayr's criteria of "a group of phylogenetically concordant phenotypic characters" on my mind when I was writing. Sorry for the confusion.

Nick Matzke · 5 March 2012

DS said:
Nick Matzke said: What the heck is "phylogenetically concordant" Fst? Fst is just a number, ie a distance. In certain situations a phylogenetic process can produce a distance matrix that contains phylogenetic information, but other processes can produce a distance matrix in Fst also. Eg actual geographic distance plus IBD. We don't have a phylogeny for living humans, we have the IBD pattern. So I'm not getting what you are saying.
Actually, the related parameter D, (Nei's genetic distance), can be used for phylogenetic inference. Perhaps that would be better than trying to use Fst values to infer degrees of relatedness.

The key question is whether combining the signal from a bunch of independently assorting loci is a reasonable thing to do. For well-separated species, it is. For within-population studies, it's a big no-no.

Nick Matzke · 5 March 2012

acoolcoolhippo said:
What the heck is “phylogenetically concordant” Fst? Fst is just a number, ie a distance. In certain situations a phylogenetic process can produce a distance matrix that contains phylogenetic information, but other processes can produce a distance matrix in Fst also. Eg actual geographic distance plus IBD. We don’t have a phylogeny for living humans, we have the IBD pattern. So I’m not getting what you are saying.
I just meant the genetic differentiation for the phenotype (e.g., dentition or craniometric). I guess I should have just said "phenotypic FST" in the sense of "craniometric FST" but I wanted to make clear that we were talking about genetically mediated phenotypic differentiation. And I had Mayr's criteria of "a group of phylogenetically concordant phenotypic characters" on my mind when I was writing. Sorry for the confusion.

OK, I sort of get what you were saying, thanks. However, I don't buy your argument that because phenotypic Fsts and genetic Fsts are both about 0.15, that therefore "a large chunk of the between individual, between population genetic variance seems to code for phenotypic variability, which is as we would expect." The sorts of phenotypes you were discussing could well be controlled by variation in just a few genes, or a few dozen. But the vast majority of the *genetic* variation is in all likelihood neutral, and of that which is not neutral, it's likely irrelevant to the morphological characters you cite. E.g. blood groups, immune system genes, etc. Finally, the phenotypic data is subject to the same clines vs. clusters argument that I've been raising throughout. Everything I've seen says that phenotypes seem to be mostly explained by clines rather than clusters -- and arguments to the contrary have got to explicitly take into account e.g. the effect when the *sampling was geographically clustered to start with*, the tendency of many algorithms to find clusters even if they don't exist, and the relative weight of clines vs. clusters for explaining the data.

acoolcoolhippo · 6 March 2012

Finally, the phenotypic data is subject to the same clines vs. clusters argument that I’ve been raising throughout. Everything I’ve seen says that phenotypes seem to be mostly explained by clines rather than clusters – and arguments to the contrary have got to explicitly take into account

I guess I don't agree with this. I would argue that, when deciding if there are human subspecies, we should use the standards and criteria that are used when deciding if there are subspecies in non-humans. Above, I pointed to a commonly used subspecies concept. The criteria were: (a) share a unique geographical range or habitat (b) share a group of phylogenetically concordant phenotypic characters (c) share a unique natural history relative to other subdivisions of the species (d) differ taxonomically from other subdivisions of the species Cluster analysis is often used to evidence B and C. If members of the respective populations share a unique natural history and a group of phylogenetically concordant characters, one should be able to find genetic and phenotypic clusters. This doesn't mean, though, that all or most differences should cluster. Just that some should. Amongst non-human populations such defining characteristics are often functionally trivial (e.g., a difference in plumage coloration), so they should be able to be trivial for humans. Now, when it comes to non-human populations, "the effect when the *sampling was geographically clustered to start with*, the tendency of many algorithms to find clusters even if they don’t exist" etc., is rarely, if ever, taken into account. I see no reason, then, why proponents of human subspecies should be burdened with this requirement. On the other hand, they should be burdened with the requirement of specifying which populations they wish to call races and of specifying these populations' defining characteristics -- as these are basic requirements for recognizing non-human subspecies. I think that the problem with this debate, in addition to it being confused with that concerning global genetic variation, is that no one applies concepts, criteria, or standards consistently. For example, Templeton, whom you cite, argues that human races don't exist because they don't meet a FST criteria; Graves (2010) and a number of others repeat this claim; but there is no such criteria in the "nonhuman literature." You argue that we should take into account "the tendency of many algorithms to find clusters even if they don’t exist," but in the nonhuman literature this is rarely done, so if we take this into account, we are not using nonhuman standards; that's fine, but if so, you shouldn't be surprised if proponents of human races disregard other nonhuman standards such as specifying the populations under question. But are there human subspecies? Based on the natural history of humans, I would propose the following: If there are human subspecies, at least by the geographic concept, which is the concept many people seem to refer to, then at very least, the human population should be divisible into Sub Saharan Africans and everyone else, since this split represents the largest one both in terms of natural history and geography. It seems clear that the respective populations meet criteria (a) to (c). Obviously, it could be pointed out that both non-Sub Saharan Africans and Sub Saharan Africans represent a diverse population, but if these two populations and their many sub divisions are together capable of representing one race, as many argue, then each population independently surely is capable of representing one race. As for criteria (b), I would point to the typical results from craniometric cluster analysis:

Applying the neighbor-joining method to the MMD distances results in the dendrogram illustrated in Figure 3. The initial split, suggesting the greatest dissimilarity, is between Subsaharan Africans and the rest of the world. (Hanihara et al. (2003) "Characterization of Biological Diversity Through Analysis of Discrete Cranial Traits"

A similar pattern can be seen for dentition:

[…] compared to other world populations, Africans south of the Sahara Desert are distinct dentally — especially in their expression of nine high- and two low-frequency morphological features. This suite of traits was termed the “Sub-Saharan African Dental Complex” (SSADC). (Irish (2011) "Afridonty: the “Sub-Saharan African Dental Complex” revisited.")

Assuming that phenetic expression approximates genetic variation, previous dental morphological analyses of Sub-Saharan Africans by the author show they are unique among the world's modern populations. Numerically-derived affinities, using the multivariate Mean Measure of Divergence statistic, revealed significant differences between the Sub-Saharan folk and samples from North Africa, Europe, Southeast Asia, Northeast Asia and the New World, Australia/Tasmania, and Melanesia. Sub-Saharan Africans are characterized by a collection of unique, mass-additive crown and root traits relative to these other world groups. (Irish (1997). "Ancestral dental traits in recent Sub-Saharan Africans and the origins of modern humans"

In these traits at least, the largest subdivision is between non-Sub Saharan Africans and Sub Saharan Africans. And this subdivision matches the one found by genetic analysis. The issue then, for my proposed two races, seems to be (d) "differ taxonomically from other subdivisions of the species." Unfortunately, there seems to be some ambiguity concerning what it means to "differ taxonomically." As discussed above, some have propose using the 75% rule. Yet there is a good deal of ambiguity concerning how to interpret this rule. One interpretation is simply that you can correctly assign at least 75% of individuals to the correct proposed race. Sewall Wright mentions this in his classic work:

There is also no question, however, that populations that have long inhabited separated parts of the world should, in general, be considered to be of different subspecies by the usual criterion that most individuals of such populations can be allocated correctly by inspection. ("Evolution and the Genetics of Populations")

This "correct classification" interpretation is still used. For example, in "Systematics of Steller sea lions (Eumetopias jubatus): subspecies recognition based on concordance of genetics and morphometrics," Phillips et al. (2009) takes correct assignment as an indication of taxonomic significance. Another interpretation, as discussed above, is that at least 75% of one population differ from 97-99% of another population in some defining trait. These are importantly different interpretations. My two races clearly meet the 75% rule by the former. For example, Relethford (2009) tells us:

Using Howells’s six geographic regions and all 57 measurements, the overall rate of correct classification is 97% for male crania and 96% for female crania. Jackknifed classification rates are slightly lower but still very high (96% for males, 94% for females). Such high levels of classification would seem to suggest that dividing humanity (or at least that section of humanity represented by Howells’s samples) into six discrete regions is valid. To argue against this subdivi- sion in the face of such high accuracy often seems coun- terintuitive . . . at first. (Race and Global Patterns of Phenotypic Variation)

It seems Relethford (2009) would agree with the "correct classification" interpretation of the 75% rule. He just goes onto argue that one could use craniometrics to classify any number of human subdivisions. That's an interesting point, but it's irrelevant, given the subspecies concept that I'm discussing. Anyways, if Relethford (2009) is correct, we could use craniometrics to correctly classify more than 75% of members into by proposed two races. But what about the 75% differing interpretation? This is where I get caught up. Craniometric and dental variation is such that while you can use it correctly classify individual, the differences aren't so extreme that 75% of one population differ from 97% of another even using any combinations of the traits; it's not even close. These then are not good differentiating criteria. As a possible one, I propose natural hair curliness, which has been found to have a heritability of .95 (Medland, 2009). Based on Loussouarn (2007) "Worldwide diversity of hair curliness: a new method of assessment" I would propose that at least 75% of non-Sub-Saharan Africans, excluding those in hybridization zones, naturally, lie outside of the range of Sub-Saharan African. who tend to have type 6 to 8 hair. This is clearly the case between Sub-Saharan Africans and certain non-Sub-Saharan African sub groups (e.g., Caucasians and N.E Asians.) Some Oceanian/South Asians groups make for an obvious exception, but given the numbers, they likely would not drive the average below 75%. That would be my tentative argument for human races, given the geographic race concept and what seem to be the criteria and standards used.

However, I don’t buy your argument that because phenotypic Fsts and genetic Fsts are both about 0.15, that therefore “a large chunk of the between individual, between population genetic variance seems to code for phenotypic variability, which is as we would expect.” The sorts of phenotypes you were discussing could well be controlled by variation in just a few genes, or a few dozen. But the vast majority of the *genetic* variation is in all likelihood neutral, and of that which is not neutral, it’s likely irrelevant to the morphological characters you cite. E.g. blood groups, immune system genes, etc

Well, my major point was that a between population FST of .12 is not trivial, at least given how both "biological race" proponents and opponents interpret such magnitudes. For example, in regards to one rather controversial proposed difference, you said:

However, he does leave open the possibility that a trait like intelligence could vary genetically between races, although it seems unlikely to him, he says there is no evidence supporting this, and in any event even a statistically-detectable difference between populations would be swamped by the within-population variability in such a trait.

If we used the same methodology as Relethford (2002) and Hanihara (2008) to calculate an IQ FST (in the way they calculate a craniometric or dentition FST), using Lynn (2006)'s estimate of phenotypic 'race' differences, our derived value would be somewhat under .05 (if we group Lynn's national or population IQs into continental clusters). This is compared to a between individual, between continental cluster genetic FST of .22 (i.e, after partitioning out intra-individual variance). My point then would be that such proposed differences are consistent with the know genetic differences. Your counter point would be that "the vast majority of the *genetic* variation is in all likelihood neutral." In reply, I would say that we would have to look at the regions in question. Here's a blurb from Wu and Zhang (2011):

In this study, we find that genes involved in osteoblast development, hair follicles development, pigmentation, spermatid, nervous system and organ development, and some metabolic pathways have higher levels of population differentiation....Our analysis demonstrates different level of population differentiation among human popula- tions for different gene groups. ("Different level of population differentiation among human genes")

You would, no doubt, reply that when it comes to highly polymorphic differences like those that code for general intelligence, there are unlikely FST's above .05. I understand what your saying, but I don't see any support on genetic grounds, one way or the other. If you were simply to take the genetic differentiation or genetically conditioned phenotypic differentiation with respect to some aspects of the nervous system as a measure of IQ differentiation, one would predict a magnitude of differences higher than typically proposed. Obviously for this specific trait or others there may be none and probably is little, but the genetic data doesn't imply this. (What does is the intra-national performance of different populations e.g., the performance of 2nd generation Black Africans in the UK or Indonesians in the Netherlands on cognitive tests.) Whatever the case, this issue, which concerns human genetic diversity, is only indirectly related to that concerning subspecies. (Since "biologic race" is used to refer both to "genetic differences" and "taxonomic groupings," the issues are conflated -- as was the case in Snapp's review.)

Nick Matzke · 6 March 2012

Now, when it comes to non-human populations, “the effect when the *sampling was geographically clustered to start with*, the tendency of many algorithms to find clusters even if they don’t exist” etc., is rarely, if ever, taken into account. I see no reason, then, why proponents of human subspecies should be burdened with this requirement.

Several papers have recently criticized the nonhuman literature for exactly this. I'm traveling so I don't have the papers handy, but amongst the points made are (1) the clustering algorithms can produce clusters even when the data was simulated with no clustering, only clinal patterns; (2) therefore people should test for clinal variation before naively throwing the data into clustering algorithms; (3) different clustering algorithms can produce different clusters with the same data; (4) small changes in the input data or the settings can produce different clusters in many situations.

Jason Antrosio · 7 March 2012

With regard to acoolcoolhippo, would like to throw a couple of recent studies into the mix. On the distinction between Sub-Saharan Africans and non-Sub-Saharan Africans, the 2009 study by Long et al. Human DNA sequences: More variation and less race finds non-Sub-Saharan Africans to be a nested subset of the larger diversity already contained within Sub-Saharan Africa, and that this Sub-Saharan African genetic diversity could in some ways also cluster distinctly: "The pattern of DNA diversity is one of nested subsets, such that the diversity in non-Sub-Saharan African populations is essentially a subset of the diversity found in Sub-Saharan African populations. The actual pattern of DNA diversity creates some unsettling problems for using race as meaningful genetic categories. For example, the pattern of DNA diversity implies that some populations belong to more than one race (e.g., Europeans), whereas other populations do not belong to any race at all (e.g., Sub-Saharan Africans)."

On craniometrics I respect the careful work of Relethford cited above. However, a more recent assessment, Strauss and Hubbe 2010, Craniometric Similarities Within and Between Human Populations in Comparison with Neutral Genetic Data suggests that the craniometric clusters may not be quite as evident as they have been portrayed: "Contrary to what was observed for the genetic data, our results show that cranial morphology asymptotically approaches a mean ω of 0.3 and therefore supports the initial statement--that is, that individuals from the same geographic region do not form clear and discrete clusters--further questioning the idea of the existence of discrete biological clusters in the human species."

loujost · 7 March 2012

Many participants in this exchange, and articles like Templeton's that people are using to support their ideas, appear to think that Fst or Gst are measures of genetic divergence. There are even discussions of cut-off values of Fst = 0.25 to 0.75 to define subspecies. This is crazy, since the maximum possible value of Fst for a given locus is constrained by the mean within-group heterozygosity at that locus. If within-group variation is high, it is mathematically impossible for Fst to reach those thresholds (or any given threshold), even if the groups cannot interbreed, share no alleles, and are each on their own evolutionary course. See Hedrick 2005 or Jost 2008 for the math. Fst and its relatives can approach zero even if no alleles are shared between groups, and can equal unity even if almost all groups are fixed for the same allele.

Also, some commentators on other threads reacting to Coyne's post have claimed that subpopulations would only diverge genetically at neutral or nearly neutral alleles if the absolute number of migrants per generation were less than 1, based on standard population genetic arguments. Since such a low value of migration is improbable in humans, they argued that human subpopulations would not diverge at neutral or nearly neutral loci. This argument is ultimately based on linking the number of migrants to the equilibrium value of Gst or Fst under simple genetic models. It rests on the misinterpretation of Fst as a measure of genetic differentiation ranging from 0 to 1. As I just mentioned, Fst is not a measure of differentiation in this sense. Therefore the standard population genetic argument used by these commentators is wrong. As I show in Jost 2008, the real quantity determining whether two subpopulations coheres or diverges genetically is m/u where m is the relative migration rate and u is the mutation rate of the locus in question. Divergence does not depend on the absolute number of migrants but on the relative migration rate; (number of migrants) / (group population). Thus oceans, mountains, deserts, and other leaky barriers might have been sufficient to allow divergence of subpopulations, especially when the subpopulations are large and migration channels are narrow.

I would urge all parties to try to avoid using Fst, and to be cautious about using pop gen arguments that depend on Fst, in this discussion.

Hedrick P (2005) A standardized genetic differentiation measure. Evolution, 59, 1633-1638.
Jost L (2008) Gst and its relatives do not measure differentiation. Molecular Ecology, 17, 4015-4026.

mpearle017 · 8 March 2012

Jason Antrosio said: With regard to acoolcoolhippo, would like to throw a couple of recent studies into the mix. On the distinction between Sub-Saharan Africans and non-Sub-Saharan Africans, the 2009 study by Long et al. Human DNA sequences: More variation and less race finds non-Sub-Saharan Africans to be a nested subset of the larger diversity already contained within Sub-Saharan Africa, and that this Sub-Saharan African genetic diversity could in some ways also cluster distinctly: "The pattern of DNA diversity is one of nested subsets, such that the diversity in non-Sub-Saharan African populations is essentially a subset of the diversity found in Sub-Saharan African populations. The actual pattern of DNA diversity creates some unsettling problems for using race as meaningful genetic categories. For example, the pattern of DNA diversity implies that some populations belong to more than one race (e.g., Europeans), whereas other populations do not belong to any race at all (e.g., Sub-Saharan Africans)." On craniometrics I respect the careful work of Relethford cited above. However, a more recent assessment, Strauss and Hubbe 2010, Craniometric Similarities Within and Between Human Populations in Comparison with Neutral Genetic Data suggests that the craniometric clusters may not be quite as evident as they have been portrayed: "Contrary to what was observed for the genetic data, our results show that cranial morphology asymptotically approaches a mean ω of 0.3 and therefore supports the initial statement--that is, that individuals from the same geographic region do not form clear and discrete clusters--further questioning the idea of the existence of discrete biological clusters in the human species."

@ Jason the hippo character has asked me to pass on these comments: You bring up a great example of why it’s important to specify a subspecies concept — along with this concept’s criteria. How does Long et al.’s critique relate to the classification of human populations by the geographic race concept being discussed? (Refer to my discussion above.) Maybe Long et al.’s point is relevant when it comes to some other subspecies concepts, but it is not when it comes to this one. However, a more recent assessment, Strauss and Hubbe 2010, Craniometric Similarities Within and Between Human Populations in Comparison with Neutral Genetic Data suggests that the craniometric clusters may not be quite as evident as they have been portrayed: “Contrary to what was observed for the genetic data, our results show that cranial morphology asymptotically approaches a mean ω of 0.3 and therefore supports the initial statement–that is, that individuals from the same geographic region do not form clear and discrete clusters–further questioning the idea of the existence of discrete biological clusters in the human species.” If you noticed, above I said: But what about the 75% differing interpretation? This is where I get caught up. Craniometric and dental variation is such that while you can use it correctly classify individual, the differences aren’t so extreme that 75% of one population differ from 97% of another even using any combinations of the traits; it’s not even close. These then are not good differentiating criteria. The reason I said this is because I was aware of Strauss and Hubbe’s study. The findings seem to imply that in no set of craniometric differences are more than 75% of one population different from 97% of another. It would be interesting to run the same analysis using the dental data or skeletal data or a combination of all. Whatever the case, the situation illustrates the important difference between two main interpretations of the 75% rule. One can correctly classify more than 75% of individuals based on a set of differentia, without 75% of individuals lying outside 97% of the range of the others in the same differentia. Anyways, this is why I offered natural hair curliness as a differentia (for Subsaharan Africans and non-Africans.) Or was that too superficial?”

mpearle017 · 10 March 2012

@ Nick & Jason,

Chuck has asked me to add one final comment as follows:

"Last comment.

Nick and Jason,

I would agree that human populations are at best only weakly subspeciated. As such, if you tighten up the taxonomic standards, likely, no set will qualify as taxonomic races. But, as it is, the standards for non human populations are fairly lax. Let me quote from Remsen (2010), “Subspecies as a meaningful taxonomic rank in avian classifications”:
Mayr et al. (1953) provided objective, quantitative definitions of subspecies based on degree of overlap that can be applied across taxa. They outlined why using simple linear overlap in measurements, for example, overemphasizes extreme individuals in a population and overestimates true population overlap. They also discussed various interpretations of the “75% rule” as the threshold for naming subspecies. Although one interpretation is that only 75% of the individuals of each sample have to be correctly classified, the rule as defined by Amadon (1949), Mayr et al. (1953), and Patten and Unitt (2002) is based on standard deviations from the mean of normally distributed data. Depending on which metric is applied, in essence these definitions mean that 90–97% of the individuals of one population must be distinguishable from the equivalent percentage of the other population to be considered subspecies under the somewhat misleadingly named 75% rule.

As I was noting above, there are multiple interpretations of the 75% rule. The more lax interpretation is (or was) just that 75% of individuals can be correctly classified into the respective populations. This is the interpretation that Sewall Wright (quoted above) was referring to in “Evolution and the Genetics of Populations” in his discussion of human races. Ditto Bodmer and Cavalli-sforza in “Genetics, Evolution and Man.” By this reading of the rule, clearly the 5-7 major human population clusters qualify as subspecies. I can’t imagine anyone seriously dissenting on this point. Perhaps, they would argue that by event of mass transportation many individual don’t “share a unique geographical range ” with others of their said race and maintain that this invalidates the concept as applied to humans. This is an interesting point — or would be were someone to make it, but this would, at best, only regress the question from “Are the said populations subspecies?” to “Were the said populations recently subspecies?”

Continuing….
[...]Although the 75% rule has a long history in ornithology, its application has been erratic at best. For example, it is generally not mentioned as a criterion for recognizing subspecies in classifica- tions (e.g., American Ornithologists’ Union 1957, Dickinson 2003) or in any of the Handbook of the World series (del hoyo et al. 1992–2008). It is not possible to tell how many of the subspecies currently recognized in such sources would qualify as subspecies under the 75% rule, but it is certain that many subspecies, especially in North America, would not qualify as valid taxa under this rule, particularly those defined by mensural differences. From personal experience in attempting to use subspecies diagnoses, such as the keys in the Birds of North and Middle America series (ridg way and Friedmann 1901–1950), I predict that more than 75% of North American subspecies taxa delimited by mensural data would not survive application of the 75% rule

Now, by this more rigorous interpretation of the rule, it’s not clear if the 5-7 major human population clusters qualify using any combination of traits — at least, I’m not aware of the qualifying differentia. They might be out there, though. I suggested hair curliness as a differentia between non-Sub Saharans and Sub Saharans. Based on the means and standard deviations presented in Hrdy’s “Quantitative Hair Form Variation in Seven Populations,” for “hair curvature,” these two populations seem to meet the 75% (differentia) criteria. The among to within F-ratio for this trait was 365. And the results agree with that of Loussouarn (2007), assuming curliness and curvature refer to basically the same trait. I’m sure if pressed, though, I could find other traits.

Whatever the case, going by Remsen (2010), it doesn’t seem as if the 75% (differentia) rule is rigorously applied. Why should it be de rigueur, then, when it comes to humans? Can one really maintain that describing human populations as subspecies is an affront to the subspecies concept, give how the concept is actually employed?