The Science of Human Races, Part 1

Uploaded by C0nc0rdance on 16.06.2012

I've been taking a long time to make this video... partly because of things going on
at work, the start of a new project and my recent promotion, but also because I wanted
to get this video just right. I care about the science behind this video and it's easy
for me to get carried away mixing my feelings and the evidence.
In my last video, I made it clear that I don't see much of a basis for biological races in
the human species, but I now realize that I was using the wrong terms, opposing the
wrong movements. I spent my efforts showing that races are not terribly useful, are arbitrary,
and that they don't really matter much in genetic terms, a position shared by most geneticists,
but my critics rightly pointed out that even arbitrary categories exist. So I don't see
any reason to oppose the assertion that races exist, given the proper biological definition
of races of organisms, and that they have at least some small aspect that is rooted
in biology. We'll call this perspective the biological race concept, or BRC.
What I continue to oppose, and what I will present evidence against, is the idea that
races are essential concepts in the human population, what we'll call "race essentialism"
or RE. This is the idea that the buckets we assign people to are somehow more meaningful
than any other way of dividing human diversity. Central to this question is the number of
races, the number of buckets, in which we find that the populations are homotypic, that
is, of a single type, a uniformity of members. Race essentialists defend the idea that there
is something in the biology that suggests the categories we draw, that they are not
completely arbitrary, and it's an idea that has roots in Victorian Era science... an era
that was dominated in the biological sciences by ideas like phrenology and eugenics and
genetic determinism. We see these concepts reborn in the language of the race essentialists.
Ideas like dividing the entire human species into large continental categories and then
comparing social characters like criminality, IQ, and income, as though those were entirely
determined by immutable characters of inheritance, rather than complex and interconnected factors
of society, environment, genetics and non-genetic inheritance. It is a simplistic understanding
of biology, a retro approach to science, and that's specifically what I want to differentiate
from the simple act of defining groups in the human population, the biological race
concept. The key difference is how much we become wedded to the idea that these concepts
really have meaning or force in biology, that they are useful categories in how we treat
people, or what freedoms or rights we grant, or how we perceive our fellow humans. In my
own estimate, if we let the biology alone determine these divisions, there are either
about 350 human races, about 250 of which reside exclusively in Africa, or there is
one human race.
I'm going to break this up into a series of videos. The first topic we'll cover is the
biological vs. the social construction of race.
1. The social construct model of race.
The human races, as most people think of them, are unquestionably social constructs. That's
not to say that there are no biological races, but they're not the ones we use in everday
parlance. The labels we stick on people are not dependent on their actual ancestry, merely
on our perception of them. Haplogroups, as I mentioned in my previous video, would be
one such biological race concept, since it uses actual genetic markers, but is devoid
of the cultural elements that we have imbued with false meaning. It's less prone to confirmation
bias and confounding by social norms. What we actually use when we talk about races are
socially constructed, truly skin deep because we judge someone's race strictly on their
outward appearance, not their inward genetics. Any overlap with actual biological race is
purely statistical and accidental.
I give you exhibit A, the current US president, Barrack Hussein Obama. What race is he? You
can answer: white, black, mixed and be right on all counts, since his mother identifies
with one group and his father another. To most people, his race is not a matter of what
genetic markers he possesses... it's how he's perceived. His social identity. He may choose
it, he may not. Growing up in Southeast Asia with an adopted father, who knows how he was
thought of there, what group he most identified with? But to most Americans he's the first
black president... not the first mixed race president, not the first half-black or half-white
president... simply black. The one-drop rule of race identification has been the standard
in the US since long before DNA testing was available. It's obviously not very reflective
of the actual markers I would find if I genotyped the US President. He's not atypical for the
American population with African ancestry. When genotyped, someone who self-identifies
as African American will typically have between 5 and 20% European ancestral markers. Many
will have Asian markers as well. The same for those who identify as European-Americans...
most will be unaware of very recent African ancestry. This is called admixture, the mixing
of different ancestral populations in the genetic complement of a modern population
through interbreeding. Admixture is the rule in most places, but especially in areas with
a high diversity or multi-regional immigration.
Many of the race realists would say that the existence of mixed races doesn't mean they
don't exist biologically, but it absolutely does mean that the rules of social race identification
are not based on classical taxonomy. There are no half-robin/half sparrows. That's simply
not how taxonomy works. As a general rule, If we can define two populations with a single
character, then there really aren't two populations. If there's significant genetic overlap between
two subgroups, then they aren't distinct after all. There are two exceptions, not found within
strict taxonomy but still useful in population genetics, and we'll need to see if either
or both of them applies to the human species. But I want to clarify early on the idea that
an alien coming to Earth with our same understanding of phylogenetics would put all modern humans
into a single species and a single subspecies. We are all Homo sapiens sapiens. Any differentiation
from this point on will have to be below the level of subspecies.
The first concept we need to address is a "deme". A deme is a distinctive group within
a species where reproduction is still possible between the groups, but each group is subject
to different selection factors. An example might be a single species of bird with two
distinctive groups, say a Western and Eastern group that differ a bit in their mating call
and only rarely interbreed. If the differences between the two groups vary continuously across
geography, then we have our second concept, a "cline". A cline is different from a deme,
or perhaps could be considered a special type of deme, where there is no demarcation between
the groups, what we would call a discontinuity. Imagine for example a species of rabbit with
several different coat types, depending on the local environment, where the mountain
deme blends into the desert deme in small steps. In classical taxonomy, a deme or cline
would fall below the level of a subspecies, and the definitions for these groups are flexible
and only very loosely defined.
So, how do we define whether a group is a subspecies, race, deme or cline?
The famous geneticist Sewell Wright, the man who wrote most of the equations that we use
today in the field of population genetics, developed a particular descriptive number
called the Fst, or population fixation index. It was a way of describing something we call
population substructure... that is, whether a large group is genetically homogenous, composed
of a single large population, or heterogenous, composed of diverse groups of breeding individuals
isolated from each other and genetically distinct. This is clearly the best statistic for describing
whether or not races are genetically distinct. The FST for modern humans is approximately
0.110, or we could say 11%. What that means, in essence, is that 89% of all the variation
in humans is shared across all groups. Only 11% of human genetic diversity can best be
explained by the presence of distinctive subgroups. Wright himself proposed that a subspecies
should be considered valid when this Fst value exceeded 0.25. Humans don't even make it halfway
to this standard criteria. So, objectively, by the standard taxonomic practices governing
subspecies, human populations don't qualify as subspecies. If we do decide to assign subcategories
to the human population, they cannot be called a subspecies.
We might still use the term deme or cline. We might still talk about races. There are
no criteria that a population needs to meet for these categories. They're completely arbitrary
in division. We can have 3 million human biological races or 1... both are equally valid because
both are completely arbitrary buckets in which to put diversity. I want to make it clear
that I don't object to arbitrary buckets, I only object to the essentialist concept
that the buckets were there before we created them, and that dividing up diversity in this
way reveals something significant. It's just an arbitrary division, like dividing up World
History into the Classical era, the Middle Ages and Renaissance Period. It's a way of
simplifying a continuum, breaking it up into understandable chunks. The essentialism in
our history example would be to treat 400 BC and 350 BC as though some real division
exists between them (beyond the 50 intervening years) simply because one falls into our arbitrary
division of the Classical period and the other does not.
My central question in this video is: does the biology, the genetics, of the human population
suggest that there is a good place to draw divisions, these buckets or categories? Are
we defined best in terms of the Victorian Era ideas of race, or rather the modern concept
of arbitrary but objective divisions like haplogroups or clines?
For that, we're going to have to look at some real data. The type of graph we'll look at
first is called a principal component analysis. On each axis, or properly the Eigenvector,
we're going to use a combination of lots of genetic markers. I don't want to focus too
much on the nature of these markers except to say that they're from non-coding regions,
not from genes, but the vast distances between genes. They have not been subjected to specific
selection, so what we're measuring has nothing to do with adaptation to the local environment...
merely the natural genetic drift of two populations with some reproductive isolation.
Each dot on the PCA graph represents a group of individuals drawn from a different population.
The goal here is to see how much of the variation in the dots can be accounted for by our two
marker sets. If human races are distinctive demes, this is what they would look like.
You'll find graphs just like this one on papers showing how biological races differ from each
other. So why would I bring it up? Doesn't data like this destroy the idea that the races
aren't distinctive? Yes and no. The problem here is one of how we select our populations.
If you choose one population from Central Africa, one from Central Asia, and one from
Australia, this is indeed what you see. Data like this suggests that these populations
must have been isolated for a long time, with no intermarriage and hence a great deal of
genetic drift and differentiation. Africans and Europeans didn't share a gene pool for
a long time.
However, what happens when we sample from all those geographic regions in between these
distant populations? Now we can see what looks like a continuous change from one region to
another. This is consistent with a model where gene flow is fairly continuous across geography.
The only reproductive isolation was from geographic distance, so that markers found in populations
in Africa remain distributed in populations in the Middle East. This is the classical
presentation of a cline, the continuous distribution of alleles that we discussed earlier. Human
diversity is best represented by a continuum of change across geography, with the occasional
gap where a physical separation existed between two peoples, producing what is called in genetics
a discontinuity. We see these clines on the global scale, say from Portugal to Siberia,
but we also see them on the very, very small scale, even within a single ethnic group or
national population. For example, geneticists can map a cline of Spanish markers into Western
France, or differentiate French and German speaking Swiss people.
To me, this suggests that dividing up diversity into homotypic groups is probably doomed from
the very beginning. If you attempt to draw dividing lines at each steep-sloped cline
or discontinuity, you will find that there are anywhere from 120 to 600 of these clinal
sets in the global population. If you attempt to keep your divisions at sharp discontinuities,
you'll find that you can't have more than 1 or 2 subgroups. That's why I say that the
number of human homotypic races suggested by biological diversity is either 300 or 1.
There just are no other divisions that account for the continuous changes both between and
within populations.
Let's take our PCA graph and try some other arbitrary divisions. What we're going to alter
here are called inferred populations, and we assign the capital letter K to this number.
If we assume 2 homotypic groups exist in the human population, that is K = 2, this is what
the analysis looks like. Perfectly valid, we've created two lobes. Set K = 3, and now
we see three distinctive populations, suggesting a central population that two groups migrated
from. Set K=4 and we have the basic continental populations. Again, the data isn't changing,
only how many inferred populations we assume. If we increase K to 9 inferred groups, you
get an analysis that still looks quite valid. There's nothing in the data here that tells
us what value K should be unless we already have a goal in mind.
So, how does my simulated data compare to actual PCA analysis of human populations?
Let's look at a few. Depending on how we set up the markers and the populations, we can
get tightly clustered demes, or we can get clinic variation. That information, by itself,
can tell us something about the history of our species.
In the next video, we'll take another look at admixture data using a program called STRUCTURE,
and we'll explore the question of IQ, criminality and the presence of genetic markers from ancient
hominins in modern humans.