The Racist Origins of Standardized Testing Still Matter

Would you walk across a bridge that was designed to break?

Of course you wouldn’t.

But what if someone told you the bridge had been fixed?

Would you trust it – especially if people were still falling off of it all the time?

That’s the situation we’re in with standardized testing.

The tests were explicitly created more than a century ago to fail minorities and the poor.

And today, after countless revisions and new editions, they still do exactly the same thing.

Yet we’re exhorted to keep using them.

A BRIEF HISTORY LESSON

Modern testing comes out of U.S. Army IQ tests developed during World War I.


 In 1916, a group of psychologists led by Robert M. Yerkes, president of the American Psychological Association (APA), created the Army Alpha and Beta tests. These were specifically designed to measure the intelligence of recruits and help the military distinguish those of “superior mental ability” from those who were “mentally inferior.” 


These assessments were based on explicitly eugenicist foundations – the idea that certain races were distinctly superior to others.

Colleague Lewis Terman made the goal clear in his book, The Measurement of Intelligence, that these “experimental” tests will show “enormously significant racial differences in general intelligence, differences which cannot be wiped out by any scheme of mental culture.” 

In 1923, another psychologist, Carl Brigham, took these ideas further in his seminal work A Study of American Intelligence. In it, he used data gathered from these IQ tests to argue the following: 


 
“The decline of American intelligence will be more rapid than the decline of the intelligence of European national groups, owing to the presence here of the negro. These are the plain, if somewhat ugly, facts that our study shows. The deterioration of American intelligence is not inevitable, however, if public action can be aroused to prevent it.”


 
 Thus, Yerkes, Terman and Brigham’s pseudoscientific tests were used to justify Jim Crow laws, segregation, and even lynchings. Anything for “racial purity.” 


People took this research very seriously. States passed forced sterilization laws for people with “defective” traits, preventing between 60,000 and 70,000 people from “polluting” America’s ruling class. 

The practice was even upheld by the US Supreme Court in the 1927 Buck v. Bell decision. Justices decided that mandatory sterilization of “feeble-minded” individuals was, in fact, Constitutional.


 Of the ruling, which has never been explicitly overturned, Justice Oliver Wendell Holmes wrote, “It is better for all the world, if instead of waiting to execute degenerate offspring for crime, or to let them starve for their imbecility, society can prevent those who are manifestly unfit from continuing their kind…. Three generations of imbeciles are enough.” 


Eventually Brigham took his experience with Army IQ tests to create a new assessment for the College Board – the Scholastic Aptitude Test – now known as the Scholastic Assessment Test or SAT. It was first given to high school students in 1926 as a gatekeeper. Just as the Army intelligence tests were designed to distinguish the superior from the inferior, the SAT was designed to predict which students would do well in college and which would not. It was meant to show which students should be given the chance at a higher education and which should be left behind. 


And unsurprisingly it has always – and continues to – privilege white students over children of color. The same as nearly every standardized test still does.

HAS IT CHANGED?

None of this can be challenged. These are historical facts. They are simply what happened justified in the words of the people who perpetrated them.

However, champions of continuing the practice of standardized testing most often defend the practice by appealing to time.

This was all a long time ago, they say. Much has changed between now and then.

But has it? Really?

We certainly don’t use the editions of the tests written by the original eugenicists, but the practices used to create them and the results of these assessments are extremely similar.

In 1964, a Department of Education report found that the average black high school senior scored below 87% of white seniors (in the 13 percentile) on standardized assessments. Fifty years later, the National Assessment of Educational Progress (NAEP) found that black seniors had narrowed the gap until they were merely behind 81% of white seniors (scoring in the 19th percentile).

Is that really the kind of progress you want to champion?

The reason for the disparity has nothing to do with the learning students of color (and the poor whose scores are similar) achieve nor their worth as human beings.

It is in the very concept of standardized testing.

Discrimination is purposefully built in to the standardization process, according to W. James Popham, PhD, Professor Emeritus at the University of California at Los Angeles and former test maker. He explained in an interview with Frontline:

“Traditionally constructed standardized achievements, the kinds that we’ve used in this country for a long while, are intended chiefly to discriminate among students … to say that someone was in the 83rd percentile and someone is at 43rd percentile. And the reason you do that is so you can make judgments among these kids. But in order to do so, you have to make sure that the test has in fact a spread of scores. One of the ways to have that test create a spread of scores is to limit items in the test to socioeconomic variables, because socioeconomic status is a nicely spread out distribution, and that distribution does in fact spread kids’ scores out on a test.”

The scores have to fall into categories – Below Basic, Basic, Proficient and Advanced, for instance. If too many students cluster in the middle, the results are invalid. We need the scores spread out – even if we must resort to non-educational factors to get there.

Family income is not something the tests ignore, says Popham. It is an essential component specifically tested for in question construction. In fact, he claims that between 15-80% of the questions (depending on the subject area) on norm-referenced exams are linked to socio-economic status (SES).

Thus minorities with higher percentages of impoverished people are selected against. Not because of any explicit racist ideology – but to get the pretty bell curve standardized assessments require.

Young Whan Choi, Manager of Performance Assessments at Oakland Unified School District in Oakland, California, agreed.

“Too often, test designers rely on questions which assume background knowledge more often held by White, middle-class students. It’s not just that the designers have unconscious racial bias; the standardized testing industry depends on these kinds of biased questions in order to create a wide range of scores.”

For example, Choi recalled a 10th grade student in his class asking him about a standardized test question. “With a puzzled look, she pointed to the prompt asking students to write about the qualities of someone who would deserve a “key to the city.” Many of my students, nearly all of whom qualified for free and reduced lunch, were not familiar with the idea of a ‘key to the city.’”

So when they get such a question wrong, it isn’t necessarily because they don’t know the concept being tested, but they don’t understand what was being asked in the first place.

Test makers could work to eliminate such instances but that would reduce the spread of answers. It would destroy the bell curve – and thus invalidate the goal of the test which ultimately is not assessing learning but sorting and ranking students.

Jay Rosner, a national admissions test expert, explained how this bias is built-in to the process for each revision of assessments like the SAT:


“Compare two 1998 SAT verbal [section] sentence-completion items with similar themes: The item correctly answered by more blacks than whites was discarded by the Educational Testing Service, whereas the item that has a higher disparate impact against blacks became part of the actual SAT. On one of the items, which was of medium difficulty, 62% of whites and 38% of African-Americans answered correctly, resulting in a large impact of 24%…On this second item, 8% more African-Americans than whites answered correctly…”


 In other words, the criteria for whether a question is chosen for future tests is if it replicates the outcomes of previous exams – specifically tests where students of color score lower than white children. And this is still the criteria test makers use to determine which questions to use on future editions of nearly every assessment in wide use in the US.

Public schools have no control over these factors. That’s why schools serving poor and minority students invariably have lower test scores. Popham concludes there will always be a testing gap because that’s the way the system is designed.

He says it’s “A game without winners.” Or more likely a game where the poor and minorities cannot win.

And that’s how the system was designed.

Whether it’s the 1920s or the 2020s.

Standardized tests came from racists assumptions about human intelligence.

And that still matters today.

We no longer profess eugenicist ideas of racial purity embedded in our assessments as self evident or based on science. But they’re there none-the-less.

Today, the very concept of intelligence being quantifiable remains in question.

In “The Mismeasurement of Man,” evolutionary biologist Stephen Jay Gould challenged many of these ideas – in particular those of Terman. Rather than see intelligence as a genetic trait, Gould envisioned it more abstractly and thus shattered the idea that any mere number could capture human value.

Complex instances of learning cannot be accurately standardized. They have to be measured in context of the students and the learning community where they were acquired.

But this is a relatively new idea. Accepting it requires us to turn the page on a rather dark passage of our national history.

We must move on from the original test makers narrow-minded, racist ideas about intelligence and human worth.

And to do that we must leave standardized testing far, far behind.

We can’t just give a faulty bridge a new coat of paint. We must demolish it and rebuild an entirely new structure to carry us into the future.


Like this post?  You might want to consider becoming a Patreon subscriber. This helps me continue to keep the blog going and get on with this difficult and challenging work.

Plus you get subscriber only extras!

Just CLICK HERE.

Patreon+Circle

I’ve also written a book, “Gadfly on the Wall: A Public School Teacher Speaks Out on Racism and Reform,” now available from Garn Press. Ten percent of the proceeds go to the Badass Teachers Association. Check it out!

Standardized Testing is a Tool of White Supremacy

Screen Shot 2019-04-03 at 8.35.24 PM

Let’s say you punched me in the face.

 

I wouldn’t like it. I’d protest. I’d complain.

 

And then you might apologize and say it was just an accident.

 
Maybe I’d believe you.

 

Until the next time when we met and you punched me again.

 

That’s the problem we, as a society, have with standardized tests.

 

We keep using them to justify treating students of color as inferior and/or subordinate to white children. And we never stop or even bothered to say, “I’m sorry.”

 

Fact: black kids don’t score as high on standardized tests as white kids.

 

It’s called the racial achievement gap and it’s been going on for nearly a century.

 

Today we’re told that it means our public schools are deficient. There’s something more they need to be doing.

 
But if this phenomenon has been happening for nearly 100 years, is it really a product of today’s public schools or a product of the testing that identifies it in the first place?

 

After all, teachers and schools have changed. They no longer educate children today the same way they did in the 1920s when the first large scale standardized tests were given to students in the US. There are no more one-room schoolhouses. Kids can’t drop out at 14. Children with special needs aren’t kept in the basement or discouraged from attending school. Moreover, none of the educators and administrators on the job during the Jazz Age are still working.
 

Instead, we have robust buildings serving increasingly larger and more diverse populations. Students stay in school until at least 18. Children with special needs are included with their peers and given a multitude of services to meet their educational needs. And that’s to say nothing of the innovations in technology, pedagogy and restorative justice discipline policies.

 

But standardized testing? That hasn’t really changed all that much. It still reduces complex processes down to a predetermined set of only four possible answers – a recipe good for guessing what a test-maker wants more than expressing a complex answer about the real world. It still attempts to produce a bell curve of scores so that so many test takers fail, so many pass, so many get advanced scores, etc. It still judges correct and incorrect by reference to a predetermined standard of how a preconceived “typical” student would respond.

 

Considering how and why such assessments were created in the first place, the presence of a racial achievement gap should not be surprising at all. That’s the result these tests were originally created to find.

 

Modern testing comes out of Army IQ tests developed during World War I.

 
In 1917, a group of psychologists led by Robert M. Yerkes, president of the American Psychological Association (APA), created the Army Alpha and Beta tests. These were specifically designed to measure the intelligence of recruits and help the military distinguish those of “superior mental ability” from those who were “mentally inferior.”
 

These assessments were based on explicitly eugenicist foundations – the idea that certain races were distinctly superior to others.
 
In 1923, one of the men who developed these intelligence tests, Carl Brigham, took these ideas further in his seminal work A Study of American Intelligence. In it, he used data gathered from these IQ tests to argue the following:
 

 

“The decline of American intelligence will be more rapid than the decline of the intelligence of European national groups, owing to the presence here of the negro. These are the plain, if somewhat ugly, facts that our study shows. The deterioration of American intelligence is not inevitable, however, if public action can be aroused to prevent it.”

 

 
Thus, Yerkes and Brigham’s pseudoscientific tests were used to justify Jim Crow laws, segregation, and even lynchings. Anything for “racial purity.”
 

People took this research very seriously. States passed forced sterilization laws for people with “defective” traits, preventing between 60,000 and 70,000 people from “polluting” America’s ruling class.
 
The practice was even upheld by the US Supreme Court in the 1927 Buck v. Bell decision. Justices decided that mandatory sterilization of “feeble-minded” individuals was, in fact, Constitutional.

 
Of the ruling, which has never been explicitly overturned, Justice Oliver Wendell Holmes wrote, “It is better for all the world, if instead of waiting to execute degenerate offspring for crime, or to let them starve for their imbecility, society can prevent those who are manifestly unfit from continuing their kind…. Three generations of imbeciles are enough.”
 

Eventually Brigham took his experience with Army IQ tests to create a new assessment for the College Board – the Scholastic Aptitude Test – now known as the Scholastic Assessment Test or SAT. It was first given to high school students in 1926 as a gatekeeper. Just as the Army intelligence tests were designed to distinguish the superior from the inferior, the SAT was designed to predict which students would do well in college and which would not. It was meant to show which students should be given the chance at a higher education and which should be left behind.
 

And unsurprisingly it has always – and continues to – privilege white students over children of color.

 
The SAT remains a tool for ensuring white supremacy that is essentially partial and unfair – just as its designers always meant it to be.
 
Moreover, it is the model by which all other high stakes standardized tests are designed.

 
But Brigham was not alone in smuggling eugenicist ideals into the education field. These ideas dominated pedagogy and psychology for generations until after World War II when their similarity to the Nazi philosophy we had just defeated in Europe dimmed their exponents’ enthusiasm.
 

Another major eugenicist who made a lasting impact on education was Lewis Terman, Professor of Education at Stanford University and originator of the Stanford-Binet intelligence test. In his highly influential 1916 textbook, The Measurement of Intelligence he wrote:
psych

 

“Among laboring men and servant girls there are thousands like them [feebleminded individuals]. They are the world’s “hewers of wood and drawers of water.” And yet, as far as intelligence is concerned, the tests have told the truth. … No amount of school instruction will ever make them intelligent voters or capable voters in the true sense of the word.

… The fact that one meets this type with such frequency among Indians, Mexicans, and negroes suggests quite forcibly that the whole question of racial differences in mental traits will have to be taken up anew and by experimental methods.

Children of this group should be segregated in special classes and be given instruction which is concrete and practical. They cannot master, but they can often be made efficient workers, able to look out for themselves. There is no possibility at present of convincing society that they should not be allowed to reproduce, although from a eugenic point of view they constitute a grave problem because of their unusually prolific breeding” (91-92).

 

This was the original justification for academic tracking. Terman and other educational psychologists convinced many schools to use high-stakes and culturally-biased tests to place “slow” students into special classes or separate schools while placing more advanced students of European ancestry into the college preparatory courses.

 
The modern wave of high stakes testing has its roots in the Reagan administration – specifically the infamous propaganda hit piece A Nation at Risk: The Imperative for Education Reform.

 
In true disaster capitalism style, it concluded that our economy was at risk because of poor public schools. Therefore, it suggested circumventing the schools and subordinating them to a system of standardized tests, which would be used to determine everything from teacher quality to resource allocation.

 
It’s a bizarre argument, but it goes something like this: the best way to create and sustain a fair educational system is by rewarding “high-achieving” students.
 

So we shouldn’t provide kids with what they need to succeed. We should make school a competition where the strongest get the most and everyone else gets a lesser share.

 
And the gatekeeper in this instance (as it was in access to higher education) is high stakes testing. The greater the test score, the more funding your school receives, the lower class sizes, the wider curriculum, more tutors, more experienced and well compensated teachers, etc.
 

It’s a socially stratified education system completely supported by a pseudoscientific series of assessments.

 
After all, what is a standardized test but an assessment that refers to a specific standard? And that standard is white, upper class students.
 
In his book How the SAT Creates Built-in-Headwinds, national admissions-test expert, Jay Rosner, explains the process by-which SAT designers decide which questions to include on the test:

 

“Compare two 1998 SAT verbal [section] sentence-completion items with similar themes: The item correctly answered by more blacks than whites was discarded by [the Educational Testing Service] (ETS), whereas the item that has a higher disparate impact against blacks became part of the actual SAT. On one of the items, which was of medium difficulty, 62% of whites and 38% of African-Americans answered correctly, resulting in a large impact of 24%…On this second item, 8% more African-Americans than whites answered correctly…”

 
In other words, the criteria for whether a question is chosen for future tests is if it replicates the outcomes of previous exams – specifically tests where students of color score lower than white children. And this is still the criteria test makers use to determine which questions to use on future editions of nearly every assessment in wide use in the US.
 

Some might argue that this isn’t racist because race was not explicitly used to determine which questions would be included. Yet the results are exactly the same as if it were.

 
Others want to reduce the entire enterprise to one of social class. It’s not students of color that are disadvantaged – it’s students living in poverty. And there is overlap here.
 

Standardized testing doesn’t show academic success so much as the circumstances that caused that success or failure. Lack of proper nutrition, food insecurity, lack of prenatal care, early childcare, fewer books in the home, exposure to violence – all of these and more combine to result in lower academic outcomes.

 

But this isn’t an either/or situation. It’s both. Standardized testing has always been about BOTH race and class. They are inextricably entwined.

 
Which leads to the question of intention.

 
If these are the results, is there some villain laughing behind the curtain and twirling the ends of a handlebar mustache?
 

Answer: it doesn’t matter.
 

As in the entire edifice of white supremacy, intention is beside the point. These are the results. This is what a policy of high stakes standardized testing actually does.
 

Regardless of intention, we are responsible for the results.
 

If every time we meet, you punch me in the face, it doesn’t matter if that’s because you hate me or you’re just clumsy. You’re responsible for changing your actions.
 
And we as a society are responsible for changing our policies.

 
Nearly a century of standardized testing is enough.

 
It’s time to stop the bludgeoning.
 
It’s time to treat all our children fairly.
 

It’s time to hang up the tests.

 


NOTE: This article expands upon many ideas I wrote about in an article published this week in Public Source.


 

Like this post? I’ve written a book, “Gadfly on the Wall: A Public School Teacher Speaks Out on Racism and Reform,” now available from Garn Press. Ten percent of the proceeds go to the Badass Teachers Association. Check it out!

book-2