Is the SAT an IQ test?
No.
Why isn't it an IQ test?
Because it doesn't measure IQ. It is used that way. And it was developed
from the army IQ test. But even the College Board will refuse to say that this
is an intelligence test. And I'd love to see them say it. I'd love to see them
say anything because then you can attack it. But there's this kind of mushy
response that when you work your way through it, there's sort of nothing
left--'Well, it has a slight predictive validity to freshman year grades in
college.' We spend a 100 million dollars a year for that? You know--your grades
in high school predict college grades better than this and we didn't have to
spend anything.
They say the SAT provides a common yardstick for comparing grades at
different schools.
Right, that's where all of the anti-test people, I think, are wrong. And
where the testing folks are right. You do need a common yardstick. You do need
some way to judge an A at this school or this teacher versus an A at this
school or this teacher. But there are lots of common yardsticks. Again, you
could use blood type. You could use height. Anything is a common yardstick.
What you have to say is, fine, it's common. But it is useful? And there are
lots of tests that are more useful than the SAT that are also common.
Such as?
Well, for instance, the Advanced Placement Tests. They are rigorous.
They're difficult. There are lots of them. You can say, my interest is history.
And so I'm going to take a history Advanced Placement test. And I'm going to
get--I want kids to be rigorous. I want curricula be rigorous. But I want them
to be not one-size-fits-all and not mindless. Let's have kids studying hard but
let's have them studying something useful, hard.
You say the SAT doesn't measure intelligence. What does it
measure?
The SAT is said to predict freshman year grades in college, a little. And
it does. It measures it a little. Almost anything you do, including family
income, will measure freshman year grades a little. But the point is that it
doesn't measure intelligence. It doesn't measure anything that's worth a 100
million dollars a year prepping for it.
Are you saying the SAT is a guessing game?
Part of the SAT is a guessing game. Fair Test's position is reasonably new
on it. We've never said that the SAT is not measuring something meaningful.
There's a little part of meaning, there's a little part of background, there's
a little part of schooling. But there's a lot of test-wiseness. There's a lot
of 'how shrewdly you can play the game?' There's a lot that can be taught in
coaching courses that has nothing to do with any of the skills you need to
succeed in college or in life.
Wayne Camara, head of the College Board's Office of Research, says the
SAT is a measure of verbal and mathematical reasoning ability.
Well, Wayne Camara, who I know is a decent person--maybe, he believes that.
But if you talk to representatives of tobacco companies, they tell you their
products are good and healthy and don't lead to diseases. And auto
manufacturers have told you that their cars don't burst into flames. And back
at the beginning of the century, you know, stockyard owners claimed that they
were slaughtering the animals in a clean way. I think we've learned not to
trust those who profit from manufacturing products, as the sole source of
information.
Yes, one of the things the SAT is measuring in small part, are those
skills. But it's measuring a whole lot of other things. And when you use the
SAT as the major factor--or worse, the sole factor--to make high stakes
decisions, to define what is merit, you're relying not on what most people
think of is merit. But on these very trainable skills that coaching courses and
others help people learn.
What does the SAT predict?
The sole scientific claim of the SAT is its capacity to predict first year
grades. According to the technical studies done by the Educational Testing
Service and College Board, the SAT predicts about one factor in six--one sixth
of the difference between two kids' first-year grades. The predictive value
declines after that--looking at four year grades or graduation rates. So even
the test makers agree that five out of six parts of whatever it takes to
predict how well you're going to do in your freshman year, is not their test.
It does correlate extremely highly with an IQ test. It was developed from
the army IQ test...
That's part of the seedy under side of the SAT. The SAT was originally
developed by straight out racists--eugenicists, people who thought my
forbearers--not just people of color--were imbeciles and shouldn't be allowed
in their country because they didn't know the language and couldn't score high
on their test. I wouldn't suggest the current people who run those companies
share those kinds of ugly views. But it's a self-reinforcing notion of defining
intelligence as that which whatever the dominant group in society has. Ends up
giving that group higher scores and lower scores. The fact that test scores
correlate with test scores is rather meaningless. The tests are measuring the
same set of factors. What's more important is whether the test accurately
predicts how well you're going to do.
Would you say that we are the only country in the world that administers
a national IQ test?
Well, despite the efforts of the Educational Testing Service--which is a
global corporation with nearly about half a billion dollars in total
revenues--the US still is the major country that administers a test like this
across the board to college bound seniors. If you take the SAT and its
competitor, the ACT--which about 80 percent as many kids take--the vast
majority of college bound kids take those tests. And yes, the SAT in particular
has its roots in IQ testing. Which are at best controversial and, at worst,
quite, quite poor predictors of anything of value.
So is it an IQ test?
It's a variant of an IQ-like test. It is set up somewhat differently. It
begs the question of, what is an IQ test measuring? What is intelligence? And
you talk to test makers. And intelligence is what their test makes. And that's
a circular definition. So to the extent that it's measuring the same that an
intelligence test is measuring--then, yes it is. But there's three fallacies
there: That there is such a thing as intelligence--that it can be measured. And
that you can put the measurements on a linear scale. And other, even people who
believe that there is such a construct as intelligence believes that
intelligencee is not one thing but seven or eight or possibly nine different
things. Robert Sternberg at Yale says it's three different things.
At best, the SAT is badly measuring one of those parts of what goes into
intelligence.
What exactly does the SAT measure? You say reasoning and ability
skills.
The SAT measures two areas. It measures developed verbal reasoning, which
are the type of skills that would be measured by reading long reading passages.
For example, in our new test students have an essay where they would read two
contrasting views on a topic. It could be political. It could be in
humanities. It could be in science. And they need to piece together
similarities and differences of the arguments--contrasting views. So that's a
type of analytic thinking and critical thinking skills that are acquired when
you read essays in college. Or the type of scientific or literature work that
you'll encounter in college and in English.
In mathematics, the SAT measures developed mathematical reasoning. So it
shies away from simple computation. As a matter of fact, the SAT of
today--unlike the SAT that you and I probably took--allows and even encourages
students to bring calculators. So it cannot measure simple addition or
division or fractions because those would be incredibly easy with the use of a
calculator. It has to measure reasoning problems, the type that you would
have, in real world applications.
And it also has a number of items that are not multiple choice. Students
have to read the context, understand the mathematical applications involved,
and then generate their own answer.
Isn't it an IQ test?
No, it's not an IQ test. It's far from it. Developed reasoning skills
measured on a test like the SAT, will link directly to the, the breadth and the
depth of the curriculum students have been exposed to in school, but also out
of school learning. Students who have read an incredible amount, whether it's
in school assignments or out of school assignments, are more likely to do
better on tests like the SAT but also in college.
So it's not an achievement measure, which would be redundant with what
grades are. But it's certainly not an IQ test which would be an innate measure
of ability. It's much more developed reasoning--the type of skills students
develop over an extended period of time.
But the SAT test was basically a stepchild of the army IQ test. Right?
Right.
Even when it was first being proposed, Henry Chauncey would have to
say--no, it's not measuring achievement. It's measuring ability more like an
IQ test. Did it change then from that?
The SAT in, in the past 40-50 years has changed remarkably. Just as
cognitive ability tests in general--that field has changed remarkably. The
types of items and the way we consider intelligence and aptitude and
achievement today, has evolved in the last 40 years. And tests have gotten
much better and more accurate in what they're doing.
What does the SAT measure?
The classic phrase is that these tests measure what they test. And the SAT
is no exception to that. The way items get on that test is the way items get on
most tests of mental ability which is that they are items that correlate with
performance in school. So an item that you would give to a norming sample that
doesn't correlate very well with school success gets dropped off the test.
Items that do correlate get put on the test. That's how tests get made up.
They're just empirical creations, creations of American and European
pragmatism. If you want to find out what actual mental capacity they measure
you have to work backwards. You have to use statistical techniques to classify
the kinds of performances that they're measuring and work backwards to "Well,
if it measures this cluster of performances, maybe it measures this kind of
capacity." And then there have developed big arguments about which performance
this cluster measures and what performance that cluster measures and which are
central to performance. So it's a very complicated game trying to work
backwards and figure out what these tests actually measure.
But is this SAT an IQ test?
It is in a sense an IQ test. The SAT and IQ test correlate very highly.
Between the SAT and the IQ, they correlate almost as much as the SAT correlates
with a second administration of the SAT, as much as it correlates with itself.
So they're very similar tests in content.
Give me the little history lesson.
The methodology for standardized tests of the kind that we use today was
developed in the 19th century by Francis Galton who was as many say, the
jealous cousin of Charles Darwin. And he was trying to get a test that would
test his kind of evolutionary, social Darwinist hypothesis that intelligence
ran in families. Of course all kinds of other things ran in families like
wealth, advantage, and so on, but that didn't bother him. He wanted a test that
would discriminate between basically upper class and lower class Brits.
He developed this technology of finding items and seeing how much they
would correlate with other performances as a criteria for whether the item
would be put on the test or not. So he had this situation in the British museum
I guess where he would have people come in and perform tasks: reaction time
tasks, visual acuity tasks, a whole variety of kind of physiologically-rooted
tasks that he thought would tap into intelligence, sort of innate, physical
intelligence. His presumption was that upper class Brits would do better on
these things than the lower class Brits and he would therefore have a set of
items that he could give to people that were a measure of intelligence that
would discriminate. People who would score high on this would be more likely to
be the upper class Brits. People who scored low on this would be the lower
class Brits. So, he died a failed scientist, never finding a set of items that
worked like that.
Alfred Binet in Paris at the turn of the century, beginning of the 20th
century, was given a practical task of coming up with a test that would help
identify kids who were retarded and wouldn't do well in school. So he simply
used Galton's technology. He said, "Well, I'll make up a bunch of items. And
the items that kids who do well in school get right, I'll put on the test. And
items that kids who don't do so well at school get right, I'll put those off
the test because they couldn't be measuring something relevant to school
success." So he gets a subset of items that kids who do well in school can
perform well on, and now he's got a test that when given to people will tend to
identify those who are not going to do well in school. And he can do what the
Paris school board asked him to do: screen out kids who are going to have real
trouble with school.
Well, as everybody knows, that became the basis of the IQ test. It was
transported into the United States, the Stanford-Binet test. That same
technology of using success in school as a criteria for whether an item gets
put on a test or taken off of a test. And that is how essentially, roughly
speaking, all standardized tests are constructed. The SAT, the GRE, the mini-IQ
test all have that inherent methodology to them.
The man who developed the SAT, Carl Brigham, was an outright racist. Do
you even mull that fact?
As I say, that fact has not been wasted on me. And the area of standardized
testing and intelligence testing has always been one of the most controversial
areas of psychology for precisely that reason. It has often been used as a way
of implementing racist intent, most recently with regard to blacks. But in the
post-World War I wave of immigration, it was used to screen out Southern
Europeans, Jews, and other groups who did not score well on tests at that
particular time. So it has, as a tool, a very, very racist past.
What do you think of the SAT, personally?
I think it is an exam that can tell you something. I've used a metaphor, if
you can indulge that, that I think captures the basic argument I would use. If
you had to select a basketball team by the number of 10 free throws that a
player could hit, the first thing you'd worry about is selecting a basketball
player based on how they shoot free throws and you know you'd never pick
Shaquille O'Neal because he's terrible at free throws even though he's a
magnificent basketball player. That's what a standardized test is, compared to
the domain of real school performance. Real school performance out there--it's
like having to select a basketball player based on how well they shoot free
throws. That's the first problem with standardized tests.
And the SAT reflects that. The predictive statistics reflect that. The SAT
measures only about 18%, [an] estimate range from 7 to 25%, of the things that
it takes to do well in school. This is something that people should realize
about the test. People think of it as capturing a very large proportion of
things that are important to school success. The people that make these tests
tell us, "No, that is not true. They don't capture a large portion of the
things--about 18%." In many of the samples I've done research on, much smaller
than that, sometimes 4% of the things that are predicting success in college
for example. So it's not great, just like a free throw is to selecting a
basketball team. And SAT is not going to get you very far with predicting who's
going to do well in college. And certainly not far with regards to who is going
to do well in society or contribute to society. It's just not that good a tool
and that's the first thing to realize about it.
The second set of problems have to do with interpreting the scores on SAT
tests. And again, the free throw example is useful. If a kid comes in and he
shoots 10 out of 10 or zero out of 10, you might take note of that kind of
performance with regard to selecting him on the basketball team. If he hits 10
out of 10, you say, "Well, okay, he's probably pretty good and that probably
reflects something about his basketball playing. I'll put him on the team. Zero
out of 10, that probably reflects something about his playing, he's off the
team." Same with SAT tests I think. When you get really strong scores one way
or the other, even though they're not as reliable, they often can bring to
light talent that would not otherwise be seen.
And so I am not one who thinks they should be done away with entirely. They
can be useful in that regard as long as we understand how to interpret them and
how little to use them. And I think many college admissions committees are very
sophisticated about this. They are closer to this issue of how predictive tests
are, and they can get a feel for it. So, that's the second thing.
Middling scores on the test are very difficult to interpret because you
don't know. If the kid practiced a little bit more, maybe he would have hit 9
free throws. Maybe he hit only 4 and he's been practicing for 10 years. It's
just hard to interpret the meaning of middling scores and the same is true with
the SAT. A kid who gets anywhere from 10 to 1200, maybe he got those scores
because of coaching or maybe he got those scores because he didn't have enough
coaching or maybe he got those scores because he went to Europe every summer
and got a great vocabulary about cathedrals and that happened to be on the test
that day. All kinds of things can contribute to performance and it muddies up
the diagnosticity of the test.
Do we or don't we have a neutral and impersonal meritocracy measuring
merit?
Well, it is certainly impersonal. I don't know I'd go so far as to say
it's either neutral or meritocratic. FairTest, for example, would say
that these tests--and I want to be clear, I'm [not] talking about all tests.
I'm a professor, I believe in methods of evaluation. I think some methods are
not only more fair but also more valuable. And what I'm talking about here in
the guise of tests is aptitude testing, tests which are used to predict future
performance, not tests which are used to give feedback, either to the teacher
to the student, as to what they have actually mastered or what they are
learning. I'm not talking about diagnostic tests. I am talking only about
aptitude tests. Because it is the aptitude test that we are using as the proxy
for merit. And it is as if this test functions as a thermometer. And you give
each person the test as if you were taking their smartness temperature. And
that unfortunately, is not how the test functions. Even the test makers do not
claim it is a thermometer of smartness. All they claim is that it correlates
with first year college grades. And if it's the LSAT, with first-year law
school grades.
Now, correlate--that's a big word. What does correlate mean? There's some
consistency. There's some relationship between the score on this aptitude test
and your first year college grades. That's true. There is some relationship.
The problem is it's a very modest relationship. It is a positive relationship,
meaning it is more than zero. But it is not what most people would assume when
they hear the term correlation. For example, your height correlates better
with your weight than your test score correlates with your first year grades.
Jane Balin, Michelle Fine and I did a study at the University of
Pennsylvania Law School where we actually looked at the first year law school
grades of 981 students and then looked at their LSAT scores. And it turned out
that there was a relationship between their LSAT and their first-year law
school grades. The LSAT predicted 14 percent of the variance between the first
year grades. And it did a little better second year: 15 percent. And I was at
a meeting with a person who at the time worked for the law school admissions
council who constructs the LSAT. And she said, well, nationwide the test is
nine percent better than random. Nine percent better than random. That's what
we're talking about.
So it may be an efficient tool in that you get the students to pay for it.
The schools don't pay for it. It allows the schools to then rank order people
based on a number that is assigned to them. But it is a fairly arbitrary tool.
And it is certainly not a thermometer of merit, if by merit--and I'm assuming
we don't mean merit is the equal of first-year college grades or first-year law
school grades. Merit is a big word. And it has to carry a lot of weight. It
does a lot of heavy lifting. It means more than just how you're going to do
first year in college. Because if all we cared about is how well you do first
year in college, we would have college as one year. Right? Why would you have
to be there and pay tuition for three more years if this is only about first
year of college? If it's such a good predictor, why do you even go to college?
Just take the test and then get a diploma.
So there must be something going on within the institution of higher
education or within the legal academy that we think also carries, quote, merit.
In which people are learning how to work and play well together with others, in
which people are learning intellectual self-confidence, in which people are
being exposed to research skills, in which people are being trained to be
leaders. None of this has any relationship to the testocracy. No one claims
that aptitude tests predict leadership, predict emotional intelligence, predict
the capacity to make a contribution to the society. The only relationship is
between the test and first-year college grades.
And what I was about to say earlier was that, with FairTest and others,
they will say that what the test actually judges is quick strategic guessing
with less than perfect information. Boys, for example, do better on the math
portion of the SAT than girls. They routinely score 40 to 50 points higher.
Many people say, well that's because girls are ignored in high school math.
That may be true. And yet the girls do just as well in college when they take
math courses as the boys, despite their lower SAT scores on the math portion.
And when you interview the boys as to how they approach the test, the answer is
they basically viewed it as a pinball machine. And the goal was speed and
winning. And the girls on the other hand, wanted to work through the problems
before they put down the answer. That, apparently, is not merit.
Somebody who wants to work through a problem before concluding with an
answer, is not guessing and they're not fast. And so on some level, what we
are confusing as a result of this over-emphasis on the testocracy--what we're
confusing merit with is speed and the confidence to guess.
Isn't one of the criticisms of the SAT that nobody's quite sure of what
it does measure? Is it because ETS can't say, or don't want to say, what it
measures?
I think it's hard to say exactly what it measures. And I'm very
sympathetic. It's especially hard to say what the SAT measures if you want to
keep the acronym SAT, which is most successful marketing tool in testing
history. Well IQ is right up there too, but SAT is what everybody knows you
have to take to go to college and it's the test that's marketed by ETS. So if
they change the name in any way that does reproduce SAT, they're in real
trouble. So I think they've got a constraint there.
But in fact, if from the beginning it had been called, say the Scholastic
Achievement Test, I think it would've taken the political curse off the thing.
It isn't exactly an achievement test, but it's certainly not exactly an
aptitude test. But if you recognize in the label that this involves
achievement, then people will say, okay, well it may not be the kind of
achievement we should test, but it's reasonable that you should give a test if
it measures achievement.
Whenever Henry Chauncey, who was working with James Conant, ever got close
to the word achievement, Conant would say that's not what I want, because
achievement was then the privilege of the guys who went at that time to Exeter
and Andover.
Well I think, when these tests were originally developed, people really
believed if they did the job right, they would be able to measure this sort of
underlying, biological potential. And they often called it aptitude, sometimes
they called it genes, sometimes they called it intelligence. But whatever they
called it, they though that there was something there and if they just tweaked
and fiddled and worked at it a little harder, they would get pretty close to
being able to measure it.
I don't think people believe that anymore. They believe that how you do on
almost any test is substantially affected by both your heredity and your
environment and both things make a difference. Any psychologist would tell you
that. But, the problem of finding a label for something which is both A and B,
is a tough one and you could say it's the Scholastic Aptitude and Achievement
Test for instance, but that doesn't sound good when you're selling it.
Especially to someone like Conant who wanted a test that measured aptitude.
I mean it wasn't an accident that they came up with these terms. They came
up with the terms because that's what people wanted to do. The fact that they
didn't actually quite do it was, you know, well we've seen this in a lot of
other fields of merchandising too. You know, people want a product that does
so and so, then say it does so and so.
What's the only way to eliminate labeling bias in tests?
I think you've got to re-label these tests. You need to call these tests
things that really reflect what it is that they measure. And then it's also a
question of whether what kinds of things the tests should really measure that
you give, but that's a different question.
Do the SAT's do a good job of predicting academic
performance?
I think they really don't do a good job of predicting. They do a pretty
poor job of predicting, but they're the best we've got. Well that means the
people who are good at those things, high school grades and test-taking, are
going to do well in getting into good colleges. And other people who would do
equally well in college are just not going to make it because we have no way of
picking out the kid who will do well even though his high grades weren't so
great and even though his SAT scores weren't so great.
There are lots of kids out there like that and on the average they won't do
as well as the other ones and the ones who will, are going to lose out because
we can't identify. They come with little tags. If we knew what else it was,
you know, if we could say, well it's stick-to-itiveness or, it's getting
excited by a teacher or something, then we could measure it. That would help a
lot. And it would probably help minority kids in particular because they don't
do well on these tests and they are put at a disadvantage by that, and that's
even more of an issue on the job where we know that tests are not terribly
strong predictors of job performance and we know that lots of other stuff
counts, But we don't know how to measure most of the stuff except by hiring
somebody and seeing how they do.
Do you think we're misguided in using them?
I think we'd be way better off if we gave achievement tests and didn't
emphasize the so-called aptitude test or now, just the mysteriously unlabeled
SAT. I don't think it would change the results in favor of minorities to any
great extent in the short run, but I do think it would have a good effect in
the long run.
And the reason it would have a good effect is that if you start testing
achievement you send a measure to people that this is what you've got to learn
to go to a good college, or to any college, whatever. And we know from all
kinds of evidence that if you actually set a task like that, the minority
students can do better than they're now doing. So I think that if we kind of
change the way we set up the task and said this is a question of achievement,
it's just like lots of other forms of achievement. You've got to work hard it,
you've got to practice, you've got to get good at it.
You would have a very different state of mind than when it seems to people
that this is something that is aptitude, unchangeable, inborn, you know, if I
can't do, I just can't do it. That's a signal for defeat and giving up. And
of course, it's not just a signal to minorities for giving up, it's a signal to
any kid who tests badly and says, gee, I just don't get good scores on these
kinds of tests. Whereas if you tell him, you know this is a math test, you have
to understand the test, lots of people can learn that math if they work hard at
it.
home |
discussion |
who got in? |
interviews |
the race issue |
sat & test prep |
history of the sat
the screening process |
test score gap |
getting in to berkeley |
bibliography |
links |
tapes & transcripts |
press |
links
FRONTLINE |
pbs online |
wgbh
|