In education, we tend to take that score and ... act on it, and not necessarily
get other indices in these high-stakes testing programs.
And that's not the right thing to do?
No, I don't think it is. ... [Say] you get a 220 [on a state test]. I know
right away that that can be, 67 times out of 100, a 215 or a 225. ... It isn't
this precision [measurement] that people think it is. There's a range in which
your true score falls. A true score is if I could test you over and over and
over again and I could estimate what your true performance level is. OK? It's a
construct. It's an imaginary thing. ...
[But in Massachusetts, for example, the state says 220 is the cut score on
its test, the MCAS. Students must score 220 on the MCAS in order to graduate.]
So if I got a 219, could that as easily have been a 221?
Oh, sure. And a 221 could as easily have been a 218. ...
We've been following one student, an 11th grader named Madeline Valera.
She scored a 216 on the math MCAS test, so she didn't pass. ... How firm a
number is that for Madeline?
Depending on what the standard error is, if the standard error is four, let's
say, it could be 212. [If] it's six, then it's 222 to 210. That's the error
band. ...
And so?
If, in fact, it was 222, she should have passed.
So Massachusetts says, "Madeline, take the test again."
Yes. And Massachusetts could have done what your doctor would do and say,
"Let's see if we can figure out another way you can demonstrate these
competencies." You don't have to do it for everybody. But we do it for people
like Madeline around the cut score. ...
Now, I understand that the last time the [MCAS] was given, there were about
3,000 students, 10th graders, who were on that line. They were below 220, but
just barely below it.
That could very well be. ...
What is your argument against using that score [to prevent someone from
graduating from high school]?
The technology isn't up to that. I mean, one of the things that people don't
see is the arcane underpinning of all this. There's something called the "three
parameter item response theory" -- algorithms that they use to arrive at these
scores. These involve assumptions, these involve rounding, these involve all
kinds of things. ...
Maybe if I had rounded a different way or I had put in a different assumption
into my program, I might have gotten a slightly different result -- not a
dramatically different result, but a slightly different result. And just to
take a number and say, "This is your score. This is it," we know that it isn't
the true score.
The people who make these policies know that when they say, "OK, 220 is
passing and 219 is failing" -- they know these scores are squishy?
Sure.
Why do they do it?
Well, again, you're into a series of issues. One of them is political. You
don't want to seem to be fiddling around with the cut scores. These cut scores
somehow, as I said, get reified. That's one reason. The other reason is that
you're going to get a lot of backlash when people really understand how this
classification system works, and they don't want to deal with that. And the
other reason is that they, again, leave out one of the most important
informational things we have about these kids, and that's teacher judgments.
They don't trust teachers?
No, that's why we have a lot of these state testing programs. They simply don't
trust teachers. Interestingly, a recent poll shows that teachers are one of the
most trusted groups and professional groups in American society.
There's a long history of kids being pushed along, [falsely promoted]. ...
Maybe there's a good reason for not trusting teachers?
Certainly I think there should be testing programs. I mean, I think testing
gives really valuable information on how schools and systems are doing,
particularly certain systems that traditionally serve certain populations of
kids, ESL kids, special-ed kids, minority kids, poor kids. Tests' information
can throw a lot of light on that.
One of the reasons [a cut score is] used is [it's] a stick that you can beat
people with. But it's the kid you're making the decision on. We can get very
adequate information on how a school is doing, or how a system is doing, or how
a state is doing without saying, "If you don't pass this test, you're not going
to get a diploma, or you're not going to go to the next grade." You don't have
to do that in my opinion.
But I think that these tests can and should be used to judge and hold the
system and schools accountable.
Hold the grownups' feet to the fire?
That's right, yes. ... When all this started, there were going to be not just
performance standards; there were going to be "opportunity to learn" standards. ...
[What are opportunity to learn standards?]
Originally, that was that there would be standards for textbooks, classrooms,
teacher training, teacher experience, funding levels -- a whole series of
things like that. But this was back when it was pie-in-the-sky talk of a
national test. And at the time, people raised the issue that it isn't fair to
impose a punitive testing system without first making sure kids had the
opportunity to learn whatever the test was going to teach.
So if you're going to test in chemistry, then every kid ought to have--
Certainly have access to a lab and to a qualified chemistry teacher, and things
like that. ... [But] start right down in the kindergarten, a kid [should have]
a good meal before the kid comes to school.
Politically, [the opportunity to learn standards] went down in flames, because a lot of the governors didn't want to get into that whole area at the time. ... And those disappeared. So there isn't a level playing field for a lot of these schools and kids.
And also, the one-size-fits-all [test] bothers me a lot. ... I want to be a bricklayer, so I go to a vo-tech school [where] I'm learning bricklaying. Why should I take the same science test as someone who is going to go to Harvard?
Why shouldn't you?
Because I have different life's choices, and I have different interests, and I
have different ability.
Isn't that an awfully elitist argument?
As I said, a lot of this is philosophical and ideological. ...
These are basic [skills tests].
They're not. The MCAS is not a basic skills test. I have no trouble with a
basic skills test. And in some states, it is a basic skills test [but]
MCAS is not a basic skills test. ...
The National Goals Panel issued a report on eighth-grade science and where the
United States fell relative to other countries in the world. I think there were
45 countries. Now, the headlines [said], "United States does mediocre at best."
If you break it out by states, five states, including Massachusetts, scored as
high or higher than every country except Singapore. OK? Now, on the
eighth-grade MCAS science test in 1999, 73 percent of the kids either failed it
or needed improvement. There's a disconnect here. So that science test is not a
basic skill science test.
Should all kids be able to read, write and compute? Yes. But do all kids need
to read the same thing and have the same level of math and sciences? No, I
don't think so. If that's elitist, then I guess I'm an elitist. ...
Are you saying the MCAS, the Massachusetts test, is a bad test?
No, I didn't say that. In fact, it is state of the art. I mean, all these tests
have limitations. It has limitations, the same limitations that any test done
by very qualified, very conscientious professionals building that test at that
company, Harcourt. ... These guys and women know what they're doing.
Nonetheless, the technology has inherent limitations to it.
Your objection is to the use of the score?
Yes, it's the use of the score, right. To give it the precision that they want
to give it is crazy. ... It's bad practice. I think that they need to get,
again, second opinions. They need to get clinical judgment of teachers. They
need to get other measures, and then come up with a decision about
Johnny, if they want to do that. ...
You said it's a good idea to have these tests, and to use them to hold
schools and the adults in them accountable. But if the tests have no meaning to
the kids--
... A lot of people think that if we don't put the pressure on the kid, then
nothing is going to happen. And that's an argument with a moral dimension.
Well, why should you punish the kid in order to whip the system into line?
There was a piece in The New York Times about three weeks ago on the
foot-and-mouth disease issue in England. And the article was about whether you
vaccinate these cows or not, and sheep. And one of the arguments is if you
vaccinate, then you can't tell whether the beast has hoof and mouth after that.
I mean, it masks it. And one of the leading veterinarians in England said, "We
are developing tests that will give us a very accurate measure of the health of
the herd, but doesn't give us accurate measure on an individual beast."
And in an analogous way, that's the same issue we have here on educational
testing. And it isn't just pass-fail; it's the difference between proficient
and advanced, between needs improvement and proficient. There's error all along
that scale.
I think you should get the numbers [from the tests]. I think you can use the numbers. But I
think you need to use them with other kinds of information about the thing
you're trying to make decisions about. ...
We have a system set up which holds kids accountable. [We say],
"Pass this [test] if you want to graduate." You're saying it hurts
kids. ... But normally, somebody's benefitting somewhere. Who's benefitting from
this system that we're in the middle of?
Well, politicians are. Test companies are. Teachers in schools are. And the
teachers in schools with the standards-based reform, they look at the
standards, and it helps them better understand what they should be teaching. It
gives them good information about the level of performance expected. All that's
to the good.
The other thing we ought to look at is, how do other Western
industrialized countries [test]? I don't know of any that test the way we do
below age 16 and 18. ...
The argument here is that we need these tests [because] they enable us to
have a meritocracy, or identify the best and the brightest who do well on these
tests, and will get opportunities they might not otherwise have had.
In this country? We know who isn't doing well. We don't need these tests to
tell us who is having a hard time and who's in trouble in school. ... You can
ask any classroom teacher and they can tell you ... who the kids are that are
having trouble in math, reading -- you name the subject. We know that certain
populations are poorly served, that there aren't schools that aren't doing a
good job for these. We know that. But now we've added this test as a
quasi-documentary of those problems that we've known, for years, exist. And
you're not going to test your way out of those problems.
What do you mean by that?
Well, just giving a test and getting the results back, that's not going to
necessarily change things for the better. You've got to do other things. You've
got to have opportunity to learn standards. You've got to have better funding.
... You don't have level playing fields. ... We're not going to solve the
problem by pulling up the tree and looking at the roots every year and then
planting it back again. ...
The commissioner of education in Massachusetts, David Driscoll, defends the
use of the high-stakes MCAS test because, as he says and the law prescribes,
kids have multiple opportunities. If the first time they don't make it, they
have four more chances. And the last two are targeted to give them even a
better shot at it.
But that's not what the professional standards, the [American Educational
Research Association] standards, or the National Research Council, calls for.
They call not only for multiple opportunities, but for multiple measures of the
same construct. Not just repeating the same test four or five times. ...
Commissioner Driscoll is absolutely right. They do have these opportunities. We
also know from the Texas data that a lot of kids don't stay around to exercise
those opportunities -- they leave school. And that's a problem. There is a cost
that we can only estimate. ... Even if they didn't get discouraged, it may be
that, for some kids, they can't demonstrate what it is you want them to
demonstrate on that mode of testing. In another mode of testing, they might
very well be able to show you what it is you're looking for.
And so we need to try to get other indicators of what it is we are truly
interested in, in addition to the multiple opportunities.
So, multiple opportunities and multiple [measures]? ... Multiple measures
like what?
The MCAS isn't the only fourth-grade math test around. There are an awful lot
of them around. Or for kids on the borderline, you can go in and get direct
measures. ... [Y]ou might need to go in and give a kid a book and say, "Please
read for me." Or you might, again, start to get teachers back into the process.
... They have a ton of information on kids. ...
The test companies, essentially, put a warning on these things. [They] say,
"Don't use these for high stakes." But states do. How do you explain
this?
It's a political question, and education has become a political issue. One of
the things that legislators or governors can do is they can impose tests. And
they don't have to worry about what goes on in classrooms; they don't have to
get into the messy details. They get numbers out that are quantifiable. So it's
very attractive, and it's cheap, relatively speaking.
Lots of money being spent.
... [G]iven the overall education budget, testing is a very small part. It's
getting bigger, but it's a relatively small part. It's not nearly as expensive
as equalizing funding, putting money into service training of teachers -- a
whole series of things that would cost a lot more money.
As I say, you get quantitative results that can go in newspapers, and you can
appear to be addressing the problem. And that's why I say you're not going to
test your way out of the problem. ...
So the test companies, when they say, "Don't use this for high-stakes
decisions," is that just being disingenuous? They're taking the money.
Sure, they're taking the money. ... This is the other thing that people don't
fully understand. [In] other industrialized countries, testing is not a big
business run by publishing companies. It's run by departments of education or
by examination boards. ... [Here] it's a commercial enterprise.
Accountable to?
Whoever the person paying the bill is. ...
The tests themselves, I wonder how good they are. ... The other day a young
student, a 10th-grader, found a mistake on the math tests. Someone else
revealed that James Madison was identified as John Madison. Is that an area of
concern?
Oh, sure. That happens. ... That's what I mean about the limitations of the
technology. These kind of items are going to slip through. And it can become a
serious issue, in some cases. ...
You make it sound as if these policies really do hurt a lot of kids.
They can, yes.
Is this mean-spirited people at work?
No, no, not at all. I think these people have what they think are the best
interests of the kids at heart. It's just, as I said, we have our
ideological/philosophical differences here about what is good, what should be
done and what shouldn't be done. What I'm saying is that the technology that
you're using to do these things has inherent limitations that we don't fully
take into account. ...
Young Pete Peterson, the 10th grader who found [one of the errors on the
MCAS], ... does that suggest that maybe there are other mistakes? Other bad
questions?
Maybe not as blatant as that, but there may be other what we call ambiguous
items, where A is the answer that they want, but B isn't that bad. ... And that
ambiguity, again, masks the true ability level of the kid. The kid may have
read something into the [question]. Remember, adults are the ones who write
these items. Kids are the ones who answer them.
My favorite example of that is the famous cactus question. There was a third-
or fourth-grade test. And they said, "Which of the following needs the least
amount of water?" [There was a picture of a] cactus, they had a geranium [in a
pot], and then they had a cabbage. And, of course, the adults wanted [the kids to answer] cactus. A
number of kids picked the cabbage. And when asked why, they said because the
cabbage was picked, it doesn't need water anymore. Perfectly, perfectly
sensible choice for those kids, but they got marked wrong.
There's a lot we don't know about how young kids approach these items, what
they read into these items. ...
You're talking about the bad questions and ambiguity. But the technocrats
would say, "Well, yes, but we're getting better."
... In terms of the testing technology, we are still back in the Model-T era.
It's basically the same technology that we've had. ... I think eight, 10 years
down the road, when we find new ways to use the computer technology and we meld
the two together -- and by that, I don't mean we just throw multiple choice
items into a machine and have the kid do that -- I mean simulations. For
example, one of the exciting things that we're working on is with doctors and
medical simulations. It's a test ... but it has all kinds of potential for ...
training. ...
Pilots take flight simulators. ... So I think that we're going to find more and
more that K-12 testing eventually is going to go down that road. But right now,
the technology is a Model-T technology. And we're very good at that Model-T
technology, but it's not a Porsche.
The simulations that medical schools, the Army, other places use ... those
are actually teaching. ... Whereas the [multiple-choice] tests we give now are
... not necessarily teaching.
No, they're not. And the other thing is the misinformation that the purpose of
a test like the MCAS is for diagnosis. Again, let's do a medical analogy. You
come in and you take a test in May, and I don't give you the results back until
November. And then I don't even give them to the same doctor; I give them to
another doctor. That's not diagnosis. If you want diagnostic information,
you've got to give stuff with very fast turnaround and very fast feedback. And
that's not what these tests do. These tests classify people -- that's what they
do. That's not diagnosis. ...
There's a proposal [by] the president of the United States for expanded
testing. Is this a good idea?
I don't think it is. I think that we have enough testing now. We know who the
kids are that need help. We know the kids that aren't doing well. [Putting] another
layer of testing on top of all that we have is, if nothing else, going to
take away from instructional time. ...
Some say [this movement] goes way back to "A Nation at Risk" in 1983. ...
Give me sense of context. ...
Well, I can go back to the 15th century in Italy, where the schoolmaster's
salary depended on how kids did on a viva voce examination on the curriculum,
which at that time was pretty much rhetoric. And up until the 19th
century, you had payment by results in Australia, Jamaica, Great Britain,
Ireland -- almost everywhere where the Brits went, except Scotland. ...
You had a performance contract here in the United States in the 1970s. You had
minimum competency movement in the 1970s. This is not a new thing.
And in every single case, we know what the effects of those are. We know that
teachers teach to the exam. ... Now, people say [that's fine], if you can have
tests worth teaching for. Well, no test should replace a curriculum. ...
So this is an old [issue]?
... And it's predictable. We know that scores are low in the first few years of
a testing program, and then they gradually go up as people catch on. Like in
any public policy thing, you can corrupt the social indicator. Whether it's
ambulance response time, on-time flight for airlines, arrest rates -- these
indicators are corruptible and tests are corruptible. And you can have these
scores go up, and not have the underlying learning that you seek to improve go
up. There are wonderful examples of this going back a long, long time. ...
And I predict that with what's going on now, we'll start to implode. ... You're
going to start to see it happen in states when large numbers of suburban kids
don't do well. ...
There have been protests -- parents keeping their kids home in well-to-do
communities out in California, Scarsdale, N.Y., and other places. Is that
going to keep happening?
Yes. I think so.
There's also reports of teachers saying, "I don't want to teach in this
[environment]," and leaving.
Well, that's one of the unintended consequences that needs to be documented.
That may be an urban myth, but we need to know. ... You can have a very good
goal [in] mind, but the unintended consequences of those goals very often come
back and bite you.
home · no child left behind · challenge of standards · testing. teaching. learning?
introduction · in your state · parents' guide · producer's chat · interviews
video excerpts · discussion · tapes & transcripts · press
credits · privacy policy · FRONTLINE · wgbh · pbs online
some photographs ©2002 getty images all rights reserved
web site copyright WGBH educational foundation
|