homethe billthe standardsthe testdiscussiontesting our schools
homethe billthe standardsthe testdiscussion
photo of james popham
interview: james popham

A professor emeritus at the University of California at Los Angeles and a former test maker, James Popham is a noted expert on educational testing. In this interview with FRONTLINE, he discusses the uses and misuses of standardized tests, the pitfalls of a public policy that fails to take the nature of tests into account, and why the results of traditional standardized achievement tests are not accurate measures of school quality. This interview was conducted by producer John Tulenko on April 25, 2001.

In the public's mind, tests seem to have taken on some kind of new meaning. How do you describe that? What meaning has been attached to test scores?

We've had educational tests in this country for many years. But in the last decade or so, tests have come to be viewed by the public as indicators of how well schools are teaching. As a consequence, they are the evidence by which they judge the quality of schooling.

You're constantly warning educational leaders that they've got to understand this stuff. Why?

Educational testing has become the chief indicator of how well schools are doing. As a consequence, educators are being judged by these tests about which, frankly, they don't know all that much.

Currently, in only 14 of our 50 states are teachers required to take a course in educational testing. And I think in only a couple of states are administrators required to take a course in educational testing. As a consequence, most educators know very little more about testing than what they remember when they were students themselves.

What about the general public? How much does the general public know about the ins and outs of testing?

Unfortunately, the public knows less about tests than even educators do. And that's really unfortunate, because tests are becoming such a significant criterion by which to judge the quality of schooling.

How does that change things for teachers -- change the job of teaching, the experience of being in the classroom?

There was a time when teachers worried chiefly about the extent to which they could transmit important knowledge and skills to youngsters. Now, that situation has been altered, because they're being held accountable to produce high scores on tests. As a consequence, the preoccupation with raising test scores has become dominant throughout most parts of the country. ...

We have to create tests that really do reflect how well teachers have been teaching. ... The kind of test we're using now is setting up public educators for absolute failure.

I've spent a lot of time working with teachers in the past several years, and many of them will recount instances where they or their colleagues had devoted inordinate amounts of attention simply to raising test scores. The preoccupation was with test-score raising, not necessary with teaching kids the things that children ought to be learning.

You said "preoccupation with raising test scores." How do they do that?

The pressures on teachers are so immense to raise test scores, the pressures that we see from their administrators, from board members, policy makers, that teachers are sometimes driven to boost those scores using techniques that are not all that defensible. For example, they may employ test items that are very similar to the actual test items on the test and, in some instances, teachers have actually used the real test items on the test preparation activities. So that kind of preparation is so enormous, so relentless, that the quality of schooling, frankly, sometimes is reduced.

How does that change the experience of school for kids? Does it send them any kind of message if your teacher is spending all of his or her time, or much of his or her time, doing practice exercises?

To me, one of the most frightening things about the preoccupation of raising test scores is the message it sends to children about what's important in school. Rather than trying to make the classroom a learning environment where exciting new things are required, the classroom becomes a drill factory, where relentless pressure, practice on test items, may raise test scores -- but may end up having children hate school.

Describe the climate today. You've been around a while; has it always been this way with test scores?

Eons ago, I was a high-school teacher, and we had standardized achievement tests way back then. They were given and they didn't make a lot of difference. We sometimes used them to make judgements about children -- who is better or worse than whom -- but they did not influence our conduct. Then these tests began to be used as part of an accountability drive to make sure that educators were doing their job well. And the minute those tests became the indicator of educational quality, all of a sudden they became terribly important.

Is it true that educators were not doing their jobs particularly well?

There is the belief on the part of the public that schools are not as effective as they should be. I would share that view. When youngsters are getting high-school diplomas without being able to read, write and compute, that's not a good thing. So taxpayers want to make sure that their schools were actually functioning properly and the accountability movement was initiated. It was enacted by state legislators so that we could have evidence that the schools were, in fact, running properly. And with that evidence, the role of tests became very dominant.

At that time, how much thought was given to how you measure educational quality or educational success? I mean, conceivably there could be many ways.

At the very beginning of the accountability movement, I don't believe the policy makers really understood what kinds of measures should be used to judge schools; the policy makers stipulated that student test scores would be the prime determiner of educational quality. They were nationally standardized tests. They were produced by reputable companies. So the belief was these will be the appropriate tests to use. The fact is, however, these are not the right kinds of tests to use to judge the quality of schooling.

There's an equation out there in the public view: high test scores equals good quality school; low test scores equals poor performing, bad school. Do you believe that equation holds true?

The common belief that schools that score high on a standardized achievement are effective and that schools that score low are ineffective is simply misguided. It reflects ignorance about the nature of the test being used, because both tests, frankly, in many years measure the kind of conduct, knowledge and skills that children bring to school -- not necessarily what they learn at school. What you want to judge the quality of schooling is the test that measures how well children were taught, not whether they come from a ritzy background.

How could the test be measuring children's backgrounds? It's supposed to be asking questions about what's being taught in school. What's going on there?

Traditionally constructed standardized achievements, the kinds that we've used in this country for a long while, are intended chiefly to discriminate among students ... to say that someone was in the 83rd percentile and someone is at 43rd percentile. And the reason you do that is so you can make judgements among these kids. But in order to do so, you have to make sure that the test has in fact a spread of scores. One of the ways to have that test create a spread of scores is to limit items in the test to socioeconomic variables, because socioeconomic status is a nicely spread out distribution, and that distribution does in fact spread kids' scores out on a test.

What would a question look like that fit into that category?

An example I often use is a question that involved a child's familiarity with fresh celery. There are actually questions on one of the currently used standardized achievement tests where you have to know what fresh celery looks like. But kids from upper-class homes, middle-class homes, where they buy fresh celery all the time, have a much better shot at that question than do kids from families where they're getting by on food stamps.

Now, there are many such questions in a test. You wouldn't think there would be. Why would they have them? But those tests spread out examinee performances very well.

Can you think of some other examples?

I'm thinking of one that I saw in a standardized achievement test recently. This is one that's currently used right now, where the emphasis was on the youngster's being able to tell what the word "field" meant. "In which field do you plan to work after you graduate?" Well, children from families where a mother or father has a professional field, like a lawyer or a dentist or a physician, they're going to be more familiar with the world "field" in that connection than would be a child from a family where a mom is a grocery store clerk or a dad who works in a car wash. So the kids from the middle- and upper-class families, where they have fields of occupation, will clearly have a better shot at that item than will kids from disadvantaged families.

So on these national standardized achievement tests, how much of what's taught in school shows up on the test itself?

A nationally standardized achievement test is given in about an hour. In about an hour, you can't test all that much, so you have to sample from larger domains of knowledge and skills. And what you end up with sometimes does not match at all well with what's being taught in school or what's supposed to be taught in school. Some studies suggest that fully 75 percent of what is on a test is not even supposed to be covered in a particular school. Clearly, it's unfair to judge the quality of schooling based on a test that's largely covering things that ought not be taught.

How does that situation occur? How's the test written?

... The tests are created by companies that are in the business of selling tests, and so they want their tests to be as attractive, as marketable as possible. So, they try to isolate the content they want to include in the test by looking at national curricula preferences, like preferences of the National Council of Teachers of English or mathematics, they look at state curricula. They look at textbooks. And they try to create a test that does the best job in being acceptable to many people.

But you really can't create a one-size-fits-all test because, in attempting to do so, you still may have some real gaps between what your test measures and what is taught in a given situation. ...

If one compares the content of textbooks used in mathematics with standardized achievement tests in mathematics, you will frequently find that fully half of the content in the test is not addressed in those textbooks -- simply not addressed in those textbooks.

Sounds patently unfair.

Remember, we live in a country where we're trying to allow local curricula choice, where the states determine the curriculum. Beyond that, the districts determine the curriculum. That being the case, there's a lot of variability. We do not have a national curriculum in this nation. And as a consequence, the gaps between what is on a test and what is actually taught sometimes are profound. ...

The tests were designed to spread out the scores. How do they go about doing it?

There are two kinds of items that are very effective in spreading out youngsters' scores. One kind is an item that is indomitably influenced by the youngster's socioeconomic background. If you have a middle-class or upper-class background, you'll do better on the item, because it deals with content more like to be encountered by youngsters from that background.

The second kind of item is one that is linked to the inherited academic aptitudes with which kids are born. Some kids vary in the way they are born with more verbal aptitude or more quantitative or more spatial aptitude. And those variations can be used in the test. You can build the test item to capitalize on what kids are born with, not what they learn in school. ...

What's most disturbing to me is in traditionally constructed standardized achievement tests, many of the items, such as those that are linked to inherited academic aptitudes or socioeconomic status, do not measure at all what is supposed to be taught in classrooms. ... They measure things that children bring to school. They measure how smart a kid is when he walked through the door, and not what he was supposed to learn in that school. ...

If the test writers prefer to write questions that may tap an innate ability or discriminate according to socioeconomic status, what can they ask about? What are they not asking about? ...

Another problem with standardized achievement tests, traditionally constructed ones, is that you want to have a very substantial spread of scores. And one of the best ways to do that is to have questions that are answered correctly by about 50 percent of the kids; 50 percent get it right, 50 percent get it wrong. You don't want items in there that are answered by large numbers of youngsters: 80 percent, 90 percent. Unfortunately, those items typically cover the content the teachers thought important enough to stress.

So the more significant the content, the more the teacher bangs at it, the better the kids do. And as soon as the kids do very well in that item, 80 percent, 90 percent getting it right, the item will be removed from the test. ... So you miss items covering the most important things that teachers teach. ...

It may seem strange that these tests are designed not to measure the most important things that teachers teach. But these tests were not designed to judge the quality of schooling. The tests were designed to spread out examinees. ... You don't want items in there that most of the kids get right, because those items don't spread out examinees. ... So you don't include those. Unfortunately, it turns out that those items often cover the very most important things teachers should be teaching.

California uses the SAT-9 to measure standards. What can you conclude from a student's test scores on the SAT-9 as to whether or not he or she has learned enough or met the standards?

A number of states now use standardized achievement tests to measure the content standards, that is, the knowledge or skills that the state wants taught. And sometimes the off-the-shelf test is said to be sufficiently aligned with the standards to serve as a reflection of those standards.

This is simply not the case. If you look at the degree of match between any commercialized standardized achievement test and a state's content standard, it's not good enough to make a judgement about whether those standards have been achieved, and you certainly don't know which standards have been achieved. So this is simply a pretend assessment. It's not useful for helping teachers judge or parents judge whether their kids are really learning what they're supposed to learn.

So, then, this is a misuse of tests. ... Not only as a measure of standards, but it's also a misuse as a fundamental measure of school quality?

The most profound misuse of educational tests these days is to employ a traditionally constructed standardized achievement test and base the student's scores, use those scores, as a reflection of school quality. These tests should not be used to evaluate school quality. And many citizens think that should be done and many educators can't disabuse them of that notion, because they don't know better.

If the tests aren't measuring what's being taught in school, what are they measuring?

Traditionally constructed standardized achievement tests measure a bit of what's taught in school. But, by and large, they measure what children bring to school, not what they learn there. They measure the kinds of native smarts that kids walk through the door with. They measure the kinds of experiences the kids have had with their parents. They do not measure, in the main, what is taught in school.

Do you think the politicians know this? They're the ones who sign off on these tests.

Most educational policy makers, state board members, members of legislatures, are well intentioned, and install accountability measures involving these kinds of tests in the belief that good things will happen to children. But most of these policy makers are dirt-ignorant regarding what these tests should and should not be used for. And the tragedy is that they set up a system in which the primary indicator of educational quality is simply wrong. ...

Because of the misuse of traditionally constructed standardized achievement tests to judge the quality of schooling, there's some really terrible things happening to our children in schools these days. One of those is important curriculum content is being driven out, because it isn't measured by the test. Another is that kids are being drilled relentlessly on the content of these high-stakes tests and, as a consequence, are beginning to hate school. And a third is that, in many instances, teachers are engaging in test preparation, which hovers very close to cheating, because they're boosting kids' scores without boosting kids' mastery of whatever the test was supposed to measure.

What's the message to teachers?

Today's accountability framework sends a message to teachers that raising test scores is all-important. And, as a consequence, teachers frequently don't worry about the whole education they're providing. They're worried about only what happens to be covered on that particular high-stakes test that will indicate how well they're performing. So it's test boosting -- at all costs. And it's really unfortunate, because the quality of schooling is being lowered as a consequence. ...

A lot of states are moving toward writing their own tests, so-called criterion-referenced tests, and the feeling is that these tests will reflect more of what's going on in classrooms. Is that how you see it?

Many states are currently abandoning off-the-shelf standardized achievement tests and developing customized versions of those tests that supposedly relate better to the state's curriculum content and what's taught in schools. But the reality is these tests are typically created by the very same companies that generated the original traditional standardized achievement test. And in many instances, there's no reason to believe they function any differently than a standardized achievement test. Just because a state says it has a so-called criterion-referenced customized test does not automatically mean that that is a better test.

What could be wrong with that test?

The customized tests that are being built for many states now have the same kinds of items in them that you'll find in a traditionally standardized achievement test. They're created by the same companies who have the same item developers who create the same kinds of items, and they simply try to make it a little more related to the state's curriculum. The fact is they function identically to traditionally constructed standardized achievement tests. ...

The people in state departments of education frequently do not know how to demand the creation of an alternative kind of test. You can have a test that simply indicates what a student knows and doesn't know. But when these customized tests are developed, there has to be a new vision of a different kind of test, and many times that new vision simply doesn't sit there in the state capital. ...

Now, you have all these standards. You've got 49 states that have adopted academic standards, and many of them in core subjects. I'm wondering how helpful are these standards in terms of directing test writers, helping them know what kinds of questions to ask?

Content standards describe the knowledge and the skills you want kids to learn. And that's very sensible, to lay out in advance what it is you want children to learn. Unfortunately, the standards movement in this country is not working as well as it should, because the people who put together the content standards are usually curriculum specialists who want children to learn all sorts of great things. And so the content standards become wishlists of the many things that you would like children to master. So when you present the content standards to teachers in that state, there's way too much to cover, there's way too much to test. And, as a consequence, the standards movement is not having a positive impact we hoped it would.

On the other hand, you have standards that are incredibly vague. I read one, "Students will understand historical events in the twentieth century." What do you do with that one?

There are many standards that are far more vague than they ought to be. My favorite was that "The student will relish literature." I kept looking for one that would have mayonnaise mathematics. But those are of no utility to educators; they have no utility to item writers. They are simply pie-in-the-sky kinds of aspirations. And so, although it is helpful to identify in advance what you want children to be able to do after instruction is over, if you describe this with a litany of vague, ambiguous statements, you haven't benefited anyone.

So in some states, the standards movement is more pretense than reality.

But if the standards were very detailed, I would think that might help the test writers. They would know exactly what they could be asking about. Is that possible, or am I wrong there?

The virtue of detail is that it would help item writers and it would help teachers, because they would have a more specific notion about what is to be accomplished. The downside of that kind of specificity is that it usually ends up with so many instructional targets the teachers have to cover, they're simply overwhelmed, as are the test writers. So the trick is to isolate a small number of really high-powered standards, standards that embrace lesser sub-skills and focus your instructional energy on that modest number. In general, the content standards we see in states across the land have not been isolated in that fashion. ...

How do you suggest they go about writing standards? How do you do it differently?

If I were standards czar, here's exactly what I'd do. I'd go to a specialist and I'd say, "Isolate the things that you want children to be able to do and put them in three piles: the absolutely essential, the highly desirable and the desirable." And having done that, then I get those two piles away and just go with the absolutely essential. And then I would say, "Now rank them from top to bottom; the most important, the next most important," and so on.

And then I would have the assessment people come in and say, "These four can be assessed in the time we have available, and can be assessed in such a way that teachers will know how to promote children's mastery of them." And then we'd have a reasonable standards-based assessment system.

You might have to bring in some outsiders; business people or lawyers, doctors, people in the community.

It's perfectly reasonable to involve people other than educators in the isolation of what ought to be taught in our schools. Citizens have a stake in this game, business people, moms and dads. I'd get everyone involved in the enterprise, just as long as they weren't cowed by the subject matter. I would not have it decided only by subject-matter specialists, but I would most assuredly rank in order of import what should be promoted, and then only assess that which can legitimately be assessed in the time available. ...

You've got all these tests out there. They only seem to ask one or two kinds of questions. Why is that?

As a practical matter, you can divide the kinds of test questions into all sorts of categories. But there are really three: there are selected-response tests, like multiple-choice or true-false tests, where the kid chooses from choices you present to them; short-answer response, where the kid writes a phrase or a sentence or two; and then performance-task, where the kid may write an essay or do something more elaborate. Those are the three kinds.

Now clearly, the first kind, the selected-response test, is much less expensive to score. The others take scorer time; they're somewhat less precise. The consequence is most of the tests across the country tend to be dominantly selected-response in nature. Sometimes a little constructive-response. A little short-answer, and maybe an essay or two. But the reason you don't have more of the latter kinds of tests is they cost too much to score.

You're saying we go for cheap tests -- tests that we know may not be as good as the tests that we could have if we spent more? Is that what you're saying?

Yes. The distressing reality is that the amount of money available for this kind of assessment operation is usually insufficient to provide for many students' constructed responses. In some states where they went vigorously for lots of performance-test items and lots of short-answer items, such as Kentucky, they've been forced to reduce their attention now to a dominantly multiple-choice kind of test. It just costs a lot of money. Should we be spending the money? Of course we should. But that, of course, is a social decision. ...

"Reliability." On the surface, it seems like reliability is usually described as consistency in the scores. Is a machine-scored test always a reliable test?

Reliability is a technical characteristic of tests. It's very important. And unfortunately, it comes in three flavors. When you think about consistency of the test, it might be a test that's administered one time, and then a week later, you get the same consistent scores. Or there are two forms of the test: form A and form B, and you get consistent score reports on both. Or it might be all the items in the test functioning in about the same way. So there are different ways to look at reliability.

And once you see that a test is reliable, you always must ask, "What kind of reliability?" So consistency in items is good. But one kind of reliability is not identical to the other kinds. ...

Why is reliability important? Why do people make such a big deal out of reliability?

In general, if a test is reliable, it's more likely you'll make valid interpretations from it. If a test is absolutely unreliable, that is scores bounced around all the time, you wouldn't even know what the kids came up with; how could you ever make a valid interpretation? Because this might be Molly's high day, as opposed to Molly's low day. So tests that are unreliable cannot yield valid score interpretations. ...

We talked about measuring schools. Let's talk about accuracy for individual students now. The scores that you get back on these tests, whether they're criterion state tests, or tests like the SAT-9, how accurate are they? Is a 62 always a 62?

When I first started teaching, I had a girl named Sally Palmer who had an IQ test score of 126. I believed Sally Palmer had an IQ of 126 and not 127 or 125. I believed in the accuracy of numbers. And many parents still do. They believe that those numbers are so darn precise. Psychometricians, experts in testing, have a term they call the "standard error of measurement," which indicates how likely it is the kid's score will be off by a certain amount every time the kid takes a test. It's a lot. These tests are not as precise as is widely believed.

How much is it?

The standard error of measurement depends on the particular test, and how much variability there is on the scores. And sometimes it can be fairly modest, but more often than not it's quite substantial. ... These tests are far, far more approximations than most people believe. ...

Are we talking a few points? Are we talking 20 points? Thirty points?

The standard error of measurement for many tests is such that, let's say on a 50-item test, you might find three or four points as a standard error. ...

I get my score back and it says 62. How should I look at that number? As a parent, what should I be thinking when I look at that number?

Fortunately, most testing firms these days are beginning to report results as a score in a particular area. ... They don't give you a single score. Rather, they'll give you a little chart that has a score in it with a little graph that says how much higher or how much lower the kid might actually have scored. So as a parent, you look at that and you say, "Ah, my youngster scored somewhere in that range; not necessarily that precise point." ...

In many places, states are making high-stakes decisions -- who graduates, who doesn't -- using a precise test score -- 220 you're in, 219 you're out. Is that appropriate?

The measurement community is universal in its condemnation of a single criterion, like a single test score, to be used in making an important decision like denial of a diploma. But to use a test score as one contributor to a variety of evidence to make that decision, that's acceptable.

Now the question is, is this kind of test accurate enough to be a contributor? In most instances, the states allow the youngster to take and retake the test several times, to make sure that the kid didn't just have a bad outing the first time and so on. And so if you are allowed retakes, and to have that test be used as a contributor to the decision, I think that's acceptable. ...

If the scores are in that plus-or-minus five-point range, what makes the scores flop?

The chief reasons that you have variation in the kids' scores are the kids themselves. They may literally be approaching the test today with less sleep than they had the night before; some kind of emotional disturbance, parental and so on. It may be the way they're responding to the particular items that preceded certain items on the test. There may be something that offends a given child. A minority youngster finds minority children depicted in a way that bothers them. It may be the way they were taught in a particular classroom allows them to be confused by an item, which itself, had it not been taught that way, would have been very clear. And so on.

There are all sorts of little things that go into making a kid's performance less than totally accurate. ... It might be the temperature of the room. It might be the day of the week. It might be so many things. And so the score, even though it is a number, and even though it is earned on a test that comes from a technical firm, may be inaccurate. ...

I'm going to guess that many people out there hear the president speak about tests, and they hear everyone saying, "I want tough accountability in schools," and they think, "Oh, what's the big deal? We write a test." How simple is it to do this and come up with a good test?

It's hard to write a test that does an accurate job in reflecting what students have learned, and simultaneously give teachers and students guidance as to what they should be promoting instructionally. It's very difficult to do that, and there's an underestimate of that difficulty.

How long does that take? You're constructing a valid, reliable test of fourth-grade social studies. Start to finish, how long is it going to take?

At one time, I was involved in the development of tests for states. And if you were developing and test, let's say, of math and English and reading for a given grade level, you were looking at, at least, a year and a half to two years for the development of the items, for the field testing of the items, for the review of the items, to make sure they weren't biased, to measure the right kind of content. It's not an overnight enterprise, and my guess is, in general, you're looking at somewhere between a year and three years to develop a suitable test.

So what do you think when you hear President Bush calling for expanded testing, every year in grades 3 through 8? Are we ready for that?

My concern about the president's call for more testing is that he and his advisors may not recognize that if we have more of the same kinds of tests we're currently using, good things will not happen in American education. I'm not opposed to high-stakes testing. I think the proper kinds of high-stakes test could be very useful for not only accountability, but for instruction. But if we have same old same old, in this instance, we'll be harming the kids, not helping them.

And we'll be in fact measuring what we think we're measuring ?

We will not be measuring what we think we're measuring, because we'll create these tests that are designed to spread people out, and not necessarily assess precisely the knowledge and the skills that our children should be learning. If you have the right kind of test in there, you can do good things for education. The wrong kinds of tests can stultify and corrupt education in our country. ...

How long will it take to create the right kind of tests?

It will probably take anywhere from two to three, four years to develop crackerjack tests across the board. These tests will call for a different way of thinking about educational assessment, for a different way of thinking about how you measure content standards. But that thinking would be worth it. ...

What is the proper role of tests in schools?

Educational tests, if properly developed, can be a marvelous tool, not only to tell the world how well schools are doing, but to help teachers and children promote the kinds of knowledge and skills that children should be mastering. You have to think of tests differently than the traditional kinds of tests. My criticism is not of high-stakes tests, but of traditionally constructed standardized achievement tests. ...

I met this teacher in California who said, "I don't believe in any testing. I don't want any testing in my classroom. I don't believe that there's a single thing that tests can do that can help me do my job better."

There's a resistance emerging in our country to high-stakes tests of any sort. I think that's unsound. I believe that properly constructed high-stakes tests, tests that can help teachers teach more effectively, should be used. I think the public has a right to know how well their schools are doing. So to resist any kind of testing, I think is disadvantaging the children. You have to create the right kinds of tests. But they can be a powerful force for instructional design, for getting kids to learn what they ought to learn.

How can we do that?

You have to build tests in a different way. You build tests with instruction in mind. You don't build tests to spread out examinees, and stop the action there. You build tests where you're always thinking, "How could this be promoted by a reasonably effective instructor? How could this really be taught?" And you build tests in such a way that they capture worthwhile skills, but capture them in a way that lets teachers know how they should be teaching. ...

Before the interview, we were talking about my kids and their public school, and I think you said something like, "Let's do something about this use of tests, so that when my kids are in school, there will still be public schools." Do you really think that way? Is this what's at stake here?

The public disenchantment with American schooling is profound. And many people are looking for alternative solutions, whether they're charter schools or vouchers or something else. I believe in the public schools. And I believe those public schools can be made effective if they are not judged with the wrong assessment tools, and they're given assessment tools that help them do a better job. I want to see our public schools persist. But I think you have to start focusing on a different way of measuring their performance.

Are we setting up schools for failure with these tests that we currently use?

We're making it impossible in many instances for teachers to do any better unless they cheat. If we build tests that fundamentally measure what children bring to school, not what they learn there, then quite clearly those children are never going to get better than what they brought to school. We have to create tests that really do reflect how well teachers have been teaching. Those kinds of tests will allow, I think, public education to survive. The kind of test that we're using now is setting up public educators for absolute failure.

Care to predict where we'll be three years from now?

I think the only way that we're really going to make progress in this arena is for more people to learn about the subject matter. For more policy makers, for more citizens, for more parents, to learn about assessment, what tests should be used, what tests shouldn't be used. Because if they don't learn this, we'll continue to use the wrong assessment tools.

home · no child left behind · challenge of standards · testing. teaching. learning?
introduction · in your state · parents' guide · producer's chat · interviews
video excerpts · discussion · tapes & transcripts · press
credits · privacy policy · FRONTLINE · wgbh · pbs online

some photographs ©2002 getty images all rights reserved
web site copyright WGBH educational foundation

SUPPORT PROVIDED BY