Indiana's school letter grades were released Wednesday, nearly three months later than last year and just six days before Indiana voters choose a state superintendent of public instruction and new school board representatives.
Chris Himsel, superintendent of Northwest Allen County Schools, sent this scathing critique of the Indiana Department of Education's revised A-F grading to selected lawmakers last month. It also has been distributed by the Indiana Association of Public School Superintendents, and is being widely praised by school officials across the state:
I am proud of the progress that Indiana public schools continue to make in raising student achievement. Northwest Allen County Schools remains devoted to establishing healthy and safe learning environments that engage, support, and challenge each child with the commitment of helping each child achieve. This commitment to helping each child achievement exists regardless of the child's race, religion, economic advantage or disadvantage, mental or physical challenges, or native language. Each educator also recognizes that more work needs to be done and continues to commit to the success of each child.
However, the accountability grades recently released also reflect flaws in a system that does not accurately reflect the learning that takes place in our public schools each day. I applaud the concept of measuring and taking into consideration individual student growth. I also commit to continue working with policymakers to establish a growth model that makes sense and truly measures individual student growth. In the meantime, I encourage policymakers to replace the current growth statistic because it is not criterion based, it does not statistically make sense, it does not account for standard measure of error, it is unexplainable and difficult to understand, and it fails to comply with current law and administrative code.
Not criterion based: If the current growth statistic was criterion based, each educator, parent, and student would know, before the test is given, the target scale score needed to achieve typical and high growth status. Because the growth statistic is based on a normal curve that compares kids within score bands, no predictability or transparency exists. Likewise, students whose scores increase at a rate 2 or 3 times the rate that the cut score increases can be labeled low growth while other students whose scale score decreases compared to the previous year can be labeled high growth – it all depends on who the student is compared to. Likewise, a student whose score increases 25 points may be high growth one year, and a different student in the same grade level the following year may be considered low growth for the exact same 25 point increase. This does not make sense and does not measure growth. It measures competition among students and assumes no matter how much or how little learning is taking place that some students are high, others are typical, and some are low. True criterion based data points are designed so that it is possible for 100% to achieve it or 0% to achieve it; neither are ever possible in a norm-referenced world.
Does not statistically makes sense: It simply does not make sense that one student can increase his scale score at a rate 2-3 times the rate of the cut score and be considered low growth and another student can have his/her score decrease and be considered high growth. Again, it goes back to the growth statistic not being a criterion referenced achievement target. Instead, it is a comparison statistic that guarantees winners and losers regardless of how much or how little learning actually occurs.
Does not account for standard measure of error: The results reflect absolute data points. However, in the world of psychometrics, there is no such thing as an absolute data point. We are assessing the progress of kids, not producing widgets, and there is standard measure of error inherent in all test scores used to determine the current growth statistic. Because the standard measure of error is not shared, we do not know the score bands defined by the standard measure of error. However, a scale score of 500 should be plus or minus a few points. For example, if the standard measure of error was 5 points for a 95% confidence interval, which is the confidence interval typically used in social science research, then the score band would be between 495 and 505. I believe that if standard measure of error was considered within the growth statistic, then many schools may or may not have lost an additional point. The concept of standard measure of error is similar to the current polling data that is reported daily during the election cycle. For example, a poll might suggest that a community supports a candidate at a 47% rate, plus or minus 4 percentage points. This means that actual support lies somewhere between 43 and 51 percent. This same concept is to what we are referring when we speak of the standard measure of error. Unfortunately, our schools and our kids are being subjected to high stakes decisions without taking the standard measure of error into consideration.
Unexplainable and difficult to understand: I do not understand this system, and I live in this world each day. I will need to explain the system to parents and media members in the next few days and weeks, and I do not know how I will accomplish it since I do not understand it myself. I do not understand how some students can have their score decrease and be considered high growth while others see dramatic increases in their scale score and are considered low growth. I do not understand how one cut off for determining growth bonus points or growth penalty points is 36.2%, another is 42.5%, another 39.2%, another is 44.9%, etc. The varying percentages make it look like the policymakers are trying to determine a cut off points that identify a particular quantity of students or schools in certain categories. Ultimately, the difficulty in understanding the growth statistic is based on the fact that none of it is criterion referenced, transparent, nor communicated prior to the testing cycle.
Fails to comply with current law: Additionally, IC 20-31-8-2 states that, "the department
And in the original and current accountability administrative codes (511 IAC 6.2-6) that clarified IC 20-31-8, it states multiple times that, "improvement shall be determined by the average of the yearly improvement for the three-year period … (three-year rolling average)." Nowhere in this new system is there a mention or consideration of an average over a three-year period. The purpose for the three year average was to insure decisions on trends of performance instead of one year aberrations.
511 IAC 6.2-6 also reinforces the idea of requiring criterion referenced data points by stating, "scores to pass tests will be set at the levels necessary to demonstrate solid academic performance on standards. These scores will not be set or skewed for the reason to cause more or fewer student to pass or more or fewer schools to rise or fall in category placements." In other words, the administrative codes and theIndianalaw requires the establishment of a pre-determined criterion referenced data point that each school and student knows prior to the test cycle; and ideally before the learning cycle.
Northwest Allen County Schools remains steadfast in our commitment to help each child succeed. We also welcome the opportunity to partner with policymakers to develop a system that more accurately reflects the learning that occurs in our public schools each day.
Superintendent, Northwest Allen County Schools