to [email protected] I believe your scoring algorithm (PEG) is having a problem. My students can no longer earn a score higher than 24. We have even used some previous essays that received a 30 earlier this year. Again, nothing higher than a 24.
Is this just happening for my classes, or are you having this problem across the board?
Thank you for your email. We are so sorry for the frustration you and your students are experiencing with the scoring of essays. The issue does indeed stem from a correction made to the code for our automated scoring engine. When we tested this correction October 30th, we did not find significant scoring changes. The code change was released along with enhanced spelling, grammar, and targeted feedback.
Since the release, we have heard from teachers that the change has, indeed, affected scoring. Had we realized the effects, we would have forewarned our users or delayed the release. Our intention was to get our enhancements out to users as quickly as possible in our continuing effort to give our users the most accurate and rigorous scoring as well as the best feedback possible. Rest assured that we are looking into this issue.
Though understanding why this happened and what it means will not change the scores, it is important that you know that this change does not reflect upon the accuracy of the scoring engine. As you may already know, we build models for the scoring engine based on thousands of essays that have been hand-scored by humans. Once these models are built, they correlate closely to human scoring of any essay. Our artificial intelligence scoring engine, PEG, is nationally acclaimed for its accuracy and correlation to human scoring, and we take great pains to ensure its integrity.
The current models were put into place last summer to allow for differentiation between types of essays – argumentative, informative/explanatory, and narrative. Until then, all essays were scored generically. Since then, we have noticed that the way students used paragraphing and/or blank lines affected their scores inconsistently. This was neither an intentional part of the model nor a part of the human scoring but a by-product of the literal nature of a computer. Since PEG Writing is a formative program and formatting is not a part of the rubric, we wanted to ensure that line spacing did not have any effect on student scores.
We regret that you and your students suffered the impact of this change. However, please keep in mind that the changes made more accurately reflect the scores that would be assigned by human scorers. Our standards for good writing have not changed; we have adjusted a scoring model to best reflect those standards. Also, please reiterate to your students that this program is formative, and thus the lower scores reflect only more room to improve their writing. We know, of course, that this is little consolation.
We again apologize to you and your students. Let us know if you have any further questions.
Kim Wilson Utah Compose Support Toll Free: 866-691-1231 Email: [email protected]
It can be tough identifying the right times to fight for something. You want to stand up for what you think is right, but you also don't want to exacerbate the confrontation. These simple guidelines can help you decide when it's worth your time.
When you get into it with someone, things can get pretty heated. There can be lasting effects from a confrontation about something small. Kathleen Kelley Reardon at Big Think put together a great list of times when it's best not to engage in battle:
-There's a low probability of winning without doing excessive damage
-Upon reflection, winning isn't as important as it originally seemed
-There likely will be a time down the line when you can raise the issue again with a different person or in a different way
-The other party's style is provocative whether speaking with you or others, so it's not worth taking personally
-You could win on the immediate issue, but lose big in terms of the relationship
"How are the essays scored? In the 2014 SAGE operational field test, all student essays will be scored by a panel of trained writing evaluators. Beginning with the 2015 SAGE test, all student essays will be scored by a writing analysis algorithm. Additionally, about 10% of student essays will also be scored by an evaluation panel."
Since its acquisition of the legacy PEG system from Dr. Ellis Batten Page and his associates in 2002, MI has been an active force in AI scoring, also known as automated essay scoring. PEG is the industry's most researched AI system and has been used by MI to provide over two million scores to students over the past five years. PEG is currently being used by one state as the sole scoring method on the state summative writing assessment, and we have conducted pilot studies with three other states. In addition, PEG is currently being used in 1,000 schools and 3,000 public libraries as a formative assessment tool. Using advanced, proven statistical techniques, PEG analyzes written prose, calculates more than 300 measures that reflect the intrinsic characteristics of writing (fluency, diction, grammar, construction, etc.), and achieves results that are comparable to the human scorers in terms of reliability and validity.
A well crafted email written by Scot Meldrum to Utah Compose:
Thank you for your swift response. I understand that you have made some changes to PEG. It is admirable that you are still making revisions to the algorithm. However, I don't believe that you have responded to my actual concern. It's fine that there are changes to the algorithm. It's fine if it takes more work to get a higher score. The problem is the cap that all my students are having. Out of over 130 essays that PEG has evaluated, none of them are able to receive a score higher than 24. Many of those essays got the cap of 24. This is very concerning because it doesn't accurately show a student's skill or level of knowledge.
There is plenty of evidence of PEG's cap on my website, http://www.refriedteacher.com/week-3-4.html. Please go and look at the graphs of students scores and see what the problem is first hand.
Thank you again for your help and support.
Good morning Mike,
We do appreciate your feedback. Your classes are not the only ones affected by the change. Please see the following message which includes more information that you will, hopefully, find helpful.
“We have been made aware that essays are scoring at lower-than-expected levels for some students. We are investigating this issue. In the meantime, we want to help users understand how and why scoring has been affected.
The PEG team is constantly modifying the scoring engine to make it more accurate. Last fall we implemented an update to PEG that adjusted its sensitivity to several features so that it better reflected human scoring. One adjustment was to increase PEG’s sensitivity to run-on sentences, which resulted in more in-text identifications of run-ons. Another example is that we decreased its sensitivity to paragraph breaks, so that higher paragraph counts would no longer tend to increase a score. Modifications such as both of these resulted in PEG scoring more accurately but also more severely.
Whenever adjustments such as this are made, the severity of the scoring will naturally change as well. In the case of this last release, the improvements somewhat increased the severity of the scoring. The PEG team is investigating appropriate steps to address scoring severity while ensuring that PEG remains both sensitive and accurate, and that its integrity is never compromised. We want to be cautious in our response so that if and when changes are made, they have little or no impact on reporting or increase confusion among users.
There have been some concerns that it is no longer possible for students to earn a combined score of 30 on an essay. We have determined that this is not the case, and that 30s are, indeed, being attained by students at all grade bands”.
Kim Wilson Utah Compose Support Toll Free: 866-691-1231 Email: [email protected]