GPT-4 Can Evaluate Student Answers as Accurately as Human Examiners — Passau University Study - Sokal Voice

Artificial intelligence GPT-4 by OpenAI has shown the ability to assess students’ written responses at a level comparable to — and sometimes better than — human lecturers. This conclusion comes from a study conducted by researchers at the University of Passau, led by Professor Johann Graf Lambsdorff.

Published in Scientific Reports, the study aimed to determine whether AI could reliably grade open-ended macroeconomics responses. The team analyzed 300 student answers to six typical questions, comparing evaluations made by both human reviewers and GPT-4.

Key findings of the study:

Innovative comparison method: Rather than treating human scores as the gold standard, researchers measured the consistency between evaluators. When GPT-4 replaced one of the three reviewers and agreement among all three increased, this indicated a better-quality assessment.
Accuracy of GPT-4: The AI system accurately ranked answers based on completeness and correctness. It frequently aligned with human judgments in identifying the best, middle, and weakest responses.
Tendency to over-score: GPT-4 occasionally assigned marks up to one point higher than human reviewers in numerical scoring.
Resilience to ambiguity: The technical part of the experiment, conducted by Abdullah Al Zubair under the guidance of Professor Michael Granitzer, showed that GPT-4 maintained consistent grading even when questions were vaguely formulated.

Despite these promising results, researchers emphasized that AI should not fully replace human evaluators. Humans are still essential in preparing model answers and making final assessments. However, GPT-4 can serve as a secondary reviewer, enhancing both grading efficiency and objectivity.

This Passau study suggests a new model for collaboration between humans and AI in higher education, where artificial intelligence functions as a reliable assistant rather than a replacement.

Sunday October 26th, 2025

217 час читання: 1 хвилина

Thursday March 12th, 2026

The Security Service of Ukraine Investigates Military Conscription Deferrals in Lviv

Thursday March 12th, 2026

873 Packs of Electronic Cigarettes Discovered in Car Roof at Shehyni

Thursday March 12th, 2026

The Situation with Law and Order in the Lviv Region is Under Control

Thursday March 12th, 2026

(Українська) Mastering SEO: Proven Methods to Grow Your Online Presence

Thursday March 12th, 2026

Today, Lviv region bids farewell to four defenders

Thursday March 12th, 2026

GPT-4 Can Evaluate Student Answers as Accurately as Human Examiners — Passau University Study

Key findings of the study:

Читати далі

The Security Service of Ukraine Investigates Military Conscription Deferrals in Lviv

873 Packs of Electronic Cigarettes Discovered in Car Roof at Shehyni

The Situation with Law and Order in the Lviv Region is Under Control

(Українська) Mastering SEO: Proven Methods to Grow Your Online Presence

Today, Lviv region bids farewell to four defenders

In Lviv Region, 95 Fires Extinguished in a Day, Most Were Arson

The Security Service of Ukraine Investigates Military Conscription Deferrals in Lviv

873 Packs of Electronic Cigarettes Discovered in Car Roof at Shehyni

The Situation with Law and Order in the Lviv Region is Under Control

(Українська) Mastering SEO: Proven Methods to Grow Your Online Presence

Today, Lviv region bids farewell to four defenders

In Lviv Region, 95 Fires Extinguished in a Day, Most Were Arson