Several of Britain’s most prestigious universities have quietly withdrawn AI-detection software from their assessment processes after internal reviews found the tools were disproportionately flagging work by dyslexic students and those who speak English as a second language. The retreat, confirmed by staff at three Russell Group institutions, marks a significant turning point in how higher education responds to generative AI — shifting the emphasis away from policing students and towards redesigning assessment altogether.
The decision follows mounting unease among academics about the reliability of detectors such as Turnitin’s AI-writing indicator and similar third-party tools, which claim to identify text generated by systems like ChatGPT. According to documents and accounts shared with TheAIPulse, several universities found error rates high enough to make the software indefensible — particularly where a false accusation could derail a student’s degree.
The data behind the retreat
Internal analyses at the institutions concerned reportedly showed that writing produced by neurodivergent students and non-native English speakers was flagged as AI-generated at notably higher rates than the general student population. The pattern echoes findings from international research suggesting that detectors often misread simpler sentence structures, limited vocabulary variation and certain grammatical patterns as machine-authored.
Dr Helena Crayford, an assessment researcher at the fictional Centre for Digital Pedagogy, says the bias is structurally baked in.
“These tools work by measuring how ‘predictable’ text is. The problem is that a dyslexic student using assistive software, or someone writing in their third language, often produces exactly the kind of clean, low-variation prose the detector associates with AI. We were effectively penalising students for writing clearly.”
One senior administrator, speaking on condition of anonymity, described the moment the figures landed as “a quiet panic.” The university had, they said, already processed academic misconduct cases partly informed by detection scores. “Once you see the false-positive rate broken down by student group, you cannot keep using it in good conscience. The legal and ethical exposure is enormous.”
From detection to redesign
Rather than seeking better detectors, the institutions are pivoting towards assessment formats that are harder to outsource to AI and, crucially, do not rely on surveillance. These include oral examinations, in-person assessed seminars, iterative coursework with documented drafting, and assignments that require students to reflect on their own learning process.
Professor Idris Okonkwo, a fictional specialist in higher-education policy, argues the change is overdue.
“Detection was always a stopgap that treated students as suspects. The smarter response is to ask what we actually want to measure. If an assessment can be completed in thirty seconds by a chatbot, the problem is the assessment, not the technology.”
Several departments are now trialling what one calls “AI-aware” assessment, where students are permitted to use generative tools but must declare and critically evaluate their use. The approach reframes AI literacy as a skill to be taught rather than a behaviour to be caught.
What about students already accused?
The most uncomfortable question concerns those who may have been wrongly penalised over the past two years. With detectors having been in widespread use since 2023, an unknown number of students could have faced misconduct investigations, capped marks or reputational harm based on flawed evidence.
Student advocacy groups are now calling for institutions to audit historical cases. Maya Trent, a fictional caseworker with a national students’ union, says inquiries have risen sharply.
“We are hearing from students who maintained their innocence, were disbelieved, and carried that stigma. Some are neurodivergent and felt singled out. If universities accept the tools were unreliable, they have a moral duty to revisit those decisions.”
So far, no institution has publicly committed to a retrospective review — a reluctance critics attribute to fears of legal liability and reputational damage. Universities contacted for this article declined to comment on individual cases, citing confidentiality.
A wider reckoning
The quiet nature of these withdrawals is itself telling. Few universities have announced the change openly, preferring to update internal guidance without fanfare. That caution reflects the awkward position the sector finds itself in: having adopted detection tools rapidly under pressure, it must now unwind that decision without admitting the original approach may have caused harm.
Equality campaigners argue the episode should prompt closer scrutiny of any algorithmic tool deployed against students. “The lesson is not just about AI essays,” Dr Crayford adds. “It is about deploying automated systems on vulnerable populations without understanding their failure modes.”
What this means
The retreat from AI-detection software signals a maturing — if belated — response to generative AI in education, replacing surveillance with sounder assessment design. But it leaves a difficult legacy. For students who may have been wrongly accused, the shift offers little comfort without a willingness from universities to revisit past judgements. As the sector moves forward, the real test will be whether institutions treat this as a quiet course correction or as a genuine reckoning with the harm their tools may have done.
Photo by Rahul Sapra on Pexels