Frontier AI Models Fail Basic UK Benefits Maths in New Turing Institute Welfare Test
An Alan Turing Institute benchmark found GPT-5, Claude and Gemini erred on over 30% of Universal Credit calculations, raising alarm as the...
An Alan Turing Institute benchmark found GPT-5, Claude and Gemini erred on over 30% of Universal Credit calculations, raising alarm as the...
Have a question, tip, or story idea? We read every message.