by Haja Mo February 8, 2026 Tech Large language models struggle to solve research-level math questions. It takes a human to assess just how poorly they perform. Previous Story Rebuilding the Lighthouse of Alexandria, Block by Virtual Block