ian on Bet On It

58 Comments

Mar 22, 2023

https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks

"As further evidence for this hypothesis, we tested it on Codeforces problems from different times in 2021. We found that it could regularly solve problems in the easy category before September 5, but none of the problems after September 12.

In fact, we can definitively show that it has memorized problems in its training set: when prompted with the title of a Codeforces problem, GPT-4 includes a link to the exact contest where the problem appears (and the round number is almost correct: it is off by one). Note that GPT-4 cannot access the Internet, so memorization is the only explanation."

It sounds like it trained on questions that are more similar to your exam questions. Have you tried asking for similar exam questions to the ones you gave to try to determine what it used for training? Another thing to try would be to ask new questions to test similar concepts.

Expand full comment

Like (2)