I’m not sure why they are describing it as “a new paper” - this came out in May of 2023 (and as such notably only used GPT-3 and not GPT-4, which was where some of the biggest leaps to date have been documented).
In particular, I find his argument at the end compelling:
Another popular example of emergence which also underscores qualitative changes in the model is chain-of-thought prompting, for which performance is worse than answering directly for small models, but much better than answering directly for large models. Intuitively, this is because small models can’t produce extended chains of reasoning and end up confusing themselves, while larger models can reason in a more-reliable fashion.
If you follow the evolution of prompting in research lately, there’s definitely a pattern of reliance on increased inherent capabilities.
Whether that’s using analogy to solve similar problems (https://openreview.net/forum?id=AgDICX1h50) or self-determining the optimal strategy for a given problem (https://arxiv.org/abs/2402.03620), there’s double digit performance gains in state of the art models by having them perform actions that less sophisticated models simply cannot achieve.
The compounding effects of competence alone mean that progress here isn’t going to be a linear trajectory.
I’m not sure why they are describing it as “a new paper” - this came out in May of 2023 (and as such notably only used GPT-3 and not GPT-4, which was where some of the biggest leaps to date have been documented).
For those interested in the debate on this, the rebuttal by Jason Wei (from the original emergent abilities paper and also the guy behind CoT prompting paper) is interesting: https://www.jasonwei.net/blog/common-arguments-regarding-emergent-abilities
In particular, I find his argument at the end compelling:
If you follow the evolution of prompting in research lately, there’s definitely a pattern of reliance on increased inherent capabilities.
Whether that’s using analogy to solve similar problems (https://openreview.net/forum?id=AgDICX1h50) or self-determining the optimal strategy for a given problem (https://arxiv.org/abs/2402.03620), there’s double digit performance gains in state of the art models by having them perform actions that less sophisticated models simply cannot achieve.
The compounding effects of competence alone mean that progress here isn’t going to be a linear trajectory.