the increases are not as fast, but they're still there. the models are already exceptionally strong, I'm not sure that basic questions can capture differences very well
"plateau" in the sense that your tests are not capturing the improvements. If your usage isn't using its new capabilities then for you then effectively nothing changed, yes.