The virtual keyword in c++ is more of a compiler optimization and less of a design decision. C++ doesn't want everyone paying the overhead of virtual function calls like other languages
I think that's an over-simplification. There was pressure on the language to ensure that data structures were compatible with C structs, so avoiding the vtable with simple classes was a win for moving data between these languages.
Of course these days with LTO the whole performance space is somewhat blurred since de-virtualisation can happen across whole applications at link time, and so the presumed performance cost can disappear (even if it wasn't actually a performance issue in reality). It's tough to create hard and fast rules in this case.