A lot of people who use containers are also fans of immutable infrastructure, which may be where some of the disconnect is here. Even before containers got popular there were a lot of shops that had disabled SSH into their machines to discourage the "artisan" server mentality.
I also don't know anyone who roles containers manually by hand- generally speaking they're automated (using things like dockerfiles and provisioning scripts, which are often right in the repository storing the code), making rolling a new container easier than trying to upgrade a single library.
Generally speaking, I am not a big fan of restricting options. Having SSH is very helpful in case of a catastrophe requiring immediate attention.
Let's leave these emergencies aside. I prepare binaries, and deploy them to servers. The deployment contains several "moving parts", so there are many tests to validate deployment and make sure that all parts work well together.
There is also a rolling deployment, to compare the performance of current version N to N-1 and N-2 on multiple axis, and catch weird regressions. When something that could not be caught in tests goes wrong (recently, a cascading issue caused by an increase in latency due to a change in the routing), I have to see what's happening on one of the live instances. I have to tweak things there.
Doing the equivalent of git bisect with binaries running on many servers is not fun.
If a feature or a fix only impacts one shared library, I would LOVE to be able to roll only that to part of the deployment fleet and see if it fixes the issue -- and roll a different mix of library to another small part of the deployment, etc.
Consider I am testing x different version of library X and y different versions of library Y. I could do that just as well with xy=N different static binaries, but if I can do with the master and x+y dynamic libraries, I believe it makes my life easier.
Live A/B testing is not really possible if the difference is a few percent in efficiency - you have to wait to have enough samples and do statistical tests to see if 1900 samples per time unit on average on servers A, B and C is a regression compared to 2000 samples on average on server D, E and F. If I could test combinations of versions, I could compare more easily server A with library X version x to server B with library X version X-1 to server C with library X version x and library Y version y-1, etc.
xy vs x+y seems small, but increase x and y, add many dimension, and you really start to want more "breathing room".
I also don't know anyone who roles containers manually by hand- generally speaking they're automated (using things like dockerfiles and provisioning scripts, which are often right in the repository storing the code), making rolling a new container easier than trying to upgrade a single library.