You still need health checks, though? Otherwise, how do you tell the difference between: "no traffic + server alive" vs "some traffic + server is dead". Yeah, you can monitor throughput on a load balancer, but if I ever again wake up from an alert about not traffic being served - I will throw hands.
Yeah, they’d have to exist but it would be for quite a different purpose. I wonder if that would mean we could implement them in different ways, e.g have a health check service that everything pings and if a ping isn’t received for N minutes, assume it’s dead and trigger some replacement routine or alert.
This begins to look a lot more like a software watchdog at that point, and you can even have each service provide a count of outstanding/processed requests per tick and if a server never gets outstanding or processed requests you could have the watchdog kill it off.
server health is necessarily a function of the actual production traffic it receives, determined at the application layer, as observed by a specific observer
it can't be known by the server itself, as (among many other reasons) the server can't know about network issues between itself and any upstream caller
it can't be determined by out-of-band health check queries, because those queries don't represent actual traffic, the simplifying assumption that they _do_ introduces many common failure modes that any seasoned engineer can speak at length about
health checks can be a nice additional signal on top of monitoring actual prod traffic, but they can't be used by themselves, they just don't capture enough relevant information