Restarts don't always need a debugger or intervention.
A classic case would be restarting memory allocation after failure and after the error handler freed cache memory.
Another case would be retrying to open a file when the process has too many files open already. The error handler may close a few that are not critical, then have the restart fire.
Another case is sending something through TCP. Perhaps you try to send something, and it gives you an error. Unbeknownst to you, the message was already sent, but you wait a second or until other connections do less, then restart and try again, and it succeeds. The other end gets a duplicate, but no matter; it's TCP.
Another case is DNS. Say you need to get the IP address for some URL, and you connect to your first default DNS server. However, it happens to be run by your local incompetent sysadmins, and it happens to be down. Your error handler may choose a different, maybe public, DNS server, like Cloudflare or Google, and then restart.
If you think, 'Oh, well, I could program those in without restarts,' you are correct, but the thing is that doing so couples things.
Take the DNS example: if you put that extra error handling logic in the code that actually tries resolving things, then how do you change error handling when you need to?
Let's make the example even more detailed: perhaps you have a fleet of servers, a whole data center. Most of those servers could use a public DNS if they needed to, but perhaps your head node must NEVER use a public DNS for security reasons. The typical way to implement that would mean having an `if` statement for acting differently based on whatever condition would indicate head node or not. That is coupling the DNS resolution with error handling.
But if you have conditions and restarts, then you simply register a different DNS error handler at startup based on if it's the head node or not. Or the error handler could have the `if` statement instead. Either way would decouple DNS resolution from the error handling.
That does help. Network stuff is a usecase I could see.
It seems like with a restart system you could do something like this, in a generic reusable library way, correct me if I'm wrong. On network failure, register a callback with some OS level network activity watcher, once the network has resumed working, continue execution as normal.
You could do that, except that you would register the handler before the network code executes. Then, on failure, the network code would run the handler that waits until activity resumes, then restarts.
A classic case would be restarting memory allocation after failure and after the error handler freed cache memory.
Another case would be retrying to open a file when the process has too many files open already. The error handler may close a few that are not critical, then have the restart fire.
Another case is sending something through TCP. Perhaps you try to send something, and it gives you an error. Unbeknownst to you, the message was already sent, but you wait a second or until other connections do less, then restart and try again, and it succeeds. The other end gets a duplicate, but no matter; it's TCP.
Another case is DNS. Say you need to get the IP address for some URL, and you connect to your first default DNS server. However, it happens to be run by your local incompetent sysadmins, and it happens to be down. Your error handler may choose a different, maybe public, DNS server, like Cloudflare or Google, and then restart.
If you think, 'Oh, well, I could program those in without restarts,' you are correct, but the thing is that doing so couples things.
Take the DNS example: if you put that extra error handling logic in the code that actually tries resolving things, then how do you change error handling when you need to?
Let's make the example even more detailed: perhaps you have a fleet of servers, a whole data center. Most of those servers could use a public DNS if they needed to, but perhaps your head node must NEVER use a public DNS for security reasons. The typical way to implement that would mean having an `if` statement for acting differently based on whatever condition would indicate head node or not. That is coupling the DNS resolution with error handling.
But if you have conditions and restarts, then you simply register a different DNS error handler at startup based on if it's the head node or not. Or the error handler could have the `if` statement instead. Either way would decouple DNS resolution from the error handling.
I hope all of that helps.