Totally agree. I used to work on a load testing product that spent many, many dev hours attempting to achieve a high degree of fidelity on web recordings at the HTTP and sometimes even the socket level of emulation. It was extremely tricky. We employed alot of regex matching mechanisms and used to keep a regression test bucket of thousands of example HTTP traffic recordings to avoid messing up cookies, headers, post data and query strings to name a few things.
In the early days, the developer abuses around ASP view state payload were an absolute nightmare to deal with. I used to half-joke that I could speak HTTP after staring at the raw traffic and how 5 page loads could generate 100+ requests which had dependencies on one another.
Interestingly, there were also an interesting class of client-server bugs that only were obvious in recordings (e.g. multiple repeat HTTP head requests to check if a resource existed). Each object or library dev clearly had no knowledge that the function triggered just before also wanted to check if that resource exists. This resulted in a huge amount of redundant unnecessary calls because nobody coordinated and optimized at this level.