>it’s an overall performance win. Speaking as someone who has benchmarked the sh...

marginalia_nu · on Jan 11, 2023

While I agree Java is almost never the fastest, I'll still object it's extremely hard to meaningfully benchmark these things. It's hard to benchmark even within the same language.

There are so much hidden state in the JVM. Like are you running out of TLAB space? How aggressively has the code been optimized? Does JVM elide any object creations? It may be tempting to try to compensate for these things, to isolate the benchmark from the statefulness of the JVM, but that is also isolating the results from any sort of practical applicability because real world code exists in the context of a stateful JVM.

It's also difficult to generalize benchmarks of allocation costs alone because it's rare for code to only be allocating stuff and this isn't something anyone is optimizing for. The GC will choke if you don't give it some breathing room. You'll get resource contention that realistically wouldn't be there in real-world scenarios because real code doesn't (typically) just allocate objects for minutes on end.

lowbloodsugar · on Jan 11, 2023

>There are so much hidden state in the JVM.

Indeed. GB of state.

To the OPs statement:

OP>Using something like a reference count adds contention to this high-concurrency data structure, slowing it down.

So instead we'll have an engine that walks the DAG all the time; that trashes the CPU caches; in a language where there are no nested structures or stack allocations, so that what might be a single allocation in C or C++ or Rust (if an allocation happened at all) ends up being 28 for a single API call. Then you've either got to traverse the DAG completely every cycle, or you've got to write somewhere else that the object has changed and that thing has the same (or worse) access cost as a reference count. I know that there are very clever people working on very clever algorithms, but at this point, most of these algorithms have crossed into optimizations that work great 99% of the time, and it is the 1% that brings your app down in production.

And I'm not even complaining: if I have to architect my application so that it submits to the rules of the GC algo just so I can use Java, well that's often a fair trade. But I'm not pretending it's free.

OP>Threads have to fight over updating that counter,

I've done a lot of work that involved exactly this, and measured it, and they just don't: most of the time they just don't because it is incredibly unlikely for two threads to be hitting the same object at the same time. Some languages (Swift) even identify that a refcount doesn't need to be taken at all. If you do have a "central" object with such contention then you switch to a different locking model.

winrid · on Jan 11, 2023

Unless you can afford the AMG Black Mercedes (Graal Native Enterprise lol)