I looked a bit into this a few years back and found it quite interesting. Despit...

qazxcvbnm · 2025-02-12T16:43:35 1739378615

I read the paper a few years ago and I agree that for such an incredible algorithmic improvement, it’s not trivial to find a use case, as you still need to maintain a separate (albeit algorithmically insignificant) lookup table. When I read the paper I (mistakenly) hoped it could be used for indexing on-disk structures without ever hitting the disk for things like B tree internal nodes. To get its sweet sweet algorithmic complexity, up its sleeve is once again (as I recall) the age old trick of rebuilding the structure when the size doubles, which makes it much less efficient than it sounds for most practical use cases. I suppose one good use case for this might be compressing database indexes, where you need to maintain a separate structure anyway and the space savings can be worth it.

judofyr · 2025-02-12T16:47:37 1739378857

I didn't invest a lot of time in it because when these papers show no practical applications I always assume that the constant factors are too high. Especially when there's also no specific values of the constant factors presented. It's still somewhere back in my mind to implement it one day to figure out the nitty gritty details.

boothby · 2025-02-12T17:05:35 1739379935

Back of the envelope calculations from the abstract: An 8-bit tiny pointer would be sufficient to reference an array of size 2^2^2^3 ≈ 1e77 (≈atoms in the visible universe) with fullness 31/32 ≈ 97%. Or size 65536 and fullness 63/64. For 16-bit tiny pointers, there's no point in going bigger than the universe; fullness goes to 99.98%. Like you, I'd love to see such an example worked out.

Izikiel43 · 2025-02-12T20:10:50 1739391050

This guy used that paper to create a new hash table which is much faster at high load that current implementations:

https://www.quantamagazine.org/undergraduate-upends-a-40-yea...

nsteel · 2025-02-12T21:57:11 1739397431

That paper was front page 2 days ago. It's the reason this paper is front page today. And while it is fantastic work, it has no obvious break-through application, as was discussed: https://news.ycombinator.com/item?id=43004955

elcritch · 2025-02-13T00:18:24 1739405904

Yeah, I posted about that algorithm on Nim Lang forum. Someone used deepseek to create a Nim version [1]. It seems the constant factor overhead is quite large. Well presuming deepseek implemented it correctly. There's obvious easy perf tweaks, but still seems to have high overhead.

1: https://forum.nim-lang.org/t/12679#78170

taeric · 2025-02-12T17:16:27 1739380587

It isn't hard to make a datastructure that indexes into itself. BDDs, for example, are often coded in this way. I did an admittedly poor job of one at https://taeric.github.io/trading-with-bdds.html, but I think it is enough to see the idea well enough.

CyberDildonics · 2025-02-12T23:46:36 1739403996

That's just called a hash map.