Hacker News new | past | comments | ask | show | jobs | submit login

At least IPv6 addresses aren't "middle-endian" like UUIDs can be.



This is not a UUID problem, this is a Microsoft problem from the 90s. Just don't use Microsoft software (</s>) and use big endian as specified by the standard.


It is a general UUID specification problem. The dashes represent a struct breakdown. That struct has internal endian issues. That struct is also weirdly laid out in a "made sense at the v1 time way" that doesn't make sense for versions after 1. Why is the version number in the middle? Why is the relatively static "Machine ID" at the end? If you were trying to cluster your sort by machine, you have to sort things "backwards". That's what SQL Server did, and why you might blame it on being a Microsoft problem, trying to avoid clustered index churn by assuming GUIDs were inserted by static Machine IDs. That assumption broke hard in later Versions of UUID when "Machine ID" just became "random entropy". But the idea to sort like that in the first place wasn't wrong for v1, it had a good sense to it. Just like it makes sense to sort v7 UUIDs by timestamp to get mostly "log ordered" clustered indexes. At least there the sort data is all up front, but it crosses "struct field" boundaries if you are still relying on the v1 chunking.

(Ultimately UUID v1 was full of mistakes that we all will keep paying for.)

For the record it is Java with the worst possible UUID sort algorithm, sorting parts of it as signed numbers: https://devblogs.microsoft.com/oldnewthing/20190913-00/?p=10...

(Friends don't let friends use Java. /s)


UUID sort orders get wild indeed. I have been down that rabbit hole.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: