> It is amazing that big endian is almost dead. I wish the same applied to writt...

idoubtit · 2025-09-01T20:43:04 1756759384

> I also wish that the world would settle on a sane date-time format like the ISO 8601

IIRC, in most countries the native format is D-M-Y (with varying separators), but some Asian countries use Y-M-D. Since those formats are easy to distinguish, that's no problem. That's why Y-M-D is spreading in Europe for official or technical documents.

There's mainly one country which messes things up...

tavavex · 2025-09-01T22:29:15 1756765755

YYYY-MM-DD is also the official date format in Canada, though it's not officially enforced, so outside of government documents you end up seeing a bit of all three formats all over the place. I've always used ISO 8601 and no one bats an eye, and it's convenient since YYYY-DD-MM isn't really a thing, so it can't be confused for anything else, unlike the other two formats.

zahlman · 2025-09-01T23:07:43 1756768063

YMD has caught on, I think, because it allows for the numbers to be "in order" (not mixed-endian) while still having the month before the day which matches the practice for speaking dates in (at least) the US and Canada.

adgjlsfhk1 · 2025-09-02T00:48:03 1756774083

The primary reason for YMD is that DDMMYYYY is ambiguous with MMDDYYYY

christophilus · 2025-09-02T01:09:29 1756775369

It is also sortable, which I think is the real advantage.

globular-toast · 2025-09-02T08:42:17 1756802537

I used to think this was really important, but what's the use case here?

If I'm writing a document for human consumption then why would I expect the dates to be sortable by a naive string sorting algorithm?

On the other hand, if it's data for computer consumption then just skip the complicated serialisation completely and dump the Unix timestamp as a decimal. Any modern data format would include the ability to label that as a timestamp data type. If you really want to be able to "read" the data file then just include another column with a human-formatted timestamp, but I can't imagine why in 2025 I would be manually reading through a data file like some ancient mathematician using a printed table of logarithms.

Majestic121 · 2025-09-02T09:10:47 1756804247

> If I'm writing a document for human consumption then why would I expect the dates to be sortable by a naive string sorting algorithm?

If you're naming a document for human consumption, having the files sorted by date easily without relying on modification date (which is changed by fixing a typo/etc...) is pretty neat

bombcar · 2025-09-02T10:59:59 1756810799

This is exactly it - file name is easy to control and sort on; creates date and modified date are (for most users) random and uncontrolled.

globular-toast · 2025-09-03T05:55:11 1756878911

So you can't sort by name, author etc? One sort key? What year is it?!

privatelypublic · 2025-09-02T01:49:37 1756777777

As ling as you pad to two characters!

rocqua · 2025-09-02T05:52:13 1756792333

8601, when used fully according to spec sucks. It makes today 20250902. It doesn't have seperators. And for adding a time it gets even less readable.

Its a serialization and machine communication format. And that makes me sad. Because YYYY-MM-DD is a great format, without a good name.

em500 · 2025-09-02T07:49:06 1756799346

YYYY-MM-DD is ISO8601 extended format, YYYYMMDD is ISO8601 basic format (section 5.2.1.1 of ISO8601:2000(E)[1]). Both are fully according to spec, and neither format takes precedence over the other.

[1] https://www.pvv.org/~nsaa/8601v2000.pdf

account42 · 2025-09-02T08:08:59 1756800539

It does have a good name: RFC 3339. Unlike the ISO standard, that one mandates the "-" separators. Meanwhile it lets you substitute a space for the ugly "T" separator between date and time:

> NOTE: ISO 8601 defines date and time separated by "T". Applications using this syntax may choose, for the sake of readability, to specify a full-date and full-time separated by (say) a space character.

Y_Y · 2025-09-02T10:51:48 1756810308

I always like the compromise of the M/D/M system popularised by the British documentary series Look Around You, e.g. "January the fourth of March".

christophilus · 2025-09-02T01:08:21 1756775301

I live in that country, and I am constantly messing up date forms. My brain always goes yyyy-mm-dd. If I write it out, September 1st, 2025, I get it in the “right” order. But otherwise, especially if I’m tired, it’s always in a sortable format.

pavon · 2025-09-02T04:41:05 1756788065

There are a lot of computations where 256 is too small of a range but 65536 is overkill. When designers of early computers were working out how many digits of precision their calculations needed to have for their intended purpose 12 bits commonly ended up being a sweet spot.

When your RAM is vacuum tubes or magnetic core memory, you don't want 25% of it to go unused, just to round your word size up a power of two.

skeezyboy · 2025-09-02T15:44:58 1756827898

> There are a lot of computations where 256 is too small of a range but 65536 is overkill

wasnt this more to do with cost? they could do arbitrary precision code even back then. its not like they were only calculating numbers less than 65537, ignoring anything larger

jcranmer · 2025-09-01T21:25:38 1756761938

I don't know that 7-bit bytes were ever used. Computer word sizes have historically been multiples of 6 or 8 bits, and while I can't say as to why particular values were chosen, I would hypothesize that multiples of 6 and 8 work well for representation in octal and hexadecimal respectively. For many of these early machines, sub-word addressability wasn't really a thing, so the question of 'byte' is somewhat academic.

For the representation of text of an alphabetic language, you need to hit 6 bits if your script doesn't have case and 7 bits if it does have case. ASCII ended up encoding English into 7 bits and EBCDIC chose 8 bits (as it's based on a binary-coded decimal scheme which packs a decimal digit into 4 bits). Early machines did choose to use the unused high bit of an ASCII character stored in 8 bits as a parity bit, but most machines have instead opted to extend the character repertoire in a variety of incompatible ways, which eventually led to Unicode.

cardiffspaceman · 2025-09-01T22:30:44 1756765844

On the DEC-10 the word size is 36 bits. There was (an option to include) a special set of instructions to enable any given byte size with bytes packed. Five 7-bit bytes per word, for example, with a wasted bit in each word.

I wouldn’t be surprised if other machines had something like this in hardware.

Affric · 2025-09-02T03:02:33 1756782153

Could you use the extra bit for parity?

cardiffspaceman · 2025-09-02T20:29:18 1756844958

I don’t remember if there are instructions for putting any value of parity as such into the spare bit.

int_19h · 2025-09-02T01:43:50 1756777430

> For the representation of text of an alphabetic language, you need to hit 6 bits if your script doesn't have case

Only if you assume a 1:1 mapping. But e.g. the original Baudot code was 5-bit, with codes reserved to switch between letters and "everything else". When ASCII was designed, some people wanted to keep the same arrangement.

dboreham · 2025-09-02T04:00:32 1756785632

Quick note that parity was never used in "characters stored". It was only ever used in transmission, and checked/removed by hardware[1].

[1] Yes, I remember you could bit-bang a UART in software, but still the parity bit didn't escape the serial decoding routine.

goku12 · 2025-09-02T03:19:56 1756783196

I wasn't asking about word sizes in particular, and had ASCII in mind. Nevertheless, your answer is in the right direction.

creshal · 2025-09-02T07:27:38 1756798058

> both of which would reverse if my first wish is also granted

But why? The brilliance of 8601/3339 is that string sorting is also correct datetime sorting.

goku12 · 2025-09-02T17:52:21 1756835541

> But why?

To get the little-endian ordering. The place values of digits increase from left to right - in the same direction as how we write literature (assuming LTR scripts), allowing us to do arithmetic operations (addition, multiplication, etc) in the same direction.

> The brilliance of 8601/3339 is that string sorting is also correct datetime sorting.

I hadn't thought about that. But it does reveal something interesting. In literature, we assign the highest significance to the left-most (first) letter - in the direction opposite to how we write. This needs a bit more contemplation.

1718627440 · 2025-09-04T14:42:57 1756996977

> we assign the highest significance to the left-most (first) letter

Yes, we do that with everything, which is why little-endian numbers would be really inconsistent for humans.

formerly_proven · 2025-09-01T20:52:48 1756759968

Computers never used 7-bit bytes similarly to how 5-bit bytes were uncommon, but both 6-bit and 8-bit bytes were common in their respective eras.

goku12 · 2025-09-02T03:30:15 1756783815

I was asking about ASCII encoding and not the word size. But this information is also useful. So apparently, people were representing both numbers and script codes (EBCDIC in particular) in packed decimal or octal at times. The standardization on 8 bits and adoption of raw binary representation seems to have come later.

formerly_proven · 2025-09-02T11:40:21 1756813221

Because character encodings were primarily designed around transmission over serial lines.

goku12 · 2025-09-02T17:36:09 1756834569

Agreed. But how does that affect how encoding are designed? I mean packed decimal vs octal vs full binary, etc?

blahedo · 2025-09-01T20:54:26 1756760066

I believe that 10- and 12-bit bytes were also attested in the early days. As for "why": the tradeoffs are different when you're at the scale that any computer was at in the 70s (and 60s), and while I can't speak to the specific reasons for such a choice, I do know that nobody was worrying about scaling up to billions of memory locations, and also using particular bit combinations to signal "special" values was a lot more common in older systems, so I imagine both were at play.

globular-toast · 2025-09-02T07:03:15 1756796595

In Britain the standard way to write a date has always been, e.g "12th March 2023” or 12/3/2023 for short. Don't think there's a standard for where to put the time, though, I can imagine it both before and after.

Doing numbers little-endian does make more sense. It's weird that we switch to RTL when doing arithmetic. Amusingly the Wikipedia page for Hindu-Arabic numeral system claims that their RTL scripts switch to LTR for numbers. Nope... the inventors of our numeral system used little-endian and we forgot to reverse it for our LTR scripts...

Edit: I had to pull out Knuth here (vol. 2). So apparently the original Hindu scripts were LTR, like Latin, and Arabic is RTL. According to Knuth the earliest known Hindu manuscripts have the numbers "backwards", meaning most significant digit at the right, but soon switched to most significant at the left. So I read that as starting in little-endian but switching to big-endian.

These were later translated to Arabic (RTL), but the order of writing numbers remained the same, so became little-endian ("backwards").

Later still the numerals were introduced into Latin but, again, the order remained the same, so becoming big-endian again.

goku12 · 2025-09-02T17:34:42 1756834482

We in India use the same system for dates as you described, for obvious reasons. But I really don't like the pattern of switching directions multiple times when reading a date and time.

And as for numbers, perhaps it isn't too late to set it right once and for all. The French did that with the SI system after all.

> So apparently the original Hindu scripts were LTR

I can confirm. All Indian scripts are LTR (Though there are quite a few of them. I'm not aware of any exceptions). All of them seem to have evolved from an ancient and now extinct script named Brahmi. That one was LTR. It's unlikely to have switched direction any time during subsequent evolution into modern scripts.

1718627440 · 2025-09-04T14:38:33 1756996713

> I also wish that the world would settle on a sane date-time format like the ISO 8601 or RFC 3339 (both of which would reverse if my first wish is also granted).

YYYY-MM-DD to me always feels like a timestamp, while when I want to write a date, I think of a name, (for me DD. MM. YYYY).

vrighter · 2025-09-03T06:36:28 1756881388

7 bits was chosen to reduce transmission costs, not storage costs, because you send 12.5% less data. Also, because computers usually worked on 8-bit bytes, the 8th bit could be used as a parity bit, where extra reliability was needed.