Hacker News new | past | comments | ask | show | jobs | submit login

I think you are thinking about this in the wrong way.

Sure Lat/Lon is the common presentation format for this particular coordinate system.

Right now in Sweden the time is 10:41 (it would be great if it was a couple of hours later, then I could say it's 14:41 to demonstrate the 24 hour time format). Yet, in software, I would represent that as time in UTC. Only when presenting to the user would I convert that to the users time zone.

My last name contain the letter "ö". In software, I would use an unicode string internally, then when writing out I would encoded that to utf-8. (20 years ago, I would have used an old character encoding called ISO/IEC 8859-1 or something like that, but you get my point).

For some damn reason I till don't understand, the decimal separator in Sweden is the comma and not the period. Still I would represent numbers internally as an integer or maybe float, and then when printing to to the user would I convert that to "123,4" (123.4) or something like that.

In Sweden, WGS84 is not the only common coordinate system. There are many others: SWEREF and SWEREF TM for example. Yes internally, depending on usecase, I would probably use a representation of WGS84 as reference, then convert that to present to the user...

This is how I think about coordinates.




> For some damn reason I till don't understand, the decimal separator in Sweden is the comma and not the period.

I'm curious to know the history of why (some?) Euro countries went with the common and the Anglo world went with the period. Some details:

> In France, the full stop was already in use in printing to make Roman numerals more readable, so the comma was chosen.[13] Many other countries, such as Italy, also chose to use the comma to mark the decimal units position.[13] It has been made standard by the ISO for international blueprints.[14] However, English-speaking countries took the comma to separate sequences of three digits. In some countries, a raised dot or dash (upper comma) may be used for grouping or decimal separator; this is particularly common in handwriting.

* https://en.wikipedia.org/wiki/Decimal_separator

ISO seems to say use a comma:

* https://en.wikipedia.org/wiki/ISO/IEC_80000#Part_2:_Mathemat...


(Fellow Swede here). I really want to agree with you, but in practice, the difference between "internal" and "external" is not always clear. For example, if I was producing a printed document in Swedish, I would definitely use decimal commas. (By the way, the time now is not 10:54 — it's 10.54, if you are to believe "Svenska Skrivregler" :-) ) However, when printing output in a terminal window, a comma means that I can't copy-paste that number into that interactive Python session in another window. Coordinates as lat, lon means I can't copy-paste into PostGIS. Makes one long for those LISP systems of the legends, where the system kept track of where things on screen came from, so if the program printed a coordinate and you copied it from the terminal, it was copied as a coordinate, not as text...


So this is sorta above my pay grade but I think this is a tangential issue.

Even if you are reading exotic data in some parochial format you internally probably wanna use one consistent format. If you have some massive data sets and don't wanna convert on read-in then there are way around that (you can setup a memoization system to convert and cache on as-needed basis). This format should be intuitive and convenient. I proposed a South/East format.. but you're free to choose whatever you want here. But the point is that Lon/Lat is likely never a good choice at this junction. It still has issues and you generally can do better.

At the interface the default we've arrived to as a society Lat/Lon WGS84 :) The merits here are irrelevant. If you want to support other i/o formats then you may. That really depends on your usecases - but they shouldn't map to whatever format you've decided on internally. And even if they did, you probably shouldn't be using Lon/Lat internally anyway


What you use internally really depends on the application requirements. Different projections have different properties. Depending on the area you need to cover and the questions you need to answer you may need to use multiple projections and to match performance requirements you may need to store all of these projections so you don't always need to convert on the fly


> In software, I would use an unicode string internally, then when writing out I would encoded that to utf-8.

I don't understand what distinction you're drawing here? UTF-8 is Unicode. In what way would you be modifying it at the presentation layer? (Unless you're dealing with true UI code, and are saying "I would map the characters to font glyphs according to the UTF-8 standard".)

I know UTF-8 isn't the only way of encoding Unicode codepoints, for what it's worth. I'm just struggling to see how you would be using just 'Unicode', as opposed to a particular encoding, at the storage layer. It's still just bits and bytes.


I think it would be more accurate to say that UTF-8 is a Unicode Transformation Format which by its name is logically distinct from Unicode itself. There are good reasons to store and process Unicode in UTF-8 format internally in many cases, but UTF-32 / UCS-4 would probably take over for internal processing if it weren't for memory usage and efficiency issues.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: