Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Using Lidar to map tree shadows (tedpiotrowski.svbtle.com)
324 points by tppiotrowski on July 9, 2023 | hide | past | favorite | 41 comments


This is ridiculously cool, demo is slick and fast! I work pretty extensively with tiling pipelines and GeoTiffs right now as I'm building out a mass repository and platform for historical map and aerial analysis (https://pastmaps.com - still very early so don't judge it too hard please)

As part of this work, I similarly hit problems with using the raw GeoTiff files as my source and found that I was able to build some custom tiling hooks into MapLibre coupled with http range queries on the static files hosted in S3 to bypass the need for tiling. It does push the compute to the clients but I've found it's actually pretty fast even on older mobile devices.

If this MapLibre GeoTiff source support is at all of interest, I'm happy to share some basic code or even open source some of that work. Here I was thinking I was the only weird dude on the internet messing with this stuff :D


Ah. I hadn't even thought that I could do byte offsets on the fly. In general I would err on the side of over-sharing because until ChatGPT came along I struggled to understand what tools and options were to generate tiles as there wasn't much data I could find online.

One thing specific in my case: the LiDAR GeoTiffs are in imperial feet with a 32 bit floating point precision. If you take the elevation range from sea level to Everest in meters (8848) and pack it into an int16, you can get .2 meter precision. That's plenty for ShadeMap so converting from float32 to int16 should save half the cloud storage space in theory and more when taking PNG compression into account.


This code is rough, and that's an understatement, but here's a basic gist showing how I'm doing what I'm doing in case it helps you in any way: https://gist.github.com/craigsc/fdb867f8971ff5b4ae42de4e0d7c...

I'm similarly using R2 as my static hosting backend and it's been pretty fast and seamless

note that i'm using the geotiff.js and fast-png libraries for the heavy-lifting behind the range-queries and the client-side png encoding. why reinvent the wheel, right?


Your browser has a very powerful image decoder built into it, offloading the PNG decoding into Javascript is very resource hungry.

Using maplibre (or any map viewer) you can load blobs of image data out of a tiff and use `Image` or `Canvas` to render the data onto a map.

Its even easier if the tiffs are already Cloud optimized as they perfectly align to a 1-to-1 map tile and they don't need to be rescaled, you can then just render the images onto the map. eg here is a viewer that loads webps out of a 15GB tiff and uses Canvas to render them onto a map [1]

Unless you are trying to layer all your maps together, you also could stop reprojecting them into webmercator, or if your goal is to layer them, then storing them in webmercator would save a ton of user's compute time.

There are a bunch of us that talk web maping and imagery in the #maplibre and #imagery slack channels in OSMUS's slack [2]

[1] https://blayne.chard.com/cogeotiff-web/index.html?view=cog&i...

[2] https://github.com/maplibre/maplibre-gl-js#getting-involved


amazing comments and callouts, thank you! I actually tried to load the raw image data blobs as a layer into MapLibre but couldn't figure out a way to do it and finally capitulated and did the "bad" move of re-encoding just to get the initial interactive map collection out the door for folks. It sounds like this is in fact possible but I missed something. I'll take a look at the Image and Canvas sources, thanks!

Re the webmercator reprojection - yea it's gnarly that I'm doing it clientside but it's exactly because I'm working towards the ability to layer them interactively on top of eachother (as well as on various basemaps). My projection code is also only half-working at the moment and it's where I'm currently spending my time next week. I'm trying to avoid building pipelines to re-encode the geotiffs as long as I can since there's 10+TB of them in my backend so this is why you're seeing me doing this clientside instead. This is a solo project so I need to be really picky where I spend my time so I can keep moving the ball forward

I'll join those 2 communities, thank you! Been crazy hard to find folks who are deep in this stuff so most of my learning has been through endless googling down deep dark corners of the web for the past 2 months


Great points. Thank you for the links. The one trade off here is that uncompressed blobs will require longer downloads than PNG and I think usually the network transfer is slower than PNG decoding.

But maybe the sample gist takes a Tiff blob and encodes it to a PNG on the client and then maplibre decodes the PNG to canvas. That would be quite inefficient if that's what it's doing.


Those comments were more at pastmaps.

For elevation data, we store our DEM/DSM in S3 as LERC [1] COGS, LERC has a WASM bundle which I think can be used in the browser. We found LERC COGs to be one of the most space efficient ways of storing highresolution DEM/DSM data [2], If you wanted to you could fetch LERC tiles directly out of a remote COG and use that directly for the terrain heights.

I am more focused on storage/archiving/publishing of our LiDAR capture program [3] than web based visualizations of it though, so I am unsure if a LERC COG would even be better for you than a PNG TerrainRGB.

[1] https://www.npmjs.com/package/lerc

[2] https://github.com/linz/elevation/tree/master/docs/tiff-comp...

[3] https://linz.maps.arcgis.com/apps/MapSeries/index.html?appid...


WASM surely could be an improvement over js, especially for kind of BigData-ish/repetitive jobs, and where load on clients might become the next wall after we optimized the cloud/server part, or when we try to use js on cloud leafnodes.


> coupled with http range queries on the static files hosted in S3 to bypass the need for tiling

Did you look into Cloud Optimized GeoTIFF format?

https://www.cogeo.org/

It is supported by OpenLayers: https://openlayers.org/en/latest/examples/cog.html

(I don't think that Maplibre or Leaflet have built-in support for it)


I did actually! For my particular source dataset it was far easier just to lean on the plain old GeoTiff format since that's what my source data was already formatted in and my testing of the client-side on-the-fly tiling using range queries and fast-png for encoding resulted in close to par performance with 0 increase in hosting costs and 0 headache in building custom pipelines to encode the cloud-optimized versions. Basically, I'm lazy, I'm sure it's the "correct" answer for other use-cases

I also have been digging pretty extensively into protomaps though for some newer non-GeoTiff datasets I'm in the process of pulling in, in my opinion it's the future for this space - https://protomaps.com/


See also https://geoblaze-gsoc.vercel.app and the underlying libraries it uses, which also do range queries on GeoTIFFs.

I'm super interested in this space, including in helping financially support some projects. I already emailed Ted about this, but would be happy to chat to anyone doing this stuff. My email's in my profile.

(Luckily this is niche enough I'm not worried about my inbox blowing up....)


You're not anymore "the weirdest", You two are unique still though. Share and exchange, and try to support each other. All gonna be good.

I need to learn more where lies the issue with GeoTIFF format. Perhaps "pure conversion" pipeline infrastructure shared between your two projects could help.

If there are two of You, perhaps there are more hitting the same wall.


> This is ridiculously cool

I think so too. And it reminds me that I refactored Leaflet's core event-handling and SVG rendering code 8 years ago. It's good to know your opensource contribution play part in such cool things.


> Radar clearly misses 90% of the shadows cast because it does not include vegetation. Radar only reflects off the ground, making objects such as trees and buildings invisible.

This doesn’t sound right to me. Certainly radar can, at certain bands, see through foliage (so-called FOPEN). I don’t have familiarity with radar seeing through buildings at the sorts of ranges and coverage rates that you’d want to use for ground mapping.

The article references the Shuttle Radar Topography Mission, which should use C- and/or X-Band radars — both should see returns from foliage and buildings.

Without digging more into it, my thoughts about the reason behind absence of foliage and building shadows in the radar data are potentially: 1) resolution of the radar data is too low (tens of meters or more), 2) maybe post-processing of multiple radar passes with different geometries to get rid of them, 3) steep grazing angles from the radar not generating much shadow to begin with.


SRTM's FAQ shows that the argumentation in the article is backwards.

    Did the radar sample the tops of trees or the ground level?

    The radar does not "see" through thick vegetation canopies. It probably penetrated a little way into some canopies, but in general it followed near the top of the canopy.

    Did the radar signal bounce off treetops, or topography, or some combination of both that will provide separate data sets (geodesists like myself care about topography, whereas scientists more interested in forestry care about the height of the canopy).

    Unfortunately, the wavelength used, 5.6 centimeters, didn't penetrate vegetation very well. That means, for moderate-heavy vegetation, we mapped near the canopy top. We did penetrate a little, as some studies comparing our technique with laser altimeters showed, but not to the ground. If the vegetation was sparse, or had no leaves, we might get a return from the ground. The Vegetation Canopy Lidar, scheduled to fly as part of the Earth Observing System, will have this capability, which may provide some interesting data-set comparisons.
https://www2.jpl.nasa.gov/srtm/faq.html


Good clarification. My understanding is that SRTM data did not include buildings or foliage (maybe they filtered out all data except the lowest elevation values?) but that's not "radar" in general.


> resolution of the radar data is too low

SRTMv3 is 30m/px covering latitudes ±60. The earliest release was 90m/px. There was also release that was 30m within the United States and 90m elsewhere.

> steep grazing angles from the radar not generating much shadow

Mountain shadows are actually a problem! Some releases contain voids where there was no radar return, particularly around the Himalayas. There's a body of publications on void filling the SRTM data, if you're interested.


If you do end up preprocessing the geotiff and if you already have the pipeline to give terrain elevation to user I guess you could also only encode the difference between lidar and radar in your tiles, in order to have only trees data on top of your already served terrains data. The objects you are encoding and the precision you need could fit in as small as 4 bits, with lots of zeros that could be compressed away ? Just a brainstormy kind of comment.


This is a good idea. There are two popular encodings of elevation data into RGB tiles. They are both not optimal in size because their value ranges need to accommodate bathymetric data (negative elevations for mapping the sea floor)

height = -10000 + ((R * 256 * 256 + G * 256 + B) * 0.1) [mapbox/maptiler]

height = (R * 256 + G + B / 256) - 32768 [mapzen terrarium]

If you only care about elevations above sea level (0-8848 meters), you can pack the data into just two bytes maintaining a .13 meter precision (Mapbox precision is .1)

height = (R * 256 + B) / (256 * 256) * 8848 [shademap]

This is the encoding I'm going to use. I've already trialed it and it saves space (I'm not sure about processing time).

The best encoding would be to encode the min elevation for an entire tile in the header and then just store the delta between the tile's min elevation and the elevation for a given pixel. It would be the most space efficient but would involve loading the whole tile data into memory to find the minimum elevation (which is less efficient then streaming and encoding one pixel at a time)


UHM - I really need to know more about you/what you know - because this is a passion of mine as seeking the next thing I want to focus on - and the terminology that you have is exactly where I want my knowledge to be.

Please point me at what I should study to be proficient in topological data science (which is what I want to study)


Are you sure you mean topological data science? I know that there are topological methods for classifying high-dimensional data structures, but this discussion is mostly geographical/topographical. Yes, it does describe a surface, but there's a fundamental assumption that all objects are either on a plane or a sphere.

edit: If you mean GIS (geographical information systems/science), there are plenty of undergraduate courses strewn over github. IMO, the R geospatial ecosystem is more mature than its Python counterpart, but both are very usable.


Thank you, I didnt have the vocabulary to acurately describe my interest - I appreciate it. Thanks.

EDIT: I also made up "topological data science" - not sure thats a thing, but I want it to be.


I love Shademaps. I wish my product that has been using it was more of a hit, but I'll say this, Ted and Shademaps are cool. Adding trees is super practical, in many of the areas I leverage this tool we're in Urban centres but in scenarios where i'm not, the tree data is almost always more relevant than the buildings or elevation (Ontario is pretty flat and thats 99% of my users)

Keep up the great work!


Thanks Andrew. Still struggling a bit with the business/marketing side of things but the idea appears popular and the work itself is fun and fulfilling.

Thanks again for your support.


Have you thought of marketing at cities or public sector consultancies for modelling urban heat islands? Might be handy to prioritize climate adaptation measures.


This feels like it should be an interview question for a GIS firm:

"So, you have some radar and lidar data and want to merge on a box with limited memory. What would you do?"


In a way, the problems I'm dealing with here are what my CS education prepared me for. Previously I did web development and you rarely had to worry if your webpage would fit into memory and you rarely used more than a few percent of the cycles your processor was capable of.


The French mapping service is starting to do HD LIDAR captures of the whole of France, and some of them are already available: https://geoservices.ign.fr/lidarhd (At the bottom)

I'd be curious to know if you plan to include these in your app at some point in the future?

Thanks for all the work you did on Shademaps!


For sure. Appreciate the link.

My main roadblock now is that my fiancé is off for the summer (she's a teacher) so we're outside a lot and it's also hard to find the LiDAR datasets online.

I think the Washington one has been out for a while but I didn't know it existed until a few weeks ago.

I'll probably do an ASK HN for LiDAR datasets at some point and hope we can crowdsource as much data as we can.


Awesome project!

What about hosting the data in an S3 bucket with "Requestor Pays"? You'd only have the storage cost.

Disables anonymous access (so would a Dropbox share) but reduces your cost massively.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/Reques...

You wouldn't necessarily need an SQL frontend as it's readonly anyway, and there are multiple ways of letting SQLite access databases in S3 buckets, e.g. https://github.com/michalc/sqlite-s3-query


A very cool demo but most of the output has no value as it is calculating the shadows on the top of the tree canopy rather than the "ground" (hence why you get sunlight even at dawn in a dense forest)

But this would be very useful for many applications at the edges of forests and for urban vegetation, in that use case the map tiles can be much smaller and maybe the Lidar data can be fetched and converted on demand.


Something that could be converted into SQLite database and served as static file at a fraction of the cost? https://news.ycombinator.com/item?id=27016630

Range request in HTTP is doing the magic, but that part is already “solved”


that's essentially the route that Mapbox went down and they've even invented an entire mbtiles file format that essentially is just a sqlite db for doing these kinds of queries on the server

it's the "status quo" approach today in the industry, but it has some downsides still especially for smaller builders (ie me):

1. i'd have to run a separate tile server to take in the tile requests and convert them to "sql" requests (or mbtile requests) under the hood, and I'm just not a fan of more moving parts 2. I'd have to take my 10+ TB (and growing) of geotiffs and process them all into mbtiles which is a huge compute and walltime cost 3. the resulting mbtiles end up being similar in size at best, but at worst far larger than the original geotiffs so it balloons up the hosting and egress costs in exchange for faster requests. this is a great compression optimization breakdown for geotiffs that dives into this if you're interested - https://blog.cleverelephant.ca/2015/02/geotiff-compression-f...

anyways, I'm sure ted has his own thoughts as well but this is at least what I've taken away from this space after diving into it with fresh eyes in the past few months


Love it.

Vaguely related and potentially boring ramble, from the perspective of a dinosaur/former geospatial tech : starting in the 2000's, we went through the emergence of the trend where local governments commissioned (part-)district-wide LiDAR fly-overs, and even better, were willing to release the resulting digital data to the public.

Even over a decade ago, we private sector techs were receiving those city-wide LiDAR datasets en-masse, for our everyday usage in small scale analysis, maps and plans.

The data sharing policies of local governments did (and probably still do) vary widely between different areas, but sometimes, the LiDAR data would be passed on in the form of multiple layers : "ground terrain", "buildings", "tree canopy", as initially captured and computed by the LiDAR operators by way of varying frequencies.

We office techs would figure out and run the procedures necessary to adapt the data to any given request. We usually used commercial software and routines to create small scale product, and analysis of this precise nature - sun/shade, and viewshed (or "what a viewer will see, from a given point"), were a couple of services that were called upon, even back then.

The early seeds of inclusion of tree canopy LiDAR specifically in such work, at a small neighbourhood scale, began to percolate through the private sector (yes, the seeds percolated, thank you metaphor gods), but it was new and rare, and to see this work now at large scale is heartening.

While my dinosaur experience includes very little regarding the serving of large scale data online, dealing with larger scale raw geospatial datasets has always been one of, if not the, key challenges of working in this area. All of the work boils down to the translation or abstraction of those sources, into efficient, easily digestible and suitably focused outputs.

In that sense, the CS, and more specifically data science, aspects of this type of work are what really drive the practical implementation of such innovations at large scale.


This is neat.

I guess you are building the tool mostly (is this right?), but what sort of things are/do you imagine your users using it for? Is it granular enough to help, like, help plan a home solar installation?

It would be a neat, if slightly over-the-top, flex to have this in a flight simulator.


I think it's granular enough for solar. I put a tool together for annual sunlight[1] charts but haven't calculated the energy potential yet.

Some personal problems it solves for me:

- I hike/ski/climb a lot and want to know how late I can sleep in and still avoid the worst of the sun [2]

- I'm getting married this year and had wedding photos taken in a meadow in the mountains and we needed to set a time to meet with the photographer for a 2 hour session where we would still get sun but as late in the evening as possible.

- We're getting married in a back yard and want to place tables in a shady location for 5pm dinner

- I have a car I use occasionally and want to park it in a location that receives very little sun so the rubber/paint doesn't deteriorate as fast [3]

- We have a small van we take to the mountains and want to park it in the shade to keep it cool or sun to charge solar panels.

- etc, etc, etc

All these use-cases would be enhanced by trees and I can do it on my local machine but want to share if other people have similar needs.

[1] https://shademap.app/sunchart/#15/47.61754/-122.34365

[2] https://shademap.app/shadeprofile/

[3] https://shademap.app/@47.61767,-122.34993,16.13693z,16889544...


I have also used shademaps for my project on assessing walkability! While coverage for my country is pretty patchy, it helps looking at things at a glance. I used some maps to analyze correlation between urban heat islands, and the effect of tree presence and artificial shade.


Shademaps for the applucation for FV planning and optimization sounds cool.

Could they be helpful for agriculture planning, for example vineyards etc?

Another bunch could perhaps be the mobile bots which need to follow the sun or the shadow, either way.


I think I would try to use cloud first, or for completely non-profit projects - the crowdsourcing/distributed_computation, perhaps with help of people from projects like OpenStreetMap.


Regarding hosting, have people forgotten about BitTorrent already?


Wow, this is very neat. Awesome project




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: