Hacker News new | past | comments | ask | show | jobs | submit | markisus's comments login

Back in my earlier days working on autonomous vehicles, I dreamed of something like this.

The issue with bounding boxes is missed detections, occlusions, and impoverished geometrical information. But if you have a hundred points being stably tracked on an object, it's now much easier to keep tracking it through partial occlusions, figure out its 3D geometry and kinematics, and even re-identify it coming in and out of occlusion.


Insurance appeals is an actual problem though. Medical practices have to hire staff to argue with insurance companies and it increases healthcare costs.

The demos look great! I imagine it's not pure javascript. Are you using webgpu?

The WebGL API is based on the OpenGL ES standard, which jettisoned a lot of the procedural pipeline calls that made it easy to write CPU-bound 3D logic.

The tradeoff is initial complexity (your "hello world" for WebGL showing one object will include a shader and priming data arrays for that shader), but as consequence of design the API sort of forces more computation into the GPU layer, so the fact JavaScript is driving it matters very little.

THREE.js adds a nice layer of abstraction atop that metal.


Spark allows you to construct compute graphs at runtime in Javascript and have them compiled and run on the GPU and not be bound by the CPU: https://sparkjs.dev/docs/dyno-overview/

WebGL2 isn't the best graphics API, but it allows anyone to write Javascript code to harness the GPU for compute and rendering, and run on pretty much any device via the web browser. That's pretty amazing IMO!


Just WebGL2

It’s an abstraction that helps mathematicians study interesting phenomena. I believe the famous squaring the circle problem was resolved using the language of fields.


That we can't square the circle comes from pi being transcendental. The result that you're thinking of is Galois' proof that there is no algebraic formula forroots of 5th degree polynomials.


Yeah, and constructability is usually handled by proving that a length is constructable if it lives in an iterated quadratic extension of the rationals. Pi does not lie in such an extension, so is not a constructable length (and neither is its square root).


"transcendental" is field language


I've always thought of "transcendental" as number theory language, though I can see how someone could argue that it is field language.

But the Galois group of a field extension definitely is field language.


a field extension is the thing which is transcendental or not.


Yes this converts video stream (plus depth) into Gaussian splats on the fly. While the system is running you can move the camera around to view the splats at different angles.

I took a screen recording of this system as it was running and cut it into clips to make the demo video.

I hope that makes sense?


If the scene is static, the normal Gaussian splatting pipeline will give much better results. You take a bunch of photos and then let the optimizer run for a while to create the scene.


I don't have a 3060 at hand so I'm not sure. Ideally someone with that setup will try it out and report back. There is no noticeable latency when comparing visually with standard pointcloud rendering.

With framerate, there are two different frame rates that are important. One is the splat construction framerate, which the speed that an entirely new set of Gaussian's can be constructed. LiveSplat can usually maintain 30fps in this case.

The second important splat rendering framerate. In VR this is important to prevent motion sickness. Even if you have a static set of splats, you need the rendering to react to the user's minor head movements at around 90fps for the best in-headset experience.

All these figures are on my setup with a 4090 but I have gotten close results with a 3080 (maybe 70fps splat rendering instead of 90fps).


The application has this feature and lets you switch back and forth. What you are talking about is the standard pointcloud rendering algorithm. I have an older video where I display the corresponding pointcloud [1] in a small picture in picture frame so you can compare.

I actually started with pointclouds for my VR teleoperation system but I hated how ugly it looked. You end up seeing through objects and objects becoming unparseable if you get too close. Textures present in the RGB frame also become very hard to make out because everything becomes "pointilized". In the linked video you can make out the wood grain direction in the splat rendering, but not in the pointcloud rendering.

[1] https://youtu.be/-u-e8YTt8R8?si=qBjYlvdOsUwAl5_r&t=14


I had to make a lot of concessions to make this work in real-time. There is no way that I know to replicate the fidelity of "actual" Gaussian splatting training process within the 33ms frame budget.

However, I have not baked in the size or orientation into the system. Those are "chosen" by the neural net based on the input RGBD frames. The view dependent effects are also "chosen" by the neural net, but not through an explicit radiance field. If you run the application and zoom in, you will be able to see the splats of different sizes pointing in different directions. The system as limited ability to re-adjust the positions and sizes due to the compute budget leading to the pixelated effect.


I've uploaded a screenshot from LiveSplat where I zoomed in a lot on a piece of fabric. You can see that there is actually a lot of diversity in the shape, orientation, and opacity of the Gaussians produced [1].

[1] https://imgur.com/a/QXxCakM


I've tried to make it clear in the link that the actual application is closed source. I'm distributing it as a .whl full of binaries (see the installation instructions).

I've considered publishing the source but the source code is is dependent on some proprietary utility libraries from my bigger project and it's hard to fully disentangle it and I'm not sure if this project has some business applications but I'd like to keep that door open at this time.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: