Hacker News new | past | comments | ask | show | jobs | submit login

> History is full of “what if”s, what if something like DuckDB had existed in 2012? The main ingredients were there, vectorized query processing had already been invented in 2005. Would the now somewhat-silly-looking move to distributed systems for data analysis have ever happened?

I like the gist of the article, but the conclusion sounds like 20/20 hindsight.

All the elements were there, and the author nails it, but maybe the right incentive structure wasn't there to create the conditions to make it able to be done.

Between 2010 and 2015, there was a genuine feeling from almost all industry that we would converge to massive amounts of data, because until this time, the industry had never faced a time with so much abundance of data in terms of data capture and ease of placing sensors everywhere.

The natural step in this scenario won't be, most of the time, something like "let's find efficient ways to do it with the same capacity" but instead "let's invest to be able to process this in a distributed manner independent of the volume that we can have."

It's the same thing between OpenAI/ChatGPT and DeepSeek, where one can say that the math was always there, but the first runner was OpenAI with something less efficient but with a different set of incentive structures.




It will not happened. The problem is that people believe theirs app will be web-scale pretty-soon so need to solve the problem ASAP.

Is only after being burned many many times that arise the need for simplicity.

Is the same of NoSql. Only after suffer it you appreciate going back.

ie: Tools like this circle back only after the pain of a bubble. It can't be done inside it


> The problem is that people believe theirs app will be web-scale pretty-soon so need to solve the problem ASAP.

Investors really wanted to hear about your scaling capabilities, even when it didn't make sense. But the burn rate at places that didn't let a spreadsheet determine scale was insane.

Years working on microservices, and now I start planning/discovery with "why isn't this running on a box in the closet" and only accept numerical explanations. Putting a dollar value on excess capacity and labeling it "ad spend" changes perspectives.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: