Hacker News new | past | comments | ask | show | jobs | submit | criley2's comments login

I asked ChatGPT o3 if lidar could damage phone sensors and it said yes https://chatgpt.com/share/683ee007-7338-800e-a6a4-cebc293c46...

I also asked Gemini 2.5 pro preview and it said yes. https://g.co/gemini/share/0aeded9b8220

I find it interesting to always test for myself when someone suggests to me that a "LLM" failed at a task.


I should have been more specific, but you missed my point I believe.

I tested this at the time on Claude 3.7 sonnet, which have an earlier cut off date and I just tested again with this prompt: “Can the lidar of a self driving car damage a phone camera sensor?” and the answer is still wrong in my test.

I believe the issue is the training cut off date, that’s my point, LLM seem smart but they have limits and when asked about something discovered after training cut off date, they will sometimes confidently be wrong.


I didn't miss your point, rather I wanted you to realize some deeper points I was trying to make

- Not all LLM are the same, and not identifying your tool is problematic because "LLM's can't do a thing" is very different than "The particular model I used failed at this thing". I demonstrated that by showing that many LLMs get the answer right. It puts the onus of correctness entirely on the category of technology, and not the tool used or the skill of the tool user.

- Training data cutoffs are only one part of the equation: Tool use by LLM's allows them to search the internet and run arbitrary code (amongst many other things).

In both of my cases, the training data did not include the results either. Both used a tool call to search the internet for data.

Not realizing that modern AI tools are more than an LLM with training data, but rather have tool calling, full internet access, and can access and reason about a wide variety of up to date data sources demonstrates a fundamental misunderstanding about modern AI tools.

Having said that:

Claude Sonnet 4.0 says "yes": https://claude.ai/share/001e16f8-20ea-4941-a181-48311252bca0

Personally, I don't use Claude for this kind of thing because while it's proven to be a very good at being a coding assistant and interacting with my IDE in an "agentic" manner, it's clearly not designed to be a deep research assistant that broadly searches the internet and other data sources to provide accurate and up to date information. (This would mean that ai/model selection is a skill issue and getting good results from AI tools is a skill, which is borne out by the fact that I get the right answer every time I try, and you can't get the right answer once).


Still not getting it I think.

My point is: LLMs sound very plausible and very confident when they are wrong.

That’s it. And I was just offering a trick to help remembering this, to keep checking their output – nothing else.


Bulk discounts don't "penalize" smaller purchases, they reward larger purchases.

Companies offer bulk discounts on basically... everything.

This is like pointing out that the Dollar Store penalizes people for buying small quantities and thus suggesting that Costco should raise prices to "make it fair".


No, it is discrimination based on marital/family status.

If they were willing to sell me 5 flight coupons for a discount, that would be acceptable. There is nothing I can do as an individual to take advantage of the discount.


Nonsense. The airline doesn't ask your marital/family status. They simply offer a bulk discount.

By your logic, Costco is also "discriminating based on marital/family status" by selling bulk at discount. Costco doesn't sell "buy 1/5 a toilet paper package, come back and get the other 4 1/5 later". They sell the whole package up front, use it or lose it.

That's how bulk discounts work.

Heck, you could say that a gallon of milk discriminates against you because you have to pay way more to buy a pint of it, and you can't "come back later for the other pints at the same price".

This is absurdism to the highest degree.

You are welcome to book flights with friends, or even organize a flight-share program and go in on flights with strangers. Bulk doesn't discriminate. I know folks who go in together on Costco bulk because they can't use it all. Make the system work for you.


I'd say algebra is required, just because that's what introduces variables and functions. If you get into programming without knowing algebra, you'll learn algebra on Day 1 without realizing that's what you're doing.


Programming doesn’t require algebra[1], which is about manipulating equations.

You only need to understand processes and assignments, neither of which appear in a typical algebra class. You certainly won’t learn how to pass an algebra class about manipulating equalities by programming.

[1] — The high school kind.


"understand" is doing a lot of work here. LLMs can explain math and solve math problems better than most programmers.


CEOs laying off engineers for AI is a good thing.

It's a natural selection mechanism whereby poorly run companies will fail extremely quickly.

These are bad companies with horrible leadership that are wasting resources. Their existence is a net negative to society and our industry. Good bye to the garbage Klarna's and DuoLingos. No one will miss you.


> It's a natural selection mechanism whereby poorly run companies will fail extremely quickly.

Sure, but at the short term cost of engineers' livelihoods.

Or were you thinking C-suite would be held to account for the failure?


No-one has to hold the C-suite to account or directly communicate with them in any way.

You just need investors and consumers to allocate capital to their competitors. The flow of capital and consumer preference is a decentralized communication method. Price signals, supply and demand, and government regulations are a form of stigmergy.


So you'd rather be a parasite than a free agent?


It's a mutualistic relationship. I get paid to learn and employers get to ingest my nutrient-rich metabolite.

Poorly run companies can be an opportunity to direct development towards whatever I'm curious about, and that's usually beneficial to both parties.


That would be commensalism at best, phoresy perhaps also, but never mind. You're not exactly wrong, but you may be depending on the financial equivalent of a nutrient waste spill: an alga, assuming this bloom is forever.


It's more about reading the room and being proactive with my decades of experience, but I take and enjoyed your point.

I needed to look up phoresy and loved it! Thank you.


Right. Except jobs are useful to help people pay rent/mortgage, food and other essentials. How is it good if companies fail?

Who gives a crap if bad companies fail or don't fail? And for that matter how is it good or bad for you and me if bad companies fail or succeed despite being poorly run?

I'm curious about the context of your judgment.


Bad companies tie up capital and talent unproductively.

If a CEO is making a tech product, but knows so little about tech that he's replacing engineers with 2025's AIs, we're all better off if that CEO goes and does something else, and those engineers go work for technical people.

Temporary stability is not a substitute for long-term prosperity. Creative destruction.


If paying rent, mortgage and food is all that matters, why not just pay people to dig ditches and then pay them to fill them back in? Why go through the rigamarole of engineering at all?

If we accept that we should be productive, then it seems easy to justify that engineers should be working at good, well run companies producing real value for society.


In theory I agree, companies should be productive.

In practice having worked in financial services and ad tech, at high salaries in each, it was absolutely the equivalent of intellectual digging of ditches for pay. The only job I ever had that felt productive was in academia, and the pay was off the charts low.


Imo current models can already break things up into bite sized pieces. The limiter I've seen is twofold

1) Maintaining context of the overall project and goals while working in the weeds on a subtask of a task on an epic (so to speak) both in terms of what has been accomplished already and what still needs to be accomplished

and 2) Getting an agentic coding tool which can actually handle the scale of doing 50 small projects back to back. With these agentic tools I find they start projects off really strong but by task #5 they're just making a mess with every change.

I've played with keeping basically a dev-progress.md file and implementation-plan.md file that I keep in context for every request and end each task by updating files. But me manually keeping all this context isn't solving all my problems.

And all the while, tools like Cline are gobbling up 2M tokens to make small changes.


> Maintaining context of the overall project and goals while working in the weeds on a subtask of a task on an epic (so to speak) both in terms of what has been accomplished already and what still needs to be accomplished

This is a struggle for every human I’ve ever worked with


This is probably the biggest difference between people who wrote code and people that should never write code. Some people just can't write several connected progtam file without logical conflict. It's almost like their brain context is only capable for hold one file.


True, but if AI only gets as useful as an average developer, it isn’t that useful.


Yes. I wonder if the path forward will be to create systems of agents that work as a team, with an "architect" or "technical lead" AI directing the work of more specialized execution AIs. This could alleviate the issue of context pollution as the technical lead doesn't have to hold all of the context when working on a small problem, and vice versa.

Shit. Do we need agile AI now?


This is kind of what the modes in roo code do now. I'm having great success with them and having them as a default just rolled out a couple days ago.

There are a default set of modes (orchestrator, code, architect, debug, and ask) and you can create your own custom ones (or have roo do it for you, which is kind of a fun meta play).

Orchestrator basically consults the others and uses them when appropriate, feeding in a sensible amount of task definition and context into the sub task. You can use different LLMs for different modes as well (I like Gemini 2.5 Pro for most of the thinking style ones and gpt o4-mini for the coding).

I've done some reasonably complicated things and haven't really had an orchestrator task creep past ~400k tokens before I was finished and able to start a new task.

There are some people out there who do really cool stuff with memory banks (basically logging and progress tracking), but I haven't played a ton with that yet.

Basic overview: https://docs.roocode.com/features/boomerang-tasks

Custom Modes: https://docs.roocode.com/features/custom-modes


It doesn't make sense for a middleman like booking.com to let you completely bypass everything they offer.

However it certainly might make sense for an individual hotel to let you bypass the booking.com middleman (a middleman that the hotel dislikes already).

Scenario 1: You logon to booking.com, deal with a beg to join a subscription service (?), block hundreds of ads and trackers, just to search searching through page after page of slop trying to find a matching hotel. You find it, go to the hotels actual webpage and book there, saving a little bit of money.

Scenario 2: You ask your favorite Deep Research AI (maybe they've come up with Diligent Assistant mode) to scan for Thai hotels meeting your specific criteria (similar to the search filters you entered on booking.com) and your AI reaches out to Hotel Discovery MCP servers run by hotels, picks a few matches, and returns them to you with a suggestion. You review the results and select one. The AI agent points out some helpful deals and rewards programs that might apply. Your AI completes the booking.

The value that AI gave you is you no longer did the searching, dealt with the middleman, viewed the ads, got begged to join a subscription service, etc.

However to the hotel, they already don't really like booking.com middleman. They already strongly prefer you book directly with them and give you extra benefits for doing so. From the hotel's perspective, the AI middleman is cheaper to them than booking.com and still preserves the direct business relationship.


Both are a good thing? Even from folks in America who strongly dislike the fascist leadership, you'd be hard pressed to find folks who think "Europe should underfund their military and rely on the American tax payer for defense" or "Europe should be uncompetitive in technology and just buy American and Chinese solutions"

I don't understand why Europeans have to blame Trump for "spending a fair minimum on their own defense in the face of existential threats" or "investing domestically in technology".

These are things you always should have been doing.


For us Europeans it's a good thing. There have been sane people here in Europe who never wanted to be reliant on US subscription software etc., but who had no chance of getting their view through.

For the US though, this European reliance on what is in effect rubbish is great. Rubbish in return for cars and all sorts of sophisticated technology.

I don't like calling ad firms etc. 'tech'. Technology for me is things like new chemical processes, whatever TSMC is doing, etc. and the US does have that, but firms like Meta are not a big player outside of machine learning research.


> rely on the American tax payer for defense

Not sure how much that's true. EU spends billions buying US military hardware such as the Reaper. It's unclear how the story will end up for US taxpayers here.


> Not sure how much that's true. EU spends billions buying US military hardware such as the Reaper.

The thing is, West Germany alone had about 3000 tanks at the end of the Cold War, and reunified Germany nowadays has 300. Navies and air forces have been run to the ground across Europe, even the nuclear forces of UK/FR are nowhere near what they had once been.

The expectation over the last decades, especially after the pacification of the Balkans, was that open war would not return to Europe any time soon - on the one side because almost all European countries were either members of the EU, EEA or Schengen and thus were likely to more integrate instead of going back to 19th century fiefdom wars, and on the other side because the only realistic opponent was Russia... which we thought we could handle by "peace by trade".


By "relying on the American tax payer" they mean trusting the US military to deter Russia.


A few thousand rich people don't need 8 billion pets.

"Maintain humanity under 500,000,000 in perpetual balance with nature" (Georgia Guidestones, a monument describing someone's ideal future that is surrounded by conspiracy)

Will you be one of the 500 million pets? Or one of the 7.5 billion leftovers? Sure would be a shame if billions become unnecessary and then a pandemic 100X worse than covid got lableaked.


>Training on all papers does not mean the model believes or knows the truth. It is just a machine that spits out words.

Sounds like humans at school. Cram the material. Take the test. Eject the data.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: